Installing and Using a Graph for OLLama + Docker LLM on Ubuntu

Tiempo de lectura: 3 minutos

Today we will learn how to install a graph in Ubuntu and use it with our Ollama server using Docker.

Rocks at sea - pexels

First, make sure the system is up to date before doing anything else.

Connect to the machine via SSH and run

sudo apt update && sudo apt upgrade -y

This may take a few minutes depending on how many updates are pending. Wait for it to finish before restarting.

If you don’t have Docker installed, the next step is to install it.

The cleanest way to install Docker on Ubuntu 22.04 is to add the official Docker repository instead of using the system package, which is usually outdated.

Execute this in order:

sudo apt install -y ca-certificates curl gnupg sudo install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg sudo chmod a+r /etc/apt/keyrings/docker.gpg echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt update sudo apt install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin 

After, add your user to the docker group so you don’t have to use sudo on every command:

html

sudo usermod -aG docker $USER

This has no effect until you close your session and log back in, but you can do it now and it will be ready.

The following block is to install the NVIDIA Container Toolkit, which allows Docker containers to access the GPU.

You can do this perfectly without the card being plugged in, the toolkit simply installed and waiting.

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt update sudo apt install -y nvidia-container-toolkit 

Now create the docker-compose.yml to run both Ollama and Open WebUI. You can place it in /opt/ollama or wherever you prefer. The content is this:

services: ollama: image: ollama/ollama container_name: ollama restart: unless-stopped ports: - "11434:11434" volumes: - ollama_data:/root/.ollama deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] open-webui: image: ghcr.io/open-webui/open-webui:main container_name: open-webui restart: unless-stopped ports: - "3000:8080" volumes: - open-webui_data:/app/backend/data environment: - OLLAMA_BASE_URL=http://ollama:11434 depends_on: - ollama volumes: ollama_data: open-webui_data: 

With this, you have everything ready from the software side.

Now it’s time for the physical part: turn off the machine, install the RTX 4060 Ti in the PCIe slot, connect the power cables from the new power source to the card (the 4060 Ti uses a 16-pin connector or adapter 3×8), and turn on.

Once the machine is back online, you need to install the NVIDIA drivers first:

sudo ubuntu-drivers autoinstall sudo reboot 

The reboot is mandatory here. When you come back, verify that the driver detects the card correctly with nvidia-smi. You should see something like «NVIDIA GeForce RTX 4060 Ti» with 16376 MiB of memory. If it appears, everything is fine.

Now configure Docker to use the NVIDIA runtime and restart the service:

sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker

Go to the directory where you placed the docker-compose.yml and start the services:

cd /opt/ollama docker compose up -d

Verify that the containers are running with docker ps. You should see ollama and open-webui in an Up state.

Now download the model. This command launches the download within the container, weighing around 9GB so it will take a while:

docker exec ollama ollama pull qwen3:14b

While downloading you can open another terminal and run nvidia-smi in loop to see that when it starts inferring the GPU it appears with active usage: watch -n 1 nvidia-smi.

Leave a Comment