How to Install Ollama with Docker

March 11, 2026

Faveren Caleb

Install Ollama with Docker to get a local LLM server that runs entirely on your own hardware, persists downloaded models across updates, and exposes an OpenAI-compatible API on port 11434, no Python environment required.

Table of Contents

What You Need Before Starting
Create the Project Directory
Write the Compose File
Start the Container
Verify It’s Running
Pull Your First Model
Run a Quick Test
What Port 11434 Means for Your Network
The Takeaway

Ollama handles everything model-related: downloading weights, managing quantization, and serving inference through a clean REST API. Running it inside Docker means the host system stays untouched, and your models survive container rebuilds because they live in a volume, not the container itself.

What You Need Before Starting

You need Docker and Docker Compose installed, the ability to run Docker commands without sudo (your user is in the docker group), and at least 8 GB of RAM with 10 GB of free disk space.

Create the Project Directory

mkdir -p ~/homelab/ollama
cd ~/homelab/ollama

Write the Compose File

Create docker-compose.yml:

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ./ollama_data:/root/.ollama
    environment:
      - TZ=America/New_York

Replace America/New_York with your timezone. Run timedatectl if you are unsure what yours is.

The volume mount ./ollama_data:/root/.ollama is the critical line. Ollama stores all downloaded models at /root/.ollama inside the container. Mapping that to a local directory means pulling a new image version later will not touch your models; they stay exactly where they are.

Start the Container

docker compose up -d

Verify It’s Running

Check the container status:

docker compose ps

The ollama service should show as Up. Then confirm the API is responding:

curl http://localhost:11434/api/tags

Expected response:

{"models":[]}

No models yet, but the API is live.

Pull Your First Model

docker exec -it ollama ollama pull qwen2.5:3b

Qwen2.5 3B is a practical starting point about 2 GB to download, runs on CPU without issues, and handles general instruction tasks well. The pull output shows layer-by-layer progress and confirms with success when complete.

Verify it landed:

docker exec -it ollama ollama list

NAME            ID              SIZE      MODIFIED
qwen2.5:3b      123abc456def    2.1 GB    2 minutes ago

Run a Quick Test

curl -X POST http://localhost:11434/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5:3b",
    "prompt": "What is Docker?",
    "stream": false
  }'

The response JSON will include a response field with the model’s output. If you get a connection refused error instead, the container is not up yet; run docker compose ps to check its status.

What Port 11434 Means for Your Network

Port 11434 is Ollama’s default API port. The Compose file maps it as 11434:11434 host port on the left, container port on the right. Any tool or service that can reach your server on that port can now send inference requests, including local scripts, other containers on the same Docker network, or applications configured to use a local Ollama endpoint.

The API is not authenticated by default. Do not expose this port directly to the internet. For remote access, route it through a VPN tunnel or a reverse proxy with authentication in front of it.

The Takeaway

Ollama is now running in Docker with a persistent volume protecting your models and a live API on port 11434. You pulled a model, confirmed it is loaded, and ran a generation request against it. The server will restart automatically with Docker and survive image updates without losing any model data.