The best Docker containers for AI homelab aren’t useful in isolation; they’re useful because of how they connect. You already know what each of these tools does. This post is about how they fit together.
The Core: Model Runner + Interface
Every AI homelab stack starts in the same place. Ollama runs your models. Open WebUI provides a browser-based interface to communicate with them. These two containers are non-negotiable; everything else is optional, depending on what you want your setup to do.
Ollama exposes an OpenAI-compatible API on port 11434. That single detail is what makes the rest of the stack possible. Any tool that can talk to OpenAI can be pointed at Ollama instead.
Add Memory: The Vector Database
A model running in Ollama has no memory of your files, notes, or documents. It only knows what’s in its training data.
A vector database fixes that. Qdrant or Chroma stores embeddings of your documents. Open WebUI connects directly to either one, which is how you get RAG the ability to ask your model questions about your own data.
If you want a simpler path, AnythingLLM bundles document management, vector storage, and a chat interface into a single container. Less flexible than the Open WebUI + Qdrant combo, but much faster to get running.
Add Automation: n8n
A model that only responds when you type at it is useful. A model that can act on your behalf, summarising an email before you read it, triggering when a file lands in a folder, routing a task based on its content, is a different category of tool.
n8n is what makes that possible. Because Ollama speaks the OpenAI API format, pointing n8n at your local instance requires changing one URL. From there, you can build workflows that use your local model the same way you’d use a cloud API, except the data never leaves your machine.
Two Practical Stacks
CPU-only machine (8GB+ RAM): Ollama + Open WebUI. Pull a 3B Q4 model, and you have a working private AI assistant. See what models fit your hardware.
GPU machine (RTX 3060 12GB or better): Ollama + Open WebUI + Qdrant + n8n. This is where the stack starts to feel serious. Your model runs fast, your documents are searchable, and your workflows are automated.
Resource Allocation Matters
Your homelab runs other things. Set memory limits on every AI container. A model that’s allowed to use all available RAM will eventually take down your other services. Docker’s –memory flag and the deploy. resources. The limits block in Compose exists for this reason. The post on how to set resource limits on Docker containers covers the specifics.
The Takeaway
The stack has a logic to it: Ollama runs the model, Open WebUI makes it usable, a vector database gives it memory, and n8n gives it reach. You don’t need all of it on day one. Start with Ollama and Open WebUI, then add layers as your needs grow.
