Raspberry Pi 5 as an AI Engineer's Home Server
You’ve got a Raspberry Pi 5 powered on, SSH working, and a blinking cursor staring back at you. Now what?
This guide turns that blank slate into a personal AI development server — Docker-ready, remotely accessible from anywhere, running your automations as persistent services. We’ll finish with a real first app deployed end-to-end, so you can feel the whole workflow before adding anything more complex.
What you’ll end up with:
- Raspberry Pi 5 configured as a headless server
- Docker + Compose for containerized services
- Tailscale for secure remote access (no port forwarding needed)
- A running FastAPI app as your first deployed service
- Ollama as a bonus for local LLM inference
What this guide assumes: Your Pi 5 is already running Raspberry Pi OS Lite (64-bit), you’ve connected via SSH at least once, and you have internet access. The Raspberry Pi Imager setup is well-documented in the official docs — this guide picks up right after first login.
1. Post-Boot Configuration
First, make sure everything is up to date and the basics are configured.
sudo apt update && sudo apt full-upgrade -y
sudo reboot
After reboot, reconnect and run raspi-config for the essentials:
sudo raspi-config
Things worth setting here:
- System Options → Hostname — give your Pi a name like
pidevinstead ofraspberrypi - Localisation Options → Timezone — set to your timezone
- Localisation Options → Locale — set to your locale (e.g.
en_US.UTF-8) - Performance Options → GPU Memory — drop to
16MB since we’re headless (frees more RAM for services)
If you prefer non-interactive commands:
sudo raspi-config nonint do_hostname "pidev"
sudo raspi-config nonint do_change_timezone "America/Sao_Paulo"
sudo raspi-config nonint do_memory_split 16
SSH Hardening
If you haven’t already, switch to key-based auth and disable password login. On your workstation:
# Generate a key if you don't have one
ssh-keygen -t ed25519 -C "your-email"
# Copy to the Pi
ssh-copy-id pi@192.168.1.x
Then on the Pi, edit /etc/ssh/sshd_config:
PasswordAuthentication no
PubkeyAuthentication yes
PermitRootLogin no
sudo systemctl restart ssh
Keep your current SSH session open while you test the new connection from a second terminal. This way you can’t lock yourself out.
Static IP (optional but convenient)
Raspberry Pi OS Bookworm and later use NetworkManager — the old dhcpcd.conf approach no longer works. If you want a fixed IP:
# List connections to find the right name
nmcli connection show
# Set static IP (adjust interface name, IP, gateway, DNS)
sudo nmcli connection modify "Wired connection 1" \
ipv4.method manual \
ipv4.addresses "192.168.1.100/24" \
ipv4.gateway "192.168.1.1" \
ipv4.dns "1.1.1.1,8.8.8.8"
sudo nmcli connection down "Wired connection 1" && \
sudo nmcli connection up "Wired connection 1"
Alternatively, sudo nmtui gives you a friendlier TUI. Or just set a DHCP reservation on your router — often the simpler path.
2. Tailscale — Access From Anywhere
Tailscale creates an encrypted mesh network between your devices. Install it on the Pi and on your laptop, and you can SSH into the Pi from anywhere in the world — no port forwarding, no VPN configuration, no dynamic DNS.
curl -fsSL https://pkgs.tailscale.com/stable/debian/bookworm.noarmor.gpg \
| sudo tee /usr/share/keyrings/tailscale-archive-keyring.gpg >/dev/null
curl -fsSL https://pkgs.tailscale.com/stable/debian/bookworm.tailscale-keyring.list \
| sudo tee /etc/apt/sources.list.d/tailscale.list
sudo apt update && sudo apt install -y tailscale
sudo tailscale up
Follow the authentication URL it prints. After authenticating, get your Pi’s Tailscale IP:
tailscale ip -4
Critical step: Go to the Tailscale admin console, find your Pi, and click Disable key expiry. Otherwise you’ll be locked out remotely in 180 days with no way to re-authenticate.
Optional: enable Tailscale SSH (manages SSH access through Tailscale ACLs):
sudo tailscale up --ssh
From now on, you can ssh pi@100.x.x.x from anywhere.
3. Docker — The Foundation
Do not install Docker via sudo apt install docker.io — the Debian package is old and unsupported. Use the official Docker repository:
sudo apt install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/debian/gpg \
-o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
sudo tee /etc/apt/sources.list.d/docker.sources <<EOF
Types: deb
URIs: https://download.docker.com/linux/debian
Suites: $(. /etc/os-release && echo "$VERSION_CODENAME")
Components: stable
Architectures: $(dpkg --print-architecture)
Signed-By: /etc/apt/keyrings/docker.asc
EOF
sudo apt update
sudo apt install -y \
docker-ce \
docker-ce-cli \
containerd.io \
docker-buildx-plugin \
docker-compose-plugin
# Run Docker without sudo
sudo usermod -aG docker $USER
newgrp docker
Verify it works:
docker run hello-world
docker compose version
The
$(dpkg --print-architecture)returnsarm64on the Pi 5 — this is correct. Old Pi 4 guides sometimes hardcodedarmhf, which is wrong for 64-bit OS.
A note on Trixie
If your Pi is running Debian Trixie (like wittmann@wittmann:~$ above), the VERSION_CODENAME will be trixie. The command above handles this automatically. If you run into 404 errors when adding the Docker repo, check apt-cache policy docker-ce — some third-party repos don’t have a trixie suite yet, in which case substituting bookworm in the Suites: line will work (the packages are compatible).
4. Your First App — End to End
Let’s deploy a real service. We’ll build a tiny FastAPI app that returns a JSON response, containerize it, and run it with Docker Compose. This covers the full workflow you’ll repeat for every future service.
The app
Create a project directory:
mkdir ~/apps/hello-api && cd ~/apps/hello-api
Create the app file:
# main.py
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def root():
return {"message": "Hello from the Pi!", "status": "running"}
@app.get("/health")
def health():
return {"status": "ok"}
Create the Dockerfile:
FROM python:3.12-slim
WORKDIR /app
RUN pip install fastapi uvicorn
COPY main.py .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Create compose.yml:
services:
hello-api:
build: .
ports:
- "8000:8000"
restart: always
Deploy it
docker compose up -d
The first run builds the image. After that:
# Check it's running
docker compose ps
# Test it
curl http://localhost:8000/
# Follow logs
docker compose logs -f
You should see:
{"message": "Hello from the Pi!", "status": "running"}
Now test it from your laptop (using the Tailscale IP):
curl http://100.x.x.x:8000/
That’s it — a real service, running on your Pi, accessible from anywhere via Tailscale. The restart: always policy means it survives reboots automatically.
Useful Compose commands
docker compose up -d # start services in background
docker compose down # stop and remove containers
docker compose restart # restart services
docker compose logs -f # follow logs
docker compose ps # show running services
docker compose exec hello-api bash # get a shell inside the container
5. Structuring Multiple Services
Once you have more than one service, a clean directory structure pays off. Here’s what works well:
~/apps/
├── hello-api/
│ ├── compose.yml
│ ├── Dockerfile
│ └── main.py
├── mcp-server/
│ ├── compose.yml
│ └── ...
└── monitoring/
├── compose.yml
└── ...
Each service lives in its own directory with its own compose.yml. This keeps them independent — you can update, restart, or tear down one without touching the others.
For services that share a network (e.g., an API that talks to a database), you can either put them in the same compose.yml or use Docker’s external networks:
# In each compose.yml that needs to communicate
networks:
shared:
external: true
Create the network once: docker network create shared.
Environment variables
Never hardcode secrets in compose.yml. Use a .env file:
# .env
API_KEY=your-key-here
DATABASE_URL=postgresql://user:pass@db:5432/mydb
# compose.yml
services:
my-service:
env_file: .env
Add .env to .gitignore — it should never be committed.
6. SD Card Survival Tips
Running a 24/7 server on an SD card is risky — constant small writes from logs, Docker, and temp files will degrade it over months. Three mitigations:
log2ram — moves /var/log to RAM and syncs to disk daily:
# Add the azlux repo
. /etc/os-release
sudo wget -O /usr/share/keyrings/azlux-archive-keyring.gpg \
https://azlux.fr/repo.gpg
sudo tee /etc/apt/sources.list.d/azlux.list <<EOF
deb [signed-by=/usr/share/keyrings/azlux-archive-keyring.gpg] \
http://packages.azlux.fr/debian/ $VERSION_CODENAME main
EOF
sudo apt update && sudo apt install -y log2ram rsync
Edit /etc/log2ram.conf and increase SIZE=128M (the default 40M is too small for Docker), then reboot.
Journald size limit — caps systemd journal writes:
sudo mkdir -p /etc/systemd/journald.conf.d/
sudo tee /etc/systemd/journald.conf.d/size.conf <<'EOF'
[Journal]
SystemMaxUse=50M
RuntimeMaxUse=50M
EOF
sudo systemctl restart systemd-journald
Move Docker data to USB SSD — Docker image layers are the biggest SD card killer. If you have a USB 3.0 SSD, move Docker’s data root there:
sudo systemctl stop docker
sudo mv /var/lib/docker /mnt/ssd/docker
sudo tee /etc/docker/daemon.json <<'EOF'
{"data-root": "/mnt/ssd/docker"}
EOF
sudo systemctl start docker
The Pi 5 also supports NVMe via the M.2 HAT+ — a more permanent upgrade if you’re planning to run heavier workloads.
7. Ollama (Bonus) — Local LLM Inference
The Pi 5 is not a GPU server, but with 8GB RAM it can run small-to-medium models. Don’t expect instant responses — think of it as a slow-but-free, always-on LLM endpoint for personal automations.
curl -fsSL https://ollama.com/install.sh | sh
The installer detects ARM64 and sets up a systemd service automatically.
Which models are realistic
With 8GB shared RAM and the OS overhead:
| Model | RAM usage | Speed on Pi 5 |
|---|---|---|
gemma3:1b | ~1GB | 15–25 tok/s — fastest |
llama3.2:1b | ~1.3GB | 10–20 tok/s |
llama3.2:3b | ~2GB | 8–15 tok/s — good balance |
phi3:mini (3.8B) | ~2.3GB | 5–10 tok/s |
llama3.1:8b | ~4.7GB | 2–5 tok/s — slow but works |
| 13B+ | — | OOM or unusable |
For interactive use and automations, the 1B–3B models hit the sweet spot. 7B/8B works for batch tasks where latency doesn’t matter.
# Pull and run a model
ollama pull llama3.2:3b
ollama run llama3.2:3b
# Check what's loaded and RAM usage
ollama ps
# The API is at localhost:11434 — OpenAI-compatible via /v1/
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2:3b",
"messages": [{"role": "user", "content": "Hello!"}]
}'
The OpenAI-compatible endpoint means you can point most LLM tooling (LangChain, LlamaIndex, Claude Code with --model local endpoint) at your Pi with minimal config changes.
Active cooling is not optional. Running a 7B model will push your Pi to 80°C within minutes without a fan. The official active cooler or a well-ventilated case with a fan is mandatory for sustained inference.
What’s Next
You now have the foundation: Docker, Tailscale, a running service, and optionally a local LLM. From here, natural next steps are:
- Portainer or Dockge — web UI for managing your Docker services
- Uptime Kuma — lightweight monitoring that alerts you when a service goes down
- Caddy — reverse proxy to serve multiple services under clean URLs (with automatic HTTPS via Tailscale Funnel)
- Actual AI workloads — MCP servers, AI agents, custom automations running 24/7
The pattern is always the same: new directory under ~/apps/, write a compose.yml, docker compose up -d. The Pi handles the rest.