Docker DevOps Agent Rules
Project Context
You are working on containerized applications using Docker for image builds, multi-stage compilation, and local development orchestration with Docker Compose.
Dockerfile Best Practices
- Use multi-stage builds: separate a `builder` stage with SDK/compiler from a lean `runtime` stage with only the binary or compiled output.
- Choose base images deliberately: `debian:bookworm-slim` for glibc compatibility, `alpine:3.20` for minimal footprint, `gcr.io/distroless/static` for Go binaries with zero shell.
- Pin base images to a specific digest: `FROM node:20.11.1-alpine3.19@sha256:<digest>` — never use `latest` or floating minor tags in production Dockerfiles.
- Order layers from least to most volatile: OS deps → language runtime → dependency manifests → `npm install` or `pip install` → source copy → build step.
- Copy only manifests before installing: `COPY package.json pnpm-lock.yaml ./` then `RUN pnpm install --frozen-lockfile` before copying source to maximize cache reuse.
- Consolidate `RUN` commands that modify the same layer: `RUN apt-get update && apt-get install -y --no-install-recommends curl=8.5.0 && rm -rf /var/lib/apt/lists/*`.
- Create a non-root user in the Dockerfile and switch to it before the final `CMD`: `RUN addgroup --system app && adduser --system --ingroup app app` then `USER app`.
- Use `HEALTHCHECK --interval=30s --timeout=10s --start-period=15s --retries=3 CMD curl -f http://localhost:8080/health || exit 1`.
- Use `.dockerignore` to exclude `node_modules/`, `.git/`, `*.md`, `test/`, `coverage/`, `.env*` from build context.
- Never bake secrets into image layers via `ARG` or `ENV` — use build secrets with `RUN --mount=type=secret,id=npmrc cat /run/secrets/npmrc > ~/.npmrc`.
Multi-Stage Build Patterns
- For Node.js: stage 1 installs all deps and compiles TypeScript; stage 2 copies `dist/` and runs `npm ci --omit=dev`.
- For Go: use `golang:1.22-alpine` as builder, copy the static binary to `scratch` or `distroless/static-debian12`.
- For Python: use a `python:3.12-slim` builder to compile wheels, copy installed packages to a clean runtime stage.
- Name stages explicitly: `FROM node:20-alpine AS deps`, `FROM node:20-alpine AS builder`, `FROM node:20-alpine AS runner`.
- Use `COPY --from=builder /app/dist ./dist` to bring only build artifacts into the final stage.
Docker Compose
- Use `docker-compose.yml` for base configuration and `docker-compose.override.yml` for local dev overrides (hot-reload mounts, debug ports) — never commit the override file.
- Define health checks on every service dependencies: `healthcheck: test: ["CMD", "pg_isready", "-U", "postgres"]` before using `condition: service_healthy` in `depends_on`.
- Use named volumes for persistence: `volumes: postgres_data:` at top level, then `volumes: - postgres_data:/var/lib/postgresql/data`.
- Scope services to custom bridge networks: `networks: backend:` and `networks: frontend:` — only the API joins both.
- Use `env_file: .env.local` for secrets; add `.env.local` to `.gitignore`. Never commit credentials in compose files.
- Set `restart: unless-stopped` for long-running services; avoid `restart: always` in dev to allow intentional stops.
- Use `profiles: ["tools"]` for optional services like adminer, mailhog, or pgAdmin to keep default `up` lean.
Layer Caching & Build Performance
- Use BuildKit (`DOCKER_BUILDKIT=1`) — it enables parallel stage builds, cache mounts, and secret mounts.
- Use `RUN --mount=type=cache,target=/root/.npm npm ci` to persist npm cache across builds without baking it into the layer.
- In CI, pass `--cache-from type=registry,ref=myrepo/app:buildcache` and `--cache-to type=registry,ref=myrepo/app:buildcache,mode=max`.
- Avoid `COPY . .` as an early layer — it invalidates cache on any source change including test files.
Security
- Run `docker scout cves` or `trivy image` against every built image in CI; fail the build on `CRITICAL` severity.
- Set `read_only: true` in compose and `readOnlyRootFilesystem: true` in pod specs; mount `/tmp` as a tmpfs volume if the app needs a writable temp dir.
- Drop all capabilities with `cap_drop: [ALL]` and add back only required ones like `NET_BIND_SERVICE`.
- Never expose the Docker socket (`/var/run/docker.sock`) into containers in production environments.
Error Handling & Debugging
- Use `ENTRYPOINT ["dumb-init", "--"]` with `CMD ["/app/server"]` to properly forward signals and avoid zombie processes.
- Add `--init` flag in compose for services that don't have a proper init process in the image.
- Use `docker build --progress=plain` to see full build output without collapsing steps.
- Check container logs with `docker logs --tail 100 --follow <container>` and structured JSON log output from the app.
- Use `docker exec -it <container> sh` for interactive debugging — prefer adding a `busybox` debug stage rather than tools in production images.
- Use `docker system df` to inspect layer, image, container, and volume disk usage during cleanup.
Testing & Validation
- Lint Dockerfiles with `hadolint Dockerfile` in CI — it catches anti-patterns like `latest` tags, `apt-get` without pinning, and `ADD` misuse.
- Validate Docker Compose files with `docker compose config` before deploying — it catches syntax errors and missing env substitutions.
- Test image startup behavior with `docker run --rm --env-file .env.test myimage:sha` before pushing to a registry.
- Scan images for vulnerabilities with `trivy image --exit-code 1 --severity CRITICAL myimage:sha` and fail CI on critical findings.
- Test health checks locally: start the container, wait for `healthy` status with `docker inspect --format='{{.State.Health.Status}}'`, then run assertions against the running container.