// Docker Guide · Beginner to Advanced

Docker Complete Guide: Containers, Images, Multi-Stage Builds & Security

📅 Updated April 2026 · 📅 April 2026 ⏱ 12 min read 🏷 Docker · Containers · DevOps · CI/CD

👨‍💻

master.devops

Practising DevOps Engineer with deep hands-on experience in Kubernetes, AWS, CI/CD, and SRE. Every guide is written from real production work.

Docker is the foundation of modern DevOps. Every CI/CD pipeline builds Docker images. Every Kubernetes workload runs Docker containers. Docker images are built and shipped daily in real enterprise environments — managing multi-stage Dockerfiles, enforcing security hardening policies, and debugging container networking issues in production. This guide is everything I wish I had known when I started.

What is Docker? The Core Concept

Docker packages an application and all its dependencies into a portable, self-contained unit called a container. Containers solve the "works on my machine" problem permanently — the same container image runs identically on a developer's MacBook, a CI runner, a staging server, and a production Kubernetes cluster.

Under the hood, Docker uses two Linux kernel features: namespaces (for process, network, and filesystem isolation between containers) and cgroups (to limit CPU and memory usage). Docker itself is not magic — it is a user-friendly wrapper around these kernel primitives. Understanding this is important for security discussions in interviews.

    Container vs Virtual Machine: A VM virtualises the entire hardware stack and runs a
    full OS kernel. A container shares the host OS kernel and only virtualises the process space. Containers
    start in milliseconds, use megabytes of memory, and pack 10x more workloads per machine than VMs.
  

Docker Image Layers — How They Work

A Docker image is built from a stack of read-only layers. Each instruction in a Dockerfile (FROM, RUN, COPY, etc.) creates a new layer. When you run a container, Docker adds a thin writable layer on top. All containers from the same image share the read-only layers — this is how Docker achieves storage efficiency.

Layer caching is what makes Docker builds fast. If a layer's instruction and its inputs have not changed, Docker reuses the cached layer. This has a critical implication for Dockerfile ordering: put instructions that change rarely (installing OS packages) early, and instructions that change frequently (copying your application code) late.

# BAD: cache-busting order — code copy invalidates package install cache
FROM node:20-alpine
COPY . /app          # ❌ any code change invalidates everything below
RUN npm install
EXPOSE 3000
CMD ["node", "server.js"]

# GOOD: stable layers first, frequently-changing last
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./    # ✅ only changes when dependencies change
RUN npm ci --only=production
COPY . .                 # ✅ only invalidates npm install if package.json changed
EXPOSE 3000
CMD ["node", "server.js"]

Multi-Stage Builds — Production Best Practice

Multi-stage builds are the most important Dockerfile pattern for production. They allow you to use a large build-time image (with compilers, build tools, test frameworks) while producing a tiny runtime image (just the application binary and its runtime dependencies). The final image contains none of the build tools — reducing attack surface and image size dramatically.

# Multi-stage build for a Java Spring Boot application

# Stage 1: Build
FROM maven:3.9-eclipse-temurin-21 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline -q    # cache dependencies as a separate layer
COPY src ./src
RUN mvn clean package -DskipTests -q

# Stage 2: Runtime — only copies the compiled JAR
FROM eclipse-temurin:21-jre-alpine  # much smaller than full JDK image
WORKDIR /app

RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser    # NEVER run as root in production

COPY --from=builder /app/target/api.jar app.jar

EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]   # exec form — SIGTERM reaches the JVM

CMD vs ENTRYPOINT — the most common interview question: ENTRYPOINT defines the executable. CMD provides default arguments. Always use exec form (["java", "-jar", "app.jar"]) not shell form (java -jar app.jar). Shell form wraps your process in /bin/sh -c, making your app a child of the shell. When Kubernetes sends SIGTERM for graceful shutdown, the shell intercepts it — your app never gets the signal and waits for the full terminationGracePeriodSeconds. This is a very common source of slow Kubernetes rolling updates.

Docker Networking Modes

bridge (default) — Each container gets its own network namespace on a private bridge network. Containers communicate by container name. Port mapping (-p 8080:80) exposes container ports to the host.
host — Container shares the host's network stack. No network isolation. Use only when you need maximum network performance (e.g., monitoring agents). Never for application containers.
none — No networking. Completely isolated. Use for batch jobs that process local files with no network requirement.
overlay — Multi-host networking for Docker Swarm. Containers on different hosts can communicate. Kubernetes uses its own networking model and does not use Docker overlay networks.

Docker Volumes — Managing Persistent Data

Containers are ephemeral — all data written inside a container is lost when the container is removed. Volumes solve this. There are three types:

Named volumes (docker run -v pgdata:/var/lib/postgresql/data postgres) — Managed by Docker, stored in Docker's storage area. Preferred for production data. Persists across container restarts and removals.
Bind mounts (docker run -v /host/path:/container/path myapp) — Maps a specific host directory into the container. Use for development (live code reloading) or sharing config files. Avoid in production containers.
tmpfs mounts (docker run --tmpfs /tmp myapp) — In-memory only. Cleared when container stops. Use for sensitive data (secrets, tokens) that should not touch disk.

Container Security Hardening

Docker security is tested in every Senior DevOps interview. Here is the complete checklist I use at real production container security reviews:

Never run as root — Add USER 1001 to your Dockerfile. Verify with docker inspect --format='{{.Config.User}}' image:tag.
Use minimal base images — Alpine, distroless, or slim variants. Fewer packages = smaller attack surface. Distroless images have no shell at all, making container breakouts much harder.
Read-only root filesystem — Run with --read-only flag. Mount specific writable paths as volumes. Prevents malware from persisting to the container filesystem.
Drop capabilities — Run with --cap-drop ALL and add back only what is needed. Most applications need zero Linux capabilities.
Scan images with Trivy — trivy image --exit-code 1 --severity HIGH,CRITICAL myimage:tag. Run this in every CI pipeline. Exit code 1 fails the build on findings.
Pin image versions — FROM node:20.11-alpine3.18 not FROM node:latest. Unpinned images introduce non-deterministic builds and can pull in breaking changes or vulnerabilities.
Use .dockerignore — Exclude node_modules, .git, *.env, and build artifacts from the build context. Speeds up builds and prevents sensitive files from entering images accidentally.

# Trivy scan in CI — fail pipeline on HIGH/CRITICAL CVEs
trivy image   --exit-code 1   --severity HIGH,CRITICAL   --no-progress   myapp:$BUILD_TAG

# Run container with security hardening flags
docker run   --read-only   --cap-drop ALL   --user 1001   --memory 512m   --cpus 0.5   --security-opt no-new-privileges   myapp:latest

Docker Compose for Local Development

Docker Compose orchestrates multi-container applications for local development. A single docker-compose.yml defines your API server, database, cache, and message broker. One command starts the entire stack: docker compose up -d.

# docker-compose.yml — local dev stack
version: "3.9"
services:
  api:
    build: .
    ports: ["8080:8080"]
    environment:
      DB_HOST: postgres
      REDIS_URL: redis://redis:6379
    depends_on:
      postgres:
        condition: service_healthy   # wait for DB to be ready
    volumes:
      - ./src:/app/src              # live code reload in dev

  postgres:
    image: postgres:15-alpine
    environment:
      POSTGRES_PASSWORD: devpassword
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "postgres"]
      interval: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    command: redis-server --maxmemory 256mb

volumes:
  pgdata:

Interview Q&A

Q1: How do you reduce Docker image size?

Multi-stage builds — use a fat build image, copy only the final artifact to a minimal runtime image. Use minimal base images (alpine, distroless, slim). Use .dockerignore to exclude unnecessary files from build context. Combine RUN commands with && to reduce layer count. Remove build-time dependencies in the same RUN layer they are installed (apt-get install && make && apt-get purge). Scan the image with dive to inspect layer contents and find waste.

Q2: What happens when you run docker run?

Docker CLI sends the run command to the Docker daemon. The daemon checks if the image exists locally — if not, pulls it from the registry (layer by layer, checking cache for each). Creates a container: allocates a new network namespace, creates a writeable layer on top of the image, assigns a virtual ethernet interface connected to the bridge network, applies resource limits via cgroups. Starts the process defined by ENTRYPOINT/CMD inside the container's namespaces. Returns the container ID.

Q3: What is a dangling image and how do you clean up Docker?

A dangling image has no tag (<none>:<none>) — created when a new build takes an existing tag, leaving the old image untagged. Clean with: docker image prune (dangling images only), docker system prune (stopped containers + dangling images + unused networks), docker system prune -a (everything not used by a running container — use carefully). In CI, always clean up after builds to prevent disk exhaustion on runners.

Q4: How do you share data between containers?

Named volumes: both containers mount the same named volume at their respective paths. Docker manages the storage. Bind mount: both containers mount the same host directory. Works but ties containers to a specific host path. Network: containers communicate over the network — one exposes an HTTP or socket API. This is the preferred pattern for microservices. Shared volumes are appropriate for sidecar patterns (log aggregators, metric exporters reading application log files).

// More Guides

📖 DevOps ☸️ Kubernetes 🐳 Docker ⚙️ CI/CD 🗂️ Terraform 🐧 Linux 🌿 Git ☁️ AWS 📊 Prometheus