← Home πŸ—ΊοΈ Mind Map β˜• Ko-fi πŸ’³ Razorpay
// Docker Guide Β· Beginner to Advanced

Docker Complete Guide 2026: Images, Multi-Stage Builds, Networking & Security

πŸ“… Updated May 2026 ⏱ 22 min read 🏷 Docker Β· Containers Β· DevOps Β· CI/CD
πŸ‘¨β€πŸ’»
Dhanush R β€” Senior DevOps Engineer
4.5+ years building and securing Docker images in production CI/CD pipelines. Every section here is derived from real Dockerfile audits, container security reviews, and production incident investigations.
// Table of Contents
  1. What is Docker? Containers vs Virtual Machines
  2. Image Layers, Caching, and How Docker Builds Work
  3. Dockerfile Instructions Deep Dive
  4. Multi-Stage Builds β€” The Production Standard
  5. Docker Networking Modes Explained
  6. Volumes and Persistent Data Management
  7. Container Security Hardening Checklist
  8. Docker Compose for Local Development
  9. Working with Container Registries
  10. Debugging Running Containers
  11. Essential Docker Command Reference
  12. 12 Docker Interview Questions with Expert Answers

Docker is the foundation of modern DevOps. Every CI/CD pipeline builds Docker images. Every Kubernetes workload runs containers. I have built, secured, and debugged Docker images daily in production for 4.5 years β€” managing multi-stage Dockerfiles for Java, Go, Node.js, and Python services, enforcing container security policies, and investigating production incidents caused by Dockerfile anti-patterns. This guide covers everything you need to use Docker correctly and answer every Docker interview question confidently.

What is Docker? Containers vs Virtual Machines

Docker packages an application and all its dependencies β€” libraries, runtime, configuration β€” into a portable, self-contained unit called a container. Containers solve the "works on my machine" problem permanently: the exact same image runs identically on a developer's laptop, a GitHub Actions CI runner, and a production Kubernetes node.

Under the hood, Docker is not magic. It uses two Linux kernel features that have existed since around 2006. Namespaces provide isolation: each container gets its own process tree (PID namespace), network stack (network namespace), filesystem view (mount namespace), and hostname (UTS namespace). cgroups (control groups) provide resource limits: the kernel enforces the CPU and memory limits you configure, killing processes that exceed their memory limit (OOMKill). Docker is a user-friendly management layer on top of these kernel primitives.

Container vs Virtual Machine: A VM virtualises the entire hardware stack and runs a complete OS kernel (several GB, 30–60 second boot). A container shares the host OS kernel and only virtualises the process space (megabytes of overhead, millisecond startup). A physical server can run 5–10 VMs or 100+ containers. Containers are not more secure than VMs by default β€” a VM escape is harder than a container escape β€” but containers are dramatically faster, smaller, and more efficient.

Image Layers, Caching, and How Docker Builds Work

A Docker image is a stack of read-only layers. Each instruction in a Dockerfile (FROM, RUN, COPY, ADD) creates a new layer. When you run a container, Docker adds a thin read-write layer on top of the image layers β€” all writes inside the container go to this writable layer. When the container is deleted, this layer is deleted. The read-only image layers are shared between all containers using the same image, which is why Docker is storage-efficient even when running dozens of containers.

Layer caching is what makes Docker builds fast. If an instruction and all its inputs are identical to a previous build, Docker reuses the cached layer and skips re-running the instruction. Cache is invalidated when: the instruction text changes, any input file changes (for COPY/ADD), or any preceding layer's cache is invalidated. This has a critical implication: the order of instructions in your Dockerfile directly determines how effective layer caching is.

The most expensive Dockerfile mistake: Copying your application code before installing dependencies. Every code change invalidates the npm install / pip install / mvn install layer, forcing a full dependency reinstall on every build. Always copy and install dependencies first (they change rarely), then copy application code (which changes with every commit).
# WRONG: code copy invalidates dependency install on every commit FROM node:20-alpine WORKDIR /app COPY . . # ❌ any file change blows the npm ci cache RUN npm ci EXPOSE 3000 CMD ["node", "server.js"] # CORRECT: stable layers first, volatile layers last FROM node:20-alpine WORKDIR /app COPY package.json package-lock.json ./ # βœ… only changes when deps change RUN npm ci --only=production COPY . . # βœ… code copy only invalidates itself EXPOSE 3000 CMD ["node", "server.js"]

Dockerfile Instructions Deep Dive

Every Dockerfile instruction has specific semantics that affect build performance, image size, and container runtime behaviour. Here are the ones that come up most in production and interviews:

Production tip β€” .dockerignore: Always create a .dockerignore file in your project root. Without it, COPY . . sends your entire project directory as the build context β€” including node_modules, .git, *.log files, and environment files containing secrets. A proper .dockerignore file is as important as .gitignore.
# .dockerignore β€” always include this in every project node_modules .git .gitignore *.md *.log .env .env.* coverage/ .nyc_output dist # for projects that build locally *.test.js __tests__ .DS_Store

Multi-Stage Builds β€” The Production Standard

Multi-stage builds are the single most important Dockerfile pattern for production images. They allow you to use a large, build-tool-heavy image to compile and test your application, then produce a minimal runtime image containing only the compiled output and its runtime dependencies. The final image contains zero build tools, test frameworks, or source code β€” dramatically reducing image size and attack surface.

In production, I have seen multi-stage builds reduce Java Spring Boot images from 800MB to 120MB and Go images from 1.2GB to 12MB. Smaller images mean faster pulls, faster deployments, less storage cost, and fewer CVEs to patch.

# Java Spring Boot β€” multi-stage build # Stage 1: Build with Maven and full JDK FROM maven:3.9-eclipse-temurin-21 AS builder WORKDIR /app COPY pom.xml . RUN mvn dependency:go-offline -q # cache deps as separate layer COPY src ./src RUN mvn clean package -DskipTests -q # Stage 2: Runtime β€” only JRE, no Maven, no source code FROM eclipse-temurin:21-jre-alpine WORKDIR /app RUN addgroup -S appgroup && adduser -S appuser -G appgroup USER appuser COPY --from=builder /app/target/api.jar app.jar EXPOSE 8080 ENTRYPOINT ["java", "-jar", "app.jar"]
# Go application β€” final image can be as small as 10MB FROM golang:1.22-alpine AS builder WORKDIR /app COPY go.mod go.sum ./ RUN go mod download COPY . . RUN CGO_ENABLED=0 GOOS=linux go build -o /bin/app . # Distroless image β€” no shell, no package manager, minimal CVEs FROM gcr.io/distroless/static-debian12 COPY --from=builder /bin/app /app USER nonroot:nonroot EXPOSE 8080 ENTRYPOINT ["/app"]

Docker Networking Modes Explained

Docker provides several networking modes, each with different isolation characteristics and use cases. Understanding these is essential for both production container design and security interviews.

Volumes and Persistent Data Management

Containers are ephemeral by design. All data written inside a container's writable layer is lost when the container is removed. For data that must survive container restarts and removals, Docker provides three storage mechanisms:

Production warning β€” database containers: Always use named volumes for database containers in Docker Compose. If you accidentally run docker-compose down -v (the -v flag removes volumes), you lose all data. For production, databases should not run in Docker at all β€” use a managed service (AWS RDS, Cloud SQL) or Kubernetes StatefulSet with proper backup automation.

Container Security Hardening Checklist

Container security is tested in every Senior DevOps and platform engineering interview. I use this checklist during production container security reviews:

Docker Compose for Local Development

Docker Compose defines and runs multi-container applications from a single YAML file. It is the standard tool for local development environments where you need your application, a database, a cache, and perhaps a message broker all running together with a single command.

# docker-compose.yml β€” full-stack local dev environment version: "3.9" services: api: build: context: . dockerfile: Dockerfile target: builder # use build stage for hot reload ports: - "8080:8080" environment: - DB_HOST=postgres - REDIS_HOST=redis - NODE_ENV=development volumes: - .:/app # mount source for live reload - /app/node_modules # keep container node_modules depends_on: postgres: condition: service_healthy redis: condition: service_started restart: unless-stopped postgres: image: postgres:16-alpine environment: POSTGRES_DB: myapp POSTGRES_USER: devuser POSTGRES_PASSWORD: devpass volumes: - pgdata:/var/lib/postgresql/data ports: - "5432:5432" healthcheck: test: ["CMD-SHELL", "pg_isready -U devuser -d myapp"] interval: 5s timeout: 5s retries: 5 redis: image: redis:7-alpine ports: - "6379:6379" command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru volumes: pgdata:

Working with Container Registries

A container registry stores and distributes Docker images. Docker Hub is the public default. For production, use a private registry: AWS ECR (Elastic Container Registry), Google Artifact Registry, Azure Container Registry, or self-hosted Harbor. Private registries give you access control, vulnerability scanning, and keep proprietary images private.

# Authenticate to AWS ECR aws ecr get-login-password --region ap-south-1 | \ docker login --username AWS --password-stdin \ 123456789.dkr.ecr.ap-south-1.amazonaws.com # Build, tag, and push to ECR docker build -t myapp:2.1.0 . docker tag myapp:2.1.0 123456789.dkr.ecr.ap-south-1.amazonaws.com/myapp:2.1.0 docker push 123456789.dkr.ecr.ap-south-1.amazonaws.com/myapp:2.1.0 # Inspect an image without pulling it (uses registry API) docker manifest inspect nginx:1.26-alpine # Sign an image with Docker Content Trust (Notary) export DOCKER_CONTENT_TRUST=1 docker push registry.company.com/myapp:2.1.0

Debugging Running Containers

Effective container debugging separates senior DevOps engineers from juniors. Here are the techniques I use most in production incidents:

# Get a shell inside a running container docker exec -it container_name sh # Alpine images use sh, not bash docker exec -it container_name bash # for Debian/Ubuntu-based images # Check container resource usage in real time docker stats container_name docker stats --no-stream # one-shot snapshot # Follow live logs from a running container docker logs -f container_name --tail=100 # Inspect all container metadata (env vars, mounts, network, restart policy) docker inspect container_name | jq '.[0].HostConfig' # Check what processes are running inside the container docker top container_name # Debug a distroless or minimal container (no shell inside the image) # Use an ephemeral debug sidecar attached to the container's namespaces docker run -it --pid=container:myapp --net=container:myapp \ --cap-add SYS_PTRACE busybox sh # Copy files from a container to the host for analysis docker cp container_name:/var/log/app.log ./app.log # Check why a container exited docker inspect container_name | jq '.[0].State' # ExitCode: 137 = OOMKilled, 1 = app error, 0 = clean exit

Essential Docker Command Reference

12 Docker Interview Questions with Expert Answers

Q1: What is the difference between CMD and ENTRYPOINT? Why does exec form vs shell form matter?
ENTRYPOINT defines the fixed executable that always runs. CMD provides default arguments that can be overridden at docker run time. Together, ENTRYPOINT + CMD = the full command. Exec form (["node", "server.js"]) runs the process directly as PID 1. Shell form (node server.js) runs as /bin/sh -c "node server.js", making your app a child process of sh. The critical runtime difference: when Docker or Kubernetes sends SIGTERM for graceful shutdown, shell form containers have sh as PID 1 β€” sh doesn't forward signals to child processes by default, so your app never receives SIGTERM and eventually gets SIGKILL after the grace period. This causes ungraceful shutdowns, dropped requests, and slow rolling deployments. Always use exec form for production containers.
Q2: How do multi-stage builds reduce image size and improve security?
Multi-stage builds use multiple FROM statements in one Dockerfile. Each stage creates an independent filesystem. You can selectively copy files between stages using COPY --from=stage_name. The final image only contains what you explicitly copy into the last stage β€” all build tools (Maven, npm, GCC, test frameworks), source code, and intermediate build artifacts are discarded. This eliminates the most common sources of image bloat and CVEs: build tools have large footprints and many dependencies, and source code in production images is both unnecessary and a security risk (it exposes internal logic). A Java app built from scratch goes from ~1GB (if you install Maven and full JDK in one stage) to ~120MB (copying only the JAR into a JRE-only Alpine base).
Q3: What happens when a container runs out of memory?
The Linux kernel's OOM (Out Of Memory) killer terminates the container process with SIGKILL (uncatchable). The container exits with code 137 (128 + 9 for SIGKILL). Docker's behavior after this depends on the restart policy: with --restart=unless-stopped, Docker restarts the container β€” which will OOMKill again if the memory limit is still too low, creating a restart loop. In Kubernetes, this shows as OOMKilled in kubectl describe pod and CrashLoopBackOff if it keeps happening. Root cause is almost always: memory limit set too low for actual usage, a memory leak in the application, or a Java application not respecting container memory limits (set -XX:MaxRAMPercentage=75 for JVM containers to leave headroom for non-heap memory).
Q4: How does Docker layer caching work and how do you optimise it?
Docker evaluates each instruction top to bottom. If an instruction and all its inputs are identical to a previous build, Docker uses the cached layer and skips re-executing it. Cache is invalidated when the instruction changes, any input file changes (for COPY), or any parent layer's cache is invalidated (cache miss cascades downward). Optimise by: putting rarely-changing instructions first (FROM base image, installing OS packages, copying and installing dependency manifests) and frequently-changing instructions last (copying application source code). For multi-stage builds, cache the dependency download step separately from the source copy step. For CI/CD, use BuildKit's cache mounts (RUN --mount=type=cache,target=/root/.m2 mvn package) to persist Maven/npm caches between pipeline runs even when source changes.
Q5: What is a distroless image? When would you use one?
Distroless images (from Google's gcr.io/distroless) contain only the application and its runtime dependencies β€” no shell, no package manager, no coreutils. They are dramatically smaller than Alpine images and have the smallest possible CVE surface area (no bash means no shell injection, no apt means no package vulnerabilities). Use for compiled languages (Go, Rust) and Java: the Go binary plus distroless/static produces an image under 10MB. The trade-off is debuggability β€” there is no shell to exec into for debugging. In production this is acceptable (use ephemeral debug containers via kubectl debug). For Python and Node.js apps with many dynamic dependencies, distroless is harder to use correctly β€” Alpine is a good alternative.
Q6: How do you handle secrets in Docker without embedding them in the image?
Three production-safe approaches: (1) Runtime environment variables β€” pass secrets at container start time via docker run -e DB_PASSWORD=$SECRET or Kubernetes Secrets mounted as env vars. The secret is never in the image. (2) BuildKit secret mounts for build-time secrets β€” RUN --mount=type=secret,id=npmrc,dst=/root/.npmrc npm install mounts the secret only during that RUN instruction and never persists it in the image layer. (3) Volume-mounted secret files β€” mount secrets from Kubernetes Secrets or Vault Agent as files at a path inside the container at runtime. Never use ENV or ARG for secrets β€” both are visible in docker history and docker inspect. Never commit .env files containing real secrets to Git.
Q7: What is the difference between a Docker volume and a bind mount?
A Docker volume is fully managed by Docker β€” Docker controls where it is stored on the host (typically /var/lib/docker/volumes/), its lifecycle, and provides commands to inspect, backup, and remove it. Volumes are portable (work the same on any Docker host), can use volume drivers for cloud storage backends, and are the recommended approach for production persistent data. A bind mount maps a specific, absolute path on the host filesystem directly into the container. The container can read and write files on the host. Bind mounts are ideal for development (mount your source code for live reload) but problematic in production: they create a tight dependency on host filesystem layout, don't work well in container orchestration, and give the container potential access to sensitive host directories.
Q8: How does Docker networking enable container-to-container communication?
On a user-defined bridge network (which Docker Compose creates automatically), Docker runs an embedded DNS server. Each container is registered in this DNS by its container name and service name. When container A connects to http://postgres:5432, Docker's embedded DNS resolves "postgres" to the postgres container's IP on the bridge network β€” no port mapping or external DNS required. This works only on user-defined bridges, not the default bridge (where containers can only reach each other by IP). On the default bridge, you must use --link (deprecated) or explicitly use IPs. For production multi-container applications, always define an explicit network in Docker Compose or your orchestration layer.
Q9: What is Docker BuildKit and why should you use it?
BuildKit is the next-generation image build engine, enabled by default in Docker 23.0+. Key improvements over the legacy builder: parallel stage execution (independent multi-stage build stages run concurrently), persistent build cache (cache mounts survive between builds), secret mounts (build-time secrets that never appear in layers), SSH agent forwarding (access private Git repos during build without embedding SSH keys), and improved cache import/export for CI/CD (push and pull cache layers from a registry). Enable in older Docker with DOCKER_BUILDKIT=1 docker build . or docker buildx build .. In CI/CD, use docker buildx build --cache-from type=registry,ref=registry/image:cache --cache-to type=registry,ref=registry/image:cache,mode=max . to share build cache across pipeline runs.
Q10: How do you debug a container that has no shell?
For distroless or scratch-based containers with no shell, there are three approaches: (1) Ephemeral debug container β€” Docker 20.10+ supports docker run -it --pid=container:myapp --net=container:myapp busybox sh, attaching a debug container to the same PID and network namespaces as the target container. In Kubernetes, use kubectl debug -it pod/myapp --image=busybox --target=myapp. (2) Override the entrypoint at run time β€” docker run --entrypoint sh myimage β€” only works if the image contains a shell. (3) Use nsenter on the host β€” docker inspect myapp | grep Pid gives the host PID, then nsenter -t PID -m -u -i -n -p sh enters all namespaces. Method 1 is the most practical for production debugging without modifying the running container.
Q11: What is the difference between COPY and ADD in a Dockerfile?
COPY copies files or directories from the build context into the image. Simple, predictable, explicit. ADD does everything COPY does plus: automatically extracts tar archives (if the source is a .tar.gz file, ADD extracts it into the destination), and can download files from URLs (though this is unreliable and uncacheable). The Dockerfile best practices guide recommends always using COPY unless you specifically need ADD's tar extraction feature. ADD's URL download capability is particularly discouraged β€” it downloads at build time with no integrity verification, breaks layer caching on every build, and introduces an external dependency into your build. Use curl in a RUN instruction with checksum verification instead.
Q12: How do you reduce the size of a Docker image that has grown too large?
Systematic approach: (1) Run docker history image:tag to identify which layers are largest. (2) Switch to a minimal base image β€” from ubuntu to debian-slim to alpine to distroless. (3) Use multi-stage builds to separate build from runtime β€” the most impactful change. (4) Chain RUN commands and clean up in the same layer: RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/* β€” the cleanup must be in the same RUN instruction, otherwise the deleted files still exist in the previous layer. (5) Use --no-install-recommends with apt-get. (6) Delete downloaded archives, test files, documentation, and examples in the same RUN instruction that installs them. (7) Use Dive (a CLI tool) to inspect every layer and find hidden large files.

🐳 Explore Docker on the Interactive Mind Map

See how Docker connects to Kubernetes, CI/CD pipelines, container registries, and more.

Open Interactive Mind Map ☸️ Kubernetes Next β†’
// More Guides
☸️ Kubernetes βš™οΈ CI/CD ☁️ AWS 🐧 Linux πŸ“Š Prometheus 🌿 Git πŸ—‚οΈ Terraform
Advertisement
β˜• Support Master DevOps

All guides are 100% free. If this helped you crack an interview or learn Docker, your support keeps the project alive.

β˜• Ko-fi β€” International πŸ’³ Razorpay β€” India
🐳
Written by Dhanush R
Senior DevOps Engineer Β· 4.5+ Years Β· Bengaluru Β· Docker Β· Kubernetes Β· AWS Β· Terraform

DevOps engineer with 4.5+ years of hands-on production experience building, securing, and debugging Docker images across Java, Go, Node.js, and Python services. Every section here is written from real production Dockerfile audits, container security reviews, and CI/CD pipeline work. Last updated: May 2026.

πŸ“Έ Instagram ▢️ YouTube πŸ’Ό LinkedIn About β†’
πŸŒ™