Docker Fundamentals

What is Docker?

Docker is a containerization platform that packages your application code, runtime, system tools, and dependencies into a standardized unit called a container. Containers are lightweight, portable, and consistent across any environment—from your local machine to production servers.

The "Works on My Machine" Problem

Before Docker, developers faced a classic problem: an application worked perfectly on their development machine but failed in staging or production due to differences in:

Operating systems
Library versions
Environment variables
System configurations
Dependencies

This inconsistency consumed countless hours debugging environmental issues. Docker solved this by containerizing the entire application environment, ensuring it runs identically everywhere.

How Docker Works: The Container Approach

A Docker container wraps your application with everything it needs to run:

┌──────────────────────────────┐
│     Docker Container         │
├──────────────────────────────┤
│     Your Application         │
├──────────────────────────────┤
│   Runtime (Node, Python, etc)│
├──────────────────────────────┤
│   System Libraries & Tools   │
├──────────────────────────────┤
│   Operating System Layer     │
├──────────────────────────────┤
│   Shared Host Kernel         │
└──────────────────────────────┘

The key insight: containers don't contain a full OS. They share the host's kernel but isolate the filesystem, processes, and network. This makes them lightweight and fast compared to virtual machines.

Containers vs Virtual Machines

Understanding the difference between containers and virtual machines (VMs) is crucial for choosing the right tool.

Detailed Comparison

Aspect	Containers	Virtual Machines
Isolation Level	Process and filesystem isolation	Complete OS isolation
Boot Time	Milliseconds	30 seconds to minutes
Size	Megabytes (typically 5-500 MB)	Gigabytes (1-20 GB)
Performance	Near-native (minimal overhead)	5-10% performance overhead
OS Flexibility	Must use same kernel as host	Can run different OS
Resource Usage	~1-2% per container	10-15% per VM minimum
Density	Hundreds per host	Tens per host
Startup Speed	Immediate	Several seconds
Portability	High (Linux kernel compatible)	High (works across platforms)
Use Case	Microservices, CI/CD, rapid scaling	Legacy apps, strict isolation, different OSes

Visual Architecture Comparison

When to use each:

Containers: Microservices, CI/CD pipelines, cloud-native apps, rapid scaling
VMs: Legacy applications, strict security boundaries, running different OS families

Docker Architecture

Docker uses a client-server architecture where components communicate via a REST API. Understanding this architecture is essential for working effectively with Docker.

Core Components

1. Docker Client

The command-line interface (CLI) you interact with daily. Commands like docker build, docker run, and docker pull are sent to the Docker Daemon via the REST API.

2. Docker Daemon (dockerd)

The server process running on the host machine that manages:

Creating and running containers
Building images
Managing volumes and networks
Storing and retrieving images from registries

3. Docker Engine

The complete runtime that combines the Docker Daemon with containerd and runc.

4. containerd

A high-level container runtime that manages the container lifecycle. It handles pulling images, creating containers, and delegating low-level operations to runc.

5. runc

The low-level container runtime that interacts directly with the Linux kernel to:

Create namespaces (for isolation)
Set up cgroups (for resource limits)
Mount filesystems
Create the actual container process

6. Docker Registry

A service storing and distributing Docker images. Docker Hub is the default public registry, but organizations can run private registries.

Docker Architecture Diagram

Communication Flow

You run docker run nginx
Docker CLI sends a request to the Docker Daemon via REST API
Docker Daemon checks if the nginx image exists locally
If not, it pulls the image from Docker Registry
The Daemon instructs containerd to create a container
containerd uses runc to interact with the kernel
The container starts and the Daemon returns control

Images vs Containers

This distinction is fundamental to Docker. Think of it this way:

Image = Blueprint or template (read-only)
Container = Running instance based on that blueprint (writable)

Images: The Blueprint

A Docker image is a lightweight, standalone, executable package containing:

Application code
Runtime environment (Python 3.11, Node.js 18, etc.)
System libraries and tools
Environment variables and default configurations

Key characteristics:

Read-only: Images never change once built
Layered: Built from multiple filesystem layers (see next section)
Portable: Same image runs everywhere
Reusable: One image can create many containers

Containers: The Running Instance

A container is a running process created from an image. Multiple containers can run from the same image simultaneously, each with its own:

Isolated filesystem (writable layer on top of image)
Process namespace
Network interfaces
Environment variables
Resource constraints

The Relationship Visualized

                    Docker Image
                  (Read-only template)
                 ┌──────────────────┐
                 │  nginx:latest    │
                 │  - OS layer      │
                 │  - Library layer │
                 │  - App layer     │
                 └──────────────────┘
                        │
         ┌──────────────┼──────────────┐
         │              │              │
         ▼              ▼              ▼
    Container 1    Container 2    Container 3
    (Writable)     (Writable)     (Writable)
    Port 8080      Port 8081      Port 8082
    Running        Running        Running

Union File System and Copy-on-Write

Docker uses the Union File System (AUFS, OverlayFS, etc.) with a copy-on-write (CoW) mechanism:

Layered Storage: The image consists of read-only layers stacked on top of each other
Container Layer: When a container starts, a thin writable layer is added on top
Copy-on-Write: When a file is modified, only that file is copied from the read-only layer to the writable layer
Efficiency: Unchanged files are shared between containers, saving disk space

Example:

Container 1 (8KB writable)     Container 2 (8KB writable)
    ↓                              ↓
┌─────────────────────────────────────┐
│  Image Layer 3 (modified app.js)    │ (shared, 100KB)
├─────────────────────────────────────┤
│  Image Layer 2 (system libraries)   │ (shared, 300MB)
├─────────────────────────────────────┤
│  Image Layer 1 (base OS)            │ (shared, 200MB)
└─────────────────────────────────────┘

Result: Two containers using only 216MB total
        instead of 412MB each (if duplicated)

Docker Image Layers

Every Docker image is built as a series of layers. Understanding layers is crucial for writing efficient Dockerfiles and optimizing build times.

How Layers Work

Each instruction in a Dockerfile creates a new layer. The complete image is a union of all layers stacked on top of each other.

Example Dockerfile and its layers:

FROM ubuntu:22.04              # Layer 1: Base OS (200MB)
RUN apt-get update             # Layer 2: Package index updates
RUN apt-get install -y python3 # Layer 3: Python runtime (150MB)
COPY app.py /app/              # Layer 4: Application code (50KB)
CMD ["python3", "/app/app.py"] # Layer 5: Metadata (no size)

Results in 5 layers totaling ~350MB.

Docker Layer Caching

Docker caches each layer to speed up subsequent builds:

Build 1: FROM ubuntu → RUN apt-get → RUN install python3 → COPY app.py
         (first time, all steps run)

Build 2: FROM ubuntu → RUN apt-get → RUN install python3 → COPY app.py
         (cache hit for first 3 layers, only COPY runs)

Cache invalidation rules:

If a layer changes, all subsequent layers rebuild (even if unchanged)
A layer is considered "changed" if:
- The Dockerfile instruction changes
- Any ADD/COPY source files change
- Environment variables change
- Parent image updates

Optimizing Layer Order

Wrong (inefficient):

FROM python:3.11
COPY . /app               # COPY changes frequently
RUN pip install -r requirements.txt  # Cache misses every time
WORKDIR /app
CMD ["python", "app.py"]

Each code change invalidates the pip install cache, forcing a full reinstall.

Right (efficient):

FROM python:3.11
WORKDIR /app
COPY requirements.txt .   # Separate from code
RUN pip install -r requirements.txt  # Cache persists
COPY . /app               # COPY code last (most frequent changes)
CMD ["python", "app.py"]

Now dependency installs are cached unless requirements.txt changes.

Viewing Image Layers

# See all layers of an image
docker history nginx:latest

# See layer sizes
docker history --human nginx:latest

# Inspect detailed layer information
docker inspect nginx:latest

Docker Registry & Docker Hub

A Docker registry is a centralized service that stores and distributes Docker images. It's the equivalent of GitHub for container images.

Types of Registries

Public Registries

Docker Hub (default): Millions of community images, free tier available
GitHub Container Registry (ghcr.io): GitHub-integrated registry
GitLab Registry: GitLab CI/CD integrated
Quay.io: Red Hat's container registry with advanced features

Private Registries

Docker Registry: Open-source registry you can self-host
Amazon ECR: AWS Elastic Container Registry
Azure Container Registry: Microsoft's registry
Google Artifact Registry: GCP's registry
Harbor: Enterprise-grade open-source registry

Docker Hub

Docker Hub is the default public registry with over 14 million images. It's built into Docker and requires zero configuration for basic use.

Official images have the Docker seal of approval:

Maintained by Docker or vendor
Follow best practices
Regularly scanned for vulnerabilities
Examples: ubuntu, nginx, python, postgres

Pulling Images

# Pull from Docker Hub (default)
docker pull nginx

# Pull specific version
docker pull nginx:1.25-alpine

# Pull from private registry
docker pull myregistry.azurecr.io/myapp:v1.0

# Pull all tags of an image
docker pull -a myimage

Pushing Images

Before pushing, you must:

Tag your image with the registry URL
Authenticate with the registry
Push the image

# Tag image for Docker Hub
docker tag myapp:latest yourname/myapp:latest

# Login to Docker Hub
docker login

# Push to Docker Hub
docker push yourname/myapp:latest

# Push to private registry
docker tag myapp:v1.0 myregistry.azurecr.io/myapp:v1.0
docker login myregistry.azurecr.io
docker push myregistry.azurecr.io/myapp:v1.0

Image Naming Convention

Docker image names follow this format:

[REGISTRY]/[NAMESPACE]/[REPOSITORY]:[TAG]

Examples:
- nginx                          (Docker Hub official)
- dockerhub.io/library/nginx     (fully qualified)
- myregistry.io/team/myapp:v1.0  (private registry)
- ghcr.io/yourname/project:main  (GitHub registry)

Components:

Registry: Where the image is stored (defaults to Docker Hub)
Namespace: Usually your username or organization
Repository: The image name
Tag: Version identifier (defaults to latest)

Why Docker? The Business Case

Organizations adopt Docker for concrete, measurable benefits:

1. Consistency Across Environments

Problem: "Works on my machine but not in production"

Solution: Docker guarantees identical environments from development → testing → staging → production.

Impact:

Eliminate environment-related bugs
Reduce deployment failures
Speed up problem diagnosis

2. Microservices Enablement

Docker's lightweight nature makes microservices architectures practical:

Deploy dozens of independent services per server
Scale individual services based on demand
Develop and deploy services independently
Isolate failures to specific services

Impact:

Faster feature deployment (minutes instead of hours/days)
Improved system resilience
Better resource utilization

3. Resource Efficiency

Containers use significantly fewer resources than VMs:

Example: Running 100 application instances

VMs (2GB each):
- 100 × 2GB = 200GB required
- 100 boot processes
- High licensing costs for OS

Containers (100MB each):
- 100 × 100MB = 10GB required (20× more efficient)
- Instant startup
- Single OS license

Impact:

Reduce infrastructure costs by 50-70%
Run more workloads per server
Lower cloud bills

4. Faster Deployments

Containers enable rapid, reliable deployments:

Traditional VM: Deploy → Boot → Start app → Health check = 2-5 minutes
Container:      Deploy → Start app → Health check = 10-30 seconds

Impact:

Reduce deployment time 90%
Enable multiple deployments per day
Faster time-to-market for features

5. Developer Productivity

Developers can:

Run production-like environments locally
Test infrastructure changes
Collaborate with consistent environments
Onboard new team members faster

Impact:

Reduced "works on my machine" issues
Faster development cycle
Better code quality

6. Operational Simplicity

Standardized units simplify operations:

Same deployment process for all applications
Reduced operational complexity
Better logging and monitoring
Easier troubleshooting

Impact:

Reduce operational overhead
Faster incident resolution
Improved system reliability

Microservices Architecture with Docker

Docker is the enabling technology for microservices. Let's understand how.

Monolithic vs Microservices

Monolithic Architecture:

┌────────────────────────────┐
│  E-Commerce Application    │
├────────────────────────────┤
│ ├─ User Management         │
│ ├─ Product Catalog         │
│ ├─ Shopping Cart           │
│ ├─ Payment Processing      │
│ ├─ Order Management        │
│ ├─ Inventory               │
│ └─ Shipping                │
└────────────────────────────┘
     Single Database
     Single Deployment
     Single Technology Stack

Problems:

One bug can bring down the entire system
Scaling: must scale entire app, not individual components
Technology lock-in: all services use same language/framework
Deployment: any change requires testing the entire application
Development bottleneck: large teams step on each other's toes

Microservices Architecture:

┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│   User      │  │   Product   │  │   Payment   │
│ Service     │  │  Service    │  │  Service    │
│ (Node.js)   │  │  (Python)   │  │  (Java)     │
└─────────────┘  └─────────────┘  └─────────────┘
      │                │                │
      ▼                ▼                ▼
   User DB         Product DB      Payment DB

┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│   Cart      │  │   Order     │  │ Inventory   │
│ Service     │  │  Service    │  │  Service    │
│ (Go)        │  │  (Node.js)  │  │  (Python)   │
└─────────────┘  └─────────────┘  └─────────────┘
      │                │                │
      ▼                ▼                ▼
   Cart DB         Order DB        Inventory DB

Advantages:

Independent Scaling: Scale only the Payment Service during sales
Technology Freedom: Use Python for ML tasks, Go for high-performance services, Java for enterprise features
Rapid Deployment: Deploy User Service without testing Cart Service
Resilience: Payment Service failure doesn't crash the entire site
Team Autonomy: Different teams own different services

How Docker Enables Microservices

Consistency: Each service runs in its own container with identical behavior everywhere
Lightweight: Hundreds of containers run on a single server
Fast Startup: Services start in milliseconds, enabling dynamic scaling
Isolation: Services can't interfere with each other
Deployment: Same deployment process for all services

Example: E-Commerce Microservices Deployment

# User Service (Node.js)
docker run -d --name user-service \
  -p 3001:3000 \
  -e DB_HOST=postgres \
  user-service:v1.0

# Product Service (Python)
docker run -d --name product-service \
  -p 3002:3000 \
  -e DB_HOST=postgres \
  product-service:v1.0

# Payment Service (Java)
docker run -d --name payment-service \
  -p 3003:8080 \
  -e DB_HOST=postgres \
  payment-service:v1.0

# All three services run independently
# Each can be scaled, updated, or restarted independently

Docker Engine Components

The Docker Engine is composed of three essential components that work together to create and manage containers.

1. Docker Daemon (dockerd)

The server process that manages Docker objects and operations.

Responsibilities:

Listen for Docker API requests from the CLI or SDKs
Manage images (build, pull, store)
Manage containers (create, start, stop, delete)
Manage networks and volumes
Handle authentication and image signing
Coordinate with containerd

Lifecycle:

# Start Docker Daemon (usually started automatically)
sudo systemctl start docker

# Check if daemon is running
docker version

# View daemon logs
journalctl -u docker -f

Configuration: Daemon settings in /etc/docker/daemon.json:

{
  "debug": false,
  "storage-driver": "overlay2",
  "log-driver": "json-file",
  "max-concurrent-downloads": 5,
  "max-concurrent-uploads": 5,
  "insecure-registries": ["myregistry.io"]
}

2. Docker Client

The CLI tool you use to interact with Docker. Sends commands to the Docker Daemon.

Common commands:

docker build    # Build images from Dockerfile
docker run      # Create and run containers
docker pull     # Download images from registry
docker push     # Upload images to registry
docker logs     # View container output
docker exec     # Run commands inside containers
docker ps       # List running containers
docker images   # List available images
docker network  # Manage networks
docker volume   # Manage volumes
docker compose  # Multi-container applications

Client-Daemon Communication: By default, clients connect to /var/run/docker.sock (Unix socket) on the same machine. For remote connections:

# Connect to remote Docker daemon
docker -H tcp://remote-host:2375 ps

# Set DOCKER_HOST environment variable
export DOCKER_HOST=tcp://remote-host:2375

3. REST API

Underlying protocol that enables client-daemon communication.

Access the API directly:

# List containers via API
curl --unix-socket /var/run/docker.sock \
  http://localhost/v1.40/containers/json

# Inspect an image
curl --unix-socket /var/run/docker.sock \
  http://localhost/v1.40/images/nginx/json

API Versions: Docker maintains multiple API versions for backward compatibility. Current version is typically 1.40+. Check with:

docker version
# Shows both Client API and Server API versions

Component Interaction Flow

┌──────────────────────────────────────────────┐
│         Docker Client (CLI)                  │
│  docker run -d --name web nginx              │
└──────────────┬───────────────────────────────┘
               │ REST API
               │ POST /containers/create
               │
┌──────────────▼───────────────────────────────┐
│      Docker Daemon (dockerd)                 │
│  ├─ Validate request                         │
│  ├─ Check if nginx:latest image exists       │
│  ├─ If not, pull from registry               │
│  └─ Delegate to containerd                   │
└──────────────┬───────────────────────────────┘
               │ Native API
               │
┌──────────────▼───────────────────────────────┐
│        containerd                            │
│  ├─ Prepare container rootfs                 │
│  ├─ Set up cgroups                           │
│  ├─ Create namespaces                        │
│  └─ Delegate process creation to runc        │
└──────────────┬───────────────────────────────┘
               │ OCI Runtime Spec
               │
┌──────────────▼───────────────────────────────┐
│        runc (OCI Runtime)                    │
│  ├─ Create Linux namespaces                  │
│  ├─ Apply cgroup limits                      │
│  ├─ Mount filesystems                        │
│  └─ Execute container process                │
└──────────────┬───────────────────────────────┘
               │
┌──────────────▼───────────────────────────────┐
│      Linux Kernel                            │
│  Container process runs isolated             │
└───────────────────────────────────────────────┘

Quick Start: Your First Containers

Get hands-on with Docker in minutes.

1. Verify Docker is Installed

docker --version
docker run hello-world

The hello-world container confirms Docker works correctly.

2. Run an Interactive Container

Explore Ubuntu's environment interactively:

docker run -it ubuntu bash

Inside the container, you can:

apt update
apt install -y curl
curl https://example.com
exit  # Exit the container

Flags explained:

-it = Interactive terminal (attach stdin/stdout)
ubuntu = Image to run
bash = Command to execute

3. Run a Web Server

Start an Nginx web server:

docker run -d -p 8080:80 nginx

Then visit http://localhost:8080 in your browser.

Flags explained:

-d = Detached (run in background)
-p 8080:80 = Map port 8080 on host to port 80 in container
nginx = Image to run

Verify it's running:

docker ps  # See all running containers

Stop the container:

docker stop <container_id>

4. Run a Database Container

Start a PostgreSQL database:

docker run -d \
  --name postgres-db \
  -e POSTGRES_PASSWORD=mysecret \
  -p 5432:5432 \
  postgres:15

Connect to it:

docker exec -it postgres-db psql -U postgres
# Now you're inside the PostgreSQL CLI

5. View Container Logs

# See last 100 lines of logs
docker logs --tail 100 postgres-db

# Follow logs in real-time
docker logs -f postgres-db

# Exit with Ctrl+C

6. Inspect Running Containers

# List all running containers
docker ps

# List all containers (including stopped)
docker ps -a

# Get detailed information about a container
docker inspect <container_id>

# View resource usage
docker stats

7. Clean Up

# Stop a container
docker stop <container_id>

# Remove a stopped container
docker rm <container_id>

# Remove an image
docker rmi nginx

# Remove all stopped containers
docker container prune

# Remove dangling images
docker image prune

What's Next?

You now understand Docker fundamentals. Continue your learning journey:

Docker Images — Build custom images with Dockerfiles
Container Networking — Connect multiple containers
Docker Volumes — Persist data across containers
Docker Compose — Multi-container applications
Best Practices — Write efficient, secure Dockerfiles
Docker Security — Secure your containers in production

Recommended Learning Path

Build and run your first custom image
Connect multiple containers with networks
Use Docker Compose to orchestrate multi-container apps
Deploy to a container orchestration platform (Kubernetes)

Summary

Docker revolutionizes application deployment by solving the "works on my machine" problem through containerization. Understanding images as templates, containers as instances, and the layered architecture enables you to build, ship, and run applications reliably across any environment. The Docker Engine's client-server architecture ensures scalability and consistency, while microservices architectures leverage Docker's lightweight nature to create resilient, independently deployable systems.

What is Docker?​

The "Works on My Machine" Problem​

How Docker Works: The Container Approach​

Containers vs Virtual Machines​

Detailed Comparison​

Visual Architecture Comparison​

Docker Architecture​

Core Components​

1. Docker Client​

2. Docker Daemon (dockerd)​

3. Docker Engine​

4. containerd​

5. runc​

6. Docker Registry​

Docker Architecture Diagram​

Communication Flow​

Images vs Containers​

Images: The Blueprint​

Containers: The Running Instance​

The Relationship Visualized​

Union File System and Copy-on-Write​

Docker Image Layers​

How Layers Work​

Docker Layer Caching​

Optimizing Layer Order​

Viewing Image Layers​

Docker Registry & Docker Hub​

Types of Registries​

Public Registries​

Private Registries​

Docker Hub​

Pulling Images​

Pushing Images​

Image Naming Convention​

Why Docker? The Business Case​

1. Consistency Across Environments​

2. Microservices Enablement​

3. Resource Efficiency​

4. Faster Deployments​

5. Developer Productivity​

6. Operational Simplicity​

Microservices Architecture with Docker​

Monolithic vs Microservices​

How Docker Enables Microservices​

Example: E-Commerce Microservices Deployment​

Docker Engine Components​

1. Docker Daemon (dockerd)​

2. Docker Client​

3. REST API​

Component Interaction Flow​

Quick Start: Your First Containers​

1. Verify Docker is Installed​

2. Run an Interactive Container​

3. Run a Web Server​

4. Run a Database Container​

5. View Container Logs​

6. Inspect Running Containers​

7. Clean Up​

What's Next?​

Recommended Learning Path​

Summary​

What is Docker?

The "Works on My Machine" Problem

How Docker Works: The Container Approach

Containers vs Virtual Machines

Detailed Comparison

Visual Architecture Comparison

Docker Architecture

Core Components

1. Docker Client

2. Docker Daemon (dockerd)

3. Docker Engine

4. containerd

5. runc

6. Docker Registry

Docker Architecture Diagram

Communication Flow

Images vs Containers

Images: The Blueprint

Containers: The Running Instance

The Relationship Visualized

Union File System and Copy-on-Write

Docker Image Layers

How Layers Work

Docker Layer Caching

Optimizing Layer Order

Viewing Image Layers

Docker Registry & Docker Hub

Types of Registries

Public Registries

Private Registries

Docker Hub

Pulling Images

Pushing Images

Image Naming Convention

Why Docker? The Business Case

1. Consistency Across Environments

2. Microservices Enablement

3. Resource Efficiency

4. Faster Deployments

5. Developer Productivity

6. Operational Simplicity

Microservices Architecture with Docker

Monolithic vs Microservices

How Docker Enables Microservices

Example: E-Commerce Microservices Deployment

Docker Engine Components

1. Docker Daemon (dockerd)

2. Docker Client

3. REST API

Component Interaction Flow

Quick Start: Your First Containers

1. Verify Docker is Installed

2. Run an Interactive Container

3. Run a Web Server

4. Run a Database Container

5. View Container Logs

6. Inspect Running Containers

7. Clean Up

What's Next?

Recommended Learning Path

Summary