Docker Under the Hood: Why Your Containers Are Just Glorified Linux Processes (And Where They Break) + Video

Listen to this Post

Featured Image

Introduction

The Docker container you spin up with a single command is often treated as a magical black box—but under the hood, it’s just a regular Linux process with some clever kernel tricks. Understanding this architecture isn’t just academic; it’s critical for diagnosing production failures where resource limits, networking, and filesystem layers reveal the true nature of containerization.

Learning Objectives

  • Understand the complete container lifecycle from Docker CLI to running process.
  • Master the isolation technologies: Linux namespaces, cgroups, and union filesystems.
  • Diagnose common production container failures across networking, resource limits, and Windows-specific quirks.

You Should Know

  1. The Container Lifecycle: From CLI to Kernel Process
    The journey from `docker run nginx` to a running web server involves a sophisticated chain of components. Here’s the step-by-step breakdown:

Step 1: API Call to Docker Daemon

The Docker CLI translates your command into a REST API request sent to the Docker daemon (dockerd). This daemon listens on a Unix socket (/var/run/docker.sock) by default, though it can also be exposed over TCP for remote management.

 Verify the Docker daemon is listening
curl --unix-socket /var/run/docker.sock http://localhost/version

Step 2: Image Verification and Pull

`dockerd` checks if the requested image (e.g., nginx:latest) exists locally. If not, it initiates a pull from a container registry like Docker Hub, Amazon ECR, or Google Container Registry. The image is stored as a series of layer tarballs, downloaded via HTTPS.

 Pull an image manually
docker pull nginx:latest

Inspect image layers
docker history nginx:latest

Step 3: Configuration Preparation

The daemon creates a container configuration object that includes environment variables, volumes, networking mode, and resource limits from your command. This configuration is written as a JSON file, forming part of the OCI (Open Container Initiative) runtime bundle.

Step 4: containerd Takes Over

`dockerd` doesn’t start the container directly. Instead, it hands the request to containerd, the industry-standard container runtime. `containerd` manages the full lifecycle, handling image pulling, storage, and execution.

 Check containerd status (systemd-based systems)
systemctl status containerd

Step 5: runc Creates the Container

`containerd` prepares the runtime bundle—a directory containing an OCI `config.json` and the root filesystem. It then calls runc, the low-level runtime that actually creates and starts the container process. `runc` reads the config, creates namespaces and cgroups, and executes the process. Once the process is running, `runc` exits.

 View runc version
runc --version

Explore the OCI config of a running container
docker inspect <container-id> | jq '.[bash].Config'

Step 6: The Running Process

What you have now is a Linux process with isolated namespaces (PID, network, mount, UTS, IPC, user) and constrained resource usage. There’s no virtual machine, no guest OS—just kernel features creating the illusion of isolation.

2. Deep Dive: Namespaces and Cgroups

Docker’s “magic” is entirely built on Linux kernel primitives. Understanding these will help you debug the most common production issues.

Linux Namespaces: The Isolation Primitives

Namespaces provide process isolation. Each namespace wraps a global system resource in an abstraction that makes processes appear to have their own isolated instance.

 List all namespaces on the system
lsns

View network namespaces
ip netns list

Enter a container's network namespace (requires root)
nsenter -t <pid> -1 ip addr

Key Namespace Types:

  • PID namespace: Processes inside see only their own process tree.
  • NET namespace: Provides isolated network interfaces, routes, and firewall rules.
  • MNT namespace: Mount points like /proc, `/sys` are container-private.
  • UTS namespace: Isolates hostname and NIS domain.
  • IPC namespace: Inter-process communication resources (e.g., System V semaphores).
  • USER namespace: Maps container UIDs/GIDs to host UIDs/GIDs.

Cgroups: Resource Enforcement

Control groups limit and meter resource usage. When you set `–memory=512m` or --cpus=2, these settings are translated into cgroup constraints.

 View container cgroup hierarchy (legacy cgroup v1)
cat /sys/fs/cgroup/memory/docker/<container-id>/memory.limit_in_bytes

View cgroup v2 (modern systems)
cat /sys/fs/cgroup/system.slice/docker-<container-id>.scope/memory.max

Monitor resource usage in real-time
docker stats --1o-stream

Understanding Resource Limit Failures:

When a container exceeds its memory limit, the kernel OOM (Out-Of-Memory) killer may terminate processes. CPU limits cause throttling, leading to degraded performance. For Windows containers, specific quirks emerge, as noted by Bernardo Meireles Correa: “The process wedges against its commit ceiling while working set still looks fine, so health checks return 200 on a container that’s already gone.”

Windows Container Debugging:

 View container memory commit on Windows
docker run --memory=1g --memory-reservation=512m mcr.microsoft.com/windows/servercore:ltsc2022

Check process working set via PowerShell inside container
Get-Process | Select-Object -Property Name, WorkingSet, PeakWorkingSet

3. Networking: The Common Pitfall

Networking issues are a top complaint in production environments. Docker creates virtual network interfaces, bridges, and iptables/NFTables rules.

The Bridge Network

By default, containers connect to the `bridge` network, where a virtual `docker0` interface routes traffic between containers and the host.

 Inspect the default bridge network
docker network inspect bridge

View iptables rules (showing NAT and forwarding)
sudo iptables -L -1 -v -t nat

Troubleshooting Networking Failures

Issue 1: Port Binding Conflicts

 Check which process is using port 80
sudo netstat -tulpn | grep :80
 Or using ss
ss -tulpn | grep :80

Issue 2: DNS Resolution Problems

Docker containers use the host’s DNS configuration by default. If DNS fails, check the `/etc/resolv.conf` inside the container.

docker exec <container-id> cat /etc/resolv.conf

Issue 3: Overlay Network MTU

When using overlay networks in Swarm or Kubernetes, MTU mismatches can cause packet loss. Verify the MTU of your physical interfaces and set the `–mtu` flag accordingly.

Container-to-Container Communication

Use user-defined bridge networks for automatic DNS resolution between containers.

 Create a custom network
docker network create my-app-1et

Run containers on this network
docker run -d --1etwork my-app-1et --1ame api my-api:latest
docker run -d --1etwork my-app-1et --1ame web my-web:latest

4. The Union Filesystem: Layers and Storage

The container’s filesystem is a stack of read-only image layers with a writable container layer on top, known as a union filesystem (OverlayFS, AUFS, or Devicemapper).

Understanding OverlayFS

OverlayFS combines multiple directories into a single unified view. The lower directories are read-only, and the upper directory is writable.

 View the overlay mount points
mount | grep overlay

Inspect the storage driver and layers
docker info | grep "Storage Driver"

Copy-on-Write (CoW) Behavior

When you modify a file inside a container, the file is copied from the read-only layer to the writable layer, and subsequent writes go to the writable layer. This means image layers remain unchanged.

Persistent Storage:

Volumes are the preferred way to persist data, as they bypass the union filesystem and provide better performance.

 Create a named volume
docker volume create my-data

Mount the volume
docker run -v my-data:/app/data my-app:latest

Common Filesystem Issues

  • Disk Space Exhaustion: Write-heavy containers can fill the `/var/lib/docker` partition. Monitor usage and clean up unused layers.
    docker system df
    docker system prune -a
    
  • File Permission Errors: Due to UID/GID mapping, especially when mounting host directories. Use the `:z` or `:Z` context labels on SELinux systems.
  • Storage Driver Performance: Devicemapper in loopback mode is slow. OverlayFS is recommended.

5. Production-Ready Observability and Debugging

Ashley Nicholson noted: “Container abstraction hides complexity, but production issues often surface in networking and resource constraints, where observability and runtime understanding become critical.” Here’s a toolkit for deep inspection.

Inspecting a Running Container

 General container info
docker inspect <container-id>

Resource usage
docker stats <container-id>

Process list
docker top <container-id>

View logs
docker logs -f <container-id>

Using `nsenter` for Kernel-Level Inspection

`nsenter` lets you execute commands in the context of another process’s namespaces, providing powerful debugging capabilities.

 Get the container's PID
PID=$(docker inspect -f '{{.State.Pid}}' <container-id>)

Enter the container's network namespace
sudo nsenter -t $PID -1

Check network stats within the container's network stack
ip -s link
ss -tunap

Cgroup and System Analysis

 View cgroup v2 statistics for a container (systemd cgroup v2)
cat /sys/fs/cgroup/system.slice/docker-<container-id>.scope/cpu.stat

Memory statistics
cat /sys/fs/cgroup/system.slice/docker-<container-id>.scope/memory.stat

Profiling Resource Limits

Set alerts on metrics like container CPU throttling, memory usage, and network packet drops. Use Prometheus with cAdvisor for full monitoring.

6. Security and Hardening

Isolation is not security. Docker containers share the host kernel, so a kernel exploit can compromise all containers.

Security Best Practices

  1. Drop All Capabilities: Start with `–cap-drop=ALL` and add only necessary capabilities.
  2. Run as Non-Root User: Use the `USER` instruction in the Dockerfile.
  3. Read-Only Root Filesystem: Use `–read-only` to prevent writes to the container filesystem.
  4. Seccomp and AppArmor Profiles: Restrict system calls using seccomp.
    docker run --security-opt seccomp=/path/to/profile.json nginx
    

5. No Privileged Containers: Avoid `–privileged` mode.

Validating Security Configuration

 Check container's capabilities
docker inspect <container-id> | grep CapAdd

List all seccomp syscalls allowed for a container (requires tools)
 Using Docker's default seccomp profile
docker run --rm -it --security-opt seccomp=unconfined ubuntu /bin/bash

What Undercode Say

  • Docker is a Process Manager, Not a Virtual Machine: Understanding this fundamental truth separates those who merely use Docker from those who can debug it under pressure.
  • Production Failure ≠ Code Failure: Most container issues relate to resource quotas and networking misconfigurations, not application logic. Observability tools like nsenter, docker inspect, and cgroup stats are essential.

Analysis

The container ecosystem has shifted from “it works on my machine” to “it broke in production under load.” Saqib Shah correctly notes that the classic failure occurs when “the kernel caps the process.” This highlights a critical gap—development environments rarely stress test resource limits. Meanwhile, Windows containers introduce additional complexity with memory commit limits and working set tracking. The underlying trend is clear: as organizations adopt microservices, the need for sysadmin-level understanding of Linux primitives is re-emerging. This isn’t a regression but an evolution—operations are now a core developer skill.

Prediction

  • -1: In the next year, we’ll see a surge in OOM-killed containers as developers migrate more stateful workloads to Kubernetes without proper memory profiling. This will lead to significant production outages.
  • +1: The increasing maturity of eBPF (Extended Berkeley Packet Filter) will provide zero-overhead observability into container namespaces and cgroups, making issues like the Windows commit ceiling transparent and allowing for proactive auto-scaling.
  • -1: The rise of AI-assisted coding will exacerbate the issue, as generated code will lack the nuanced understanding of container resource constraints, leading to “hallucinated” configurations that perform well in static tests but fail under dynamic load.
  • +1: Platform engineering teams will invest heavily in chaos engineering and load testing pipelines, embedding resource limit validation into CI/CD, shifting the “it works locally” failure leftward in the development lifecycle.
  • -1: As containerization expands to more edge and IoT devices, the lack of robust cgroup v2 support on older kernels will fragment security and resource management, increasing the attack surface and operational complexity.

▶️ Related Video (76% Match):

🎯Let’s Practice For Free:

🎓 Live Courses & Certifications:

Join Undercode Academy for Verified Certifications

🚀 Request a Custom Project:

Secure, high-velocity infrastructure and disruptive technological engineering. Contact our engineering team for high-tier development and proprietary systems:
[email protected]
💎 Smart Architecture | 🛡️ Secure by Design | ⭐ Trusted by Thousands

IT/Security Reporter URL:

Reported By: Alexxubyte Systemdesign – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

💬 Whatsapp | 💬 Telegram

📢 Follow UndercodeTesting & Stay Tuned:

𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin | 🦋BlueSky