Docker is software for containerizing applications. Today, we’ll talk about what containerization and Docker are, what they are used for, and what advantages they bring.
Containerization is one of the methods of virtualization. To understand it better, let’s take a brief historical detour.
In the 1960s, computers couldn’t perform multiple tasks at once. This led to long queues for access to such rare machines. The solution was to distribute computing power among different isolated processes. That’s how the history of virtualization began.
Virtualization is the allocation of computing resources to isolated processes within a single physical device.
The main development of virtualization came during the Internet era. Imagine you’re a business owner and you want your company to have a website. You need a server connected to the global network. Today, that’s as easy as visiting hostman.com and choosing a server that fits your needs.
But in the early days of the internet, such convenient services didn’t exist. Companies had to buy and maintain servers on their own, which was inconvenient and expensive. This problem led to the rise of hosting providers: companies that purchased hardware, placed it in their facilities, and rented out servers.
As technology advanced, computers became more powerful, and dedicating a full physical server to a single website became wasteful. Virtualization helped: several isolated virtual machines could run on one computer, each hosting different websites. The technology allowed allocating exactly as many resources as each site needed.
However, that still wasn’t enough. As the internet evolved, the number of applications required for running a website grew, and each required its own dependencies. Eventually, it became “crowded” within a single virtual machine. One workaround was to host each application in its own virtual machine, a kind of virtual “matryoshka doll.” But a full VM was still excessive for a single application: it didn’t need a full OS instance. Meanwhile, virtual machines consumed a lot of resources, much of which went unused.
The solution was containerization. Instead of running a separate virtual machine for each application, developers found a way to run them in isolation within the same operating system. Each container includes the application, its dependencies, and libraries: an isolated environment that ensures consistent operation across systems.
What is a program? It’s a piece of code that must be executed by the CPU.
When you run a container, Docker (through the containerd component) creates an isolated process with its own namespace and file system. To the host system, the container looks like a regular process, while to the program inside it, everything appears as if it’s running on its own dedicated system.
Containers are isolated but can communicate with each other via networks, shared volumes, or sockets, if allowed by configuration.
Isolation from the host OS raises a natural question: how to store data?
Docker Volume: a storage unit created and managed by Docker itself. It can be located anywhere: within the host’s file system or on an external server.
Bind Mount: storage manually created by the user on the host machine, which is then mounted into containers during runtime.
tmpfs Volume: temporary in-memory storage. It is erased when the container stops.
In production environments, volumes are most commonly used, as Docker manages them more securely and reliably.
Docker’s architecture consists of several key components that work together to build, run, and manage containers:
A physical or virtual machine running the Docker Engine. This is where containers and images are executed.
The central service responsible for building, running, and managing containers. Since Docker 1.11, Docker Engine has used containerd, a low-level component that directly manages container lifecycles (creation, start, stop, and deletion).
A container runtime that interacts with the operating system kernel to execute containers. It’s used not only by Docker but also by other systems such as Kubernetes. Docker Engine communicates with containerd via an API, passing commands received from the client.
The command-line interface through which users interact with Docker. CLI commands are sent to the Docker Daemon via REST API (usually over a Unix socket or TCP).
A Docker image is a template that includes an application and all its dependencies. It’s similar to a system snapshot from which containers are created.
A text file containing instructions on how to build an image. It defines the base image, dependency installation commands, environment variables, and the application’s entry point.
A Docker container is a running instance of an image. A container is isolated from other processes and uses host resources through Docker Engine and containerd.
A repository for storing and distributing Docker images. There are public and private registries. The most popular public one is Docker Hub, which Docker connects to by default.
A tool for defining and running multi-container applications using YAML files. It allows developers to configure service dependencies, networks, and volumes for entire projects.
What does isolation provide in terms of security?
An isolated application cannot harm the host operating system.
It has no access to the host’s file system, preventing data leaks.
Any application-related crash won’t affect the host OS.
A container image can be run on any device with Docker installed.
Docker automates application deployment and configuration, saving time and reducing human error.
Docker users have access to repositories with thousands of ready-to-use images for various purposes.
Unlike virtual machines, Docker containers don’t require a separate OS instance, allowing better use of computational resources.
Now let’s move from theory to practice. The first thing we need to do is install Docker.
Installation begins at the official website: docker.com. Go to the “Get Started” section and choose the version for your operating system. In our case, it’s Windows. Installation guides for other OSs are also available. After installation, a system reboot is required.
Docker requires a hypervisor, special software that enables multiple operating systems to run simultaneously. We’ll use WSL2 (Windows Subsystem for Linux 2).
Docker installs WSL2 automatically, but you must manually download the latest Linux kernel update. Go to Microsoft’s website, download, and install the update package. After rebooting, Docker Desktop will open.
Let’s print the message “Hello, World” to the console using a simple Python script:
#!/usr/bin/python3
print("Hello World")
Since we’re not running the script directly, we need a shebang—that’s the first line in the script. In short, the shebang tells the Linux kernel how to execute the script. Let’s name our file the classic way: main.py
.
Now open the command line. To run the script, execute:
docker run -v D:\script_dir:/dir python:3 /dir/main.py
Let’s break this down:
docker run
runs a container
-v
mounts a directory (bind mount)
D:\script_dir
is the directory with our script
/dir
is the mount point inside the container
python:3
is the image
/dir/main.py
is the executable file (our script)
What happens when this command is executed? Docker searches for the python:3
image first locally, then in the registry, and deploys it. Next, it mounts our script directory into the container and runs the script inside it.
In this article, we explored what Docker is, how it works, and even ran our first script. Docker and containerization are not a cure-all, but they’re invaluable tools in modern software development.