Cisco Blogs / Learning / Exploring Default Docker Networking Part 2

September 20, 2022 93 Comments

Learning

Exploring Default Docker Networking
Part 2

17 min read

Well, hello everyone. I’m back to continue my exploration of default Docker networking. If you haven’t already read (or it has been a while since you read) Exploring Default Docker Networking Part 1 I would recommend checking it out. In that post, I explain what “Default Docker Networking” means, then put on my headlamp and climbing gear as I go deep down into the layer 1 (physical) and layer 2 (data link) aspects of container networking and how Linux networking concepts like virtual ethernet links and bridges provide the “magic.”

In this post, I’m going to continue the exploration into layer 3 (network), shining the light on how container network traffic is routed, NATed, and filtered. So finish a protein bar, take a drink from your canteen, and let’s shed some light on this next part of our journey!

Our default Docker bridge network exploration map

Part 1 ended with a network topology drawing of the container networking discussed throughout the post. I’ve expanded that topology to include details beyond the Linux bridge, veths, and containers we discussed and explored in that post by adding network details and information from outside the layer 2 space of the containers we will use to explore today.

Linux and Container Networking Topology — Here we see the docker0 network and how it connects to the Linux host network and external hosts.

The additions to this topology drawing include:

A fourth container called web has been added. This container is running a web server on port 80 that has been “published” (made available outside the container network) on port 81.
The Linux host’s primary network link, ens160, with its IP address of 172.16.211.128 has been added.
The network processing layer from Linux that provides routing, filtering, and other network functions has been added.
Two additional hosts in the lab network outside of the Linux host running the containers have been shown, along with basic connectivity indications.

A ping in the dark…

We are going to start with a simple test to determine whether we can ping from one container to the primary IP address from the Linux host hosting the container. Specifically, from C1 to ens160’s IP address of 172.16.211.128.

root@c1:/# ping 172.16.211.128
PING 172.16.211.128 (172.16.211.128) 56(84) bytes of data.
64 bytes from 172.16.211.128: icmp_seq=1 ttl=64 time=1.80 ms
64 bytes from 172.16.211.128: icmp_seq=2 ttl=64 time=0.051 ms
64 bytes from 172.16.211.128: icmp_seq=3 ttl=64 time=0.092 ms
^C
--- 172.16.211.128 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2021ms
rtt min/avg/max/mdev = 0.051/0.646/1.796/0.813 ms

The success is probably not all that surprising — or even feel that satisfying. I mean, we network engineers ping between two hosts all the time, right? So, why bother? Let me break it down for you.

First up, remember that the container and the Linux host are on 2 different layer 2 networks. The container is on 172.17.0.0/16, and the host is on 172.16.211.0/24. We would assume this type of traffic would involve routing. So let’s check the routing table on the involved devices.

The container’s table below shows us that the default route is used to reach the 172.16.211.0/24 network.

root@c1:/# ip route
default via 172.17.0.1 dev eth0 
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.0.2

And the host’s table shows that there is actually a route for 172.17.0.0/16 through the docker0 interface.

root@expert-cws:~# ip route
default via 172.16.211.2 dev ens160 proto dhcp src 172.16.211.128 metric 100 
172.16.211.0/24 dev ens160 proto kernel scope link src 172.16.211.128 
172.16.211.2 dev ens160 proto dhcp scope link src 172.16.211.128 metric 100 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

Nothing seems out of the ordinary, but don’t leave just yet. I swear there is a point to this part of the exploration.

Let’s look at the packets themselves using the Linux tool tcpdump to monitor the icmp traffic on ens160.

root@expert-cws:~# tcpdump -n -i ens160 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens160, link-type EN10MB (Ethernet), capture size 262144 bytes

I started up the capture and then issued another ping from C1 to the Linux host. But I didn’t see any packets captured on interface ens160 despite the pings being successful. See, I told you it might get interesting. ?

Let’s change our capture to the docker0 interface instead.

root@expert-cws:~# tcpdump -n -i docker0 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on docker0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:51:53.427337 IP 172.17.0.2 > 172.16.211.128: ICMP echo request, id 38, seq 1, length 64
14:51:53.427373 IP 172.16.211.128 > 172.17.0.2: ICMP echo reply, id 38, seq 1, length 64

Okay, that looks better. But, why are we seeing the traffic on the docker0 interface when the destination was the address assigned to the ens160 interface? Well, because the traffic never actually reaches the ens160 interface. The networking stack within the Linux system processes the traffic, and because it is all internal to the system, there is no need for the traffic to make it to the network link/adapter.

Then why does it show up on the docker0 interface at all? Why can’t the networking stack just process it directly and leave all “interfaces” out of it? This is because of the network isolation that is used as part of Docker networking. Recall back to Part 1; how when we ran ip link from within the container, we only saw the container interface and not the other interfaces from the host. And when we ran the command from the host, we did NOT see the container interfaces in the list. We only saw the host side of the veth pair. As a reminder, here is the command from the container host.

root@expert-cws:~# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:75:99:27 brd ff:ff:ff:ff:ff:ff
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 
link/ether 02:42:9a:0c:8a:ee brd ff:ff:ff:ff:ff:ff
97: vethb192fa8@if96: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default 
link/ether 46:10:b9:df:52:8b brd ff:ff:ff:ff:ff:ff link-netnsid 0
99: veth055569e@if98: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default 
link/ether 52:07:4f:3e:11:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 1
101: veth3a3ee0b@if100: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default 
link/ether 9e:51:13:75:53:52 brd ff:ff:ff:ff:ff:ff link-netnsid 2
105: vethd8a9fa5@if104: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default 
link/ether d2:86:8f:ab:75:0b brd ff:ff:ff:ff:ff:ff link-netnsid 3

The blue link number 97 represents the host side of the veth that connects to link number 96 on C1. But where is link number 96 in the list?

Linux network namespaces enter the story

The answer is that link number 96, the eth0 interface for C1, is in a different network namespace from the default one from the other links on the host.

Linux namespaces are an abstraction within Linux that allow system resources to be isolated from each other. Namespaces can be set up for many different types of resources including processes, mount points, and networks. In fact, these namespaces are key to how Docker containers run as isolated instances from each other and the host they run on.

We can view the network namespaces on our host with the list namespaces command.

root@expert-cws:~# lsns --type=net
NS         TYPE  NPROCS PID    USER   NETNSID    NSFS COMMAND
4026531992 net   375    1      root   unassigned /run/docker/netns/default /sbin/init maybe-ubiquity
4026532622 net   1      81590  uuidd  unassigned /usr/sbin/uuidd --socket-a
4026532675 net   1      1090   rtkit  unassigned /usr/libexec/rtkit-daemon
4026532749 net   2      134263 expert unassigned /usr/share/code/code --typ
4026532808 net   1      267673 root   0          /run/docker/netns/74fa6636a15f bash
4026532872 net   1      267755 root   1          /run/docker/netns/e12672b07df8 bash
4026532921 net   6      133573 expert unassigned /opt/google/chrome/chrome 
4026532976 net   1      133575 expert unassigned /opt/google/chrome/nacl_he
4026533050 net   1      267840 root   2          /run/docker/netns/5cab1255c9ae bash
4026533115 net   1      268958 root   3          /run/docker/netns/c54dcb1bd674 /bin/bash

Each of the entries in the list colored blue represents one of the four containers that we are running now, with the PID column identifying the specific process tied to the unique container. We can determine the PID for a container by inspecting it.

root@expert-cws:~# docker inspect c1 | jq .[0].State.Pid
267673

And with that, we can now run the command to view the network links from within the container’s network namespace.

root@expert-cws:~# nsenter -t 267673 -n ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
96: eth0@if97: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 
link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0

And BAM! There we have link number 96.

And this brings us back to the question: why didn’t the host network stack process the ping directly from the container? Why do we see the traffic on the docker0 interface? Because the networking “stack” is really the network namespace. And the container’s network namespace where the ping originated is different from the default network namespace where the IP address for the ens160 interface resides. It is the virtual ethernet “cable” that allows traffic from the container namespace to reach the default namespace, through the docker0 interface. And once the traffic arrives in the docker0 interface, the networking stack can now process the request and send the reply, all through the docker0 interface.

Pinging beyond the gates… er host

So we’ve now seen how network isolation is accomplished with Linux namespaces and the impact on the interfaces involved in the network processing of traffic. For our next test, let’s send traffic outside of the Linux host where our containers are running, and send a ping to another Host01 from the network topology.

root@c1:/# ping -c 1 172.16.211.1
PING 172.16.211.1 (172.16.211.1) 56(84) bytes of data.
64 bytes from 172.16.211.1: icmp_seq=1 ttl=63 time=0.271 ms

--- 172.16.211.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.271/0.271/0.271/0.000 ms

I actually sent a single ping packet from the container, and we can see that it was successful. Before sending the ping I started up a packet capture on both the docker0 and ens160 interfaces to capture the traffic along the way and compare the differences as the traffic arrived in the default network namespace from the container and as it was sent out from the host towards its destination (as well as the return trip).

# Capture on the docker0 interface 
root@expert-cws:~# tcpdump -n -i docker0 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on docker0, link-type EN10MB (Ethernet), capture size 262144 bytes

17:11:13.047823 IP 172.17.0.2 > 172.16.211.1: ICMP echo request, id 41, seq 1, length 64
17:11:13.048061 IP 172.16.211.1 > 172.17.0.2: ICMP echo reply, id 41, seq 1, length 64

# Capture on the ens160 interface
root@expert-cws:~# tcpdump -n -i ens160 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens160, link-type EN10MB (Ethernet), capture size 262144 bytes

17:11:13.047856 IP 172.16.211.128 > 172.16.211.1: ICMP echo request, id 41, seq 1, length 64
17:11:13.048024 IP 172.16.211.1 > 172.16.211.128: ICMP echo reply, id 41, seq 1, length 64

Take a look at the output above. The blue lines are the echo requests sent from the container, and the green lines are the echo replies from the other host. The bold purple addresses in the requests represent the source addresses from the container, and the bold orange addresses indicate the destination addresses for the reply packets. On the docker0 captures, the addresses shown are the IP addresses assigned to the C1 interface — this would be expected. However, on the ens160 capture, the addresses have been translated to the IP address of the Linux host machine’s ens160 interface.

That is right, our old friend Network Address Translation (NAT) shows up in container networking as well. In fact, so does NAT’s very useful cousin PAT (Port Address Translation), but I’m getting ahead of myself.

Entering the Docker networking story… iptables!

The networks created by Docker to support a bridge-type network are built to be private and to leverage IP address space that is NOT reachable from outside the Docker-managed network. However, many services deployed and managed with Docker do require connectivity beyond the small number of containers making up the service and running on the host. Docker leverages the same network concept used elsewhere to solve this problem, Network (and Port) Address Translation (NAT/PAT). And similar to how we’ve seen Docker leveraging Linux elements like bridges and namespaces, Docker makes use of iptables to perform the address translation and filtering involved here as well.

Before we start looking at how iptables are involved in these traffic flows, I wanted to give a quick caveat. Network traffic processing and flow through the underbelly of Linux is a complicated topic, and iptables is both a powerful and complicated tool. I plan to break down the topic here in the blog to describe and explain what is happening under the hood of Docker networking in as simple and clear a way as possible. But a thorough exploration of iptables and Linux networking would be worthy of several blog posts on their own.

With iptables, rules are created and applied to the processing of network traffic as it is handled by Linux. These rules are applied at different points in the processing of traffic to accomplish a number of different tasks. Rules can be applied:

Before any routing decision is made (PREROUTING).
As traffic destined for the local host arrives (INPUT).
As traffic created by the local host is sent (OUTPUT).
As traffic “passing through” the local host is processed (FORWARD).
After the routing decision is made (POSTROUTING).

And the rules that are written can do a number of things to the traffic.

Traffic can be blocked/denied.
Traffic can be allowed/permitted.
Traffic can be redirected elsewhere.
Traffic can have its source or destination addresses changed (NAT/PAT).

Rules are added to one of the “tables” that iptables manages. The two tables worth mentioning now are the filter and the nat tables. The filter table creates rules primarily concerned with whether traffic is allowed or blocked, while the nat table has rules related to address translation. Let’s look at the nat table and see if we can find what caused the translation of the ICMP traffic from our example.

root@expert-cws:~# iptables -L -v -t nat
Chain PREROUTING (policy ACCEPT 251 packets, 67083 bytes)
 pkts bytes target     prot opt in     out     source               destination         
   18  1292 DOCKER     all  --  any    any     anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 247 packets, 66747 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 19468 packets, 1264K bytes)
 pkts bytes target     prot opt in     out     source               destination         
    3   252 DOCKER     all  --  any    any     anywhere            !localhost/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 19472 packets, 1264K bytes)
 pkts bytes target     prot opt in     out     source               destination         
   31  1925 MASQUERADE  all  --  any    !docker0  172.17.0.0/16        anywhere            

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination         
    7   588 RETURN     all  --  docker0 any     anywhere             anywhere

Look at the rule in the POSTROUTING table colored blue. This is the rule that caused the translation we saw. Now, let’s break down the parts of the rule that are used to match traffic to process.

protocol = all
- Match traffic of any protocol type
in = any / out = !docker0
- Match traffic coming IN any interface and going OUT any interface other than docker0
- Traffic going OUT docker0 would be sent towards a container
source = anywhere / destination = anywhere
- Match traffic from or to any address

The “target = MASQUERADE” part describes the action this rule will take. You might be more familiar with the actions like DROP or ACCEPT that show up on the filter table, but the NAT table was a different set of targets that indicate the type of translation that will occur. MASQUERADE is a type of source address translation (SNAT) that translates the source network address of the traffic to the address assigned to the interface the traffic has been routed OUT.

Consider the echo request sent from the container against this rule.

An echo request matches the “all protocol.”
The packet came in the docker0 interface (in = any) and will be going out the ens160 (out = !docker0).
The source and destination are certainly “anywhere.”

When the traffic was processed against this rule the MASQUERADE target/action was taken to SNAT the source address to the IP address of the ens160 interface — which is exactly what we saw happen.

Look out! There is a web (server) ahead!

So far we’ve used some ICMP traffic with ping to look at how containers can reach external networks and hosts. But, what about when a container is running a service like a web server that is designed to be available to external users? Let’s end our discussion with this example. In order to get started, we’re going to need a web server.

There is a multitude of web servers that can be run as Docker containers, but for our exploration here, I’m going to keep it very simple and use the HTTP server that is included with Python and the standard “python:3” Docker image maintained by the Python Software Foundation and Docker.

# Start the container in the background 
root@expert-cws:~# docker run -tid --rm \
  			--name web --hostname web \
  			-p 172.16.211.128:81:80 \
  			python:3 /bin/bash

# Attach to the running container 
root@expert-cws:~# docker attach web

# Start a basic web server 
root@web:/# python -m http.server 80
Serving HTTP on 0.0.0.0 port 80 (http://0.0.0.0:80/) ...

The “docker run” command should be familiar from when we ran commands in Part 1, but there is a new option included. We need to “publish” the container’s ports which need to be made available to external hosts. A container can have no ports published, or many dozens of ports depending on the unique needs of that service.

In the command above, I am publishing port 80 from the container to port 81 on the host server’s IP address of 172.16.211.128.

If I had left off the IP address to publish the service to, Docker would have made the web server available on any/all IP addresses on the underlying host. Leaving off an explicit IP address for publishing a service is common, however I find being explicit a better strategy. This is somewhat of a personal preference in application design.

I can now attempt to access the web server from Host01.

Excellent, by browsing to the IP address of the Linux host on port 81 I am greeted with a direct listing from the container where the Python web server is running.

Tracing the web traffic with packets and tables

Let’s finish our exploration today by inspecting the traffic for the incoming web traffic and the translation rules that connect things together.

We need to change up our packet capture commands to capture the web traffic on both the ens160 and docker0 interfaces. As traffic arrives at the Linux host it will be destined to tcp port 81 and translated to tcp port 80, before it is sent out to the container.

# Capture traffic from the Linux host interface
root@expert-cws:~# tcpdump -n -i ens160 'tcp port 81'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens160, link-type EN10MB (Ethernet), capture size 262144 bytes

18:34:59.147085 IP 172.16.211.1.64534 > 172.16.211.128.81: Flags [SEW], seq 3761281905, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 1727838954 ecr 0,sackOK,eol], length 0
18:34:59.147191 IP 172.16.211.128.81 > 172.16.211.1.64534: Flags [S.E], seq 3294439992, ack 3761281906, win 65160, options [mss 1460,sackOK,TS val 3650251894 ecr 1727838954,nop,wscale 7], length 0
.
.


# Capture traffic being sent to the containers 
root@expert-cws:~# tcpdump -n -i docker0 'tcp port 80'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on docker0, link-type EN10MB (Ethernet), capture size 262144 bytes


18:34:59.147133 IP 172.16.211.1.64534 > 172.17.0.5.80: Flags [SEW], seq 3761281905, win 65535, options [mss 1460,nop,wscale 6,nop,nop,TS val 1727838954 ecr 0,sackOK,eol], length 0
18:34:59.147178 IP 172.17.0.5.80 > 172.16.211.1.64534: Flags [S.E], seq 3294439992, ack 3761281906, win 65160, options [mss 1460,sackOK,TS val 3650251894 ecr 1727838954,nop,wscale 7], length 0
.
.

I have limited the output in the post above to just the start of the request where we can see the translation at work.

In the output above, the blue lines represent the initial request packet from the web browser to the server, and the green lines are the first packet sent to establish the session. By looking at the bold purple and orange addresses, you can see the destination address translation (DNAT) at work in the communications. The source addresses are left unchanged, and in fact, in the below logs from the container, you can see the IP address from Host01.

root@web:/# python -m http.server 80
Serving HTTP on 0.0.0.0 port 80 (http://0.0.0.0:80/) ...
172.16.211.1 - - [07/Sep/2022 18:30:07] "GET / HTTP/1.1" 200 -

We can once again look at the NAT table using iptables and find the rule that provides this behavior.

root@expert-cws:~# iptables -L -v -t nat -n
Chain PREROUTING (policy ACCEPT 42 packets, 4004 bytes)
 pkts bytes target     prot opt in     out     source               destination         
   38  2712 DOCKER     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 37 packets, 3584 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 15523 packets, 1010K bytes)
 pkts bytes target     prot opt in     out     source               destination         
   17  1092 DOCKER     all  --  *      *       0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT 15530 packets, 1010K bytes)
 pkts bytes target     prot opt in     out     source               destination         
   42  2849 MASQUERADE  all  --  *      !docker0  172.17.0.0/16        0.0.0.0/0           
    0     0 MASQUERADE  tcp  --  *      *       172.17.0.5           172.17.0.5           tcp dpt:80

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination         
   14  1176 RETURN     all  --  docker0 *       0.0.0.0/0            0.0.0.0/0           
    7   448 DNAT       tcp  --  !docker0 *       0.0.0.0/0            172.16.211.128       tcp dpt:81 to:172.17.0.5:80

The rule in blue has the target set to DNAT along with a destination of 172.16.211.128 and translation of “tcp dpt:81 to:172.17.0.5:80“. The rule is applied both during the PREROUTING and OUTPUT phases of network processing by using the ability within iptables to TARGET another chain in the link.

A quick stop at the filter table

There is one final stop in our exploration of the traffic flows I want to make before finishing up. So far our iptables commands have targeted the NAT table (-t nat). Let’s take a look at the filter table where the ACCEPT/DROP rules are found.

root@expert-cws:~# iptables -L -v -t filter -n
Chain INPUT (policy ACCEPT 231K packets, 26M bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
  141 15676 DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
  141 15676 DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
39461  118M ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
   13   952 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
30852 1266K ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    6   504 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT 215K packets, 18M bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    7   448 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.5           tcp dpt:80

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
30852 1266K DOCKER-ISOLATION-STAGE-2  all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
70326  120M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DROP       all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
30852 1266K RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
 pkts bytes target     prot opt in     out     source               destination         
70326  120M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0

Most of the DOCKER-related aspects found in the filter table are there to ensure the network isolation of containers. However the rule in blue that I have indicated above is key to how services are exposed from a container to the outside world. This rule will ACCEPT tcp port 80 traffic destined for 172.17.0.5 (the web container) that arrives on any interface other than docker0 and goes out interface docker0. This rule uses the container’s actual IP address and port because the filtering happens after the DNAT from the NAT table.

The light at the end of the default Docker networking journey

And so we find ourselves at the end of this exploration of the default Docker Networking. Looking around the group, I’m glad to see that we didn’t lose anyone along the way, but I know it was a close one. And you might not believe me, but even after another 3,500 words on the topic of Docker networking (a total of over 7,000 between Parts 1 and 2), there is plenty more to explore on the topic. Overlay networks, how DNS works for containers, custom network plugins, and (gasp) Kubernetes networking are all out there for you to explore!

My goal for this short series was to help give a foundation on which you can continue to build your knowledge around container networking and to make the topic less mysterious or daunting for network engineers new to it. It can be very easy to become intimidated when working through container introductions that “just work” but don’t explain “why they work” or “how they work.” If I did my job right, the magic isn’t so magical anymore.

Here are a few links to other resources worth checking out for more information on the topic.

In Season 2 of NetDevOps Live, Matt Johnson joined me to do a deep dive into container networking. His session was fantastic, and I reviewed it when getting ready for this post. I highly recommend it as another great resource.
The Docker documentation on networking is very good. I referenced it quite often when putting this post together.
The man pages for Linux namespaces and iptables are excellent resources about these important technologies that enable Docker networking
And check out the man page for tcpdump if you’d like to do more packet capturing

And as always, please let me know what you thought of this post in the comments or over on Twitter. What should I “explore” next here on the blog? Thanks for reading!

Follow Cisco Learning & Certifications

Twitter | Facebook | LinkedIn | Instagram

Use #CiscoCert to join the conversation.

Hank Preston

Principal Engineer

Learning and Certifications

93 Comments

Mrigank Shekhar Chaubey says:

September 21, 2022 at 2:52 am

nice port Hank, bookmarked
- Hank Preston says:
  
  September 21, 2022 at 5:34 am
  
  Thanks Mrigank!
Eric says:

September 21, 2022 at 5:46 pm

Excellent follow up from your part 1. I hope you plan to continue the conversation by touching on Kubernetes networking
- Hank Preston says:
  
  September 22, 2022 at 11:31 am
  
  Thanks Eric!
  
  I’m probably going to tackle another topic in the next few blogs, but I would love to dive into Kubernetes networking as well. So many topics, so little time 🙂
- Juan Carlos Bolívar López says:
  
  October 4, 2022 at 8:49 am
  
  Interesante implementar Docker en una red.
Frank says:

September 22, 2022 at 7:11 am

Linux distros are moving away from iptables to nftables, and Red Hat is promoting their Podman(default to Rootless) to replace insecure-by-default Rootful Docker, I read better IPv6 support: https://www.redhat.com/sysadmin/podman-new-network-stack
- Anonymous says:
  
  September 28, 2022 at 3:50 pm
  
  Can I join you
Yinchuan says:

September 22, 2022 at 7:28 pm

It’s time to check out podman networking
- Hank Preston says:
  
  September 23, 2022 at 5:18 am
  
  I haven’t looked at podman at all yet. Never enough time for everything :-).
- Dilwarsha says:
  
  September 28, 2022 at 3:52 pm
  
  I would like to be successful in my life with your helping also to be in you country
Kevin Diggs says:

October 4, 2022 at 9:07 am

Wow Hank! Thanks for this detailed blog entry.
- Hank Preston says:
  
  October 4, 2022 at 2:12 pm
  
  You are welcome Kevin!
Uvindhu says:

October 4, 2022 at 9:12 am

Nice Work
- Hank Preston says:
  
  October 4, 2022 at 2:12 pm
  
  Thanks so much!
Randy Gates says:

October 4, 2022 at 9:50 am

I look forward to more information – Nice Job with the explanation.
- Hank Preston says:
  
  October 4, 2022 at 2:13 pm
  
  Oh, there’s so much more information to come!
Steven J says:

October 4, 2022 at 11:11 am

Thanks
- Hank Preston says:
  
  October 4, 2022 at 2:13 pm
  
  You are welcome Steven, thanks for reading.
Jack H says:

October 4, 2022 at 12:27 pm

I look forward to more information – Nice Job with the explanation.
- Hank Preston says:
  
  October 4, 2022 at 2:14 pm
  
  You are welcome!
Alim B. says:

October 4, 2022 at 1:32 pm

Thank you for sharing!
- Hank Preston says:
  
  October 4, 2022 at 2:14 pm
  
  You are welcome! Thanks for stopping by to check it out.
Eduardo Montes de Oca Huerta says:

October 4, 2022 at 4:35 pm

Nice work
David Liang says:

October 4, 2022 at 6:10 pm

Nice, lots to learn.
Anonymous says:

October 4, 2022 at 6:19 pm

good coverage of IPtables as you say a complex topic
- Hank Preston says:
  
  October 5, 2022 at 6:03 am
  
  Thanks. It’s a decent start, but there is a lot in that topic.
Brennan Truong says:

October 4, 2022 at 7:16 pm

Good post
Cody says:

October 4, 2022 at 7:58 pm

Nice, this has been a great review of dockers networking! Looking forward the the next installment.
- Hank Preston says:
  
  October 5, 2022 at 6:04 am
  
  Glad you found it useful!
Bmes says:

October 4, 2022 at 9:57 pm

Interesting reading, thx for sharing.
- Hank Preston says:
  
  October 5, 2022 at 6:05 am
  
  You are welcome. Thanks for reading.
Dzhon Gakhokidze says:

October 4, 2022 at 10:31 pm

great topic about docker networking
- Hank Preston says:
  
  October 6, 2022 at 5:38 am
  
  Thanks! I love how much “networking” is in container/Docker networking concepts. You just need to dive in.
cesar cabrera says:

October 4, 2022 at 11:30 pm

Interesting read about docker; the first time I read something like this. Nice reading
- Hank Preston says:
  
  October 6, 2022 at 5:39 am
  
  Glad to have had something to bring you in!
Carlos Assunção says:

October 5, 2022 at 12:51 am

Thank you for sharing!
Scott Fenn says:

October 5, 2022 at 1:36 am

Great stuff Hank, Bookmarking this one!
- Hank Preston says:
  
  October 5, 2022 at 6:05 am
  
  I often go back to some of my own posts as reminders of how things work. They make great notes on tech.
Simon D says:

October 5, 2022 at 1:47 am

Thanks Hank
Dominik Pecak says:

October 5, 2022 at 3:20 am

Great content!
Rafael says:

October 5, 2022 at 9:20 am

Nice Information, thanks!
hstoddart says:

October 5, 2022 at 9:30 am

nice work and excellent information.
Ajet says:

October 5, 2022 at 10:52 am

Very detailed blog entry. Well done
- Hank Preston says:
  
  October 6, 2022 at 5:39 am
  
  Thanks! I always worry that my blogs might get too long/technical. Glad they are appreciated.
Jeremy Adkins says:

October 5, 2022 at 1:01 pm

Very technically insightful.
Cristian says:

October 5, 2022 at 3:46 pm

Great information about a fantastic tool! Well done!
- Hank Preston says:
  
  October 6, 2022 at 5:40 am
  
  Docker is a great tool built on top of a lot of other great tools and concepts present in Linux for a long time.
Brad says:

October 5, 2022 at 4:34 pm

Great content as always Hank!
KAMAL KUMAR says:

October 5, 2022 at 8:40 pm

nice work great reading it , and excellent information.
Anonymous says:

October 6, 2022 at 12:08 pm

Great article, thanks.
Randy says:

October 6, 2022 at 1:14 pm

I have heard about Docker for so long but never really understood it very well. This blog really helps put it in focus!
- Hank Preston says:
  
  October 10, 2022 at 5:39 am
  
  Glad to have helped Docker become a little clearer!
Brandon says:

October 6, 2022 at 3:50 pm

I enjoyed the technical breadth of this article. Great content that will be useful for future reference.
- Hank Preston says:
  
  October 10, 2022 at 5:40 am
  
  Excellent! I know I’ll be back to this post next time I need to understand or troubleshoot something related to container networking.
Neil M says:

October 7, 2022 at 9:01 am

This is a great read, as I don’t currently understand docker networking and I just know that our organization and it’s developers are going to need help to implement/run systems built on it soon. Thanks!
- Hank Preston says:
  
  October 10, 2022 at 5:40 am
  
  Did you check out Part 1 as well (https://blogs.cisco.com/learning/exploring-default-docker-networking-part-1)? It might be useful too.
Tomas Delgado says:

October 7, 2022 at 8:01 pm

Useful info, bookmarked !! (and thank you)
Ibrahim Yussuf says:

October 9, 2022 at 8:50 am

Great content, thanks
Larry Persing says:

October 10, 2022 at 4:33 am

Interesting, thank you for the overview.
David Anderson says:

October 10, 2022 at 6:32 am

Looks like Linux and Cisco APIs in XE or above is now the requirements.
Andres Melian says:

October 10, 2022 at 11:23 pm

Excellent follow up from your part 1
Mohammad Tusi says:

October 11, 2022 at 7:21 am

Thank you.
jonathan cabanillas says:

October 11, 2022 at 10:31 am

Great Blog, thank you
Alessandro Abrahao says:

October 11, 2022 at 3:11 pm

Thanks!!!
Gary Parsons says:

October 12, 2022 at 5:11 pm

The new ways of integrating componets in the network are mind boggeling!
- Hank Preston says:
  
  October 17, 2022 at 5:50 am
  
  Every time I learn something new I feel a bit mind boggled. But container networking isn’t any more mind boggling than learning BGP, Frame Relay, or any other new technology that I have had to along the way. And less boggling than some. I’m looking at you multicast routing and QoS 🙂
Rune Johannesen says:

October 13, 2022 at 11:49 pm

You are an inspiration Hank!
Philip Carter says:

October 14, 2022 at 5:57 pm

Interesting reading, thanks for sharing.
Lots to digest especially as more Cisco products (anything Webex related) are now utilising Docker under the hood, Its great to be able to gain a better insight/understanding on how it all works…

I’ll be heading back to these blog series a few times to go though it again and understand it better…
- Hank Preston says:
  
  October 17, 2022 at 5:49 am
  
  Thanks Philip. Indeed, Docker is becoming a pervasive technology in all areas of IT. Having an understanding of how it works helps build confidence when working with new technologies.
Muhamed Buzaljko says:

October 17, 2022 at 6:50 am

Very useful, thank you for your time and effort.
Stefani R says:

October 17, 2022 at 11:03 am

Very very useful!!
Ranydvc says:

October 17, 2022 at 1:07 pm

great info!!
Oyewole Fakeye says:

October 20, 2022 at 9:42 am

I would like to be successful in my life with your helping. Nice one
Amit Gumber says:

October 20, 2022 at 10:59 pm

Great Learning Content
pp says:

October 21, 2022 at 1:08 am

Great, thanks for sharing
Muhammad Rafi says:

October 21, 2022 at 12:49 pm

Nice one as always @Hank, I am keen to if we take this to the next level and release a dockerized router or swich 🙂
- Hank Preston says:
  
  October 24, 2022 at 6:55 am
  
  Thanks. I’m curious myself on whether we’ll see significant need and desire for something akin to a traditional router/switch to replace the underlying Linux networking concepts that containers rely on. I could probably debate either side of that question. Time will tell!
Michael says:

October 23, 2022 at 7:01 am

I don’t fully understand this concept in terms of L2 and L3 encapsulation at the hops from the source container to the destination IP address. If the ICMP packet from a container isn’t captured on the ens160 interface, would a payload from a container sent to the ens160 interface reach the ens160 interface as a frame, after Linux networking processes the packet?
- Hank Preston says:
  
  October 24, 2022 at 6:51 am
  
  I agree it doesn’t seem like it should be the case, but as we saw in my testing above in the section “A ping in the dark…”, pings to the IP address assigned to the ens160 interface on the host are NOT captured on the ens160 interface.
  
  I think about it this way.
  
  Capturing traffic on an interface is done somewhere just before/after the traffic is received/sent at the interface. So it happens “below” the IP layer where ICMP is processed.
  
  Because Linux networking can process the ICMP packet, and send the reply, without needing to send it “down the stack towards the actual interface” it never makes it to where tcpdump is processing traffic for capture.
  
  Using your terms of packet/frame – the ICMP “packet” is never encapsulated in a L2 “frame” at the ens160 interface to be captured.
  
  Hopefully that helps you understand it better. I’m forever fascinated by the low level processing of how traffic flows. It isn’t always relevant in day to day work and network engineering, but it can certainly help you understand things better when you need to really get dirty in troubleshooting something.
  - Anonymous says:
    
    October 24, 2022 at 12:24 pm
    
    If packets from C1 were sent to Host 01, then as packets they would egress the ens160 interface?
    - Hank Preston says:
      
      October 24, 2022 at 12:30 pm
      
      Yes, that is correct. Any traffic destined to another host will egress the ens160 interface and be picked up and visible in packet captures. You can see this difference in the blog post section called “Pinging beyond the gates… er host”.
Imran Patel says:

October 25, 2022 at 6:24 am

Very comprehensive, will need to re-read it to fully digest – Thank you!
Joseph Akinyelure says:

October 26, 2022 at 12:43 am

Well presented. Thanks.
Juan says:

October 28, 2022 at 12:38 am

Great intro, I need to dive more into containers soon!
Eric says:

October 30, 2022 at 5:41 am

Great blog, thank you for sharing!
Denis Morin says:

November 7, 2022 at 10:22 am

Great read love the way you break it down. Thanks.
THORN Channa says:

November 7, 2022 at 11:57 pm

That is the good guidance.
Jeremy Jensen says:

November 14, 2022 at 12:41 pm

Great information. This helped me out a lot.
Bettsy says:

November 14, 2022 at 5:06 pm

Great information, Thanks!
orkjms says:

November 15, 2022 at 12:33 am

Thank you !
Fulvio Ferreira says:

November 24, 2022 at 3:31 am

Saved to my bookmarks.. Thank you!
Otieno says:

December 12, 2022 at 11:04 pm

Bookmarked for future reference.
Kaleem says:

December 17, 2022 at 7:40 am

Great information, thanks for sharing!

Comments are closed.