资源池容器网络(overlay)查障记录

4 分钟读完

一期资源池虚机搭建容器 Swarm 集群,跨节点 Overlay 网络(Vxlan)方案不通,排查过程如下:

  1. 本地环境 Swarm 集群,自建 Overlay 网络,跨节点容器间正常互通;
  2. 资源池虚机(不同计算节点)Swarm 集群,自建 Overlay 网络,跨虚机容器间不通;
  3. 资源池虚机(同一计算节点)Swarm 集群,自建 Overlay 网络,跨虚机容器间正常互通;
  4. 容器互联测试,其所在虚机抓包 -> 宿主机抓包 -> 架顶交换机(CE6855)抓包-异常

问题定位

容器发出包(Vxlan),经过虚机到宿主机,进入架顶交换机后没有流出, 咨询华为工程师提供该交换机对虚机内容器 Overlay 网络支持方案。

解决方案

一期资源池架顶交换机型号为 CE6855,使用配置

port nvo3 mode access

指定端口模式为 Vxlan 网络接入端口

资源池环境使用容器集群的计算节点,接入架顶交换机端口需要批量更新该配置

|| 多场景抓包分析

1- 本地跨物理机 Swarm 集群,跨节点容器通信 [正常]

集群信息

物理机 容器
Swarm01 - 10.50.50.149 ngtest.3 - 10.10.10.12
Swarm03 - 10.50.50.151 ngtest.4 - 10.10.10.8

集群 Overlay 网络:

# docker network inspect myolnet
...
    {
        "Name": "myolnet",
        "Id": "6x6r0bser8twa3qaylqvwv3sz",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.10.10.0/24",
                    "Gateway": "10.10.10.1"
...

从 Swarm03 节点上 ngtest.4 容器对 Swarm01 节点上 ngtest.3 容器做联通测试

# docker exec ngtest.4.ckqgty4hxvffoh7gurgh4g1q7 ping ngtest.3.06186x1njdfvt5szgee261vw5
PING ngtest.3.06186x1njdfvt5szgee261vw5 (10.10.10.12): 56 data bytes
64 bytes from 10.10.10.12: icmp_seq=0 ttl=64 time=0.849 ms
64 bytes from 10.10.10.12: icmp_seq=1 ttl=64 time=0.855 ms
64 bytes from 10.10.10.12: icmp_seq=2 ttl=64 time=0.604 ms
...

两侧分别抓包

Swarm01

# tcpdump -i ens33 host swarm01 and swarm03 -w Swarm3toSwarm1onS1.pcap

Swarm03

# tcpdump -i ens33 host swarm03 and swarm01 -w Swarm3toSwarm1onS3.pcap

查看结果

Swarm01 端

Swarm3toSwarm1onS1

Swarm03 端

Swarm3toSwarm1onS1

端口访问测试

# docker exec -it  ngtest.4.ckqgty4hxvffoh7gurgh4g1q7 wget ngtest.3.06186x1njdfvt5szgee261vw5
--2017-11-05 07:31:45--  http://ngtest.3.06186x1njdfvt5szgee261vw5/
Resolving ngtest.3.06186x1njdfvt5szgee261vw5 (ngtest.3.06186x1njdfvt5szgee261vw5)... 10.10.10.12
Connecting to ngtest.3.06186x1njdfvt5szgee261vw5 (ngtest.3.06186x1njdfvt5szgee261vw5)|10.10.10.12|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 612 [text/html]
Saving to: 'index.html'

index.html                  100%[========================================>]     612  --.-KB/s    in 0s      

2017-11-05 07:31:45 (36.1 MB/s) - 'index.html' saved [612/612]

两侧分别抓包(同上)

查看结果

Swarm01 端

Swarm3toSwarm1onS1wget

Swarm03 端

Swarm3toSwarm1onS1wget

结论: Swarm 集群的 Overlay 网络可以实现跨节点的容器间互通

2- 资源池环境虚机跨计算节点

集群信息

计算节点 虚拟机 容器
compute-h10-10.domain.tld lzy02 - 192.101.0.18 ngmigu.3 - 10.20.20.4
compute-h04-3.domain.tld lzy03 - 192.101.0.19 ngmigu.1 - 10.20.20.6

集群 Overlay 网络

# docker network inspect migunet
...
    {
        "Name": "migunet",
        "Id": "6njyo6ctzjobc0sg3o32sttu0",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.20.20.0/24",
                    "Gateway": "10.20.20.1"
...

容器分布

# docker service ps ngmigu
ID                         NAME      IMAGE  NODE   DESIRED STATE  CURRENT STATE           ERROR
bzgupynir43zwg1qy4deu3q7g  ngmigu.1  nginx  lzy-3  Running        Running 11 minutes ago  
134hva4h0zedar2yuau597t61  ngmigu.2  nginx  lzy-3  Running        Running 11 minutes ago  
ahuxyhjtqzt9ccum86cnezm18  ngmigu.3  nginx  lzy-2  Running        Running 12 minutes ago  
d7yz0pw7989xmzzeatdyt0a1d  ngmigu.4  nginx  lzy-2  Running        Running 12 minutes ago  

网络结构图

overlay-network-architecture

从 lzy03 节点上 ngmigu.1 容器对 lzy02 节点上 ngmigu.3 容器做联通测试

# docker exec ngmigu.1.a6hqpwpr9sycuc6y8h3cpuajf ping ngmigu.3.c1098rxxodhx38iyp4pz8ufg1
PING ngmigu.3.c1098rxxodhx38iyp4pz8ufg1 (10.20.20.4): 56 data bytes

两侧分别抓包

Lzy02

# tcpdump -i eth0 host lzy02 and lzy03 -w Lzy03toLzy02onL2.pcap

Lzy03

# tcpdump -i eth0 host lzy03 and lzy02 -w Lzy03toLzy02onL3.pcap

查看结果

Lzy02 端

Lzy03toLzy02onL2

Lzy03 端

Lzy03toLzy02onL3

=> 从虚机 Lzy03 发出的包没有应答,在 Lzy02 端则没收到包,继续在宿主机上抓包

Compute-h10-10

# ovs-tcpdump -i bond2 -w Ch04-3toCh10-10onC10-10.pcap

Compute-h04-3

# ovs-tcpdump -i bond2 -w Ch04-3toCh10-10onC04-3.pcap

查看结果

Compute-h10-10 端

Ch04-3toCh10-10onC10-10

Compute-h04-3 端

Ch04-3toCh10-10onC04-3

=> 物理机上已发出包,而对端计算节点没有收到,问题可能出在架顶交换机(CE6855)

对交换机入口、出口抓包,结果如下

入口

tor入口

出口

tor出口

结论: 容器发出的 Vxlan 包进入架顶交换机后没有流出,需架顶交换机对携带 Vxlan 报文的网络包提供转发支持。

|| 架顶交换机配置

型号:CE6855

缺省情况下,未指定端口模式为 Vxlan 网络接入端口,即携带 Vxlan 报文目的 UDP 端口号(缺省值为 4789)的普通IP报文不能接入 Vxlan 网络。

命令

port nvo3 mode access

该命令用来指定端口模式为 Vxlan 网络接入端口,从而允许携带 Vxlan 报文目的 UDP 端口号(缺省值为 4789)的普通IP报文接入 Vxlan 网络。

生效后测试

互通成功

后期资源池环境中需要跑容器云的计算节点,其架顶交换机接入端口需要预更改模式

|| Swarm 集群 Overlay 网络

Docker 内置的 Overlay 网络是采用 IETF 标准的 Vxlan 方式

1- Swarm 介绍

swarm-overview

2- Swarm 集群 Overlay 网络方案介绍

overlay-networks-in-swarm-mod

demystifying-docker-overlay-networking

注意事项

在 Swarm 集群中,47897946 两个端口需要放开

Port 4789 UDP for the container overlay network.

Port 7946 TCP/UDP for container network discovery.

|| Swarm 集群搭建及 Overlay 网络配置

0- 升级内核

官网建议对 CentOS7 内核做升级,默认的 3.10 版本会有网段重复的可能

On some older kernels, including kernel 3.10, automatically assigned addresses may overlap with another subnet in your infrastructure.

升级到 4.13 版本

# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

# rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm

# yum --enablerepo=elrepo-kernel install  kernel-ml-devel kernel-ml -y

# grub2-set-default 0

# reboot

# uname -r
4.13.11-1.el7.elrepo.x86_64

1- 安装 docker

# yum install docker -y

# systemctl enable docker && systemctl start docker

2- 创建 Swarm 集群

主节点 swarm01

# docker swarm init --advertise-addr 10.50.50.149
Swarm initialized: current node (66bwyy35v7leh93tj281c1kek) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join \
    --token SWMTKN-1-39manpslm0mdcb1489wgthnkksksrp2zskp5hu2h9vchuvfvtc-dhxgzx9qezoja5pu0kzx4t9q9 \
    10.50.50.149:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

加入节点 swarm02

# docker swarm join --token SWMTKN-1-39manpslm0mdcb1489wgthnkksksrp2zskp5hu2h9vchuvfvtc-dhxgzx9qezoja5pu0kzx4t9q9 10.50.50.149:2377
This node joined a swarm as a worker.

查看集群

# docker node ls
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
1o0evmym4pn83y0jbn11wbdp3    swarm02   Ready   Active        
66bwyy35v7leh93tj281c1kek *  swarm01   Ready   Active        Leader

3- 创建 Overlay 网络

# docker network create -d overlay --subnet 10.10.10.0/24 --gateway 10.10.10.1 migunet
9f3lf5xqjhwn9gvwi2ukcrwsy

# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
b820d6cdf91b        bridge              bridge              local               
e5a12e8eae1b        docker_gwbridge     bridge              local               
9a91d77d8864        host                host                local               
4yw6h5a6pg4i        ingress             overlay             swarm               
9f3lf5xqjhwn        migunet             overlay             swarm               
ba4ea288bebe        none                null                local 

# docker network inspect migunet
[
    {
        "Name": "migunet",
        "Id": "9f3lf5xqjhwn9gvwi2ukcrwsy",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.10.10.0/24",
                    "Gateway": "10.10.10.1"
                }
            ]
        },
        "Internal": false,
        "Containers": null,
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "257"
        },
        "Labels": null
    }
]


注: 自建 Overlay网络,在 worker 节点上暂时还没有,Swarm 会在拉起某 service 所关联的 worker 上创建该网络

This is because new overlay networks are only made available to worker nodes that have containers using the overlay. This reduces the scope of the network gossip protocol and helps with scalability.

4- 创建服务

创建 4 副本的 nginx 服务

# docker service create --replicas 4 --name ngmigu --network migunet -p 80:80 nginx
09s47a3207122hvpwk7iow2q4

# docker service ls
ID            NAME    REPLICAS  IMAGE  COMMAND
09s47a320712  ngmigu  4/4       nginx  

# docker service ps ngmigu
ID                         NAME      IMAGE  NODE     DESIRED STATE  CURRENT STATE           ERROR
dvfae6k948l6dwzl8qm8x3gkd  ngmigu.1  nginx  swarm01  Running        Running 27 seconds ago  
8v8cwdnenpm4ad99gsub4bmtq  ngmigu.2  nginx  swarm02  Running        Running 29 seconds ago  
51klvtfpawag8nu1ak8rxqgku  ngmigu.3  nginx  swarm02  Running        Running 29 seconds ago  
7n6fj3xmuu435v0o74u2yx9yx  ngmigu.4  nginx  swarm01  Running        Running 27 seconds ago

在 swarm01 上查看网络详情

# docker network inspect migunet
...
    "Containers": {
        "dfa3830aabab6226b70fce9c665bcf609d169516fb7ca5634005b2a87249690d": {
            "Name": "ngmigu.4.7n6fj3xmuu435v0o74u2yx9yx",
            "EndpointID": "1acb632571c4d939ee586b243d8a0725c017d9538a22ee593cdee0f1a2136c87",
            "MacAddress": "02:42:0a:0a:0a:06",
            "IPv4Address": "10.10.10.6/24",
            "IPv6Address": ""
        },
        "faceadbc47e4bf0e64954fee4d7e9f399912c32f43eed70e7ef62e1352e62848": {
            "Name": "ngmigu.1.dvfae6k948l6dwzl8qm8x3gkd",
            "EndpointID": "c8579828859bf8b9f2a40f0ed51db8dbed209ae1470841bc2bfcd562830fc834",
            "MacAddress": "02:42:0a:0a:0a:03",
            "IPv4Address": "10.10.10.3/24",
            "IPv6Address": ""
        }
    },
    "Options": {
        "com.docker.network.driver.overlay.vxlanid_list": "257"
    },
...

在 swarm02 上查看网络详情

# docker network inspect migunet
...
    "Containers": {
        "0e3193db5a5fb95466a7e514eb6e7b32d6669f9455782c1c2e51a1bba3278c1d": {
            "Name": "ngmigu.3.51klvtfpawag8nu1ak8rxqgku",
            "EndpointID": "1c30b9cbfcbb24ccf26ea053b5314f4efd563cf8b3a316bd3f2e60390bef8b47",
            "MacAddress": "02:42:0a:0a:0a:05",
            "IPv4Address": "10.10.10.5/24",
            "IPv6Address": ""
        },
        "544204ff5bbc693a2c4df86253470c120b1d45784e94d41cdaf33f673e05ba29": {
            "Name": "ngmigu.2.8v8cwdnenpm4ad99gsub4bmtq",
            "EndpointID": "e43a4080737a06554da175f36756fbfd8e153cfa1de57982e6cdcf9c584467b1",
            "MacAddress": "02:42:0a:0a:0a:04",
            "IPv4Address": "10.10.10.4/24",
            "IPv6Address": ""
        }
    },
    "Options": {
        "com.docker.network.driver.overlay.vxlanid_list": "257"
    },
...

更新时间: