kubernetes II - network policy

| 标签 docker 

0x00 design

在 kubernetes 项目中一般的提案会在 repo 的docs/proposals下面, 这次我主要关注networkPolicy相关提案,关于 networkpolicy 早期的会议讨论可以见 1, 提供一个 case2 可以看到典型的应用方式,社区讨论定稿为 3, 第一次提交的代码在 patch #25638,通过issues/22469来追踪.
初步接触 networkpolicy 个人习惯还是从如何使用开始,以下基于配置完成后的整理,如果问题还请斧正.

0x10 create a cluster

首选在本地配置单个 node 和 master 的 k8s 的集群, 推荐go get拉取回来的源码目录中的cluster/kube-up.sh进行集群部署, 我比较懒所以写个一个脚本做了个 wapper 把一些变量放进去了.

#!/bin/bash

iptables -F
iptables -Z
iptables -X

export KUBERNETES_PROVIDER=vagrant
export KUBERNETES_VAGRANT_USE_NFS=true
export ALLOW_PRIVILEGED=true
export VAGRANT_HTTP_PROXY=http://username:password@10.0.58.88:8080
export VAGRANT_HTTPS_PROXY=http://username:password@10.0.58.88:8080

install() {
        rpm -qa | grep vagrant-libvirt
        if [ $? != 0 ]; then
                dnf install vagrant-libvirt -y
        fi
        pushd $GOPATH/src/github.com/kubernetes/kubernetes
                ./cluster/kube-up.sh
        popd
        exit 0
}

remove() {
        sudo virsh destroy kubernetes_master
        sudo virsh destroy kubernetes_node-1
        sudo virsh undefine kubernetes_master
        sudo virsh undefine kubernetes_node-1
        sudo rm -rf /var/lib/libvirt/images/kubernetes_*
        sudo rm -rf /var/lib/libvirt/qemu/
        sudo rm -rf /var/lib/libvirt/dnsmasq/
        sudo systemctl restart libvirtd.service
        exit 0
}

down() {
        ./cluster/kube-down.sh
        exit 0
}

$1

echo "please input $0 install or $0 remove"

VAGRANT_HTTP_PROXY变量是 vagrant 提供的一个插件可以在创建的虚拟机节点中注入代理环境变量, 使用NFS避免了rsync造成额外的磁盘浪费, 如果还需要预配置一些奇怪的东西比如dnf代理在虚拟机节点中可以

qemu-system-x86_64 /var/lib/libvirt/images/kube-fedora23_vagrant_box_image_0.img  -m 2048

开启虚拟机在其中修改配置文件, 然后退出注意一致性问题 (比如 dnf 被C+c遗留.lock 文件), 重新运行我的小 wapper 把一些遗留的配置文件清理掉, 然后重新创建, 一切正常可以看见集群信息

$ kubectl cluster-info
Kubernetes master is running at https://10.245.1.2
Heapster is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/heapster
KubeDNS is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
Grafana is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
InfluxDB is running at https://10.245.1.2/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

0x20 install network solution

Kubernetes itself doesn’t enforce NetworkPolicy. You’ll need to run a networking solution like Calico, Canal, etc as your network plugin to get NetworkPolicy features.

network solution 选择了 calico 并且选择手工配置将其添加到已经存在的集群当中.

0x21 准备镜像

在 master 和 node 上面 pull 镜像

docker pull docker.io/calico/cni
docker pull docker.io/calico/kube-policy-controller
docker pull quay.io/calico/node

0x22 准备 kubelet

在 master 和 node 分别 下载必要命令行工具, 虽然loopback在 master 不是必须的但是多一个也无所谓(懒得改脚本)

wget http://www.projectcalico.org/builds/calicoctl
sudo chmod +x calicoctl

wget -N -P /opt/cni/bin https://github.com/projectcalico/calico-cni/releases/download/v1.4.1/calico
wget -N -P /opt/cni/bin https://github.com/projectcalico/calico-cni/releases/download/v1.4.1/calico-ipam
chmod +x /opt/cni/bin/calico /opt/cni/bin/calico-ipam

wget https://github.com/containernetworking/cni/releases/download/v0.3.0/cni-v0.3.0.tgz
tar -zxvf cni-v0.3.0.tgz
cp loopback /opt/cin/bin/

master 和 node 添加配置文件

mkdir -p /etc/cni/net.d
cat >/etc/cni/net.d/10-calico.conf <<EOF
{
    "name": "calico-k8s-network",
    "type": "calico",
    "etcd_endpoints": "http://<ETCD_IP>:<ETCD_PORT>",
    "log_level": "info",
    "ipam": {
        "type": "calico-ipam"
    },
    "policy": {
        "type": "k8s"
    }
}
EOF

因为通过cluster/kube-up.sh创建的集群是通过 systemd 管理的 kubelet 启动所以命令行参数的传入位于/etc/sysconfig/kubelet, 添加DAEMON_ARGS变量添加--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --network-plugin-dir=/etc/cni/net.d,而后重新启动服务.

0x23 创建 calico node

在 master 和 node 上面分别ETCD_ENDPOINTS是你 master 和 master 的 etcd 的端口.

ETCD_ENDPOINTS=http://10.245.1.2:4379  ./calicoctl node

然后在任意机器下面都可以看见

[root@kubernetes-master ~]# ETCD_ENDPOINTS=http://10.245.1.2:4379  ./calicoctl status
calico-node container is running. Status: Up About a minute
Running felix version 1.4.0

IPv4 BGP status
IP: 10.245.1.2    AS Number: 64511 (inherited)
+--------------|-------------------|-------|----------|-------------+
| Peer address |     Peer type     | State |  Since   |     Info    |
+--------------|-------------------|-------|----------|-------------+
|  10.245.1.3  | node-to-node mesh |   up  | 08:22:15 | Established |
+--------------|-------------------|-------|----------|-------------+

0x24 创建 policy-controller

在开发机上下载policy-controller并修改其中的ETCD_ENDPOINTSimages(使用现有的镜像比较快).

    spec:
      hostNetwork: true
      containers:
        - name: calico-policy-controller
          # Make sure to pin this to your desired version.
          image: calico/kube-policy-controller
          env:
            # Configure the policy controller with the location of 
            # your etcd cluster.
            - name: ETCD_ENDPOINTS
              value: "http://10.245.1.2:4379"

在开发机上通过上面下载的yaml文件创建controller.

$ kubectl create -f policy-controller.yaml

0x25 配置确认

在开发机器上确认一下 calico-policy-controller 的状态是否正常.

$ kubectl get pod --namespace=kube-system
NAME                                   READY     STATUS    RESTARTS   AGE
calico-policy-controller-0i4so         1/1       Running   1          23h
heapster-v1.2.0-2582472167-91sw6       4/4       Running   24         4d
kube-dns-v19-dfhx9                     3/3       Running   41         4d
kube-proxy-kubernetes-node-1           1/1       Running   7          4d
kubernetes-dashboard-v1.4.0-gydiz      1/1       Running   22         4d
monitoring-influxdb-grafana-v4-auln8   2/2       Running   15         4d

并确认 master 和 node 上 calico-node 容器是否正常.

0x30 test network policy

在开发机上面创建 namespace 名为 test 且在对其配置每 pod 隔离,并在其中创建名为 nginx 的 deployment 和 service,

kubectl create ns test # 创建 test 命名空间
kubectl annotate ns test "net.beta.kubernetes.io/network-policy={\"ingress\": {\"isolation\": \"DefaultDeny\"}}" --overwrite # 每 pod 隔离
kubectl --namespace=test run nginx --image=nginx # 创建 deployment 
kubectl --namespace=test  expose deployment nginx --port=80 # 创建 service 

在 test 宣告默认 deny 时候可以在 master 或者 node 上面运行如下.

[root@kubernetes-node-1 ~]# ETCD_ENDPOINTS=http://10.245.1.2:4379  ./calicoctl profile show
+--------------------+
|        Name        |
+--------------------+
|   k8s_ns.default   |
| k8s_ns.kube-system |
|    k8s_ns.test     |
+--------------------+
[root@kubernetes-node-1 ~]# ETCD_ENDPOINTS=http://10.245.1.2:4379  ./calicoctl profile k8s_ns.test rule show
Inbound rules:
   1 deny
Outbound rules:
   1 allow

查看信息,pod 是不是正常运行.

kubectl --namespace=test get svc,pod
NAME        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
svc/nginx   10.247.114.187   <none>        80/TCP    3h
NAME                       READY     STATUS    RESTARTS   AGE
po/nginx-701339712-wcim9   1/1       Running   0          3h

0x31 测试

配置每 pod 隔离,然后配置 networkpolicy 的 selector .

$ kubectl annotate ns test "net.beta.kubernetes.io/network-policy={\"ingress\": {\"isolation\": \"DefaultDeny\"}}" --overwrite
namespace "test" annotated
$ cat network-policy.yaml 
kind: NetworkPolicy
apiVersion: extensions/v1beta1
metadata:
  name: access-nginx
  namespace: test
spec:
  podSelector:
    matchLabels:
      run: nginx
  ingress:
    - from:
      - podSelector:
          matchLabels:
            access: "true"

$ kubectl create -f network-policy.yaml                                                                                 
networkpolicy "access-nginx" created

$ kubectl --namespace=test run busybox --rm -ti --labels="access=true" --image=busybox /bin/sh                                
Waiting for pod test/busybox-3554646944-irgeg to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # wget nginx
Connecting to nginx (10.247.114.187:80)
index.html           100% |**********************************************************************************************************************************************************|   612   0:00:00 ETA
/ # Session ended, resume using 'kubectl attach busybox-3554646944-irgeg -c busybox -i -t' command when the pod is running
deployment "busybox" deleted

$ kubectl --namespace=test run busybox --rm -ti --image=busybox /bin/sh        
Waiting for pod test/busybox-3674381263-0solq to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # wget --timeout=1 nginx
Connecting to nginx (10.247.114.187:80)
wget: download timed out
/ # 

Bingo! 可以看见不带标签的被隔离了,带标签的正常运行.

0x40 network policy implements

reference


PREV     NEXT