K8S 企业容器云07 - 生产环境建设方案

13 分钟读完

本文档为生产环境 Kubernetes集群,并包括 Calico、Helm、StorageClass(Rook-Ceph)、Ingress(Nginx)、Cert-manager(CA & ACME)

环境准备

本集群分为 3台 Master节点,2台 Worker节点;

此外,Master2 及 Master3 还承载 Master.api、Ingress出口及 Worker功能

Keepalived

Master2、Master3 承载虚地址(VIP-10.135.204.48)

wx-cloudnative-master-2 & wx-cloudnative-master-3 上执行

# yum install keepalived -y

# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id CloudNative
}

vrrp_instance VI_1 {
    state BACKUP
    interface bond0
    virtual_router_id 51
    priority 50
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        10.135.204.48
    }
}

# systemctl restart keepalived

各节点系统基础配置

域名解析

# vim /etc/hosts
10.135.204.2  wx-cloudnative-master-1
10.135.204.44 wx-cloudnative-master-2
10.135.204.45 wx-cloudnative-master-3
10.135.204.46 wx-cloudnative-worker-1
10.135.204.47 wx-cloudnative-worker-2
10.135.204.48 wx-cloudnative-master.api

安全

# systemctl stop firewalld && systemctl disable firewalld

# yum install vim wget ntp -y

# vim /etc/selinux/config
SELINUX=disabled
SELINUXTYPE=targeted

# setenforce 0
# swapoff -a

替换阿里源

# mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo_bak
# wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
# wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo

# yum clean all && yum makecache
# yum update -y

# timedatectl set-timezone Asia/Shanghai
# timedatectl
# timedatectl set-ntp true

Docker

安装、添加国内仓库

# yum install yum-utils device-mapper-persistent-data lvm2 -y
# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

# yum install containerd.io docker-ce docker-ce-cli -y
# systemctl enable docker && systemctl start docker

# cat > /etc/docker/daemon.json <<EOF
{
    "exec-opts": ["native.cgroupdriver=systemd"],
    "log-driver": "json-file",
    "log-opts": {
    "max-size": "100m"
    },
    "storage-driver": "overlay2",
    "registry-mirrors":[
        "https://32lrw8s6.mirror.aliyuncs.com",
        "http://hub-mirror.c.163.com",
        "https://docker.mirrors.ustc.edu.cn",
        "https://registry.docker-cn.com"
    ]
}
EOF

# mkdir -p /etc/systemd/system/docker.service.d
# systemctl daemon-reload
# systemctl restart docker
# docker info

kubeadm kubectl kubelet

安装

# cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg       http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

# cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

# sysctl --system

# yum install kubeadm-1.18.3 kubelet-1.18.3 kubectl-1.18.3
# systemctl start kubelet
# systemctl enable kubelet

Master1 操作

kubeadm

创建集群

# kubeadm init \
  --apiserver-advertise-address 0.0.0.0 \
  --apiserver-bind-port 6443 \
  --cert-dir /etc/kubernetes/pki \
  ## 为控制平面指定一个稳定的 IP 地址或 DNS 名称
  --control-plane-endpoint wx-cloudnative-master.api \
  --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
  --kubernetes-version 1.19.2 \
  ## 指定 pod的 IP地址范围
  --pod-network-cidr 10.10.0.0/16 \
  ## 指定 Service的VIP地址范围
  --service-cidr 10.20.0.0/16 \
  ## 为 Service指定域名 cloudnative.wx
  --service-dns-domain cloudnative.wx \
  --upload-certs

查看集群

# mkdir -p $HOME/.kube
# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
# chown $(id -u):$(id -g) $HOME/.kube/config

# kubectl cluster-info
Kubernetes master is running at https://wx-cloudnative-master.api:6443
KubeDNS is running at https://wx-cloudnative-master.api:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Calico

获取 YAML

# mkdir calico && cd calico
# wget https://docs.projectcalico.org/manifests/calico.yaml

目前版本的 calico YAML可以自动读取目前配置中的 pod CIDR

# kubectl apply -f calico.yaml

查看运行状态,coredns也恢复正常运行

# kubectl get pods -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS   AGE
kube-system   calico-kube-controllers-578894d4cd-sfwmx   1/1     Running   0          97s
kube-system   calico-node-cwmkr                          1/1     Running   0          98s
kube-system   coredns-546565776c-bq22h                   1/1     Running   0          4h54m
kube-system   etcd-wx-cloudnative-master-1               1/1     Running   0          4h55m
kube-system   kube-apiserver-wx-cloudnative-master-1     1/1     Running   0          4h55m
kube-system   kube-controller-manager-ck01               1/1     Running   0          4h55m
kube-system   kube-proxy-ksc6b                           1/1     Running   0          4h54m
kube-system   kube-scheduler-wx-cloudnative-master-1     1/1     Running   0          4h55m

Helm

安装

# wget https://get.helm.sh/helm-v3.3.3-linux-amd64.tar.gz
# tar -zxvf helm-v3.3.3-linux-amd64.tar.gz
# mv linux-amd64/helm /usr/local/bin/helm

# helm version
version.BuildInfo{Version:"v3.3.3",
GitCommit:"55e3ca022e40fe200fbc855938995f40b2a68ce0",
GitTreeState:"clean", GoVersion:"go1.14.9"}

添加阿里 chart仓库

# helm repo add stable https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
# helm search repo stable
# helm repo update

注:Helm3以上版本无需再安装 Tiller

Master2 & Master3

以 Master身份加入集群

kubeadm join wx-cloudnative-master.api:6443 --token nr3hgx.29i5i0vfobqzdde2 \
    --discovery-token-ca-cert-hash sha256:9b908cded64d2193c89ab9a3eb54867a93fab59aa0acb56e48e1bac9d985c151

补充:如果忘记 token,可以重新获取

# kubeadm token create --print-join-command

# kubeadm init phase upload-certs --upload-certs
...
Using certificate key:
9657d13637814316fe8284e0ef167106b13acdad30279dbc29a804f82cedb973

# kubeadm join wx-cloudnative-master.api:6443 \
  --token nr3hgx.29i5i0vfobqzdde2 \
  --discovery-token-ca-cert-hash sha256:9b908cded64d2193c89ab9a3eb54867a93fab59aa0acb56e48e1bac9d985c151 \
  --control-plane --certificate-key 9657d13637814316fe8284e0ef167106b13acdad30279dbc29a804f82cedb973

去除两台 Master节点的 NoSchedule污点标记,同时承担 Worker功能

# kubectl taint nodes wx-cloudnative-master-2 node-role.kubernetes.io/master:NoSchedule-
# kubectl taint nodes wx-cloudnative-master-3 node-role.kubernetes.io/master:NoSchedule-

Woker1 & Worker2

以 Woker角色加入集群

# kubeadm join wx-cloudnative-master.api:6443 \
  --token nr3hgx.29i5i0vfobqzdde2 \
  --discovery-token-ca-cert-hash sha256:9b908cded64d2193c89ab9a3eb54867a93fab59aa0acb56e48e1bac9d985c151

集群状态

# kubectl get nodes
NAME                      STATUS   ROLES    AGE   VERSION
wx-cloudnative-master-1   Ready    master   70d   v1.19.2
wx-cloudnative-master-2   Ready    master   70d   v1.19.2
wx-cloudnative-master-3   Ready    master   70d   v1.19.2
wx-cloudnative-worker-1   Ready    <none>   70d   v1.19.2
wx-cloudnative-worker-2   Ready    <none>   70d   v1.19.2

Ingress(Nginx)

架构

节点设置

使用 Master2 & Master3 的 NodePort作为所有 Ingress资源的出口

所以这个两个节点专门承载 Ingress Pod,用自定义标签绑定(ingresscontroller=member)

# kubectl label nodes wx-cloudnative-master-2 ingresscontroller=member
# kubectl label nodes wx-cloudnative-master-3 ingresscontroller=member

# kubectl get nodes --show-labels | grep master
wx-cloudnative-master-1   Ready    master   70d   v1.19.2   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=wx-cloudnative-master-1,kubernetes.io/os=linux,node-role.kubernetes.io/master=
wx-cloudnative-master-2   Ready    master   70d   v1.19.2   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,ingresscontroller=member,kubernetes.io/arch=amd64,kubernetes.io/hostname=wx-cloudnative-master-2,kubernetes.io/os=linux,node-role.kubernetes.io/master=
wx-cloudnative-master-3   Ready    master   70d   v1.19.2   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,ingresscontroller=member,kubernetes.io/arch=amd64,kubernetes.io/hostname=wx-cloudnative-master-3,kubernetes.io/os=linux,node-role.kubernetes.io/master=

Nginx Ingress Controller

https://docs.nginx.com/nginx-ingress-controller/

配置及安装,根据 Label在特定节点以 Daemonset成员形式拉起 Pod

# helm fetch nginx-stable/nginx-ingress
# tar xzvf nginx-ingress-0.7.0.tgz
# cd nginx-ingress

# vim values.yaml
line7 - kind: daemonset
line23 - hostNetwork: true
line87 - nodeSelector: {ingresscontroller: member}
line120 - replicaCount: 2
line141 - setAsDefaultIngress: true
line191 - type: NodePort

专用 kube-ingress命名空间

# kubectl create namespace kube-ingress

# helm install cloudnativengingress . -n kube-ingress
# kubectl get pods -n kube-ingress -o wide
NAME                                       READY   STATUS    RESTARTS   AGE   IP              NODE                      NOMINATED NODE   READINESS GATES
cloudnativengingress-nginx-ingress-4rdt6   1/1     Running   2          56d   10.135.204.45   wx-cloudnative-master-3   <none>           <none>
cloudnativengingress-nginx-ingress-77p5c   1/1     Running   0          20d   10.135.204.44   wx-cloudnative-master-2   <none>           <none>

# kubectl get ingressclass -A
NAME    CONTROLLER                     PARAMETERS   AGE
nginx   nginx.org/ingress-controller   <none>       24h

# kubectl describe ingressclass nginx
Name:         nginx
Labels:       app.kubernetes.io/managed-by=Helm
Annotations:  ingressclass.kubernetes.io/is-default-class: true
              meta.helm.sh/release-name: cloudnativengingress
              meta.helm.sh/release-namespace: kube-ingress
Controller:   nginx.org/ingress-controller
Events:       <none>

TLS 认证

使用 cert-manager作为集群的 certificate management controller

Certificate Manager

架构

专用 kube-public命名空间

# kubectl create namespace kube-public

# helm repo add jetstack https://charts.jetstack.io
# helm repo update

# helm fetch jetstack/cert-manager
# tar xzvf cert-manager-v1.0.3.tgz
# cd cert-manager

使用主机网络端口

# vim values.yaml
line25 - namespace: "kube-public"
line27 - installCRDs: true
line270 - securePort: 10251
line281 - hostNetwork: true

安装

# helm install cert-manager . -n kube-public
NAME: cert-manager
LAST DEPLOYED: Fri Oct 23 17:42:28 2020
NAMESPACE: kube-public
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager has been deployed successfully!

In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).

More information on the different types of issuers and how to configure them
can be found in our documentation:

https://cert-manager.io/docs/configuration/

For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the `ingress-shim`
documentation:

https://cert-manager.io/docs/usage/ingress/

证书发布功能测试

# cat <<EOF > test-cert-resources.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: cert-manager-test
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: test-selfsigned
  namespace: cert-manager-test
spec:
  selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: selfsigned-cert
  namespace: cert-manager-test
spec:
  dnsNames:
    - cloudnative.wx
  secretName: selfsigned-cert-tls
  issuerRef:
    name: test-selfsigned
EOF

# kubectl apply -f test-cert-resources.yaml
namespace/cert-manager-test created
issuer.cert-manager.io/test-selfsigned created
certificate.cert-manager.io/selfsigned-cert created


# kubectl describe certificate -n cert-manager-test
...
Spec:
  Dns Names:
    cloudnative.wx
  Issuer Ref:
    Name:       test-selfsigned
  Secret Name:  selfsigned-cert-tls
...
Events:
  Type    Reason     Age   From          Message
  ----    ------     ----  ----          -------
  Normal  Issuing    94s   cert-manager  Issuing certificate as Secret does not exist
  Normal  Generated  94s   cert-manager  Stored new private key in temporary Secret resource "selfsigned-cert-5n9r4"
  Normal  Requested  94s   cert-manager  Created new CertificateRequest resource "selfsigned-cert-vdxz4"
  Normal  Issuing    94s   cert-manager  The certificate has been successfully issued

# kubectl delete -f test-cert-resources.yaml

ACME(公网环境)

cert-manager北向对接各类证书签发机构,若对外网开放,可使用 ACME

cert-manager 提供了 Issuer 和 ClusterIssuer 两种用于创建签发机构的自定义资源对象,Issuer 只能用来签发自己所在 namespace 下的证书,ClusterIssuer 可以签发任意 namespace 下的证书

# vim acme-clusterissuer.yaml
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    email: zeastion@live.cn
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: cloudnative-issure-account-key
    solvers:
    - http01:
        ingress:
          class: nginx

# kubectl apply -f acme-clusterissuer.yaml

# kubectl get clusterissuer -o wide
NAME                  READY   STATUS                                                 AGE
letsencrypt-staging   True    The ACME account was registered with the ACME server   5m19s

# kubectl describe clusterissuer letsencrypt-staging
Name:         letsencrypt-staging
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  cert-manager.io/v1
Kind:         ClusterIssuer
Metadata:
...
  Self Link:         /apis/cert-manager.io/v1/clusterissuers/letsencrypt-staging
  UID:               bbf6287b-d05f-40fa-968c-c2e6378d077c
  Spec:
  Acme:
    Email:            zeastion@live.cn
    Preferred Chain:
    Private Key Secret Ref:
      Name:  cloudnative-issure-account-key
    Server:  https://acme-staging-v02.api.letsencrypt.org/directory
    Solvers:  
      http01:
        Ingress:
          Class:  nginx
Status:
  Acme:
    Last Registered Email:  zeastion@live.cn
    Uri:                    https://acme-staging-v02.api.letsencrypt.org/acme/acct/16302840
  Conditions:
    Last Transition Time:  2020-10-26T07:40:28Z
    Message:               The ACME account was registered with the ACME server
    Reason:                ACMEAccountRegistered
    Status:                True
    Type:                  Ready
Events:                    <none>

给 Ingress resources自动签发证书

官方参考

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    # add an annotation indicating the issuer to use.
    cert-manager.io/cluster-issuer: *nameOfClusterIssuer*
    kubernetes.io/tls-acme: "true"
  name: myIngress
...
spec:
...
  tls: # < placing a host in the TLS config will indicate a certificate should be created
  - hosts:
    - *example.com*
    secretName: *myingress-cert* # < cert-manager will store the created certificate in this secret.

CA(内部环境)

办公网等内部环境使用,简化采用自建 CA即可

创建 CA private key

# openssl genrsa -out ca.key 2048

创建 self signed Certificate

# openssl req -x509 -new -nodes -key ca.key -subj "/CN=CloudNative" -days 3650 -reqexts v3_req -extensions v3_ca -out ca.crt

将以上密钥对存储于 Secret资源内

# kubectl create secret tls ca-key-pair --cert=ca.crt --key=ca.key --namespace=kube-public

# kubectl describe secret ca-key-pair -n kube-public
Name:         ca-key-pair
Namespace:    kube-public
Labels:       <none>
Annotations:  <none>

Type:  kubernetes.io/tls

Data
====
tls.crt:  1099 bytes
tls.key:  1675 bytes

创建 CA ClusterIssuer,引用以上 Secret

# vim ca-clusterissuer.yaml
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: ca-clusterissuer
  namespace: kube-public
spec:
  ca:
    secretName: ca-key-pair

查看 clusterIssuer资源

# kubectl describe clusterissuer ca-clusterissuer
Name:         ca-clusterissuer
Namespace:    
Labels:       <none>
Annotations:  <none>
API Version:  cert-manager.io/v1
Kind:         ClusterIssuer
Metadata:
  Creation Timestamp:  2020-12-10T09:55:59Z
  Generation:          1
  Managed Fields:
    API Version:  cert-manager.io/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:conditions:
    Manager:      controller
    Operation:    Update
    Time:         2020-12-10T09:55:56Z
    API Version:  cert-manager.io/v1alpha2
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:ca:
          .:
          f:secretName:
    Manager:         kubectl-client-side-apply
    Operation:       Update
    Time:            2020-12-10T09:55:59Z
  Resource Version:  26235249
  Self Link:         /apis/cert-manager.io/v1/clusterissuers/ca-clusterissuer
  UID:               edc36778-a49e-4e27-bfa5-900a7a40edbb
Spec:
  Ca:
    Secret Name:  ca-key-pair
Status:
  Conditions:
    Last Transition Time:  2020-12-10T09:55:56Z
    Message:               Signing CA verified
    Reason:                KeyPairVerified
    Status:                True
    Type:                  Ready
Events:                    <none>

给 Ingress resources自动签发证书

...
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: "ca-clusterissuer"
    nginx.org/ssl-services: "*ServiceName*"
...

Kubernetes Dashboard

https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dashboard/dashboard.yaml

https://artifacthub.io/packages/helm/k8s-dashboard/kubernetes-dashboard

使用 Helm安装

# helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
# helm fetch kubernetes-dashboard/kubernetes-dashboard
# tar xzvf kubernetes-dashboard-2.8.2.tgz
# cd kubernetes-dashboard/

关联 CA ClusterIssuer

# vim values.yaml
line125 - enabled: true
...
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: "ca-clusterissuer"
    nginx.org/ssl-services: "cnk8sdashboard-kubernetes-dashboard"
    #kubernetes.io/tls-acme: 'true'
    #cert-manager.io/cluster-issuer: "letsencrypt-staging"
...
  hosts:
    - k8sdashboard.cloudnative.wx
  tls:
    - secretName: k8sdashboard.cloudnative.wx
      hosts:
        - k8sdashboard.cloudnative.wx
...

创建

# kubectl create namespace kube-dashboard
# helm install cnk8sdashboard . -n kube-dashboard

增加访问账号

https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md

# vim admin-serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin
  namespace: kube-dashboard

# vim admin-clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin
  namespace: kube-dashboard

# kubectl apply -f admin-serviceaccount.yaml
# kubectl apply -f admin-clusterrolebinding.yaml

使用得到的 token信息访问 https://k8sdashboard.cloudnative.wx

# kubectl get secret -n kube-dashboard | grep admin
admin-token-qf47f                                 kubernetes.io/service-account-token   3      47m

# kubectl describe secret admin-token-qf47f -n kube-dashboard | grep token:
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IkFVZ0picFhwbVpCSnRFcHN4QmktbkpsNWU3ZmV5aElmZEVtUkd1N1cwM0EifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi10b2tlbi1xZjQ3ZiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJhZG1pbiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjIyOTg0NzliLWZhMjUtNGU4OS04YzZjLWM4Y2VjNWJkZTZmYiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLWRhc2hib2FyZDphZG1pbiJ9.Jo0ORC9MplqB4L3rWd-YrgfkQYrPjkvRUoYgReSQw8p4hzbcqF1rB-o--NtOroyrOoFWt4KK47oHectZeE8skbNrlLRYZGWUudmeu3IBdS5ivu9REi28Wm1eS9st5bkd-JL-iko0SI7xo3DKvo2nu8PxiuUKHiuJqdPUqTGOPhjbE1hFx05RIrRoG1ta3dSrV4SxmkS3HEdanQRSGtcM3zu6W2r9pdYrVwJnEH8Rx5Dz4g0XAy8Oli18pHNpthXo8TmUyrfYa1a7i1z50DCsKhdw9L6rhlPyIG1Akk_g7WqCxhB65KI-ZuW3qhbfBX6TmoQUGyBVFEFakTv_QmG5RA

Ceph (Rook)

设计

Master2 Master3 Worker2 Worker3 各配有 12块 3TB数据盘,单盘作为一个 OSD

使用 Helm安装

# kubectl create namespace rook-ceph

# helm repo add rook-release https://charts.rook.io/release
# helm search repo rook-ceph --versions

# helm install cloudnativeceph -n rook-ceph rook-release/rook-ceph

# helm ls -n rook-ceph
NAME            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
cloudnativeceph rook-ceph       1               2020-10-16 13:53:55.349112672 +0800 CST deployed        rook-ceph-v1.4.5                   

创建集群

参考:

https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md https://github.com/rook/rook.github.io/blob/master/docs/rook/v0.7/cluster-crd.md

下载 cluster.yaml (https://github.com/rook/rook/blob/release-1.4/cluster/examples/kubernetes/ceph/cluster.yaml)

修改 node字段,使用 sdb ~ sdm 12块盘

# vim rook/cluster.yaml
line72 - provider: host
...
storage:
  useAllNodes: false
  useAllDevices: false
  config:
  nodes:
  - name: "wx-cloudnative-master-2"
  deviceFilter: "^sd[b-m]"
  - name: "wx-cloudnative-master-3"
  deviceFilter: "^sd[b-m]"
  - name: "wx-cloudnative-worker-1"
  deviceFilter: "^sd[b-m]"
  - name: "wx-cloudnative-worker-2"
  deviceFilter: "^sd[b-m]"

创建

# kubectl create -f cluster.yaml

# kubectl get pods -n rook-ceph
NAME                                                              READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-c78s4                                            3/3     Running     9          69d
csi-cephfsplugin-f5d7f                                            3/3     Running     6          69d
csi-cephfsplugin-jhsfg                                            3/3     Running     6          69d
csi-cephfsplugin-provisioner-5c65b94c8d-b2q6k                     6/6     Running     7          21d
csi-cephfsplugin-provisioner-5c65b94c8d-xmpqf                     6/6     Running     17         69d
csi-cephfsplugin-zn42q                                            3/3     Running     9          69d
csi-rbdplugin-64tkq                                               3/3     Running     6          69d
csi-rbdplugin-mvf45                                               3/3     Running     9          69d
csi-rbdplugin-provisioner-569c75558-8xqbs                         6/6     Running     7          21d
csi-rbdplugin-provisioner-569c75558-s68gx                         6/6     Running     20         59d
csi-rbdplugin-vphdj                                               3/3     Running     9          69d
csi-rbdplugin-xshx2                                               3/3     Running     6          69d
rook-ceph-crashcollector-wx-cloudnative-master-2-669bc8587jrl6n   1/1     Running     3          69d
rook-ceph-crashcollector-wx-cloudnative-master-3-767f4c9f9gt2hp   1/1     Running     2          69d
rook-ceph-crashcollector-wx-cloudnative-worker-1-59ff4fcf6972sj   1/1     Running     1          21d
rook-ceph-crashcollector-wx-cloudnative-worker-2-5ffffb554k72nw   1/1     Running     1          21d
rook-ceph-mgr-a-8684b6f644-q2ks8                                  1/1     Running     6          69d
rook-ceph-mon-a-5687786c6d-qbn6n                                  1/1     Running     1          21d
rook-ceph-mon-b-798f7d9c5f-cx6cj                                  1/1     Running     1          21d
rook-ceph-mon-c-58cd98ff65-9pnsx                                  1/1     Running     3          68d
rook-ceph-operator-858bc99959-n82dm                               1/1     Running     3          59d
rook-ceph-osd-0-7d9cc585b5-rbzpr                                  1/1     Running     1          21d
rook-ceph-osd-1-6cc88f6b6d-qcdfb                                  1/1     Running     12         69d
rook-ceph-osd-10-7fd649f67b-fl29p                                 1/1     Running     12         69d
rook-ceph-osd-11-7fd6c55c68-klldb                                 1/1     Running     14         69d
rook-ceph-osd-12-55cbd47d8-9m7dx                                  1/1     Running     1          21d
rook-ceph-osd-13-75989ff544-9w642                                 1/1     Running     1          21d
rook-ceph-osd-14-6fb7d9bb4d-bfxwz                                 1/1     Running     12         69d
rook-ceph-osd-15-6f899f5878-pmgwk                                 1/1     Running     14         69d
rook-ceph-osd-16-76957878c8-8r9pt                                 1/1     Running     1          21d
rook-ceph-osd-17-fb5859cf-j7448                                   1/1     Running     1          21d
rook-ceph-osd-18-68cb5f774c-fxhmk                                 1/1     Running     12         69d
rook-ceph-osd-19-dbbf78f55-fzjx2                                  1/1     Running     1          21d
rook-ceph-osd-2-68559b959f-7wvp8                                  1/1     Running     1          21d
rook-ceph-osd-20-84d687774d-8vj94                                 1/1     Running     14         69d
rook-ceph-osd-21-f585fffbf-28bzv                                  1/1     Running     1          21d
rook-ceph-osd-22-78d996c454-6gf6c                                 1/1     Running     12         69d
rook-ceph-osd-23-5b889cb4f7-t4zm6                                 1/1     Running     1          21d
rook-ceph-osd-24-569b9f9755-9cbsx                                 1/1     Running     14         69d
rook-ceph-osd-25-56c4c54758-cpn96                                 1/1     Running     1          21d
rook-ceph-osd-26-79ffd5f9dc-csft5                                 1/1     Running     12         69d
rook-ceph-osd-27-67fbc74f5c-ct9gw                                 1/1     Running     1          21d
rook-ceph-osd-28-c9747946-zpwms                                   1/1     Running     14         69d
rook-ceph-osd-29-7b5b965487-qxhnj                                 1/1     Running     1          21d
rook-ceph-osd-3-78679764b9-6vh4f                                  1/1     Running     14         69d
rook-ceph-osd-30-d9cf5f999-8d4nn                                  1/1     Running     12         69d
rook-ceph-osd-31-57cd898cd9-qd2nz                                 1/1     Running     1          21d
rook-ceph-osd-32-d6cb85666-jntgr                                  1/1     Running     14         69d
rook-ceph-osd-33-5856cbcf9-rcxbg                                  1/1     Running     1          21d
rook-ceph-osd-34-56df5fdb9-2dhrs                                  1/1     Running     1          21d
rook-ceph-osd-35-7d484d6b7b-sfk9w                                 1/1     Running     12         69d
rook-ceph-osd-36-97bdc7454-pss8l                                  1/1     Running     14         69d
rook-ceph-osd-37-689455458b-m2lkb                                 1/1     Running     1          21d
rook-ceph-osd-38-5cb946d6d8-xgb7c                                 1/1     Running     1          21d
rook-ceph-osd-39-d688d54c9-p695j                                  1/1     Running     12         69d
rook-ceph-osd-4-546b44c79-pmwzb                                   1/1     Running     1          21d
rook-ceph-osd-40-7b4b4d79cb-72rrm                                 1/1     Running     14         69d
rook-ceph-osd-41-66858b66c4-lt92w                                 1/1     Running     1          21d
rook-ceph-osd-42-7d46c5cb7c-d4vj4                                 1/1     Running     1          21d
rook-ceph-osd-43-744787dd9c-pns4s                                 1/1     Running     12         69d
rook-ceph-osd-44-7df764cd6-b2kkr                                  1/1     Running     14         69d
rook-ceph-osd-45-58f74c7b85-k698f                                 1/1     Running     1          21d
rook-ceph-osd-46-8c99fb6cc-cn7mm                                  1/1     Running     12         69d
rook-ceph-osd-47-787f9567f5-jf9pk                                 1/1     Running     14         69d
rook-ceph-osd-5-c79df46bc-rszk4                                   1/1     Running     12         69d
rook-ceph-osd-6-6d85cdd88f-rzv7x                                  1/1     Running     1          21d
rook-ceph-osd-7-65d9865cc7-x7b2k                                  1/1     Running     14         69d
rook-ceph-osd-8-674f88bdb6-g59jr                                  1/1     Running     1          21d
rook-ceph-osd-9-c88985bd8-wplc6                                   1/1     Running     1          21d
rook-ceph-osd-prepare-wx-cloudnative-master-2-6cdtd               0/1     Completed   0          7h21m
rook-ceph-osd-prepare-wx-cloudnative-master-3-q6rsv               0/1     Completed   0          7h21m
rook-ceph-osd-prepare-wx-cloudnative-worker-1-zlnzs               0/1     Completed   0          7h21m
rook-ceph-osd-prepare-wx-cloudnative-worker-2-j6rk7               0/1     Completed   0          7h21m
rook-ceph-tools-7b956ff5dc-bbhrl                                  1/1     Running     3          68d
rook-discover-k4mhh                                               1/1     Running     2          69d
rook-discover-v8g45                                               1/1     Running     4          69d
rook-discover-xxvkf                                               1/1     Running     2          69d
rook-discover-zntr4                                               1/1     Running     3          69d

查看磁盘变化

# lsblk -f
NAME                                                                                                 FSTYPE      LABEL           UUID                                   MOUNTPOINT
sda
└─sda1                                                                                               xfs                         555f0ba1-80c2-45a3-9fb6-7a61abea08bd   /
sdb                                                                                                  LVM2_member                 SDcVYN-mJZh-VSa1-82Gx-UntZ-XSf4-g9oQHY
└─ceph--c05a90c0--f9df--424e--bdaa--332b59c5aeb1-osd--data--6aa9bb69--b3ec--44e6--81bc--aae7f808aaeb
sr0                                                                                                  iso9660     CentOS 7 x86_64 2020-04-22-00-51-40-00

使用管理工具 toolbox

# kubectl create -f rook/toolbox.yaml

# kubectl -n rook-ceph get pod -l "app=rook-ceph-tools"
NAME                               READY   STATUS    RESTARTS   AGE
rook-ceph-tools-6bdcd78654-l4szw   1/1     Running   0          30s

查看状态

# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

# ceph status
  cluster:
    id:     e37f16af-b8c6-4b8c-b88d-e8746e504e65
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum a,b,c (age 2w)
    mgr: a(active, since 7h)
    osd: 48 osds: 48 up (since 2w), 48 in (since 8w)

  data:
    pools:   2 pools, 33 pgs
    objects: 1.19k objects, 3.4 GiB
    usage:   62 GiB used, 131 TiB / 131 TiB avail
    pgs:     33 active+clean

  io:
    client:   14 KiB/s wr, 0 op/s rd, 0 op/s wr

# ceph osd status
ID  HOST                      USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE      
 0  wx-cloudnative-worker-1  1476M  2792G      0        0       0        0   exists,up  
 1  wx-cloudnative-master-3  1235M  2792G      0        0       0        0   exists,up  
 2  wx-cloudnative-worker-2  1245M  2792G      0        0       0        0   exists,up  
...
46  wx-cloudnative-master-3  1242M  2792G      0        0       0        0   exists,up  
47  wx-cloudnative-master-2  1358M  2792G      0        0       0        0   exists,up

# ceph df
--- RAW STORAGE ---
CLASS  SIZE     AVAIL    USED    RAW USED  %RAW USED
hdd    131 TiB  131 TiB  14 GiB    62 GiB       0.05
TOTAL  131 TiB  131 TiB  14 GiB    62 GiB       0.05

--- POOLS ---
POOL                   ID  STORED   OBJECTS  USED     %USED  MAX AVAIL
device_health_metrics   1  1.8 MiB       48  1.8 MiB      0     41 TiB
replicapool             2  3.4 GiB    1.15k  3.4 GiB      0     41 TiB

# rados df
POOL_NAME                 USED  OBJECTS  CLONES  COPIES  MISSING_ON_PRIMARY  UNFOUND  DEGRADED  RD_OPS       RD    WR_OPS       WR  USED COMPR  UNDER COMPR
device_health_metrics  1.8 MiB       48       0     144                   0        0         0    3263  3.9 MiB      3312  3.2 MiB         0 B          0 B
replicapool            3.4 GiB     1145       0    3435                   0        0         0  106945  998 MiB  60285206  614 GiB         0 B          0 B

total_objects    1193
total_used       62 GiB
total_avail      131 TiB
total_space      131 TiB

Ceph Dashboard

配置 Ingress外部访问

# vim dashboard-external-https.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: rook-ceph-mgr-dashboard
  namespace: rook-ceph
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/cluster-issuer: "ca-clusterissuer"
    nginx.org/ssl-services: "rook-ceph-mgr-dashboard"
    #kubernetes.io/tls-acme: "true"
    #cert-manager.io/cluster-issuer: "letsencrypt-staging"
spec:
  tls:
   - hosts:
     - rook-ceph.cloudnative.wx
     secretName: rook-ceph.cloudnative.wx
  rules:
  - host: rook-ceph.cloudnative.wx
    http:
      paths:
      - path: /
        backend:
          serviceName: rook-ceph-mgr-dashboard
          servicePort: https-dashboard

# kubectl create -f rook/dashboard-external-https.yaml

# kubectl -n rook-ceph get service
NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
csi-cephfsplugin-metrics   ClusterIP   10.20.226.50    <none>        8080/TCP,8081/TCP   69d
csi-rbdplugin-metrics      ClusterIP   10.20.163.28    <none>        8080/TCP,8081/TCP   69d
rook-ceph-mgr              ClusterIP   10.20.102.232   <none>        9283/TCP            69d
rook-ceph-mgr-dashboard    ClusterIP   10.20.179.9     <none>        8443/TCP            69d

# kubectl describe ingress rook-ceph-mgr-dashboard -n rook-ceph
Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
Name:             rook-ceph-mgr-dashboard
Namespace:        rook-ceph
Address:          
Default backend:  default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
TLS:
  rook-ceph.cloudnative.wx terminates rook-ceph.cloudnative.wx
Rules:
  Host                      Path  Backends
  ----                      ----  --------
  rook-ceph.cloudnative.wx  
                            /   rook-ceph-mgr-dashboard:https-dashboard   10.135.204.44:8443)
Annotations:                cert-manager.io/cluster-issuer: ca-clusterissuer
                            kubernetes.io/ingress.class: nginx
                            nginx.org/ssl-services: rook-ceph-mgr-dashboard
Events:                     <none>

网页访问 https://rook-ceph.cloudnative.wx

用户名:admin 密码: 执行如下命令查看输出

# kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo

storageclass

# wget https://github.com/rook/rook/raw/release-1.4/cluster/examples/kubernetes/ceph/csi/rbd/storageclass.yaml
# mv storageclass.yaml block-storageclass.yaml
# kubectl create -f rook/block-storageclass.yaml

# kubectl get sc
NAME              PROVISIONER                  RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
rook-ceph-block   rook-ceph.rbd.csi.ceph.com   Delete          Immediate           true                   113m

标记默认 StorageClass

# kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/rook-ceph-block patched

# kubectl get sc
NAME                        PROVISIONER                  RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
rook-ceph-block (default)   rook-ceph.rbd.csi.ceph.com   Delete          Immediate           true                   114m

分类:

更新时间: