ephemeral-storage
因为ephemeral-storage在k8s 到当前版本(1.8)都是alpha,ephemeral-storage功能默认是不启用的,如果你想使用ephemeral-storage功能需要,你在apiserver,kubelet的args中配置(–feature-gates=LocalStorageCapacityIsolation=true
)重启进程就可以使用啦,ephemeral-storage可以很多资源中使用,下面我只拿pod进行测试和使用,如了解更多请点开这里写链接内容
ephemeral-storage的eviction逻辑
EmptyDir 的使用量超过了他的 SizeLimit,那么这个 pod 将会被驱逐
Container 的使用量(log,如果没有 overlay 分区,则包括 imagefs)超过了他的 limit,则这个 pod 会被驱逐
Pod 对本地临时存储总的使用量(所有 emptydir 和 container)超过了 pod 中所有container 的 limit 之和,则 pod 被驱逐
映射数据到本地
apiVersion: V1
kind: pod
metadata:
name: foo
spec:
containers:
- name: fooa
resources:
requests:
ephemeral-storage: "2Gi"
limits:
ephemeral-storage: "3Gi"
volumeMounts:
name: myEmptyDir
mountPath: /data
volumes:
name: myEmptyDir
emptyDir:
sizeLimit: "1Gi"
不映射数据到本地
apiVersion: V1
kind: pod
metadata:
name: foo
spec:
containers:
- name: fooa
resources:
requests:
ephemeral-storage: "2Gi"
limits:
ephemeral-storage: "3Gi"
但,这玩意生效有条件:
猛一看,ephemeral-storage只能对镜像存放在“根分区”下的容器有效,也就是默认的"Docker Root Dir: /var/lib/docker
"必须在根分区下;对于一个正常点的运维来说,程序路径与根分区分离是基本的做法,对于一个有节操的k8s运维来说,将/var/lib/docker
用独立分区,再正常不过了。
测试结果如下:
docker Version: 18.09.8
k8s version:1.13.8
Docker Root Dir: /var/lib/docker
kubelet的--root-dir: 默认(/var/lib/kubelet)
/var/lib/docker
在根分区下,ephemeral-storage有效果
/var/lib/docker
不在根分区下(作为单独分区),ephemeral-storage没有效果
这有点沮丧,这么有用的功能难道不能派上用场,不太相信,求助github,有线索:https://github.com/kubernetes/enhancements/issues/361
其中有这样的回复:
The behavior you describe should work regardless of this feature. Make sure you have –root-dir set correctly. Docker reports its root directory to the kubelet, so as long as your images are stored on the same partition that contains /var/lib/docker (or whatever your docker root dir is), this should work correctly.
这句话貌似有误,/var/lib/docker
应该写错了,换成/var/lib/kubelet
才好理解,因为/var/lib/kubelet
是--root-dir
的默认配置,总的来说,意思是只要“Docker Root Dir: /var/lib/docker
”和“kubelet --root-dir
”在一个分区,就能起作用。
测试结果就是如此。
/var/lib/docker
是独立分区的情况下,怎样实现kubelet的root-dir与/var/lib/docker
一个分区呢?两个选择:
方案1. 修改root-dir
kubectl drain nodename
systemctl stop docker
systemctl stop kubelet
修改/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
:
增加--root-dir=/var/lib/docker/kubelet/
将/var/lib/kubelet/
修改为/var/lib/docker/kubelet/
修改/etc/kubernetes/kubelet.conf
将/var/lib/kubelet/
修改为/var/lib/docker/kubelet/
mv /var/lib/kubelet /var/lib/docker
systemctl daemon-reload
systemctl start docker
systemctl start kubelet
有个遗留问题,重启kubelet后,又自动生产了以下目录,但kubelet运行正常
# tree /var/lib/kubelet -L 3
/var/lib/kubelet
└── device-plugins
├── DEPRECATION
├── kubelet_internal_checkpoint
└── kubelet.sock
方案2.root-dir软链到/var/lib/docker
下
kubectl drain nodename
systemctl stop docker
systemctl stop kubelet
mv /var/lib/kubelet /var/lib/docker
ln -s /var/lib/kubelet /var/lib/docker/kubelet
systemctl start docker
systemctl start kubelet
systemctl uncordon nodename
PS:上述mv操作前,先df确认下是否有/var/lib/kubelet
下的文件被mount,有则先umount再mv,否则报错“Device or resource busy
”
# mv kubelet/ /var/lib/docker
mv: cannot remove ‘kubelet/pods/73a3d42a-b2a5-11e9-8e8d-005056b4f9d3/volumes/kubernetes.io~secret/kube-proxy-token-jccg4’: Device or resource busy
mv: cannot remove ‘kubelet/pods/73a36f7a-b2a5-11e9-8e8d-005056b4f9d3/volumes/kubernetes.io~secret/etcd-certs’: Device or resource busy
mv: cannot remove ‘kubelet/pods/73a36f7a-b2a5-11e9-8e8d-005056b4f9d3/volumes/kubernetes.io~secret/calico-node-token-tzfv8’: Device or resource busy
mv: cannot remove ‘kubelet/pods/e2542d86-ceef-11e9-8e8d-005056b4f9d3/volumes/kubernetes.io~secret/node-exporter-token-5926x’: Device or resource busy
# df -h
tmpfs 20517564 0 20517564 0% /var/lib/kubelet/pods/73a3d42a-b2a5-11e9-8e8d-005056b4f9d3/volumes/kubernetes.io~secret/kube-proxy-token-jccg4
tmpfs 20517564 0 20517564 0% /var/lib/kubelet/pods/73a36f7a-b2a5-11e9-8e8d-005056b4f9d3/volumes/kubernetes.io~secret/etcd-certs
tmpfs 20517564 0 20517564 0% /var/lib/kubelet/pods/73a36f7a-b2a5-11e9-8e8d-005056b4f9d3/volumes/kubernetes.io~secret/calico-node-token-tzfv8
tmpfs 20517564 0 20517564 0% /var/lib/kubelet/pods/e2542d86-ceef-11e9-8e8d-005056b4f9d3/volumes/kubernetes.io~secret/node-exporter-token-5926
测试结果:
在容器中dd生成一个5G的文件,终于可以evicted了。
# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ptest-trade-747b894f54-mhrv4 0/1 Evicted 0 3m37s <none> lin-40-16-206.lb.com <none> <none>
ptest-trade-747b894f54-tx847 0/1 Running 0 26s 10.46.206.96 lin-40-16-206.lb.com <none> <none>
# kubectl describe pod p7881-trade-747b894f54-mhrv4
Events:
Warning Evicted 12s kubelet, lin-40-16-206.lb.com Pod ephemeral local storage usage exceeds the total limit of containers 5Gi.
Warning ExceededGracePeriod 2s kubelet, lin-40-16-206.lb.com Container runtime did not kill the pod within specified grace period.
Normal Killing 1s kubelet, lin-40-16-206.lb.com Killing container with id docker://ptest-trade:Need to kill Pod