用户工具

站点工具


02-工程实践:kubernetes:issue:proc_ns_ipc

容器重建失败

现象

  • docker日志caused “lstat /proc/12355/ns/ipc: no such file or directory”“: unknown”
  • kubelet日志包含Error syncing pod, StartContainer failed
  • calico挂掉之后容器网络不通(其他机器会删除到问题节点的路由),监控报coredns不可用(通过Zabbix dns监控检查node到coredns的连通性)

kubelet日志:

E1125 12:01:52.879891   15686 pod_workers.go:186] Error syncing pod 018b823f-eefe-11e8-93f1-e8611f143f2c ("calico-node-9dmkn_kube-system(018b823f-eefe-11e8-93f1-e8611f143f2c)"), skipping: failed to "StartContainer" for "calico-node" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=calico-node pod=calico-node-9dmkn_kube-system(018b823f-eefe-11e8-93f1-e8611f143f2c)"
E1125 12:01:55.879732   15686 pod_workers.go:186] Error syncing pod 69e93844-eefe-11e8-93f1-e8611f143f2c ("dev-router-mjktm_dev(69e93844-eefe-11e8-93f1-e8611f143f2c)"), skipping: failed to "StartContainer" for "nginx-ingress-controller" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=nginx-ingress-controller pod=dev-router-mjktm_dev(69e93844-eefe-11e8-93f1-e8611f143f2c)"
E1125 12:02:06.969298   15686 remote_runtime.go:213] StartContainer "efcc9baa1ddb971592b799dc6eb0a1c0c8d0bbcf4167ae15df464176099dc24d" from runtime service failed: rpc error: code = Unknown desc = failed to start container "efcc9baa1ddb971592b799dc6eb0a1c0c8d0bbcf4167ae15df464176099dc24d": Error response from daemon: OCI runtime create failed: container_linux.go:341: creating new parent process caused "container_linux.go:1713: running lstat on namespace path \"/proc/16130/ns/ipc\" caused \"lstat /proc/16130/ns/ipc: no such file or directory\"": unknown
E1125 12:02:06.969372   15686 kuberuntime_manager.go:744] container start failed: RunContainerError: failed to start container "efcc9baa1ddb971592b799dc6eb0a1c0c8d0bbcf4167ae15df464176099dc24d": Error response from daemon: OCI runtime create failed: container_linux.go:341: creating new parent process caused "container_linux.go:1713: running lstat on namespace path \"/proc/16130/ns/ipc\" caused \"lstat /proc/16130/ns/ipc: no such file or directory\"": unknown

相关链接

calico相关issue

有可能是CNI的问题

关注calico release note,及时进行小版本升级

考虑升级到3.2.4: https://docs.projectcalico.org/v3.2/releases/

live restore

开启live restore,重启docker进程不影响容器

unregister_netdevice: waiting for lo to become free.

还不知道是否相关

02-工程实践/kubernetes/issue/proc_ns_ipc.txt · 最后更改: 2020/04/07 06:34 由 annhe