0

In my K8S cluster there is a occasional issue of pods getting stuck in "CreateContainerError" state without any errors or problems. Every time the problem is fixed simply by deleting stuck pod and letting deployment recreate it. I suspect some NFS PVC issue, but this problem occurs only once every 3-4 months making debugging very difficult.

The real problem is that when this happens the only way to bring back affected service is to manually delete this pod. I tried to find a way to do it according to the k8s philosophy, but apparently there is no option to force deployment to recreate pod.

My question is: is there a way or tool to automate recreating pods in "CreateContainerError" state when it happens?

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Browse other questions tagged or ask your own question.