kubernetes debug

Do not deploy pod on one cluster node:
kubectl cordon <cluster node>

Force delete statefulset:
kubectl delete pods <pod> --grace-period=0 --force

gets Pods on specific nodes:
kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=<node>

Scale statefulset:
kubectl scale statefulsets <stateful-set-name> --replicas=<new-replicas>

Nodes labels:
Add label:
kubectl label nodes <node-name> <label-key>=<label-value>
Show labels:
kubectl get nodes --show-labels

Use nodeSelector:
apiVersion: v1
kind: Pod
  name: nginx
    env: test
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
    disktype: ssd

For OpenShift use:

oc describe pod <pod-id>  

For vanilla Kubernetes:

kubectl describe pod <pod-id>  

Examine the events of the output. In my case it shows Back-off pulling image coredns/coredns:latest

In this case the image coredns/coredns:latest can not be pulled from the Internet.

  FirstSeen LastSeen    Count   From                SubObjectPath           Type        Reason      Message
  --------- --------    -----   ----                -------------           --------    ------      -------
  5m        5m      1   {default-scheduler }                        Normal      Scheduled   Successfully assigned coredns-4224169331-9nhxj to
  5m        1m      4   {kubelet}   spec.containers{coredns}    Normal      Pulling     pulling image "coredns/coredns:latest"
  4m        26s     4   {kubelet}   spec.containers{coredns}    Warning     Failed      Failed to pull image "coredns/coredns:latest": Network timed out while trying to connect to https://index.docker.io/v1/repositories/coredns/coredns/images. You may want to check your internet connection or if you are behind a proxy.
  4m        26s     4   {kubelet}                   Warning     FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "coredns" with ErrImagePull: "Network timed out while trying to connect to https://index.docker.io/v1/repositories/coredns/coredns/images. You may want to check your Internet connection or if you are behind a proxy."

  4m    2s  7   {kubelet}   spec.containers{coredns}    Normal  BackOff     Back-off pulling image "coredns/coredns:latest"
  4m    2s  7   {kubelet}                   Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "coredns" with ImagePullBackOff: "Back-off pulling image \"coredns/coredns:latest\""

Additional debuging steps

  1. Identify the node by doing a ‘kubectl/oc get pods -o wide’
  2. ssh into the node that can not pull the docker image
  3. check that the node can resolve the DNS of the docker registry by performing a ping.
  4. try to pull the docker image manually on the node
  5. If you are using a private registry, check that your secret exists and the secret is correct. Your secret should also be in the same namespace. Thanks swenzel
  6. Try to pull the image locally