For the most part, GKE has just worked for me. I’ve not had many problems, but last night I came across an issue that had me scratching my head for a bit. An update to this site funny enough failed to deploy and ended up with the container stuck in CrashLoopBackOff. So after a bit of log debugging.
kubectl describe pod <POD_NAME>
I decided to redeploy another service. Whoops the same problem, now I remember an email a couple of weeks back about my node version being incompatible with master. I logged into the console, and I could see my cluster was three versions behind. So I started the cluster update and found something else to do. After the update, all seemed back to normal until I noticed that a very old deployment for a Postgres service continued to have the CrashLoopBackOff status. After checking the logs, I inspected the deployment file. Everything was fine until I noticed an issue with the image.
spec: containers: - image: postgres
Now when it comes to pulling in anything external, I’m a real stickler for version numbers, but this one slipped through the net. Somewhere in between the 400+ days that the pod had been running was an update to the Postgres image from 10.8 to 11.3, and this caused the container crash on startup because the internal storage format had changed. So after adding the version to the image and redeploying. Everything returned to normal.