Troubleshooting Kubernetes¶
Estimated time to read: 3 minutes
- Originally Written: April, 2025
Running low on storage - check your logs¶
I run a couple of small Rancher (RKE1) Kubernetes clusters for some projects and provide storage with Longhorn. They've been running fairly well for a couple of years with minimal maintenance required but recently we had a storage issue. The original cluster was stood up quickly and as it goes sometimes with these side projects, the monitoring and "day 2" operations were an afterthought and added to the TODO list (lesson learned). Monitoring consisted of logging into the cluster every 3 or 4 months to check all the pods were running, or alternatively waiting until a user of the cluster complained that it was broken.
Initially we saw a lot of evicted pods and some with ContainerStatusUnknown
. Longhorn was showing failed nodes and no space available instead of what we usually see like the following image.
Although I no longer have the logs we saw errors with the nodes and Kubelets on the failed nodes. We eventually found the culprit which was the kubelet
which after 2 years had filled the JSON log to 50G+ and consumed all the space on the node.
du -sh -t 100M /var/lib/docker/containers/* | sort -hr
Once the container was restarted the cluster returned to a healthy state.
As I had quickly setup the cluster, besides not configuring monitoring/alerting, I had also not configured any log rotation.
On each of the Kubernetes nodes I created a logrotate file containing the following.
# Specifies the log files to be rotated
/var/lib/docker/containers/*/*.log {
# Rotate the log files up to 7 times before deleting old logs
rotate 7
# Perform the rotation daily
daily
# Compress the rotated log files
compress
# Continue processing even if a log file is missing
missingok
# Delay compression until the next rotation cycle
delaycompress
# Truncate the original log file and create a new one without interrupting the logging process
copytruncate
# Specify the maximum size a log file can grow before it's rotated
maxsize 200M
}
This is the one-liner I use to create the file and rerun logrotate.
- I needed to use
$''
as it allows the string to interpret escape sequences like\n
for newlines. Otherwise you'll end up the literal\n
text in the file rather than a newline. -fv
is to force the rotation and show verbose output
sudo mkdir -p /etc/logrotate.d/docker-container && echo $'/var/lib/docker/containers/*/*.log {\n rotate 7\n daily\n compress\n missingok\n delaycompress\n copytruncate\n maxsize 200M\n}' | sudo tee /etc/logrotate.d/docker-container/logrotate.conf > /dev/null && sudo logrotate -fv /etc/logrotate.d/docker-container
Cleaning up Evicted pods or pods with Errors¶
kubectl get po -A |grep Evicted |awk '{print "kubectl delete po -n ",$1,$2}'|bash -x