There are different ways to do Kubernetes database backups. Why use pg_dump? There are two benefits for this option: simplicity and consistency. It comes with the standard PostgreSQL distribution and it continues to make backups even if the database is being used at the same time, it does not block other users accessing the database. In this tutorial, I will explain how to create PostgreSQL backups using this command.
While we are using Velero (https://velero.io) to do Kubernetes backups, I felt a bit uneasy about the database backups so I decided to also have some good old SQL dumps for our Postgres databases. In VMs we have used this type of backup for quite some time, however, for those running in Kubernetes we didn’t.
If you want to understand how this has been created and brush up your Kubernetes knowledge, read on.
This is a simple shell script based on pg_dump which is creating a SQL dump for a database. Even if in Kubernetes it’s not as important, in order to be consistent with the style we are using in our VMs, we will dump each database in its own file. This means we are not using pg_dumpall but we will loop over the list of dbs and dump each one with pg_dump.
A few notes on how this may be different from other scripts found online:
The Dockerfile for building the docker image is very simple, basically we start from the bitnami/postgres image which already has all the postgresql tools and add the pgdump.sh script.
Setting a 'do nothing' ENTRYPOINT is useful in this case because you can start a container or a pod from it, exec 'inside' it and test things.
Building and pushing (to docker hub vvang repository) the image steps:
docker build -t pgdump:0.5 . docker tag pgdump:0.5 vvang/pgdump:0.5 docker push vvang/pgdump:0.5
I have tested this on KIND (local cluster) and DigitalOcean managed Kubernetes so I'll just assume you have access with kubectl and helm to a Kubernetes cluster.
I have created a namespace pg-helm and I used helm to install postgres in there:
kubectl create ns pg-helm
helm install -n pg-helm --set auth.postgresPassword='p123456' mypg bitnami/postgresql
As you can see in the instructions displayed by the helm command, you can access the postgres server using the service mypg-postgresql ( mypg-postgresql.pg-helm from other namespaces), with user postgres and the password stored in the secret mypg-postgresql:
kubectl -n pg-helm get services kubectl -n pg-helm get secrets kubectl get secret --namespace pg-helm mypg-postgresql -o jsonpath="{.data.postgres-password}" | base64 -d
If you uninstall the helm chart, the PVC won't be deleted, as a safety measure for you. If you reinstall it, the old PVC and postgres data will be used.
I have decided to use dedicated volumes for dumps so first we'll need to create a PVC, see the file pvc.yml.
Then, we'll create a deployment (in fact, a simple pod will suffice) which can be used to test the dump script or to do restores. Looking at the file deployment.yml: - all the objects created (pvc, deployment, cronjob) will have the same name, in this case mypg-pgdump - using the volume created by PVC.
After you create the PVC and the deployment, you can test things:
kubectl apply -f pvc.yaml kubectl apply -f deployment.yml kubectl -n pg-helm get pods # note the mypg-pgump-... name above and use it in the next command kubectl -n pg-helm exec -ti mypg-pgdump-6cdfc4c966-kq6j2 -- bash # now you are 'inside' the container /pgdump.sh # this will run the backup ls -la /data # this will show the dumps exit
If this is working, everything is ok and you can proceed with the cronjob. Otherwise, you can use various debug commands inside the mypg-pgdump pod.
Kubernetes has a dedicated object for cronjobs and this is pretty similar to a Deployment. Look at the file cronjob.yml where in the spec section you will recognize most of it. Only the schedule and restartPolicy are new.
The schedule syntax is the same as the standard Unix cron daemon (no surprise here). You may change the schedule: line to see some results faster.
kubectl apply -f cronjob.yml kubectl get cj
And that’s it. Stay tuned for part 2, where we will discuss Helm and how to convert static YAML files.
You may find all the files here: https://github.com/viorel-anghel/pgdump-kubernetes.git
Viorel Anghel has 20+ years of experience as an IT Professional, taking on various roles, such as Systems Architect, Sysadmin, Network Engineer, SRE, Devops, and Tech Lead. He has a background in Unix/Linux systems administration, high availability, scalability, change, and config management. Also, Viorel is a RedHat Certified Engineer and AWS Certified Solutions Architect, working with Docker, Kubernetes, Xen, AWS, GCP, Cassandra, Kafka, and many other technologies. He is the Head of Cloud and Infrastructure at eSolutions.
Ready to learn more about how custom software products can benefit your business? Contact us today to schedule a consultation and start exploring your options!