How to automatically backup a PostgreSQL database to Microsoft Azure
I wanted to backup a PostgreSQL database automatically to some offsite storage/server.
The first thing I had to do was decide where to put the backup. In the end I decided to use Azure as part of the solution. I know people say Azure’s expensive, but you can get a 12 month free trial. So at the time of writing, this solution doesn’t cost anything.
Let’s start with the Azure end of it. There are lots of options in Azure and it can take a bit of time to work where you should store the backup file. I knew nothing about Azure when I started, but I worked out that the best option was to use blob storage.
To set up blob storage I did the following:
- Create a subscription account on Azure - go for the 12 months free trial.
- Create a resource group.
- Then create a storage account.
- Add a container - that’s where your backup files get stored (as blobs).
For each one of the steps above you can pretty much just keep the default settings.
Now let’s look at the database server. I've got PostgreSQL running on Ubuntu. I created a simple script to run the backup and then copy the file across to the container in Azure.
Here’s the full script:
Let’s go through it, section by section.
- We set three variables at the top that used to build the name of our backup file. The name of the backup is the name of the database and the date.
-
The next bit deletes any previous backups. It starts with a ‘find’ that looks for files that already exist in the backup folder. There are some options after the path to the backup folder:
-mtime 3 - find all files modified more than three days ago
-type f - search for files
The next bit deletes the files we’ve found. We run the rm command for each file found by find:
-exec rm -f {} \;
-f means it doesn’t stop to prompt the user or get stuck on non-existent files.
{} - tell rm to run on the files found.
\; - the semicolon ends the command. The backslash is to escape the semicolon so it’s used by the find command and not the shell.
I also log any errors and the standard output to a separate file:
2> /var/lib/postgresql/backups/auto/$FILENAME.delete.log 1>&2
-
This is the part that does the backup:
sudo -u postgres /usr/lib/postgresql/10/bin/pg_dump -v -F c -d $DATABASE -h localhost -p 5432 -U postgres -f /var/lib/postgresql/backups/auto/$FILENAME
We’re running pg_dump as the postgres user. We’re also using these options:
- -v - specifies verbose mode. pg_dump will output object comments and start/stop times to the dump file and also send progress messages to standard error.
- -F c - specifies the backup file format, c means custom.
- -d $Database - the database we want to backup, using the variable we set at the start.
- -h localhost -p 5432 - host & port of the database server.
- -U postgres - the username that's used to connect to the database.
- -f /var/lib/postgresql/backups/auto/$FILENAME - the file where we want our backup to go.
Again, as with the delete, we’re redirecting any errors to a separate log file each time we run the backup.
-
The last of the script copies the backup file to Azure. It uses Azcopy, a free tool that’s available for Windows, MacOS and Linux. Get started with AzCopy
To use Azcopy you need a secure key for your storage account in Azure.
There were 3 other things I needed to do to get the backup script to run:
-
Make the backup script an executable:
sudo chmod +x /home/mydbscripts/dbbackup.sh
- Edit the pg_hba.conf file on the server.
- Give the postgres user account the server permission to write files to the backup directory.
Finally, the last step was to edit /etc/crontab and schedule the backup to run every evening.
sudo nano /etc/crontab and then update the crontab file so the dbbackup.sh script runs every evening.
Every few days I check that the backup has worked and the latest file is safely stored in Azure. From time to time I also pick a backup in Azure, download and then do a restore to a test version of PostgreSQL.