As a part of my own journey of learning within PostgreSQL, I’ve decided that I’m going to take part in PGSQL Phriday as often as I can, just as a way to continue to stretch my knowledge of this platform. Along the way, hopefully, I can help you learn a little too. The topic of this first post is Two Truths and a Lie about PostgreSQL.
Now, the lie could easily be that I’m in any way qualified to talk about this topic, however, I can do a little better than that. Let’s start off with simple, but important information, backups:
- Pg_dump can go to either a straight set of SQL, or, to a compressed digital format
- Pg_dumpall backups can be used to restore over existing databases with data
- You can use Pg_dump to backup individual schemas
Now, in keeping with the game, one of these statements is a lie. Yeah, all the hard core PostgreSQL people have already spotted it. However, those of us just learning, this stuff might not be obvious. Let’s talk about these.
The command to go to Pg_dump is pretty straight forward:
pg_dump databasename > backupfile;
The output from this command, by default, will be a series of SQL files defining the objects and the data within the database. You can then use Pg_restore to restore the database. However, what if you don’t want to deal with lots of individual files? What if you want a big fat formatted file like you get from a SQL Server backup (fair warning, I know SQL Server relatively well, so I use it as a reference point for my PostgreSQL learning)? Ah, then you have to modify the code to use -F, capitalized, for Format. I have a number of choices too:
-Fp – Plain, or the default
-Fd – Directory, each table gets its own directory along with the data in it
-Ft – Tar, a compressed file
-Fc – Custome, a binary compressed file, what I’m looking for.
So, we would modify the output to this:
pg_dump -Fc databasename > backupfile;
As I stated in the article I wrote and linked above, you can use Pg_dumpall to backup all the databases on a given PostgreSQL server with a single command. This is a lot easier than issuing pg_dump multiple times. The basic syntax is crazy simple:
Pg_dumpall > location;
Now, when you’re ready to restore all those databases, you don’t need to sweat that there is data in them. You can simply run the following command:
Pg_restoreall < location;
That will read the files & restore your databases. Nice & simple really. For more details, read about Pg_restoreall here.
Pg_dump Individual Schemas
I really like how well Pg_dump does some functionality. There are so many options for modifying what is getting backed up. One of my favorite is the ability to simply pull one schema, separate from the others. Syntax isn’t all that bad either:
Pg_dump -n 'dev.*' -Fc databasename > backupfile;
You can get really funky with this, because it’s not really just going after schema. Instead, that’s all about pattern matching. It supports regular expressions. You can get quite creative in deciding what gets backed up. The link above for pg_dump has more details.
Which Was the Lie?
It was Pg_restoreall. For those who followed the link to Pg_restoreall, I hope you enjoyed the song. Pg_dumpall is a wrapper around Pg_dump. However, it doesn’t implement every aspect of Pg_dump. So, for example, you can’t run a restore on the database if there’s data within it, unless you create the backup using “clean” or ‘-c’ to ensure drops occur before the restore process (but it drops everything before the restore process, so plan accordingly). You can read more on Pg_dumpall here (honest link this time).
I hope you enjoyed my little contribution to Two Truths and a Lie about PostgreSQL as part of #PGSQLPhriday #001. Follow the link at the top of the page for more about this fun community blogging effort.