There is a war about primary keys in the database world. There are two camps. The first camp believes that primary keys should only ever be created on meaningful information. For example, there is an ISO standard for the abbreviation of state names in the United States. You could create a table for looking up state names and make the primary key that abbreviation because it is guaranteed to be unique. The other camp believes that primary keys should never be created on meaningful information because, meaningful information is subject to change and you don’t want your primary keys to be changing. A changing primary key means changing all the tables that are related to that value. This camp believes that all primary keys should be artificial. As far as I’m concerned, both camps are 100% correct. My own personal preference is towards the artificial key camp, but I have built systems with natural keys and they can work just fine. The examples we’re using in these blog posts are for artificial keys, but that is just me staying in my comfort zone. I am not attempting to advocate for one camp or the other. I’ll leave it to others to have that fight. I just want to make you aware that you will see both types in different databases. You may see both types in the same database. In upcoming examples, I will even show you something that is a combination of both types.
One point in favor of natural keys, just because you have an artificial key, you’re not done. You still must define the natural key and set it up as a candidate, or alternate, key in order to be sure that you’re meeting the business requirements for uniqueness as well as simply uniquely identifying the row. If you do not do this, then it will be possible to violate business rules such as having more than one state abbreviation with identical values. So, while artificial keys are easier to set up and maintain, and can perform better, you have the added overhead of having to also set up and maintain the natural key anyway. Don’t ever forget this point if you do choose to work with artificial keys.
One point in favor of artificial keys, data compliance through mechanisms like the GDPR requires you to change the data going to non-production environments so that privacy is maintained. A natural key could also be personally identifying information. Having to change all primary key and foreign key values across your database could be a massive undertaking. Artificial keys, used appropriately, completely avoid this issue.
Artificial vs. Natural keys is not a hill I wish to die on. Nor should you. There are so many more important things to worry about when building your database. The next post in the series will be on the scripts necessary to define primary keys.