Foreign Key Constraints

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Abstract

Never use foreign key constraints. Use foreign keys, however.

Definitions

Key
A column or group of columns that will always have a unique value or combination of values per row. A key can be used to uniquely identify a row, all the time.
Foreign Key
A column or group of columns that contains values that reference the key of another table.
Foreign Key Constraint
A limitation on the data in a database such that no foreign key references a row that does not exist.

Problems

Typically, people learn about foreign keys, then they learn about foreign key constraints, and then they think they want foreign key constraints everywhere.

Arguments Against Foreign Key Constraints

In reality, foreign key constraints are never useful. You should never try to create them, in any system.

People Expect the Constraint to be Broken All the Time

In reality, people understand that foreign keys aren't consistent. If I gave you a random phone number, you do not automatically assume that it works, or that it even references the intended target.

The Idea that You Can Limit Information is Stupid

Think about what foreign key constraints say: "Either this data is consistent, or it doesn't exist." That's a rather broad statement. Inconsistent data exists in reality all the time, and we are fine with that. Trying to enforce arbitrary rules is like sticking your fingers in your ears and saying, "I can't hear you!"

Your Use Cases Demand You Record Inconsistent Data

You are often going to encounter a Catch-22. You need to grant a user permissions, for instance, who has not yet been added to the system.

While you can use proxy keys and other strategies, simply relaxing all the foreign key constraints resolves the Catch-22 much more simply.

Your Code is Robust

By handling the case of missing data early, you solve a whole lot of problems in the beginning. Having a pessimistic view about your code dependencies is always a good idea.

You Write Consistency Checks

You will want to see where your data is inconsistent. This is good practice, just because it helps you understand what your data is doing.

Data consistency checks are not cheap and in very large, rapidly changing data sets, almost impossible to implement.

Scaling

Without foreign key constraints, you can add, delete, or modify data without any regard to alternate data sources or data sizes.

Weak Coupling

Weak Coupling is a good design practice. By making the interaction between systems small and well-defined, it becomes much easier to understand, fix, and improve your system.

Simplicity

There are systems and proposals to make foreign keys work, despite the arguments above. However, all of them result in a much more complicated system. This is a huge burden in developer time, mind share, and documentation. (See Complexity)

Arguments For Foreign Key Constraints

Obviously, I've made up my mind, so I'm going to shoot these arguments down.

My Code Is Simpler

You may be correct that it's easier to assume that the data being referred to exists. However, even with foreign key constraints, this is not always the case. What if you are given a bad ID? Or what if, in between the time you look up the id and actually fetch the data, the data disappears? Since you have to guard against these cases anyway, foreign key constraints do not save you any development time. They only give you a false sense of security.

The Data is Small and Colocated

Just because your data is small and colocated does not imply it will always be so. What happens when the New York office wants their own version of your database? Are you absolutely certain that Los Angelos and New York will never want to refer to each other's data?

We Shouldn't Have Inconsistent Data

The argument is that data should always be consistent.

This is a recipe for disaster. There is no reason in the world why data should be consistent other than to satisfy your obsessive compulsive disorder.

In the real world, real people deal with inconsistent data all the time. Your system should be at least as robust.

UI Gets Easier

The awful truth is foreign key constraints make UI's awful.

To really illustrate this, let me propose we built a system that did enforce foreign key constraints on, let's say, people's phone numbers. If I tried to add data into the system that So-and-so has a particular phone number, and the system discovers that phone number doesn't exist, what would you expect it to do? Forget the data I gave it? Or flag it as non-existant and ask me to come back later and fix it?

If you want the system to remember the wrong data and flag it, then you don't want foreign key constraints.

Non Arguments

These are arguments that have no bearing whatsoever on the discussion.

Foreign Keys are Important

Note that I'm not arguing against foreign keys. Foreign keys are facts of life. I'm arguing against foreign key constraints.

My DBA Said ... Therefore It's True

This is a logical fallacy called argumentum ad verecundium, argument from authority. The message, the argument, needs to be divorced from the arguer so that it an be considered logically.

If you DBA has a reason, ask him what they are, and then evaluate those reasons.

I Don't Get It

If you don't understand the topic we're discussing, that means you can't contribute nor decide what is best. You need to get educated about the topic.

Too many developers take a lackadaisical approach to database design and engineering. Unfortunately, these tend to be the same developers that take a lackadaisical approach to other aspects of their job, and it shows in the end product they produce.