How To Partition Your Data

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Introduction

Why partition your data? Eventually you are going to reach a point where you can't make your database go any faster. Disks have a maximum speed; networks only go so fast. Eventually, you will need 2, 3, or more databases to share the load.

You can play with multi-master replication or master-slave replication, but this is a temporary salve over a larger underlying problem. There is simply too much happening for one or a handful of machines to handle. And as you add more and more nodes to your replication scheme, your entire system is going to become more and more brittle and difficult to manage.

This is why you partition. This is why governments setup district offices. This is why companies have satellite offices. This is why CEOs hire vice presidents. There is simply too much work for one thing to do, and it needs to be spread out.

Assuming you have your data organized into tables, then there are several options in partitioning.

Vertical Partitioning

What I call vertical partitioning is the business of moving entire tables off of the machine and onto a separate machine. That is, these tables are no longer kept, in any form, on the original machine. They are moved entirely onto a new machine.

Now, this is going to have some side-effects: Foreign keys don't work anymore. That is, you can't tell whether the foreign key exists, and you certainly can't communicate that a row has disappeared.

You'll have to adapt your data model. You'll have to account for the situation where there is a foreign key pointing to nowhere. What will this mean? Figure it out.

Ideally, you don't want to partition data that is tightly coupled. But even if you have to, it isn't the end of the world. What you need to do is setup a service layer that will coordinate the actions between the two databases. These service layers, properly implemented, are scaleable. Each machine you add can do just as much work as the previous machine did.

Not only are foreign keys broken, but so are transactions. You have to get rid of this concept of a transaction. Now you are treading in a world where nothing is perfect. Deal with it. Figure out what the correct response is. Figure out the order things have to happen in, and figure out how to continue when things get interrupted.

Horizontal Partitioning

After you've done as much vertical partitioning as you can, it's time to do horizontal partitioning. This is when you take a table, split it into n chunks, and store each chunk on a different server.

The point is that each row will live on one of the n servers. You shouldn't just do mod on the primary key--you'll need to do mod on a hash of the primary key to get a random but predictable server. This will make sure the rows are spread evenly across the servers. Otherwise, you may end up with an imbalance.

You are also going to break transactions even more. That's okay, though, because you've already learned to deal with it when you partitioned vertically.

Extending Horizontal Partitions

Eventually, you may need to extend your horizontal partitions. When you do this, always extend the number of partitions by an integer multiple of the number of nodes. The reason why is because you are almost guaranteed that a fraction of the rows already on the node will stay on the node.

As you are partitioning, that is, moving rows to their new home, your service layer will have to check both the old and the new home for the row. The old home will have to keep track of when the new home will take responsibility of the row, and both will have to agree.

Fault Tolerance

If each server has an <math>f%</math> chance of failing each day, and you have <math>n</math> servers, your entire system will have a chance of failing that is <math>n \times f</math> each day. As the number of servers grows, this will become intolerable. How do you keep the chance of system failure down?

There are a few ways.

First, you can build back-up nodes. These will keep track of what the primary node is doing and be ready to take over when the primary node fails. When the primary node comes back up, it will act as a back-up node. For <math>b</math> backup nodes per node, the chance of failure becomes <math>n \times f^b</math>.

There is another way to have back-up nodes. You can have each node act as a back-up node for its neighbors. Although this reduces performance, it also increases reliability as if you had that many back-up nodes.