NoSQL

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Introduction

See It Must be Crap on Relational Databases Week for an intro. Basically, it appears like a bunch of MySQL users got sick of MySQL and decided that all Relational Databases are crap and got to go.

Rebuttal

Let me rebut this.

First, here are some facts:

  1. Relational databases are very old. They've been around a long time and are still going strong. Case in point: filesystems are trending towards a relational database-type interface.
  2. NoSQL isn't new. Back in 2000, it was object databases. Before that, it was hierarchical databases. Today, it appears to be high-performance databases.

Now, the explanation why:

Relational databases are an abstract concept with an excellent implementation. Kind of like basic arithmetic on a microchip, it just works and works well. Whereas basic arithmetic is based on simple mathematical concepts such as adding and subtracting, relational databases are based on simple mathematical concepts such as transform and filter. And the concepts behind this are probably even simpler than the concepts behind basic arithmetic. Let's review them.

  1. Tables have columns and rows.
  2. Columns describe what is in each row.
  3. Each row contains values for each columns.
  4. You can insert, update, select, or delete data from the tables.
  5. You can combine tables in new and interesting ways. This is the "relational" bit.
  6. Any type of data you can describe can be stored in a relational database.
  7. With one exception (recursion), the storage system is remarkably simple and complete. That is, there isn't any type of data you would ever want to store that cannot easily be stored in a relational database.
  8. But even with recursion, the only limitation is the expressiveness of SQL, and there are easy ways to work around that.

From these basic concepts, you get every possible function you can ever imagine, all in a simple expressive interface.

With relational databases, data becomes manageable. I mean, if you stored all of your data in a table, it would be so much more accessible than any other storage system known to man.

Agreement

Now, relational databases aren't perfect. The implementation of all existing implementations are sorely lacking in one or more areas. If you don't know where your favorite relational database stops working well, you don't know it as well as you think. But this is an issue with implementation, not interface.

Unfortunately, the imperfect implementation translates to defects in the interface. Some databases hide this better than others, but all of them have you thinking about stuff besides the tables, columns, and rows of the database. The worst (MySQL, MS Access) have you spending most of your time worrying about the implementation, while the best (PostgreSQL) have you thinking more about relational algebra.

In the end, for your particular data needs, you will find particular edge cases where relational databases do not work for you. By all means, use some other database to handle this particular data, whether it is the filesystem, random access memory, or some other application that doesn't pretend to be a relational database. I, myself, have re-implemented relational database tables in some other system because the implementation was not sufficient for our needs.

I, too, am waiting for something better than the relational database to roll around. Unfortunately, I doubt I will see it in my lifetime, anymore than I would expect to see quantum physics if I lived in Rutherford's day.

Conclusion

For most development tasks, start with a relational database to store your data. Yes, really. Avoid the filesystem, avoid memory, etc... Just store it in the database. Then, as you identify areas where the relational database is insufficient, then use a specialized system.

In my experience, the development task means I don't know what the end product is going to look like until I have finished it. That means I want to maximize flexibility. That means a relational database.