Time-Series Data
Intro
It's not uncommon that you want to collect time-series data. What is time-series data? It's simply data organized by time. It might be the number of clicks you get each day, or each hour, or every 5 minutes, or the number of sales made, or the current CTR or whatever.
OpenTSDB
OpenTSDB promises to be a good way of handling this data. It's built on hadoop, which means it is not as trivial as a set of tables in an RDBMS. That's not as bad as you think, since SQL is really not ideal for this kind of data.
Relational Model
People come up with a variety of schemes. Let me list them.
Direct Table
This is something like this:
item_metrics ------------ id (unique PK) item_id (FK to items) datetime timezone interval (NOT in seconds) metric1 metric2 ...
This is fairly useful, but it suffers from a number of problems.
- You need to add columns when you come up with new metrics.
- If you only want some of the metrics, the database usually has to work to retrieve all the data anyway, giving you no savings on the backend.
- When you go to display the data, it's more common for graphing software to want data transposed, IE, a list of metric name - values rather than a list of metric1, metric2, etc... pairs.
You run into another problems as well. What happens when you DB can no longer handle the load of recording and reporting metrics? You'll have to come up with some kind of sharding system. At first, you'll partition the table by datetime. This is reasonable. Then you'll want to break the table so that all the metrics are not stored together. In the extreme case, you're going to end up with a number of tables with the columns (unique id, item id, date time, value), one for each metric.
This leads to the second way of recording metrics, the meta-table table.
Meta-Table Table
Meta-tables are tables that store the information necessary to make a table. It's a fairly mindbending thought experiment for those who haven't encountered it before.
Basically, a table is just a series of rows. Each row has a series of columns. If you created a table that records the row number and the column name and column value, voila! You're storing a table in a table.
In metrics, it looks something like this.