How a Relational Database can yet be helpful with Event Sourcing

Tags: ASP.NET MVC, Knockout.js, Historical Events, Software Architecture, No SQL, Relational Database, Event Sourcing, Greg Young, RDBMS, CQRS, Read Model, Write Model, Event Store, Referential Integrity, Json, Composite Event Ids, Events, Large Event Store, Snapshots

I recently stumbled upon a blog post of with the provoking title “” where he raises the question why relational databases are being used in so many projects without questioning the reason why and if they are required. Using relational databases pull a whole string of consequences into one’s project the largest of which is being forced to use some kind of ORM to fix the mismatch between the object and the relational model. In his follow-up post he clarified his thoughts and concluded that more analysis should be done before deciding for a RDBMS. The single most important point in his post however was missed out by many of his critics. He never spoke out against the use of RDBMSes but the storing of a relational model within one. Needless to say that, not only (but also) because of its title his posts attracted a lot of reactions, including harsh criticism… Anyhow I like Greg’s straightforward and pragmatic way of expressing his view and agree that often decisions which database will be taken are made by people who have no background in software architecture at all.

Provided the decision has been made to use for one’s project architecture even more thought has to be put into what kind of data store will be used. Being confronted with two different models, one for reading and the other for storing events the chance to use 2 different database systems pops up.

Let’s analyze the differences briefly:

Read Model:

  • No need for any kind of relationship between tables
  • Ideally each table belongs to one view in the UI

Write Model:

  • Stores a stream of events
  • Can grow immensely large depending on the number of events
  • There is no relationship between events at all

 

Interestingly more projects seem use RDBs for their read models than on the write side. This may be owed to the fact that data stored in tables and views in a relational database often resemble their presentation in the UI. However there are no relations between tables on the read side and the amount of data in there should be slight which makes up for an ideal use of a document based db. Strengthened by the fact that queries on the read side shall only return data the UI needs and data are expected to be formatted in a way that it can quickly be shown to the user without any further transformation, databases that store their data in exactly this form would be ideal. Say we have a website built on and , the website and its developers could be made very happy and productive, if the data a query service returns were already formatted in . A lot of tedious work is also being relieved from developers on the read model side if they don’t have to map data-to-objects when going out of the db which additionally causes performance. Imagine events coming in to the read model were being transformed once before entering the read store and then wait there in exactly the format the UI expects them until they are queried. Ideal, isn’t it? So no need for a RDB on the read side.

Let’s go over to the write side. Some systems have an immensely large amount of events being stored each day letting the store (whatever that may be) grow bigger and bigger. There is no problem with storing events aligned one after the other, sorted by time and aggregate id and adding snapshots after a certain period of time in any kind of storage technology as long as there is no requirement to rebuild an an aggregate based on a historical period of time. Customers tend to come up with such ideas once they realize it is technically possible and they found out that there is hidden business value in an event store. Additionally there are systems that depend upon continuous historical analysis of certain events. Strategy engines based on machine learning algorithms being trained to make decisions on historical data may be an example of this. So what if we wanted to rebuild aggregates from events in the past quicker.

One option to make this work involves making use of referential constraints a relational database facilitates. Instead of having one single table that stores the vast amount of events we could create a separate table that partitions events based on the time they have been created and the aggregate that applied them. This table could then have its own primary key which could be used as a foreign key on the table that stores the events. This way the events are still being stored in their own table, but the incoming event stream is segmented by time in a tree-like structure. Besides making a time-based index explicit in a separate table the growth rate of this table is calculable: it only depends on the number of aggregates and time, which is steady. An additional gain is that it allows for using the same index in another table – the snapshots for example. Ideally snapshots correlate with this index, that means – one snapshot per primary id. Now retrieving historical events for an aggregate can be done by looking up the respective id(s) in the CompositeEventIds table, which is a very fast operation, and then join the Events table with these id(s). So by this a relational database can be leveraged to increase performance when rebuilding state of aggregates for a specific period of time in the history of the events. The picture below shows a schematic of what I mean.

RDB Event Store 

Relational databases can and will still play a role even in hardcore CQRS end event sourcing systems, however their utilization shifts from storing a relational model that resembles the object model to a way referential integrity is used for sorting and indexing events.

Technorati Tags: ,,,,,,,,,,,,,,,,,,

Windows Live Tags: ASP.NET MVC,Knockout.js,Historical Events,Software Architecture,No SQL,Relational Database,Event Sourcing,Greg Young,RDBMS,CQRS,Read Model,Write Model,Event Store,Referential Integrity,Json,Composite Event Ids,Events,Large Event Store,Snapshots
Add a Comment