Last week I attended MongoDB Day London. Now MongoDB itself is a technology that I’m fairly interested in, I can see where it would have its uses. But the problem is the people! They all talk like this:
- Some problem that just doesn’t really exist (or hasn’t existed for a very long time) with relational databases
An example would be the first speaker, who didn’t like normalized data because it had “bad locality”. Now ignoring for a second the difference between a logical and a physical data model, and the existence of the normal forms, if you ever did find that the bottleneck on your joins specifically was seek time, you could pre-compute the join and refresh it whenever anything changed – in Oracle using a materialized view (1996!), a continuous query, the result cache… And that’s on top of the block buffer cache and the query optimizer already being very smart. Another of the same speaker’s examples overlooked the existence of nested tables, saying what you could do with them is impossible in “SQL databases”. It’s claimed that MongoDB is more flexible because it doesn’t constrain you to tables. Well that’s backwards… We don’t work the way we do because tables are a limitation of the technology, we use the relational model because it has sound mathematical underpinnings, and the technology reflects that†. Where’s the rigour in MongoDB’s model?
Another speaker claimed that it was far better for each application to have its own database, and expose all its data through web services. Sounds good, except you now need another technology, a directory to find all these things, since they aren’t just table names or stored procedure names in the one place, and manage access control and auditing. And if you need to touch data across several of them then you’ll need something to coordinate that… We could call it a transaction processing facility, since that’s what IBM called it in 1960. He handwaved over both of those. There were many similar examples.
Another recurring theme was of an organization refreshing its hardware and modifying its architecture, one component of which was introducing MongoDB, yet all the performance gains attributed to it. For example splitting OLTP and OLAP from one database into two, and introducing a delay of a few minutes between data coming in and being available for reporting. Well that will give you a massive performance boost in any database! If you can tolerate the delay, of course. But if you could, why build it that way in the first place (or having built it, complain that it’s slower than you’d like), and if you can’t, then you can’t do this. In the roadmap they are promising point-in-time recovery in a future release. Oracle had that in 1988, when I had just left primary school.
So anyway, since it’s free‡, there’s no reason not to evaluate MongoDB, and see if it suits your use cases. But just remember that these kids think they’re solving problems that IBM (et al) solved quite literally before they were born in some cases, and the features are probably already there in your existing database/technology stack (I have used Oracle for my rebuttals just because I am most familiar with it, but I expect the same is true for SQL Server and DB/2 as well). Talk to your friendly local DBA…
† I personally predict that in a few years there will be a lot of work re-normalizing the data in MongoDB and its rivals so it can actually be useful. That’s reason enough to become an expert in it. In about 2001, the company I joined then had just completed a massive engineering effort to get off Versant and (back) into Oracle… All this object-database stuff gives me massive deja vu for the 1990s when they were all the rage.
‡ In the same way that Oracle is also free for evaluation purposes. No-one would deny that Oracle is expensive in production! But there is no such thing as cheap or expensive in business, there is only worth the money or not.