MongoDB Days

Last week I attended MongoDB Day London. Now MongoDB itself is a technology that I’m fairly interested in, I can see where it would have its uses. But the problem is the people! They all talk like this:

  1. Some problem that just doesn’t really exist (or hasn’t existed for a very long time) with relational databases
  2. MongoDB
  3. Profit!

An example would be the first speaker, who didn’t like normalized data because it had “bad locality”. Now ignoring for a second the difference between a logical and a physical data model, and the existence of the normal forms, if you ever did find that the bottleneck on your joins specifically was seek time, you could pre-compute the join and refresh it whenever anything changed – in Oracle using a materialized view (1996!), a continuous query, the result cache… And that’s on top of the block buffer cache and the query optimizer already being very smart. Another of the same speaker’s examples overlooked the existence of nested tables, saying what you could do with them is impossible in “SQL databases”. It’s claimed that MongoDB is more flexible because it doesn’t constrain you to tables. Well that’s backwards… We don’t work the way we do because tables are a limitation of the technology, we use the relational model because it has sound mathematical underpinnings, and the technology reflects that†. Where’s the rigour in MongoDB’s model?

Another speaker claimed that it was far better for each application to have its own database, and expose all its data through web services. Sounds good, except you now need another technology, a directory to find all these things, since they aren’t just table names or stored procedure names in the one place, and manage access control and auditing. And if you need to touch data across several of them then you’ll need something to coordinate that… We could call it a transaction processing facility, since that’s what IBM called it in 1960. He handwaved over both of those. There were many similar examples.

Another recurring theme was of an organization refreshing its hardware and modifying its architecture, one component of which was introducing MongoDB, yet all the performance gains attributed to it. For example splitting OLTP and OLAP from one database into two, and introducing a delay of a few minutes between data coming in and being available for reporting. Well that will give you a performance boost in any database! If you can tolerate the delay, of course. But if you could, why build it that way in the first place (or having built it, complain that it’s slower than you’d like), and if you can’t, then you can’t do this. In the roadmap they are promising point-in-time recovery in a future release. Oracle had that in 1988, when I had just left primary school.

So anyway, since it’s free‡, there’s no reason not to evaluate MongoDB, and see if it suits your use cases. But just remember that these kids think they’re solving problems that IBM (et al) solved quite literally before they were born in some cases, and the features are probably already there in your existing database/technology stack (I have used Oracle for my rebuttals just because I am most familiar with it, but I expect the same is true for SQL Server and DB/2 as well). Talk to your friendly local DBA…

† I personally predict that in a few years there will be a lot of work re-normalizing the data in MongoDB and its rivals so it can actually be useful. That’s reason enough to become an expert in it. In about 2001, the company I joined then had just completed a massive engineering effort to get off Versant and (back) into Oracle… All this object-database stuff gives me massive deja vu for the 1990s when they were all the rage.

‡ In the same way that Oracle is also free for evaluation purposes. No-one would deny that Oracle is expensive in production! But there is no such thing as cheap or expensive in business, there is only worth the money or not.

Posted in MongoDB, Oracle, Random thoughts | Leave a comment

The Grand Challenges

What would you say are the greatest challenges facing modern computing? Protein folding to discover new pharmaceuticals? Sifting the vast quantities of sensor readings from the Large Hadron Collider? Rendering movies so lifelike that human actors are obsolete?

Well if you are Apple, I’d say your greatest challenges were scrolling a document, responding to a mouse click and keeping up with a user typing. Hell, you can’t even manage it for this blog post…

Posted in Random thoughts | Leave a comment

Those Who Forget History Are Doomed To Repeat It…

… first as tragedy, then as farce.

There is a strange attitude among many in this industry towards what are contemptuously referred to as legacy systems. No-one would ever articulate this of course, because when you say it out loud it sounds ridiculous, but the implicit belief is, in the 70s, 80s, 90s they had: smartphones and AJAX† and Ruby-on-rails and Chrome (and a long shopping list of “modern” technologies) but because they were stupid they chose to use dial-up modems, and dumb terminals, and program in FORTRAN. And because they were stupid, we have nothing to learn from them.

On a similar note, it is a common refrain to hear that newspapers are obsolete. And as a business model, that may well be true – but since the 1980s, newspaper publishing has defined mission-critical computing. Come hell or high water, the paper has to be on the newsstands in the morning. The desktops, the networks, the servers, the presses, the logistics, and all the software and IT have to work. Nowadays, it is as much as anyone can hope for for most software to mostly work, most of the time. If a browser or an operating system crashes or freezes, you take it in your stride and restart it. If a website is down, you might try again later, or you might just not bother. Even major bits of infrastructure are unreliable. The skills required to do serious computing are simply decaying, while individuals such as myself retain and practice the old ways, I don’t think when the last newspaper switches off its presses, that talent will then make its website five-9s reliable… Even the best civil engineer can’t build a castle on a fetid swamp. We’ll have to nuke it from orbit – only way to be sure – and start again.

† I recently used AJAX to build the interface for one of my projects. Even with bolt-ons like Comet, it’s pretty crude and feeble compared to Tcl/Tk… From the last century. I’m sure by the end of this decade it will be as-good‡, only to be swept away by some shiny new thing, and we’ll be back to square one in terms of getting useful work done. Since the 80s, every decade in computing has been a shallow copy of the previous decade. Eventually it will go full circle, like pocketwatches being replaced by wristwatches, to be replaced by clocks on mobile phones in your pocket…

‡ And yet, actually no more powerful or easy to use than Curses, for either developers or end users.

Posted in Random thoughts | Leave a comment

Quick histograms

Having come back to actively working on OCI*ML recently, it’s time I cracked on with some more features (I have been promising LOBs for a long time, sorry to anyone who’s still waiting). Just to get warmed up, inspired by spark I have added a quick histogram function, similar to quick query for interactive use. This requires a query of the form of a label and a number, for example a simple view:

SQL> create view v1 as
select object_type, count(1) as howmany from user_objects group by object_type;


The histogram automatically scales to the width of the current window.

Also, I have been reading Jordan Mechner’s book The Making Of Prince Of Persia†. It’s both fascinating and inspiring. Just before that, I read The Future Was Here, the story of the Commodore Amiga‡. The book is made even more poignant by my Mac inexplicably showing the beach ball as I scroll through a simple web page, or the mighty RHEL servers at work being unable to keep up with my typing. The future is still back in the 80s.

† The original code is also on Github.

‡ I have an A500+ on my desk right now, the best of them IMHO. I might write a post comparing it with the Atari STE, and the BBC with the C64, in the cold light of day as an experienced adult. I have a fine collection of classic machines now, often acquired broken with the intention of repairing them myself. Another time-sink from OCaml work…

Posted in Ocaml, OCIML, Operation Foothold, Random thoughts | Leave a comment

OCaml 4 beta

OCaml 4 beta 2 has been released, and so I quickly tested OCI*ML with it. Only a couple of minor tweaks were necessary, due to the following changes:

  • Some .cmi for toplevel internals that used to be installed in`ocamlc -where` are now to be found in `ocamlc -where`/compiler-libs. Add “-I +compiler-libs” where needed.
  • Warning 28 is now enabled by default.

The impact of these was that the toplevel prompt wasn’t working in the shell, and one non-fatal warning when compiling, so nothing that would have broken any code, but it’s good to be up-to-date. The necessary changes have been checked in on Github.

Speaking of which, bearing in mind the LinkedIn débàcle, I have a password generator too…

Posted in Haskell, Ocaml, OCIML | Leave a comment

Raspberry Pi: Who is it really for?

There is a lot of discussion on the Internet today about the Raspberry Pi. This is a project over which I am deeply ambiguous. On the one hand, exposure to computing at an early age was enormously influential on my life. I was fortunate enough to have a BBC Micro, a machine that was and indeed still is, extremely capable†, and easy to program. Not just in the sense that BBC Basic was an excellent language, the entire machine was easy to understand – you could maintain a pretty good mental model of “where” everything was in its 32k RAM, and what happened when and why. Everyone wrote their own simple games (today’s Angry Birds is just a modern twist on the artillery game, perfectly do-able by a keen 10-year-old in those days). Whether on the BBC or the C64 or whatever, dabbling in BASIC programming was not unusual, even for kids who mainly played games, and using a home computer mainly for programming wasn’t unusual either. Magazines had annotated listings to type in, and hardware projects interfacing with or even modifying machines. Schoolchildren in the 80s made the UK the software powerhouse it is today.

A lot of that was lost in the intervening years. It was certainly possible to program the “next generation”, the 16-bit home computers like the Atari ST and Commodore Amiga, of course. It just wasn’t what they were for. The creators of the BBC went to great lengths to include a fantastic BASIC dialect, whereas ST Basic was notoriously bug-ridden, and Devpac was a third-party product that you had to pay for. I don’t know so much about the Amiga world, but on the ST, the barriers to entry to programmers were certainly higher. Not insurmountably so, for one such as me who had grown up on the BBC and took programmability for granted, but I wonder how accessible I would have found it if it were my first machine. Certainly there was more to learn in order to, as on the BBC, produce a “professional” looking program, one that would operate with GEM for example. But, whether it was truly an unexpected emergent property of the more advanced machines, or the general zeitgeist, programming fell out of the mainstream. Games were the normal use, consoles such as the Sega Megadrive replaced 8-bit micros in some households.

Since then (fast-forwarding over most of the ’90s and all of the ’00s) programming has simultaneously gotten easier and less accessible. How is this possible? Abstraction. Moving further from the machine, or placing more and more layers between the programmer or user and the machine (once those terms were nearly synonymous). It is very easy for a user to write a macro in VBA that gets a lot of useful work done, and I would never advocate taking this type of programming away for that reason. But it encourages thinking of the machine as a “black box”, it is difficult to reason about what it is actually doing, and that in turn discourages the very powerful mindset that it’s all just code, all the way down that is needed to be an actual programmer. This is not necessarily intended as a value judgement; for many people computers are just tools and that’s fine, for them the macro-style approach is highly productive.

But someone has to make the tools, and the question is, is the Raspberry Pi going to nurture a new generation of tool-makers? Abstraction is useful because it allows one to do work without repetitive detail and focus on the problem domain. But I argue that abstraction should only be introduced once the fundamentals are understood. Learning on a machine like the BBC teaches that, no so much the details which become obsolete, but the concepts, (e.g. I do my real work in Python on Linux on x64 not BBC Basic on a Model B, but I use dis and GDB regularly). The Raspberry Pi has video, ethernet, USB, 256M RAM. It comes with a GUI and a web browser. It has more in common with a PlayStation than it has with a BBC Micro (and I make the same criticism of the OLPC). I know a lot of people of my age are excited about getting one and using it as a cheap embedded controller, like an Arduino. But I don’t think it’s a useful teaching tool, or at least, no more useful than a common PC. For that, you’d want something like a FIGnition. I honestly don’t know why that project has been relegated to the sidelines while the Pi gets all the press.

† Thought experiment: If you had a BBC clocked at 2Ghz what “real work” could you not do on it? What would you need to add? What about a BBC with a 65816 instead of a 6502, giving it 16M RAM, and ADFS access to modern storage – but fundamentally the same OS, languages(s), the same model of computation, switch it on, BASIC > prompt and off you go? I am struggling to think of anything…

Posted in BBC, Linux, Random thoughts | 4 Comments

Illiquid Markets

Some crazy prices on eBay:

  • RAMAMP RA20+44 for £500. This (I believe) is a board that adds 32K to a standard Model B, which can be used as sideways or shadow RAM. A shadow RAM only Aries B20 goes for £20, same again for a B12 which can take sideways RAM. To put that into perspective, the BBC B when new cost £399.
  • Electron +3 for £200, a disk drive for the Acorn Electron, for 5× what an Electron itself goes for.

The retro stuff I bought I use (well, play with!), but at these prices I can only assume people are keeping them in glass display cabinets… I could understand if someone desperately needed some data off old disks for sentimental reasons, but the price for the RAMAMP, given that it lives inside the case and there are functionally identical substitutes (e.g. Aries B32, or B20+B12) for a fraction of the price, is completely inexplicable.

Update

Posted in BBC, Random thoughts | Leave a comment