Following on from a previous post, here is some simple database code in hopefully idiomatic Haskell. My strategy is
- Use the Reader monad to pass the database connection around the functions that use it. The description, computations which read values from a shared environment seems ideal, and could be extended to carry a record containing any configuration data only known at runtime. Using this approach for a database connection tho’ has a subtle and interesting property: even tho’ the variable that represents the database connection is read-only, the connection itself is not! So a function can perform
INSERTon the database, but it can’t (for example) point the connection at a different database and “fool” the next function. Tho’ I can’t offhand think of a case in which I might want to do that.
- Use the Writer monad to return the result set to the calling function. This is described as computations which produce a stream of data in addition to the computed values. This is effectively an append-only list; a function can neither access nor modify results returned from earlier functions (i.e. in SQL terms a
UNION). Again I can’t think of a case in which I’ve necessarily needed to do that, but it’s a nice property to guarantee. If I did want to do either of those things I could of course just use the State monad (computations which maintain state) to carry around a record containing everything, but there is separation of concerns to consider.
Here is the code. We start as ever with the imports. Haskell allows forward declarations (unlike OCaml) so the main function is here too.
module Main where import Control.Monad.Reader import Control.Monad.Writer import Database.HDBC import Database.HDBC.Sqlite3 main:: IO () main = handleSqlError $ do lda <- connectSqlite3 "test.db" rs <- runDbApp lda $ do deleteAllRows getRows -- returns nothing doInsert ["Huw", "Pugh", "Barney McGrew"] getDbTime getRows mapM_ putStrLn rs
The main function is almost spookily clean; all the plumbing of passing the database connection created on line 10 and getting back the result set on line 11 has been completed hidden inside the monads. The
do block executes all the functions within in sequence. Desugaring that would let us write the main body of the application on a single line just as cleanly:
rs <- runDbApp lda $ deleteAllRows >> getRows >> doInsert ["Huw", "Pugh", "Barney McGrew"] >> getDbTime >> getRows
There is definitely something to be said for this style, at the top level of a program anyway, it allows very, very simple composition of “building blocks” to make an application that executes as a pipeline, building up a final result as it goes. Next is the function that sets up the nested Reader and Writer.
-- set up a monadic context for database-using functions runDbApp::(Monad m) => r -> ReaderT r (WriterT w m) a -> m w runDbApp lda f = do (a, rs) <- runWriterT (runReaderT f lda) return rs
This function uses monad transformers, nesting the Reader within the Writer to create a new combined monad with properties of both in which function
f will execute (
f being the entire
do block above, composed) . The database connection
lda is the cargo of the Reader, but it could have easily been a complex record containing, for example, parsed command line arguments. The Writer actually returns a pair, the final result of a computation, and its “log” of the events that occurred. As it is this log that we are interested in,
a is discarded.
getDbTime::MonadIO m => ReaderT Connection (WriterT [String] m) () getDbTime = do lda <- ask -- get the database handle from the Reader rs <- liftIO $ quickQuery lda "select datetime ('now')"  let rs' = map rowToString rs tell rs' -- store the results in the Writer (one row of [Char])
This is where things start to get a bit more complicated. To get the time, a one row result set, from the database, the connection is requested from the Reader monad with
ask, and as the query involves I/O this is “lifted” into the context of the IO monad to run, then the result extracted back into the current context with ←. Finally
tell invokes the Writer monad. It can be seen that this is more complex than say the OCaml equivalent,
lda is not simply a locally bound variable passed in as an argument to the function.
getRows::ReaderT Connection (WriterT [String] IO) () getRows = do rs <- (\lda -> liftIO $ do sth <- prepare lda "select c1 from t1" execute sth  fetchAllRows sth) =<< ask tell (map rowToString rs)
This function returns multiple rows. Here I am trying a slighly different structure in an attempt to get some of the elegance of
main into the actual functions that do the work. This is complicated by the fact that I don’t think HDBC was designed to be used monadically; functions that are typically have their argument that would be a monadic value last, e.g. randomR was obviously meant to be used with the State monad, as its state is the last argument it is very easy to curry it and get the monad to apply it in pointfree style.
doInsert xs = do ask >>= (\lda -> liftIO $ prepare lda "insert into t1 values (?)" >>= (flip executeMany $ map stringToRow xs) >> commit lda)
Here is another attempt.
prepare expects the stateful thing, the connection, as its first parameter of two and returns a statement handle. Similarly executeMany expects that statement handle as its first parameter, and
commit wants the output of the step prior to the step immediately before it. Some monkeying around with a λ function and
flip gets everything into the right order for
>>= (take the result of one function and pipe it to the next, like the Unix shell’s
|) and preserves
lda by name to be used twice.
dbfunc f b c = (\a -> liftIO $ f a b c) deleteAllRows = do ask >>= dbfunc run "delete from t1" 
This is as elegant as I can get it, abstracting out the reordering and lifting to a reusable helper function. But – I happen to know because I wrote this program that I have a
COMMIT in the near future. In real code, if this was meant to be transactional in and of itself, I would have to keep hold of my connection somehow so I can use it for
commit lda, either with ask, or by making the commit part of
dbfunc, which would limit me to single-statement transactions (hardly better than setting
AUTOCOMMIT ON). Finally a couple of helper functions to handle casts between datatypes:
-- cast a single-column result set from [SqlValue] to String rowToString [x] = (fromSql x)::String -- cast a single value into a [SqlValue] (a row) stringToRow x = [toSql x] -- end of file
gaius@debian:~/Projects/MonadDb$ ls MonadDb.hs gaius@debian:~/Projects/MonadDb$ echo "create table t1 (c1 varchar2(32));"|sqlite3 test.db gaius@debian:~/Projects/MonadDb$ ghc --make MonadDb.hs -hide-package monads-fd [1 of 1] Compiling Main ( MonadDb.hs, MonadDb.o ) Linking MonadDb ... gaius@debian:~/Projects/MonadDb$ ./MonadDb 2010-09-19 10:30:09 Huw Pugh Barney McGrew gaius@debian:~/Projects/MonadDb$
Conclusion: this would have been a lot more elegant had the HDBC API been designed with monads in mind up-front, but I suppose it was a tradeoff between that and making it radically dissimilar in convention to every other database API (e.g. the convention of
sth being the first parameter is the same in OCaml’s Occi module, and in Oratcl and that has made it easier for me to learn).
The monadic style could be said to simply shift complexity away from the main function and into the worker functions. That’s not necessarily a bad thing; it depends who your audience is. If
main will usually be written by people who would otherwise write a “one-off” shell script and the function library is maintained as a project in its own right, then that’s perfectly OK, an excellent use of this technology, even. If the same person/people will be writing an entire application this way then the benefits are less clear – perhaps they will become clearer at vast scale? A “large” program for me, for the kind of work I do, would be a thousand new lines of Python + various libraries (some of which I have written, and sometimes discover unexpected behaviors when “reusing”). The more can be put into thoroughly engineered libraries the more leverage Haskell gives in this scenario.
Thanks to Daniel Fischer from the Haskell list for help with this