Extprocs in OCaml

An external procedure or EXTPROC is a technique for calling C or other native code from SQL in Oracle. Why might you want to do this?

  1. To interface to a 3rd party system for which there is only a C API, or in preference to using a bloated XML or similar gateway
  2. To take advantage of well-tested libraries rather than re-implementing their algorithms in PL/SQL
  3. To do intensive computation in optimized native code

Point 3 is slightly less relevant these days (point 2 notwithstanding) in that we have true native compilation of PL/SQL in Oracle 11g†, but a colleague of mine recently expressed an interest in calling CUDA code from SQL, a feature that is already present in Postgres. My own interest is at a slight tangent to this: OPERATION FOOTHOLD is about putting the necessary infrastructure in place to remove barriers to wider adoption of OCaml in the enterprise. In this post I will present effectively the opposite of OCI*ML: rather than calling Oracle from OCaml, I will call OCaml from within Oracle (via C).

Firstly the OCaml code. As we are just demonstrating the technique, we will just do something simple: add two integer numbers together and return the result. The OCaml code implements both this function and the necessary plumbing to talk to C.

(* The OCaml function to make available to SQL *)
let caml_add_two x y = x + y

(* Register this function with C *)
let () =
  Callback.register "caml_add_two" caml_add_two

Next the C stub file:

#include <caml/mlvalues.h>
#include <caml/callback.h>
#include "extproc_add.h"

value* caml_add_two = NULL;

int add_two(int x, int y) {
  /* if the shared lib is being loaded for the first time, call the OCaml initialization */
  if (caml_add_two == NULL) {
    caml_startup(NULL); /* no argv */
    caml_add_two = caml_named_value("caml_add_two");
  }

  return Int_val(caml_callback2(*caml_add_two, Val_int(x), Val_int(y)));
}

We can assume that the shared object will be loaded on first invocation, and be unloaded when that session exits. The functions Int_val() and Val_int() marshall the arguments across the OCaml/C interface. Note that there is no OCI code necessary, as we are not using a context. The objective is to do processing outside of the database, not simply to invoke OCI code from within the database (altho’ this is actually more complicated to do! And would probably be necessary for doing bulk data processing).

Compile it with the Makefile:

# Makefile for OCaml Extproc example

all:	test.c libextproc_add.so
	gcc test.c -o test -L. -lextproc_add -L`ocamlopt -where` -ldl -lcamlrun -lcurses -lm

libextproc_add.so:	c_extproc_add.o caml_extproc_add.o
	gcc -shared -Wl,-E -o libextproc_add.so c_extproc_add.o caml_extproc_add.o -L`ocamlopt -where` -ldl -lcamlrun -lcurses -lm

c_extproc_add.o:	extproc_add.c
	gcc -g3 -o c_extproc_add.o -I`ocamlopt -where` -c extproc_add.c

caml_extproc_add.o:	extproc_add.ml
	ocamlc -output-obj -o caml_extproc_add.o extproc_add.ml
clean:
	rm *.o *.cm* test *.so


test.c is a trivial stub program to test that the shared object works correctly, (see GitHub for the complete project). Next is the configuration of TNSNAMES.ORA (I am using Oracle XE on Debian):

XE =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = debian)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = XE)
    )
  )

EXTPROC_CONNECTION_DATA =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC_FOR_XE))
    )
    (CONNECT_DATA =
      (SID = PLSExtProc)
      (PRESENTATION = RO)
    )
  )

And LISTENER.ORA:

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (SID_NAME = PLSExtProc)
      (ORACLE_HOME = /usr/lib/oracle/xe/app/oracle/product/10.2.0/server)
      (PROGRAM = extproc)
      (ENVS="LD_LIBRARY_PATH=/usr/lib/ocaml:/home/gaius/Projects/extproc-ocaml,EXTPROC_DLLS=/home/gaius/Projects/extproc-ocaml/libextproc_add.so")
    )
  )

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC_FOR_XE))
      (ADDRESS = (PROTOCOL = TCP)(HOST = debian)(PORT = 1521))
    )
  )

DEFAULT_SERVICE_LISTENER = (XE)

Change exact pathnames to match your local system. Finally, at the SQL*Plus™ prompt, execute the following script as SYS:

conn / as sysdba
create or replace library gaius.extproc_add as '/home/gaius/Projects/extproc-ocaml/libextproc_add.so';
/
grant execute on gaius.extproc_add to gaius;

conn gaius/abc123

create or replace function add_two(x in binary_integer, y in binary_integer) return binary_integer as language c name "add_two" library extproc_add;
/

select add_two(1,2) from dual;

Note the use of the datatype BINARY_INTEGER rather than NUMBER, and how the function name is the name of the function in the C code.

Executing the example
This is trivial but it demonstrates the basics of invoking OCaml code from within SQL on the Oracle database, and will serve as a framework for future development. The C code involved is basically boilerplate‡; the application should be considered as just using PL/SQL and OCaml as languages (if the number of different languages involved is a concern!). We can therefore add to the list point 4: to implement a particular algorithm in a better-suited (more concise or expressive) language than Ada PL/SQL. An example of that might be a recursive algorithm. The C stub can reference any number of OCaml functions, in any number of .o files. Of course, in the real world, the advantages of doing so must outweigh the costs, i.e. the total complexity of (external procedure + OCaml) must be < (just writing it in PL/SQL).

One final tip: it is possible to debug external procedures with gdb. The steps are as follows:

  1. Compile with -g to insert debugging symbols
  2. Find the PID of the tnslsnr and attach with gdb --pid=...
  3. Set the option to follow a child process and enter the breakpoint(s) you want:
    (gdb) set follow-fork-mode child
    (gdb) break extproc_add.c:18
    (gdb) cont

    Don’t do this on a live system as the listener will be frozen at this point!

  4. Execute the external procedure in sqlplus in another window, and when you hit the breakpoint, you will drop into the debugger.

I used this technique to debug an ORA-28576 I got when testing. It seems that DEBUG_EXTPROC does not work on Oracle XE, but it isn’t really necessary, this technique works just as well. More about this in Unix Diagnostic Tools for DBAs.

† Or we can use TimesTen as a compute engine to offload this work from the database. Or OCI code and the client result cache.

‡ Unless there is a lot of data in which case OCI code will be needed e.g. if using Oracle native datatypes such as NUMBER, which would need marshalling from Oracle → C using OCINumberToInt() then Val_int() to marshall it C → OCaml. But to keep this example simple I just used BINARY_INTEGER.

About Gaius

Jus' a good ol' boy, never meanin' no harm
This entry was posted in Linux, Ocaml, Operation Foothold, Oracle. Bookmark the permalink.

2 Responses to Extprocs in OCaml

  1. Pingback: OCaml bindings for Coherence with SWIG | So I decided to take my work back underground

  2. Pingback: Learning the Wrong Lessons | So I decided to take my work back underground

Leave a comment