Comparing Python and Ocaml

I wrote some Python code a while ago for neatly formatting tabular output, for generating reports on the command line or from cron jobs to email. Typically this would be populated with a result set from a query “joined” with some computation done in the code. In order to make it possible to use Ocaml for this type of report (or for the next version of an existing report) I reimplemented it and here they are side by side.

The first thing to notice is that they’re exactly the same length in lines, but there is more Ocaml code. Admittedly this is probably not very idiomatic Ocaml on my part (and it is in the Camlp4 syntax). Tokens in Ocaml tend to be much longer, e.g. Array.length or String.length instead of an (overloaded) len. Does this make it more legible or maintainable? I’m not convinced, at least not for a snippet this size, but I’ve yet to use Ocaml “in the large”. I get type safety – but I know that I will only be dealing with strings, ints or floats (dates are effectively strings for the purposes of this) and Python does the casts “for free”.

Anyway, this forms a part of a sub-project to put all the scaffolding in place so that Ocaml can be a “drop-in” replacement for Python for my work. Most importantly, the compiled Ocaml code has fewer external dependencies.

Python:

class Report:
    def __init__(self, cols=[]):
        "Takes column headings as an argument"
        self.widths     = []
        self.columns    = cols
        self.records    = []
        
        for c in self.columns:
            self.widths.append(len(c))
            
    def addRow(self, r=[]):
        "Add one row, padding or truncating if necessary"
        row = r
        
        if len(row) > len(self.columns):
            row = row[:len(self.columns)]
            
        while len(row) < len(self.columns):
            row.append(None)
            
        for x in xrange(0, len(row)):
            if len(str(row[x])) > self.widths[x]:
                self.widths[x] = len(str(row[x])) 
            
        self.records.append(row)
        
    def printReport(self, o=stdout):
        "Generate the report of column headings, a divider, then the rows"
        h, d = [], []
        for c in self.widths:
            h.append('%%%ds' % c)
            d.append('-' * c)
                        
        fmt = ' '.join(h)
        print >>o, fmt % tuple(self.columns)
        print >>o, fmt % tuple(d)
        
        for r in self.records:
            print >>o, fmt % tuple(r)

OCaml:

class report (header : array string) =
  object (self)
    (** Array holding the width of each column *)
    value mutable widths = ([||] : array int);
    (** List of arrays each of which is one row in the report *)
    value mutable rows = ([] : list (array string));
    (** Initialize widths to be the same as the widths of the header *)
    initializer
      do {
        widths := Array.make (Array.length header) 0; self#set_widths header
      };
    method private set_widths x =
      for i = 0 to min (Array.length x) (Array.length header) - 1 do {
        if widths.(i) < String.length x.(i) then
          widths.(i) := String.length x.(i)
        else ()
      };
    (** Return r.(i) padded with spaces to widths.(i) *)
    method private pad_column r i =
      sprintf "%*s" widths.(i) r.(i);
    method private print_row chan r =
      do {
        for i = 0 to min (Array.length r) (Array.length header) - 1 do {
          output_string chan (self#pad_column r i ^ " ")
        };
        output_string chan "\n"
      };
    (** Add a row to this report - column widths will automatically adjust @param r an array of string values*)
    method add_row r = do { self#set_widths r; rows := [r :: rows] };
    (** Generate the report to STDOUT @param chan an optional out_channel *)
    method print_report ?(chan = Pervasives.stdout) () =
      do {
        self#print_row chan header;
        self#print_row chan
          (Array.init (Array.length widths)
             (fun i -> String.make widths.(i) '-'));
        List.iter (fun x -> self#print_row chan x) (List.rev rows)(* get the rows in order added *) 
      };
  end;
Advertisements

About Gaius

Jus' a good ol' boy, never meanin' no harm
This entry was posted in Ocaml, Python. Bookmark the permalink.

11 Responses to Comparing Python and Ocaml

  1. Hez says:

    One small change to be slightly more idiomatic would be to change lines 35 and 36 of the OCaml snippet to:

    (Array.map (fun width -> String.make width ‘-‘) widths);

    It saves a manual call to Array.length and results in code which is a bit clearer once you are familiar with functions like map.

    You can also take advantage of currying on line 37:

    List.iter (self#print_row chan) (List.rev rows)

    That is perhaps less idiomatic and more a matter of personal preference.

  2. gaiush says:

    Thanks for the tips – the latter one is what the Haskell community call “points-free” style I believe.

    Do you have any thoughts about Camlp4 vs “regular” syntax?

    • Hez says:

      A bit of terminology – the syntax you are using is called the revised syntax. Camlp4 is the tool which allows syntax extensions to OCaml, including the revised vs original OCaml syntaxes.

      The revised syntax cleans up some potential inconsistencies in the original syntax. It does not seem to be very commonly used, but camlp4 can translate code from revised to original and back so you can jump back and forth between the two with some effort.

  3. report2 and report3 are maybe a little bit more typical for OCaml. We use lists more often than arrays. We also avoid keeping state in the object (like widths) when we can compute it at the end.


    class report (header: string array) =
    object (self)
    (** Array holding the width of each column *)
    val mutable widths = ([||] : int array)

    (** List of arrays each of which is one row in the report *)
    val mutable rows = ([] : (string array) list)

    (** Initialize widths to be the same as the widths of the header *)
    initializer
    begin
    widths <- Array.make (Array.length header) 0; self#set_widths header
    end

    method private set_widths x =
    for i = 0 to min (Array.length x) (Array.length header) - 1 do
    if widths.(i) < String.length x.(i) then
    widths.(i) <- String.length x.(i)
    done

    (** Return r.(i) padded with spaces to widths.(i) *)
    method private pad_column r i =
    Printf.sprintf "%*s" widths.(i) r.(i)

    method private print_row chan r =
    for i = 0 to min (Array.length r) (Array.length header) - 1 do
    output_string chan (self#pad_column r i ^ " ");
    done;
    output_string chan "\n"

    (** Add a row to this report - column widths will automatically adjust @param r an array of string values*)
    method add_row r =
    self#set_widths r;
    rows String.make widths.(i) '-'));
    List.iter (fun x -> self#print_row chan x) (List.rev rows)(* get the rows in order added *);
    ()

    end

    (* A little long, but support adding data of different size e.g. 4 cols with 3
    * headers
    *)
    class report2 (header: string array) =
    object (self)

    (** List of arrays each of which is one row in the report *)
    val mutable rows = ([] : (string array) list)

    method add_row r =
    rows
    let e = f e1 e2 in
    e :: map2' (tl1, tl2)
    | [], e2 :: tl2 ->
    let e = f dflt1 e2 in
    e :: map2' ([], tl2)
    | e1 :: tl1, [] ->
    let e = f e1 dflt2 in
    e :: map2' (tl1, [])
    | [], [] ->
    []
    in
    map2' (lst1, lst2)
    in

    let lst =
    List.map Array.to_list (header :: List.rev rows)
    in

    let widths =
    List.fold_left
    (fun widths row ->
    map2
    (fun s len -> max (String.length s) len)
    row ""
    widths 0)
    []
    lst
    in

    let pads =
    List.map (fun len -> String.make (len + 1) ' ') widths
    in

    let lst =
    (* Introduce header split *)
    match lst with
    | hdr :: data ->
    hdr
    ::
    (List.map (fun len -> String.make len '-') widths)
    ::
    data
    | [] ->
    []
    in

    List.iter
    (fun row ->
    let _u : unit list =
    map2
    (fun s (pad, len) ->
    let s_len = String.length s in
    String.fill pad 0 len ' ';
    String.blit s 0 pad (len - s_len) s_len;
    output_string chan pad)
    row ""
    (List.combine pads widths) ("", 0)
    in
    output_string chan "\n")
    lst;
    flush chan;
    ()

    end
    end

    (* Same version as report, cannot add extra cols
    *)
    class report3 (header: string array) =
    object (self)

    (** List of arrays each of which is one row in the report *)
    val mutable rows = ([] : (string array) list)

    method add_row r =
    rows
    List.map2
    (fun s len -> max (String.length s) len)
    row widths)
    (List.map (fun _ -> 0) (Array.to_list header))
    lst
    in

    let pads =
    List.map (fun len -> String.make (len + 1) ' ') widths
    in

    let lst =
    (* Introduce header split *)
    match lst with
    | hdr :: data ->
    hdr
    ::
    (List.map (fun len -> String.make len '-') widths)
    ::
    data
    | [] ->
    []
    in

    List.iter
    (fun row ->
    List.iter2
    (fun s (pad, len) ->
    let s_len = String.length s in
    String.fill pad 0 len ' ';
    String.blit s 0 pad (len - s_len) s_len;
    output_string chan pad)
    row
    (List.combine pads widths);
    output_string chan "\n")
    lst;
    flush chan;
    ()

    end
    end

    let data =
    [
    [|"01"; "abcdef"; "ghfig"|];
    [|"02"; "abcdef"; "ghfidfsg"|];
    [|"03"; "abcdef"; "ghfig"|];
    [|"04"; "abcd"; "ghfazeig"|];
    [|"05"; "abcdefffff"; "gherfig"|];
    ]

    let () =
    let rprt = new report [|"a"; "b"; "c"|] in
    let () =
    List.iter rprt#add_row data;
    rprt#print_report ()
    in

    let rprt = new report2 [|"a"; "b"; "c"|] in
    let () =
    List.iter rprt#add_row data;
    rprt#add_row [|"z"; "z"; "z"; "z"|];
    rprt#print_report ()
    in

    let rprt = new report3 [|"a"; "b"; "c"|] in
    let () =
    List.iter rprt#add_row data;
    rprt#print_report ()
    in
    ()

  4. The code in my previous comment is not well displayed, here is a better version (I hope).

    BTW, I have converted the revised syntax to standard OCaml syntax.

    class report (header: string array) =
      object (self)
        (** Array holding the width of each column *)
        val mutable widths = ([||] : int array)
    
        (** List of arrays each of which is one row in the report *)
        val mutable rows = ([] : (string array) list)
    
        (** Initialize widths to be the same as the widths of the header *)
        initializer
          begin
            widths <- Array.make (Array.length header) 0; self#set_widths header
          end
    
        method private set_widths x =
          for i = 0 to min (Array.length x) (Array.length header) - 1 do 
            if widths.(i) < String.length x.(i) then
              widths.(i) <- String.length x.(i)
          done
    
        (** Return r.(i) padded with spaces to widths.(i) *)
        method private pad_column r i =
          Printf.sprintf "%*s" widths.(i) r.(i)
    
        method private print_row chan r =
          for i = 0 to min (Array.length r) (Array.length header) - 1 do
            output_string chan (self#pad_column r i ^ " ");
          done;
          output_string chan "\n"
    
        (** Add a row to this report - column widths will automatically adjust @param r an array of string values*)
        method add_row r = 
          self#set_widths r; 
          rows  String.make widths.(i) '-'));
          List.iter (fun x -> self#print_row chan x) (List.rev rows)(* get the rows in order added *);
          ()
    
      end
    
    (* A little long, but support adding data of different size e.g. 4 cols with 3
     * headers
     *)
    class report2 (header: string array) = 
    object (self)
             
      (** List of arrays each of which is one row in the report *)
      val mutable rows = ([] : (string array) list)
    
      method add_row r = 
        rows  
                    let e = f e1 e2 in
                      e :: map2' (tl1, tl2)
                | [], e2 :: tl2 ->
                    let e = f dflt1 e2 in
                      e :: map2' ([], tl2)
                | e1 :: tl1, [] ->
                    let e = f e1 dflt2  in 
                      e :: map2' (tl1, [])
                | [], [] -> 
                    []
            in
              map2' (lst1, lst2)
          in
    
          let lst = 
            List.map Array.to_list (header :: List.rev rows)
          in
    
          let widths = 
            List.fold_left 
              (fun widths row ->
                 map2 
                   (fun s len -> max (String.length s) len)
                   row ""
                   widths 0)
              []
              lst
          in
    
          let pads = 
            List.map (fun len -> String.make (len + 1) ' ') widths
          in
    
          let lst = 
            (* Introduce header split *)
            match lst with 
              | hdr :: data ->
                  hdr 
                  :: 
                  (List.map (fun len -> String.make len '-') widths)
                  ::
                  data
              | [] ->
                  []
          in
    
            List.iter 
              (fun row ->
                 let _u : unit list = 
                   map2
                     (fun s (pad, len) ->
                        let s_len = String.length s in
                          String.fill pad 0 len ' ';
                          String.blit s 0 pad (len - s_len) s_len;
                          output_string chan pad)
                     row ""
                     (List.combine pads widths) ("", 0)
                 in
                   output_string chan "\n")
              lst;
            flush chan;
            ()
    
        end
    end
    
    (* Same version as report, cannot add extra cols
     *)
    class report3 (header: string array) = 
    object (self)
             
      (** List of arrays each of which is one row in the report *)
      val mutable rows = ([] : (string array) list)
    
      method add_row r = 
        rows 
                 List.map2 
                   (fun s len -> max (String.length s) len)
                   row widths)
              (List.map (fun _ -> 0) (Array.to_list header))
              lst
          in
    
          let pads = 
            List.map (fun len -> String.make (len + 1) ' ') widths
          in
    
          let lst = 
            (* Introduce header split *)
            match lst with 
              | hdr :: data ->
                  hdr 
                  :: 
                  (List.map (fun len -> String.make len '-') widths)
                  ::
                  data
              | [] ->
                  []
          in
    
            List.iter 
              (fun row ->
                 List.iter2
                   (fun s (pad, len) ->
                      let s_len = String.length s in
                        String.fill pad 0 len ' ';
                        String.blit s 0 pad (len - s_len) s_len;
                        output_string chan pad)
                   row 
                   (List.combine pads widths);
                   output_string chan "\n")
              lst;
            flush chan;
            ()
    
        end
    end
    
    let data = 
      [
        [|"01"; "abcdef"; "ghfig"|];
        [|"02"; "abcdef"; "ghfidfsg"|];
        [|"03"; "abcdef"; "ghfig"|];
        [|"04"; "abcd"; "ghfazeig"|];
        [|"05"; "abcdefffff"; "gherfig"|];
      ]
    
    let () = 
      let rprt = new report [|"a"; "b"; "c"|] in
      let () = 
        List.iter rprt#add_row data;
        rprt#print_report ()
      in
    
      let rprt = new report2 [|"a"; "b"; "c"|] in
      let () = 
        List.iter rprt#add_row data;
        rprt#add_row [|"z"; "z"; "z"; "z"|];
        rprt#print_report ()
      in
    
      let rprt = new report3 [|"a"; "b"; "c"|] in
      let () = 
        List.iter rprt#add_row data;
        rprt#print_report ()
      in
        ()
    
  5. ChriS says:

    Here is a short version (which also allows to add a row of 4 with a header of 3). I wrote the code fast, hope it respects your specs.

    open Printf
    
    class report (header : string array) =
    object (self)
      (** Array holding the width of each column, initially set to the
          lengths of the header. *)
      val widths = Array.map String.length header
      (** List of rows in the report, in reverse order. *)
      val mutable rows = ([] : (string array) list)
    
      (** Add a row to this report - column widths will automatically adjust
          @param r an array of string values*)
      method add_row r =
        let lw = Array.length widths in
        let r = if Array.length r > lw then Array.sub r 0 lw else r in
        Array.iteri (fun i ri -> widths.(i) <- max widths.(i) (String.length ri)) r;
        rows  fprintf chan "%*s " widths.(i) ri) r;
        fprintf chan "\n"
    
      (** Generate the report to STDOUT
          @param chan an optional out_channel *)
      method print_report ?(chan = Pervasives.stdout) () =
        self#print_row chan header;
        self#print_row chan (Array.map (fun wi -> (String.make wi '-')) widths);
        List.iter (fun r -> self#print_row chan r) (List.rev rows)
    
    end
    


    Test:

    let () =
      let data = [
        [|"01"; "abcdef"; "ghfig"|];
        [|"02"; "abcdef"; "ghfidfsg"|];
        [|"03"; "abcdef"; "ghfig"|];
        [|"04"; "abcd"; "ghfazeig"|];
        [|"05"; "abcdefffff"; "gherfig"|];
        [|"z"; "z"; "z"; "z"|];
      ] in
      let rprt = new report [|"a"; "b"; "c"|] in
      List.iter rprt#add_row data;
      rprt#print_report ()
    

  6. ChriS says:

    Hell, the system ate part of the code (becase of the ← ?). Let us try again (the possibility to preview would be welcome!)

    open Printf
    
    class report (header : string array) =
    object (self)
      (** Array holding the width of each column, initially set to the
          lengths of the header. *)
      val widths = Array.map String.length header
      (** List of rows in the report, in reverse order. *)
      val mutable rows = ([] : (string array) list)
    
      (** Add a row to this report - column widths will automatically adjust
          @param r an array of string values*)
      method add_row r =
        let lw = Array.length widths in
        let r = if Array.length r > lw then Array.sub r 0 lw else r in
        Array.iteri (fun i ri -> widths.(i) ← max widths.(i) (String.length ri)) r;
        rows ← r :: rows
    
      method private print_row chan r =
        Array.iteri (fun i ri -> fprintf chan "%*s " widths.(i) ri) r;
        fprintf chan "\n"
    
      (** Generate the report to STDOUT
          @param chan an optional out_channel *)
      method print_report ?(chan = Pervasives.stdout) () =
        self#print_row chan header;
        self#print_row chan (Array.map (fun wi -> (String.make wi '-')) widths);
        List.iter (fun r -> self#print_row chan r) (List.rev rows)
    end
    • Gaius says:

      Thanks!

      When I first wrote it (in Python) I couldn’t think of a case in which getting a row with more columns than in the header wouldn’t be a bug (e.g. accidentally splitting on a space or a comma in some result from the DB) so I truncated (allowing the potentially lengthy report generation to continue without crashing), yet I could think of case where there would be fewer (e.g. a computation that returned no meaningful result) in which case I padded.

  7. ChriS says:

    BTW, do not put types in your comments: they are already in the code (or can be displayed after compilation — e.g. with C-c C-t in Emacs) and you may want to change the data structures later… with the risk of forgetting to change your comments.

  8. Pingback: Using OCaml with Oracle (2) | So I decided to take my work back underground

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s