Analyzing logfiles (1)

One powerful technique I often use is to quickly load application or system logs into throwaway SQLite databases for analysis. This means I can easily run queries such as, how many times has event X occurred broken down by hour, or how how does the average of this metric on that box compare with its peers. Other approaches I have seen people use for this type of analysis include Awk, Excel, and printing out every Orca graph, laying them out on the boardroom table and just staring at them. These approaches rapidly become unmanageable once you need to correlate many machines, different types of logfiles, etc. In this example I will be loading sar output but it can obviously be any regularly-structured file.

Here is the project directory:

gaius@debian:~/Projects/Sar2db$ ls
Makefile  sar2db.ml  sqlite3_overlay.ml

And a slightly more sophisticated Makefile:

# Makefile

TARGET=sar2db
EXTRAS=sqlite3_overlay.ml

$(TARGET): $(TARGET).ml $(EXTRAS)
	ocamlfind ocamlopt -package sqlite3 -package unix -package calendar -linkpkg $(EXTRAS) $< -annot -o $(TARGET)

clean:
	rm -f $(TARGET) *.o *.cm* *~ *.db

# End of file

The OCaml code starts as usual, opening the main modules I am using to move their functions into the default namespace:

(* A program to load sar output into SQLite for analysis *)

open Unix
open Sqlite3
open Sqlite3_overlay
open Scanf
open Printf

(* Parse command line arguments *)
let (infile, dbname, tabname) = ((ref ""), (ref ""), (ref ""))
let _ = Arg.parse
    [("-i", Arg.Set_string infile, "Input file name (default STDIN)");
     ("-o", Arg.Set_string dbname, "New SQLite database name (default sar.db)");
     ("-t", Arg.Set_string dbname, "Table name (default sardata)")]
    (fun x -> raise (Arg.Bad (sprintf "%s: unexpected argument '%s'" Sys.argv.(0) x)))
    (sprintf "Usage: %s [-o database]" Sys.argv.(0));

Scanf is a module that is the opposite of Printf: rather than producing output based on a format string, it instead uses a format string to parse a line of input. Note that I am overlaying some Sqlite3 functions with my own; I cannot for the life of me figure out how to do INSERT with bind variables with this library. I know it is possible in the underlying C API. So I am going to fake it for now. Obviously that abandons the performance enhancement of using prepared statements in the first place, but it keeps the main body of the code nice and clean, and I’m sure I’ll figure it out eventually.

(* main *)
try
    (* Open the database (implicit create) and create the table (if it does not already exist *)
    let dbname = (match !dbname with |"" -> "sar.db" |_ -> !dbname) in
    let chan = (match !infile with |"" -> Pervasives.stdin |_ -> open_in !infile) in
    let lda = db_open dbname in
    let tabname = (match !tabname with |"" -> "sardata"|_ -> !tabname) in

This part of the code actually implements the defaults for the command line arguments. If the database name is not specified in defaults to sar.db, similarly the table name defaults to sardata. This flexibility allows many different datasets to coexist for cross referencing.

        let sql_create = sprintf
        "create table %s (
            sample_time     datetime,
            sample_host     varchar2(32),
            sample_cpu      varchar2(4),
            pc_user         number,
            pc_nice         number,
            pc_system       number,
            pc_iowait       number,
            pc_steal        number,
            pc_idle         number)" tabname in

        let sql_index = sprintf "create unique index ix_sar on %s (sample_time, sample_host)" tabname in
            List.iter (fun stmt -> ignore (exec lda stmt) ) [sql_create; sql_index];

Next create the table and an index on it for performance. If either already exists it’s fine, the new data will simply be appended to what is there, so any errors executing the SQL are ignored. Before using this technique it is important to verify that the SQL is actually valid! The most obvious flaw is that I am performing no verification that the table name passed in on the command line is a valid one. However I have tested that provided that it is, this block of code will perform as intended.

        (* Read the header to get the hostname and the start date *)
        let (sample_host, start_date) =
        let first_line = input_line chan in
            (sscanf first_line "%_s %_s (%s@) %02d/%02d/%04d"
            (fun host mon dd yyyy ->
                (host, fst (mktime {tm_year = yyyy - 1900; tm_mday = dd; tm_mon = mon - 1;
                            tm_sec=0; tm_min=0; tm_hour=0; tm_wday=0; tm_yday=0; tm_isdst=false}) ))) in

Before going on, it would perhaps be useful to note the format of sar output:

gaius@debian:~/Projects/Sar2db$ sar|head -5
Linux 2.6.32-5-686 (debian) 	09/12/2010

12:00:01 AM       CPU     %user     %nice   %system   %iowait    %steal     %idle
12:01:01 AM       all     51.33      0.00      6.83     15.54      0.00     26.30
12:02:01 AM       all     30.36      0.00      6.11     21.08      0.00     42.45
12:03:01 AM       all     43.72      0.00      8.41     21.31      0.00     26.57

There is a date in the header, and the individual rows only have the time. Therefore to support gathering data over several days, it will be necessary to add the date and the time together. Also in the header is the name of the machine. The ) character is special in the format string, hence escaping it with the @ in order to perform a match. There are more concise methods available for date parsing, but in this program times and dates are mixed in with other data, so it makes sense to parse them all together. It takes as arguments a string to match against, a format string to tell it what to expect and a function to execute with the values. Format specifiers with an underscore are matched but then discarded and not passed to the function, or alternately I could have declared the function (fun _ _ host mon dd yyyy -> ...). The function returns a tuple containing the host name and the start date.

The final block of code in the module implements an “infinite” loop, broken out of when an exception is raised. Note (line 18) that we are already in a try block; this is the block that will exit the program when the end of the input is reached, either a file or standard input. The try block here is to loop again when the format string is not matched (periodically there will be a repeat of the header in the sar output, which we will ignore, as the format will not match). When nesting like this it is helpful to annotate which with belongs to which try. Indentation is not significant in OCaml as it is in Haskell and Python.

        (* Infinite loop to read the file, exiting when End_of_file *)
        let ins_stmt = prepare lda (sprintf "insert into %s values (?,?,?,?,?,?,?,?,?)" tabname) in
        while true do
            let next_line = input_line chan in
            try
                sscanf next_line "%02d:%02d:%02d %s %s %f %f %f %f %f %f"
                (fun hh mi ss ampm cpu pc_user pc_nice pc_sys pc_io pc_steal pc_idle ->
                    (* Add the start_date to the sample_time *)
                    let hh24 = match hh with  |12 -> 0.   |_ -> float hh in
                    let hh' = match ampm with |"AM" -> 0. |_    -> 12. in
                    let sample_time = start_date +. ((hh24 +. hh') *. 3600.) +. (float mi *. 60.) +. float ss in
                        bind ins_stmt 1 (to_date sample_time);

                    bind ins_stmt 2 (Data.TEXT sample_host);
                    bind ins_stmt 3 (Data.TEXT cpu);
                    Array.iteri (fun i x -> bind ins_stmt (i+4) (Data.FLOAT x)) [|pc_user; pc_nice; pc_sys; pc_io; pc_steal; pc_idle|];

                    match (execP ins_stmt) with
                        |Rc.OK -> ()
                        |_ -> print_endline (errmsg lda);
                )

            with Scan_failure e -> () |End_of_file -> () (* sscanf *)
        done
with
    Sys_error e     -> eprintf "%s: %s\n" Sys.argv.(0) e
    |End_of_file    -> () (* input_line *)

(* end of file *)

The addition of a time to a date is based on the example in PLEAC. I have not yet decided whether to use CalendarLib or OCamlnet’s Netdate in future. Probably the latter as it cuts down on the dependencies. Similarly, I’ll look into Pcre as it’s a prereq for Ocamlnet anyway, and I am very familiar with regexps from Python (that is how I do it usually).

Note that each different data type is bound slightly differently. Since there are many consecutive values of the same type, they can be bound in a loop with iteri. They are cast using constructors from the Sqlite3.Data module to Data.t. Finally the SQL can be executed (execP is my own function for prepared statements, as noted earlier).

It’s possible to very quickly and easily gather data from a variety of machines, using SSH key-pair and/or NFS storage. For example:

for n in `seq 1 10`
do
    ssh host${n} "sar|/shared/sar2db -o /shared/sar.db"
done

Now using SQL’s GROUP BY syntax, it is easy to analyze the performance of a cluster.

Finally here is the code for my “fake” prepared statements.

(* Overlays on Sqlite module. Highly inefficient! *)

open Sqlite3
open CalendarLib

(* expose only the interface *)
module type SQLITE3_OVERLAY =
sig
	type prepared
	val prepare : Sqlite3.db -> string -> prepared
	val execP 	: prepared -> Sqlite3.Rc.t
	val bind 	: prepared -> int -> Sqlite3.Data.t -> unit
	val to_date : float -> Sqlite3.Data.t
end

(* a "fake" prepared statement. contains original sql and values bound into it *)
type prepared = {lda        :db;            (* database handle *)
                sql         :string;        (* original SQL *)
                bindvars    :Data.t array;  (* variables to bind *)
                positions   :int list}      (* indexes of ? in sql *)

(** return a list containing all the indexes of ? in a string *)
let findq sql =
    let rec findq' sql offset acc =
        try
            let pos = String.index_from sql offset '?' in
                findq' sql (pos+1) (pos::acc)
        with
            Not_found -> List.rev acc in
    findq' sql 0 []

(** construct a prepared statement record *)
let prepare lda sql =
    { lda = lda; sql=sql; positions=(findq sql); bindvars=Array.make (List.length (findq sql)) (Data.NULL)}

(** insert a string at pos into a another string *)
let splice str pos x =
    let before = String.sub str 0 pos in
    let after = String.sub str (pos + 1) ((String.length str) - (pos + 1)) in
        before ^ x ^ after

(** substitute the bind vars of a prepared into it's sql and execute it *)
let execP stmt =
    let sql' = ref "" in
        sql' := String.copy stmt.sql;
        for i = (Array.length stmt.bindvars) - 1 downto 0 do (* backwards to not invalidate findq results *)
            sql' := splice !sql' (List.nth stmt.positions i) ("'" ^ (Data.to_string stmt.bindvars.(i)) ^ "'")
        done;

        exec stmt.lda !sql'

(** bind a value into the statement at pos - indexes start from 1 *)
let bind stmt pos value =  stmt.bindvars.(pos - 1) <- value

(** Convert Epoch time to Data.TEXT in ISO 8601 format *)
let to_date e = Data.TEXT (Printer.CalendarPrinter.sprint "%Y%m%dT%H:%M:%S" (Calendar.from_unixfloat e))

(* End of file *)

It simply builds literal SQL by substituting a string in position of a ? indexed from 1 (see line 46 in sar2db.ml). I am overlaying the prepare and bind functions and adding execP and to_date (as there is no Data.DATETIME constructor). findq and splice are only used internally so they are hidden. Sqlite implicitly handles the type conversions from strings, but I am fully type safe until the last step.

About these ads

About Gaius

Jus' a good ol' boy, never meanin' no harm
This entry was posted in Linux, Ocaml, SQLite. Bookmark the permalink.

3 Responses to Analyzing logfiles (1)

  1. Pingback: Dictionary format strings | So I decided to take my work back underground

  2. Pingback: Scope in database code (2) | So I decided to take my work back underground

  3. Pingback: Calling Unix libraries from OCaml | So I decided to take my work back underground

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s