The json module

The json module is a dissection module: its purpose is to take an Orchids event, parse its last field (which should be a text string, of type str), which should be in JSON format, and return a refined Orchids events, with additional fields.

JSON means JavaScript Object Notation, and has become a popular alternative to XML for describing objects, possibly nested.  The Linux journalctl utility, for example, has a native JSON export format.

The json module is still in a preliminary version. Its intended use is to parse data embedded in, say, the .syslog.msg field of an event already dissected once by the syslog module.   For example, if you know that messages reported by syslog is in JSON format when the .syslog.prog field equals js_reporter (that name is merely for the sake of the example), then you would write something like the following in $OCONF/orchids-inputs.conf, assuming syslog data comes from the textfile module from file blah.log:

INPUT		textfile	"blah.log"
DISSECT syslog	textfile	"blah.log"
DISSECT json    syslog		js_reporter

Configuration options

None (yet).

Fields

The fields provided by the json module are of a somewhat peculiar nature.  Most other modules document a fixed set of fields.  The json module has fields that may vary dynamically as events are obtained.  To accomodate for this dynamicity, all those run-time fields are collected in one single static field, .json.fields, which is an array mapping the names of the dynamic fields to their values.  This will be clearer with examples, to be given below.  We shall also explain the role of the .json.remainder field below.

Field Type Mono? Description
.json.remainder str rest of message
.json.fields [str array of dynamic fields

(The type [str means array of strings. An array can be indexed by any basic type. Here the indices will be strings as well.)

We start with a simple example.  Imagine the following text, in JSON format:

{"syslog_time":"2017-05-03T20:41:14.342405+02:00","syslog_host":"darkstar","process_id":"237","path":"/tmp/exploit"}

The json module will dissect that by letting .json.fields be the array that maps:

  • "syslog_time" to the value "2017-05-03T20:41:14.342405+02:00";
    note that the latter is not a ctime value: to obtain one, use the ctime_from_str primitive, for example;
  • "syslog_host" to the value "darkstar";
  • "process_id" to the value "237";
    note that the latter is not an uint value: to obtain one, use the uint_from_str primitive, for example;
  • "path" to the value "/tmp/exploit".

The JSON format also allows one to nest objects, as in the "attributes" subobject below:

{"syslog_time":"2017-05-03T20:41:14.342405+02:00","syslog_host":"darkstar","process_id":"237","path":"/tmp/exploit","attributes":{"length":"2","referer":"apache","object.class":"None"}}

in which case the .json.fields array will contain the following additional fields:

  • "attributes.length", mapped to the value "2";
  • "attributes.referer", mapped to the value "apache";
  • "attributes.object.class", mapped to the value "None".

Still more complicated, a JSON object may contain lists of sub-objects enclosed in between square brackets (‘[‘…’]’), as in the value of the "menu" subobject below.

{"syslog_time":"2017-05-03T20:41:14.342405+02:00","syslog_host":"darkstar","process_id":"237","path":"/tmp/exploit","attributes":{"length":"2","referer":"apache","object.class":"None"},"menu":[{"value":"New","onclick":"createNewDoc()"},{"value":"Open","onclick":"OpenDoc()"},{"value":"Close","onclick":"CloseDoc()"}]}

in which case the .json.fields array will contain the following additional fields:

  • "menu(0).value", mapped to  "Open";
  • "menu(0).onclick", mapped to "createNewDoc()";
  • "menu(1).value", mapped to "Open";
  • "menu(1).onclick", mapped to "OpenDoc()";
  • "menu(2).value", mapped to "Close";
  • "menu(2).onclick", mapped to "CloseDoc()".

Note how list of objects are handled as though we had implicit subobjects, numbered 0, 1, 2, etc.  The syntax of fields is meant to be compatible with the accessors of the prelude module—just so that you are not forced to get used to a new syntax when you switch modules.

Finally, it may be that the JSON message ends in some junk, as exemplified at the end of the following.

{"syslog_time":"2017-05-03T20:41:14.342405+02:00","syslog_host":"darkstar","process_id":"237","path":"/tmp/exploit","attributes":{"length":"2","referer":"apache","object.class":"None"},"menu":[{"value":"New","onclick":"createNewDoc()"},{"value":"Open","onclick":"OpenDoc()"},{"value":"Close","onclick":"CloseDoc()"}]}and then some junk

In that case, the .json.remainder fields will contain "and then some junk".

In all previous examples, .json.remainder simply contained the empty string "".