The event dispatcher

Once Orchids has completed all initializations, it calls event_dispatcher_main_loop(), which repeatedly collects events from various sources, and launches threads to monitor signatures.  The event_dispatcher_main_loop() function can be found in file evt_mgr.c.  Here is how it works.

The event_dispatcher_main_loop() function runs an infinite loop.  There are two kinds of things that it is waiting for:

  • First, it is waiting on input coming from a certain set of sockets; events received on an Internet socket will be read from there;
  • Second, it is also looking for actions that are scheduled to be done at a later time, and are kept in a priority queue (ctx->rtactionlist, where ctx is the Orchids context); this is a general purpose mechanism that is used to poll text files, to register timeouts, and various other things.

Sockets

The code that looks from input waiting on a socket looks like this:

retval = select(ctx->maxfd + 1, &rfds, NULL, NULL, wait_time_ptr);

Once Orchids detects activity on a file descriptor, retval will contain a positive number: the number of file descriptors that are ready for input.  It will then sweep through the list ctx->realtime_handler_list to find the correct callback(s) that must be called to deal with input on those file descriptors that are ready. These callbacks are registered through the add_input_descriptor() call, in orchids_api.c.

It may also happen that retval equals 0.  In that case, no input was ready, and the timeout, given by wait_time_ptr, has been reached.  This timeout is always equal to the minimal delay one should wait before we must trigger an action from the priority queue: see the next section on the rtactionlist priority queue.

The file descriptors on which select() is waiting are registered through the add_input_source() function call, in orchids_cfg.c.  You shouldn’t call it directly: it is meant to be called when Orchids reads an INPUT directive at configuration time, see below.

Let us take an example to see how all this configuration takes place.  In the configuration file orchids-inputs.conf, (in the distribution, look into dist/; otherwise, look typically into /usr/local/etc/orchids/,) you should find a line of the following form:

INPUT            textfile    /var/log/messages

This instructs Orchids to open the file /var/log/messages and feed it as input to the mod_textfile module. It also registers /var/log/messages as an indexing key, and I will talk about it in another post, dealing with so-called dissectors.

When Orchids does all its configuration stuff, it will parse the command INPUT. The table config_dir_g[], in orchids_cfg.c, lists all the configuration commands that Orchids recognizes. One of the lines there reads:

  { "INPUT", add_input_source, "Add an input source module"},

This tells Orchids to just call add_input_source() when it sees a line starting with INPUT.

Now add_input_source() will read the rest of the line, take the next word (textfile here) as the base name of a module (here, mod_textfile), and it will look for a configuration directive provided by this module of the exact same name INPUT.

The mod_textfile declares one such directive, precisely. Indeed, in its table textfile_dir[] of recognized directives (which Orchids knows about because the declared module structure input_module_t mod_textfile = { contains a pointer to it), we find the following line:

  { "INPUT", add_input_file, "Add a file as input source" },

The function add_input_file() is local to the mod_textfile module. When instructed to open a SOCK_UNIX socket, it connects to it; when told to read from a pipe, it opens it; in both cases, it obtains a file descriptor, and uses it to call add_input_descriptor() and register it as one of the possible sources of input that select() should wait for.

There is a final case that add_input_file deals with: opening a regular file.  It makes no sense to select() on a regular file.  Instead, add_input_file() will add the file descriptor to a local list of polled files, with prescribed polling interval.  At post-configuration time (namely, after all modules have been loaded), Orchids will call the module’s textfile_postconfig() function (a pointer to that function is installed in the declared module structure input_module_t mod_textfile). This in turn calls register_rtcallback() to install a callback that will be called after the polling period to read the contents of the file in.

The rtactionlist priority queue

The other thing that event_dispatcher_main_loop() deals with is a priority queue of actions to be done later.  Each comes with a time at which the action should be triggered, and a callback to invoke at that time (and additional data to pass to the callback).

There are several places in the infinite loop implemented by event_dispatcher_main_loop()  where it does that.  They all have the form:

he = get_next_rtaction(ctx);

Right after that, Orchids will execute the callback he->cb if there is one. In the example of mod_textfile, for instance, with key /var/log/messages, since the latter is a regular file, the module’s post-configuration function will have registered a callback that reads the contents of the file.  This callback is called rtaction_read_files(), and what it does is:

  • read in as much new text was appended to the file and put it in some internal buffers
  • try to read one complete line (terminated by a newline \n) and package it as an Orchids event (a list of fields)
  • re-register itself to read the next line, or read some more text in, as needed, later.

Since my point here is to talk about the priority queue, let us look at the last point first.  What it does, concretely, is to call:

register_rtaction(ctx, he);

with the same he as the one we got from our earlier call to get_next_rtaction().  This is normal: we reschedule ourselves to be called later by the infinite loop in the event_dispatcher_main_loop() function.

However, we do the following subtle thing.  If we haven’t reached the end of the file yet, namely if we still have a few lines to parse, or if we haven’t read in the whole new contents of the file, then we do not change he->date: this way, the date at which we are rescheduled is now.  The effect is that the loop in event_dispatcher_main_loop() will call us back again immediately to read some new lines.  This is the proper way to do it: the infinite loop may do other things, e.g., reading other urgent input from other sources, which you should give Orchids the chance of dealing with.

On the opposite, if we have reached the (current) end of the file, we add the standard value of the polling interval to the date (if (eof) he->date.tv_sec += cfg->poll_period), so as to be called back in that amount of time.  Before that, we must call  gettimeofday (&he->date, NULL); see this post.

The priority queue is implemented as a skew heap, for optimal time performance.

Most of the time, the above behavior is typical: your callback should reschedule itself by calling register_rtaction() as above.  If you do not wish to be rescheduled, you should free the memory used by he, by caling gc_base_free(ctx->gc_ctx,he).  Otherwise you will introduce a memory leak.

To install such delayed callbacks, conversely, the obvious way is to allocate a heap_entry_t, put a pointer to it into some local variable he, and to call register_rtaction (ctx, he). You would also have to fill in the heap_entry_t structure by hand. A better way is to call register_rtcallback(), which does most of the job for you: it takes an Orchids context ctx, a callback, two pieces of data that you wish to be forwarded to the callback (one meant to be garbage-collectable, the other one meant to be allocated by hand), a delay (as a time_t), and a priority (of type int) before the callback is triggered.

The rtactionlist priority queue is managed in such a way that the first heap entries that will be returned by get_next_rtaction() are those that are scheduled earliest, and among all those which are scheduled at the same date, those that have the highest priority.

But what does Orchids do next?

We have seen how Orchids was able to read data from several sources, sockets, or polled files, and how the rtactionlist priority queue could be used to shedule tasks to be done later, or even immediately.

However, what does it do next?  In the example of mod_textfile, I said that the rtaction_read_files() callback did three things:

  • read in as much new text was appended to the file and put it in some internal buffers
  • try to read one complete line and package it as an Orchids event
  • re-register itself to read the next line, or read some more text in, as needed, later.

But I only commented on the last one.  The first point consists in handling a partially filled buffer of text, possibly containing characters from an incomplete line already present in the buffer, and advancing pointers as we progress in finding lines.

The second point is more mysterious.  I’ll talk about it in another post.  Roughly, Orchids packages the line just read as a record, and feeds it to the event injector.  The latter will try to dissect the line, repreatedly, using several modules if necessary.  When it reaches a point where no further dissection can be done, it launches the Orchids engine, creating and advancing threads, executing bytecode, in the hope of finding a match.