The ovm_regex_t type

The ovm_regex_t type is the OrchIDS type of regular expression matchers.

It is defined this way in src/lang.h:

typedef struct ovm_regex_s ovm_regex_t;
struct ovm_regex_s
{
  gc_header_t gc;
  char     *regex_str;
  regex_t   regex;
  int splits;
};

The regex_str field will in general be a NUL-terminated, C-like string — the regular expression itself.  This string must be allocated using Xmalloc(); this is not a pointer to an ovm_str_t object, for example.

The regex field is meant to contain the compiled regular expression matcher itself, as produced by the regcomp function from the standard regex library.

The splits field is meant to contain the number of subexpressions to match.  These are the subexpressions refered to by parentheses in the regular expression (and that one can compare to using \\1, \\2, …).

The type ovm_regex_t is a type of garbage-collectable data. To allocate a new object of type ovm_vstr_t, the following low-level function is provided:

ovm_var_t *ovm_regex_new(gc_t *gc_ctx);

This creates a new ovm_regex_t object, with a NULL regex_str field, and an uninitialized regex field. Calling res the result, one always has TYPE(res)==T_REGEX.  One can access the regex_str, regex and splits fields using REGEXSTR(res), REGEX(res), and REGEXNUM(res) respectively, all to write into or to read from.

The return type of ovm_regex_new() is the universal type ovm_var_t instead of ovm_regex_t, for practical reasons.

An example of how to use ovm_regex_new() is given by the issdl_regex_from_str() function, in src/lang.c, which implements the OrchIDS regex_from_str() primitive.

The result of ovm_regex_new() is created white, and much be gc_touch()ed before storing it into a garbage-collectable object.