Regular expressions module
In header <scn/regex.h>
scnlib doesn't do the regex processing itself, but delegates that task to an external regex engine. This behavior is controlled by the CMake option SCN_REGEX_BACKEND
, which defaults to std
(use std::
). Other possible options are Boost
and re2
.
The exact feature set, syntax, and semantics of the regular expressions may differ between the different backends. See the documentation for each to learn more about the supported syntax. In general:
std
is available without external dependencies, but doesn't support named captures, has limited support for flags, and is slow.Boost
has the largest feature set, but is slow.re2
is fast, but doesn't support all regex features.
Feature | std | Boost |
|
---|---|---|---|
Named captures | No | Yes | Yes |
Wide strings (wchar_t ) as input | Yes | Yes | No |
Unicode character classes (i.e. \pL ) | No | Yes-ish [1] | Yes |
Character classes (like this: [[:alpha:]] ) match non-ASCII | No | Depends [2] | No |
[1][2]: The behavior of Boost.Regex varies, whether it's using the ICU or not. If it is, character classes like \pL
and [[:alpha:]]
can match any non-ASCII characters. Otherwise, only ASCII characters are matched.
To do regex matching, the scanned type must either be a string (std::
or std::
), or scn::
. Due to limitations of the underlying regex engines, the source must be contiguous.
Flag | Description | Support |
---|---|---|
/m | multiline : ^ matches the beginning of a line, and $ the end of a line. | Supported by |
/s | singleline : . matches a newline. | Supported by |
/i | icase : Matches are case-insensitive. | Supported by everyone: |
/n | nosubs : Subexpressions aren't matched and stored separately. | Supported by everyone: std , Boost , and re2 . |
Classes
-
template <typename CharT>class scn::basic_regex_match
-
template <typename CharT>class scn::basic_regex_matches