Regular expressions module

In header <scn/regex.h>

scnlib doesn't do the regex processing itself, but delegates that task to an external regex engine. This behavior is controlled by the CMake option SCN_REGEX_BACKEND, which defaults to std (use std::regex). Other possible options are Boost and re2.

The exact feature set, syntax, and semantics of the regular expressions may differ between the different backends. See the documentation for each to learn more about the supported syntax. In general:

  • std is available without external dependencies, but doesn't support named captures, has limited support for flags, and is slow.
  • Boost has the largest feature set, but is slow.
  • re2 is fast, but doesn't support all regex features.
Regex backend feature comparison
FeaturestdBoost

re2

Named capturesNoYes

Yes

Wide strings (wchar_t) as inputYesYes

No

Unicode character classes (i.e. \pL)NoYes-ish [1]

Yes

Character classes (like this: [[:alpha:]]) match non-ASCIINoDepends [2]No

[1][2]: The behavior of Boost.Regex varies, whether it's using the ICU or not. If it is, character classes like \pL and [[:alpha:]] can match any non-ASCII characters. Otherwise, only ASCII characters are matched.

To do regex matching, the scanned type must either be a string (std::basic_string or std::basic_string_view), or scn::basic_regex_matches. Due to limitations of the underlying regex engines, the source must be contiguous.

Possible flags for regex scanning
FlagDescription

Support

/mmultiline: ^ matches the beginning of a line, and $ the end of a line.

Supported by Boost and re2. For std, uses std::regex_constants::multiline, which was introduced in C++17, but isn't implemented by MSVC.

/ssingleline: . matches a newline.

Supported by Boost and re2, not by std.

/iicase: Matches are case-insensitive.

Supported by everyone: std, Boost, and re2.

/nnosubs: Subexpressions aren't matched and stored separately.Supported by everyone: std, Boost, and re2.

Classes

template <typename CharT>
class scn::basic_regex_match
template <typename CharT>
class scn::basic_regex_matches