String::Random::Regexp::regxstring - Generate random strings from a
    regular expression


    Version 1.04


    This module provides functionality for generating random strings from a
    regular expression by bridging to the regxstring C++ library by daidodo
    <> via XS.

        use String::Random::Regexp::regxstring;
        my $strings = generate_random_strings(
        # generates 3 random strings based on the regexp
        #   3F3YR2W22947580xxx
        #   N5HHM8LW0K59719xxx957
        #   G2DQL6JF1E91086xxx
        # or provide it with a Regexp object
        my $strings = generate_random_strings(
        # or enable debug
        my $strings = generate_random_strings(


      * generate_random_strings : generates random strings. This sub is
      exported by default.



        my $strings = generate_random_strings($regexp [, $N, $debug])


      * $regexp : a regular expression either as a string or as a Regexp
      object created via e.g. qr/.../

      * $N : optionally specify the number of random strings to generate.
      Default is 1.

      * $debug : optionally enable debug, if set to 1. By default it is
      turned off.

    Given a regular expression, this subroutine will generate $N random
    strings which are guaranteed to be matched by the specified regular

    The generated random strings will be returned back as an ARRAY ref.

    undef is returned on error, e.g. when no regular expression was
    specified or when the number of random strings to generate is not

THE C++ LIBRARY regxstring by daidodo

    This is a regxstring C++ library by daidodo
    <> which produces random strings
    from a regular expresssion. According to the author, "... most Perl 5
    supported regular expressions are also supported by regxstring, as
    showing bellow:"

        Meta-character(s)   Description
        \                   Quote the next meta-character
        ^                   Match the beginning of the line
        $                   Match the end of the line (or before newline at the end)
        ?                   Match 1 or 0 times
        +                   Match 1 or more times
        *                   Match 0 or more times
        {n}                 Match exactly n times
        {n,}                Match at least n times
        {n,m}               Match at least n but not more than m times
        .                   Match any character (except newline)
        (pattern)           Grouping
        (?:pattern)         This is for clustering, not capturing; it groups sub-expressions like "()", but doesn't make back-references as "()" does
        (?=pattern)         A zero-width positive look-ahead assertion, e.g., \w+(?=\t) matches a word followed by a tab, without including the tab
        (?!pattern)         A zero-width negative look-ahead assertion, e.g., foo(?!bar) matches any occurrence of "foo" that isn't followed by "bar"
        |                   Alternation
        [xyz]               Matches a single character that is contained within the brackets
        [^xyz]              Matches a single character that is not contained within the brackets
        [a-z]               Matches a single character that is in a given range
        [^a-z]              Matches a single character that is not in a given range
        \f                  Form feed
        \n                  Newline
        \r                  Return
        \t                  Tab
        \v                  Vertical white space
        \d                  Digits, [0-9]
        \D                  Non-digits, [^0-9]
        \s                  Space and tab, [ \t\r\n\f]
        \S                  Non-white space characters, [^ \t\r\n\f]
        \w                  Alphanumeric characters plus '_', [0-9a-zA-Z_]
        \W                  Non-word characters, [^0-9a-zA-Z_]
        \N                  Matches what the Nth marked sub-expression matched, where N is a digit from 1 to 9

    The library provides an executable which may be run from the command
    line. It takes a regular expression from the standard input and dumps
    the random strings.


    There are at least two alternative modules at CPAN which I have tested.

    String::Random and Regexp::Genex. Both fail with rudimentary regular

    The former does not support groups and therefore all parentheses have
    to be removed from the regular expression first. But this is not a
    trivial task. For example:

      use String::Random qw/random_regex/;
      print random_regex('[A-HN-SW]\d{7}[A-J]ES[A-HN-SW]\d{7}[A-J](?:xx)?');
      # '(' not implemented.  treating literally.

    The latter fails randomly on large regular expressions, e.g.
    [A-HN-SW]\d{7}[A-J]xxx but succeeds with the shorter


    Andreas Hadjiprocopis, <bliako at>




    The XS function for generating random strings accepts the input regular
    expression as a string. This means that if a Regexp object was supplied
    to generate_random_strings, the regular expression as a string must be
    extracted. And this is done by stringifying the Regexp object, e.g. my
    $str = "".qr/abc/ However, the stringification encloses the regular
    expression within a (?^: and ). For example:

      print "".qr/^(abc)/
      # prints (?^:^(abc))

    Currently, the subroutine will remove this "enclosure". It remains to
    be seen whether this is 100% successful.

    I have not tested the statistical distribution of the results in
    regular expressions like a|b|c|d. They must appear equally often.


    Mock::Data::Regex which is implemented in Pure-Perl.


    The core functionality to this module is provided by the C++ library
    for generating random strings from regular expressions located at The author is DoZerg / daidodo. The Licence
    is Apache v2.0.

    The source code of this library is included in the current module.

    I have provided C++ harness code, the XS interface and the Perl module.


    This software (except the C++ files) is Copyright (c) 2024 by Andreas

    This is free software, licensed under:

      The Artistic License 2.0 (GPL Compatible)

    The C++ files are Copyright (c) by daidodo <>
    and are licensed under Apache v2.0