NAME
    Data::Sah - Fast and featureful data structure validation

VERSION
    This document describes version 0.916 of Data::Sah (from Perl
    distribution Data-Sah), released on 2024-02-16.

SYNOPSIS
    Non-OO interface:

     use Data::Sah qw(
         normalize_schema
         gen_validator
     );

     my $v;

     # generate a validator for schema
     $v = gen_validator(["int*", min=>1, max=>10]);

     # validate your data using the generated validator
     say "valid" if $v->(5);     # valid
     say "valid" if $v->(11);    # invalid
     say "valid" if $v->(undef); # invalid
     say "valid" if $v->("x");   # invalid

     # generate validator which reports error message string
     $v = gen_validator(["int*", min=>1, max=>10],
                        {return_type=>'str_errmsg', lang=>'id_ID'});
     # ditto but the error message will be in Indonesian
     $v = gen_validator(["int*", min=>1, max=>10],
                        {return_type=>'str_errmsg', lang=>'id_ID'});
     say $v->(5);  # ''
     say $v->(12); # 'Data tidak boleh lebih besar dari 10'
                   # (in English: 'Data must not be larger than 10')

     # normalize a schema
     my $nschema = normalize_schema("int*"); # => ["int", {req=>1}, {}]
     normalize_schema(["int*", min=>0]); # => ["int", {min=>0, req=>1}, {}]

    OO interface (more advanced usage):

     use Data::Sah;
     my $sah = Data::Sah->new;

     # get perl compiler
     my $pl = $sah->get_compiler("perl");

     # compile schema into Perl code
     my $cd = $pl->compile(schema => ["int*", min=>0]);
     say $cd->{result};

    will print something like:

     # req #0
     (defined($data))
     &&
     # check type 'int'
     (Scalar::Util::Numeric::isint($data))
     &&
     (# clause: min
     ($data >= 0))

    To see the full validator code (with "sub {}" and all), you can do
    something like:

     % LOG_SAH_VALIDATOR_CODE=1 TRACE=1 perl -MLog::ger::LevelFromEnv -MLog::ger::Output=Screen -MData::Sah=gen_validator -E'gen_validator(["int*", min=>0])'

    which will print log message like:

     normalized schema=['int',{min => 0,req => 1},{}]
     validator code:
        1|do {
        2|    require Scalar::Util::Numeric;
        3|    sub {
        4|        my ($data) = @_;
        5|        my $_sahv_res =
         |
        7|            # req #0
        8|            (defined($data))
         |
       10|            &&
         |
       12|            # check type 'int'
       13|            (Scalar::Util::Numeric::isint($data))
         |
       15|            &&
         |
       17|            (# clause: min
       18|            ($data >= 0));
         |
       20|        return($_sahv_res);
       21|    }}

DESCRIPTION
    This distribution, "Data-Sah", implements compilers for producing Perl
    and JavaScript validators, as well as translatable human description
    text from Sah schemas. Compiler approach is used instead of interpreter
    for faster speed.

    The generated validator code can run without the "Data::Sah::*" modules.

STATUS
    Some features are not implemented yet:

    *   def/subschema

    *   obj: meths, attrs properties

    *   .prio, .err_msg, .ok_err_msg attributes

    *   .result_var attribute

    *   BaseType: more forms of if clause

        Only the basic form of the "if" clause is implemented.

    *   BaseType: postfilters

    *   BaseType: prefilters.temp

    *   BaseType: check, prop, check_prop clauses

    *   HasElems: each_index, check_each_elem, check_each_index, exists
        clauses

    *   HasElems: len, elems, indices properties

    *   hash: check_each_key, check_each_value, allowed_keys_re,
        forbidden_keys_re clauses

    *   array: uniq clauses

    *   human compiler: markdown output

VARIABLES
  $Log_Validator_Code (bool, default: 0)
MODULE ORGANIZATION
    Data::Sah::Type::* roles specify Sah types, e.g. "Data::Sah::Type::bool"
    specifies the bool type. It can also be used to name distributions that
    introduce new types, e.g. "Data-Sah-Type-complex" which introduces
    complex number type.

    Data::Sah::FuncSet::* roles specify bundles of functions, e.g.
    <Data::Sah::FuncSet::Core> specifies the core/standard functions.

    Data::Sah::Compiler::$LANG:: namespace is for compilers. Each compiler
    might further contain "::TH::*" (type handler) and "::FSH::*" (function
    handler) subnamespaces to implement appropriate functionalities, e.g.
    Data::Sah::Compiler::perl::TH::bool is the bool type handler for the
    Perl compiler, Data::Sah::Compiler::perl::FSH::Core is the Core funcset
    handler for Perl compiler.

    Data::Sah::Coerce::$LANG::To_$TARGET_TYPE::From_$SOURCE_TYPE::$DESCRIPTI
    ON contains coercion rules.

    Data::Sah::Filter::$LANG::$TOPIC::$DESCRIPTION contains filtering rules.

    Data::Sah::Value::$LANG::$TOPIC::$DESCRIPTION contains value codes.

    Data::Sah::TypeX::$TYPENAME::$CLAUSENAME namespace can be used to name
    distributions that extend an existing Sah type by introducing a new
    clause for it. See Data::Sah::Manual::Extending for an example.

    Data::Sah::Lang::$LANGCODE namespaces are for modules that contain
    translations. They are further organized according to the organization
    of other Data::Sah modules, e.g. Data::Sah::Lang::en_US::Type::int or
    "Data::Sah::Lang::en_US::TypeX::str::is_palindrome".

    Sah::Schema:: namespace is reserved for modules that contain schemas in
    their $schema package variables. For example, Sah::Schema::posint.

    Sah::Schemas::* are module names for distributions that bundle several
    "Sah::Schema::*" modules. For example Sah::Schemas::Int contains various
    schemas for integers such as Sah::Schema::uint, Sah::Schema::int8, and
    so on.

    Sah::SchemaR:: namespace is reserved to store resolved result of schema.
    For example, Sah::Schema::unix::local_username contains the definition
    for the schema "unix::local_username" which is "unix::username" with
    some additional coerce rules. "unix::username" in turn is defined in
    Sah::Schema::unix::username which is base type "str" with some clauses
    like minimum and maximum length as well as regular expression for valid
    pattern. To find out the base type of a schema (which might be defined
    based on another schema), one has to perform one to several lookups to
    "Sah::Schema::*" modules. A "Sah::SchemaR::*" module, however, contains
    the "resolved" result of the definition, so by looking at
    Sah::SchemaR::unix::local_username one can know that the schema
    eventually is based on the base type "str". See
    Dist::Zilla::Plugin::Sah::Schemas.

    Sah::SchemaV:: namespace is reserved to store generated schema validator
    code. See Dist::Zilla::Plugin::Rinci::GenValidator.

FUNCTIONS
    None exported by default.

  normalize_clset (function)
    Usage:

     # as function
     my $nclset = Data::Sah::normalize_clset($clset[, \%opts]); # => hash

    Convert a clause set to its normalized form, e.g. change
    "{"!match"=>"abc"}" into "{"match"=>"abc", "match.op"=>"not"}". Produce
    a shallow copy of the input clause set hash.

  normalize_schema (function)
    Usage:

     # as function
     my $nschema = normalize_schema($schema); # => ARRAY

    Convert $schema to its normalized form, i.e. the two-element array form.
    See Sah for more information about schema forms. Produces a new copy of
    arrayref as well as clause set hashref, even if the input $schema is
    already in array form. Implemented by Data::Sah::Normalize.

    Can also be used as a method, see "normalize_clset (method)".

  gen_validator (function)
    Usage:

     $code_or_str = gen_validator($schema, \%opts); # => CODE (or STR)

    Generate validator code for $schema (or, if "source" option is set to
    true, a source code string).

    Can also be used as a method, see "gen_validator (method)".

    Known options (unknown options will be passed to Perl schema compiler,
    see Data::Sah::Compiler::perl):

    *   accept_ref => BOOL (default: 0)

        Normally the generated validator accepts data, as in:

         $res = $vdr->($data);
         $res = $vdr->(42);

        If this option is set to true, validator accepts reference to data
        instead, as in:

         $res = $vdr->(\$data);

        This allows $data to be modified by the validator (mainly, to set
        default value specified in schema). For example:

         my $data;
         my $vdr = gen_validator([int => {min=>0, max=>10, default=>5}],
                                 {accept_ref=>1});
         my $res = $vdr->(\$data);
         say $res;  # => 1 (success)
         say $data; # => 5

    *   source => BOOL (default: 0)

        If set to 1, return source code string instead of compiled
        subroutine. Usually only needed for debugging (but see also
        $Log_Validator_Code and "LOG_SAH_VALIDATOR_CODE" if you want to log
        validator source code).

ATTRIBUTES
  compilers => HASH
    A mapping of compiler name and compiler ("Data::Sah::Compiler::*")
    objects.

METHODS
  new
    Usage:

     my $sah = Data::Sah->new;

    Create a new Data::Sah instance.

  get_compiler
    Usage:

     my $comp = $sah->get_compiler($name);

    Get compiler object. "Data::Sah::Compiler::$name" will be loaded first
    and instantiated if not already so. After that, the compiler object is
    cached.

    Example:

     my $plc = $sah->get_compiler("perl"); # loads Data::Sah::Compiler::perl

  normalize_schema (method)
    Usage:

     # as method
     my $nschema = $sah->normalize_schema($schema);

    See "normalize_schema (function)", see "normalize_schema (function)" for
    more details on arguments.

  normalize_clset (method)
    Usage:

     # as method
     my $nclset = $sah->normalize_clset($clset[, \%opts]); # => hash

    Can also be used as function, see "normalize_clset (function)" for more
    details on arguments.

  normalize_var
     my $nvarname = $sah->normalize_var($var);

    Normalize a variable name in expression into its fully
    qualified/absolute form.

    Not yet implemented (pending specification).

    For example:

     [int => {min => 10, 'max=' => '2*$min'}]

    $min in the above expression will be normalized as "schema:clauses.min".

  gen_validator
     # as method
     my $vdr = $sah->gen_validator($schema [ , \%opts ]); # => coderef

     # as function
     my $vdr = gen_validator($schema [ , \%opts ]); # => coderef

    Can also be used as a function, see "gen_validator (function)" for more
    details on arguments.

FAQ
    See also Sah::FAQ.

  Comparison to {JSON::Schema, Data::Rx, Data::FormValidator, ...}?
    See Sah::FAQ.

  Why is it so slow?
    You probably do not reuse the compiled schema, e.g. you continually
    destroy and recreate Data::Sah object, or repeatedly recompile the same
    schema. To gain the benefit of compilation, you need to keep the
    compiled result and use the generated Perl code repeatedly.

  Can I generate another schema dynamically from within the schema?
    For example:

     // if first element is an integer, require the array to contain only integers,
     // otherwise require the array to contain only strings.
     ["array", {"min_len": 1, "of=": "[is_int($_[0]) ? 'int':'str']"}]

    Currently no, Data::Sah does not support expression on clauses that
    contain other schemas. In other words, dynamically generated schemas are
    not supported. To support this, if the generated code needs to run
    independent of Data::Sah, it needs to contain the compiler code itself
    (or an interpreter) to compile or evaluate the generated schema.

    However, an eval_schema() Sah function which uses Data::Sah can be
    trivially declared and target the Perl compiler.

  How to display the validator code being generated?
    Use the "source => 1" option in gen_validator().

    If you use the OO interface, e.g.:

     # generate perl code
     my $cd = $plc->compile(schema=>..., ...);

    then the generated code is in "$cd->{result}" and you can just print it.

    If you generate validator using gen_validator(), you can set environment
    LOG_SAH_VALIDATOR_CODE or package variable $Log_Validator_Code to true
    and the generated code will be logged at trace level using Log::ger. The
    log can be displayed using, e.g., Log::ger::Output::Screen:

     % LOG_SAH_VALIDATOR_CODE=1 TRACE=1 \
       perl -MLog::ger::LevelFromEnv -MLog::ger::Output=Screen \
       -MData::Sah=gen_validator -e '$sub = gen_validator([int => min=>1, max=>10])'

    Sample output:

     normalized schema=['int',{max => 10,min => 1},{}]
     schema already normalized, skipped normalization
     validator code:
        1|do {
        2|    require Scalar::Util::Numeric;
        3|    sub {
        4|        my ($data) = @_;
        5|        my $_sahv_res =
         |
        7|            # skip if undef
        8|            (!defined($data) ? 1 :
         |
       10|            (# check type 'int'
       11|            (Scalar::Util::Numeric::isint($data))
         |
       13|            &&
         |
       15|            (# clause: min
       16|            ($data >= 1))
         |
       18|            &&
         |
       20|            (# clause: max
       21|            ($data <= 10))));
         |
       23|        return($_sahv_res);
       24|    }}

    Lastly, you can also use validate-with-sah CLI utility from the
    App::SahUtils distribution (use the "--show-code" option).

  How to show the validation error message? The validator only returns true/false!
    Pass the "return_type=>"str_errmsg"" to get an error message string on
    error, or "return_type=>"hash_details"" to get a hash of detailed error
    messages. Note also that the error messages are translateable (e.g. use
    "LANG" or "lang=>..." option. For example:

     my $v = gen_validator([int => between => [1,10]], {return_type=>"str_errmsg"});
     say "$_: ", $v->($_) for 1, "x", 12;

    will output:

     1:
     "x": Input is not of type integer
     12: Must be between 1 and 10

  How to show all the error and warning messages?
    If you pass "return_type=>"hash_details"" then the generated validator
    code can return a hashref containing all the errors (in the "errors"
    key) and warnings (in the "warnings" key) instead of just a boolean
    (when "return_type=>"bool_valid"") or a string containing the first
    encountered error message (when "return_type=>"str_errmsg"") .

  How to get the data value with the default filled in, or coercion done?
    If you use "return_type=>"hash_details"", the generated validator code
    will also return the input data after the default is filled in or
    coercion is done in the "value" key of the result hashref. Or, if you do
    not need a validator that checks for all errors/warnings, you can use
    "return_type=>"bool_valid+val"" or "return_type=>"str_errmsg+val"". For
    example:

     my $v = gen_validator(["date", {"x.perl.coerce_to"=>"DateTime"}],
                           {return_type=>"str_errmsg+val"});

     my ($err, $val) = @{ $v->("2016-05-14") };

    The validator will return an error message string (or an empty string if
    validation succeeds) as well as the final value. In the example above,
    $val will contain a DateTime object. This is convenient because the
    final value is what is usually used further after validation process.

  What does the "@..." prefix that is sometimes shown on the error message mean?
    It shows the path to data item that fails the validation, e.g.:

     my $v = gen_validator([array => of => [int=>min=>5], {return_type=>"str_errmsg"});
     say $v->([10, 5, "x"]);

    prints:

     @[2]: Input is not of type integer

    which means that the third element (subscript 2) of the array fails the
    validation. Another example:

     my $v = gen_validator([array => of => [hash=>keys=>{a=>"int"}]]);
     say $v->([{}, {a=>1.1}]);

    prints:

     @[1][a]: Input is not of type integer

    Note that for validator that returns full result hashref
    ("return_type=>"hash_details"") the error messages in the "errors" key
    are also keyed with data path, albeit in a slightly different format
    (i.e. slash-separated, e.g. 2 and "1/a") for easier parsing.

  How to show the process of validation by the compiled code?
    If you are generating Perl code from schema, you can pass "debug=>1"
    option so the code contains logging (Log::ger-based) and other debugging
    information, which you can display. For example:

     % TRACE=1 perl -MLog::ger::LevelFromEnv -MLog::ger::Output=Screen \
       -MData::Sah=gen_validator -E'
       $v = gen_validator([array => of => [hash => {req_keys=>["a"]}]],
                          {return_type=>"str_errmsg", debug=>1});
       say "Validation result: ", $v->([{a=>1}, "x"]);'

    will output:

     ...
     [spath=[]]skip if undef ...
     [spath=[]]check type 'array' ...
     [spath=['of']]clause: {"of":["hash",{"req_keys":["a"]}]} ...
     [spath=['of']]skip if undef ...
     [spath=['of']]check type 'hash' ...
     [spath=['of','req_keys']]clause: {"req_keys":["a"]} ...
     [spath=['of']]skip if undef ...
     [spath=['of']]check type 'hash' ...
     Validation result: [spath=of]@1: Input is not of type hash

  What else can I do with the compiled code?
    Data::Sah offers some options in code generation. Beside compiling the
    validator code into a subroutine, there are also some other options.
    Examples:

    *   Dist::Zilla::Plugin::Rinci::Validate

        This plugin inserts the generated code (without the "sub { ... }"
        wrapper) to validate the content of %args right before "#
        VALIDATE_ARG" or "# VALIDATE_ARGS" like below:

         $SPEC{foo} = {
             args => {
                 arg1 => { schema => ..., req=>1 },
                 arg2 => { schema => ... },
             },
             ...
         };
         sub foo {
             my %args = @_; # VALIDATE_ARGS
         }

        The schemas will be retrieved from the Rinci metadata ($SPEC{foo}
        above). This means, subroutines in your built distribution will do
        argument validation.

    *   Perinci::Sub::Wrapper

        This module is part of the Perinci family. What the module does is
        basically wrap your subroutine with a wrapper code that can include
        validation code (among others). This is a convenient way to add
        argument validation to an existing subroutine/code.

ENVIRONMENT
  LOG_SAH_VALIDATOR_CODE => bool
    If set to true, will log (using Log::ger, at the trace level) the
    validator code being generated. See "SYNOPSIS" or "FAQ" for example on
    how to see this log message.

HOMEPAGE
    Please visit the project's homepage at
    <https://metacpan.org/release/Data-Sah>.

SOURCE
    Source repository is at <https://github.com/perlancar/perl-Data-Sah>.

SEE ALSO
    Data::Sah::Tiny, Params::Sah

   Other interpreted validators
    Params::Validate is very fast, although minimal. Data::Rx, Kwalify,
    Data::Verifier, Data::Validator, JSON::Schema, Validation::Class.

    For Moo/Mouse/Moose stuffs: Moose type system, MooseX::Params::Validate,
    among others.

    Form-oriented: Data::FormValidator, FormValidator::Lite, among others.

   Other compiled validators
    Type::Tiny

    Params::ValidationCompiler

AUTHOR
    perlancar <perlancar@cpan.org>

CONTRIBUTORS
    *   mauke <lukasmai.403@gmail.com>

    *   Michal SedlÃ¡k <sedlakmichal@gmail.com>

    *   Steven Haryanto <stevenharyanto@gmail.com>

    *   Steven Haryanto <steven@masterweb.net>

    *   Szymon NieznaÅski <s.nez@member.fsf.org>

CONTRIBUTING
    To contribute, you can send patches by email/via RT, or send pull
    requests on GitHub.

    Most of the time, you don't need to build the distribution yourself. You
    can simply modify the code, then test via:

     % prove -l

    If you want to build the distribution (e.g. to try to install it locally
    on your system), you can install Dist::Zilla,
    Dist::Zilla::PluginBundle::Author::PERLANCAR,
    Pod::Weaver::PluginBundle::Author::PERLANCAR, and sometimes one or two
    other Dist::Zilla- and/or Pod::Weaver plugins. Any additional steps
    required beyond that are considered a bug and can be reported to me.

COPYRIGHT AND LICENSE
    This software is copyright (c) 2024, 2022, 2021, 2020, 2019, 2018, 2017,
    2016, 2015, 2014, 2013, 2012 by perlancar <perlancar@cpan.org>.

    This is free software; you can redistribute it and/or modify it under
    the same terms as the Perl 5 programming language system itself.

BUGS
    Please report any bugs or feature requests on the bugtracker website
    <https://rt.cpan.org/Public/Dist/Display.html?Name=Data-Sah>

    When submitting a bug or request, please include a test-file or a patch
    to an existing test-file that illustrates the bug or desired feature.