Examples¶

These are self-contained examples showing validatedata solving real problems. Each one reflects a workflow you might actually have — no boilerplate, no contrived data.

Fast validation with validator()¶

When you only need a boolean pass/fail result (no error messages), use validator(). It compiles a rule into a callable that returns True or False with minimal overhead. Its faster than Pydantic v2 and msgspec on invalid data dicts.

The performance advantage of the validator function on invalid data comes from early‑exit optimisations. Other libraries often have to validate everything and report errors.

from validatedata import validator

# Single value – pipe syntax
is_valid_username = validator('str|min:3|max:32')
is_valid_username('alice')      # True
is_valid_username('a')          # False

# Multiple fields – flat dict rule
validate_user = validator({
    'username': 'str|min:3|max:32',
    'email':    'email',
    'age':      'int|min:18'
})

validate_user({'username': 'bob', 'email': 'bob@example.com', 'age': 25})   # True
validate_user({'username': 'bob', 'email': 'bob@example.com', 'age': 15})   # False

# Parameterized containers
is_str_list = validator('list[str]')
is_str_list(['a', 'b', 'c'])    # True
is_str_list(['a', 1, 'c'])      # False

is_str_or_int_list = validator('list[str,int]')
is_str_or_int_list(['a', 1, 'c'])   # True

User registration¶

A typical sign-up form: username, email, password with strength requirements, and an optional phone number.

from validatedata import validate_data

rule = {
    'username': 'str|strip|min:3|max:32|re:^[\\w.-]+$|msg:username must be 3–32 characters, letters, digits, dots, or hyphens only',
    'email':    'email|msg:please enter a valid email address',
    'password': 'str|min:8|re:(?=.*[A-Z])(?=.*\\d).+|msg:password must be at least 8 characters with one uppercase letter and one digit',
    'phone':    'phone|nullable',
}

result = validate_data(
    data={
        'username': 'alice_99',
        'email':    'alice@example.com',
        'password': 'Secure123',
        'phone':    None,
    },
    rule=rule,
)

if result.ok:
    print('registration accepted')
else:
    # errors are grouped per field — easy to map back to form inputs
    for group in result.errors:
        if group:
            print(group[0])

The phone field is nullable so submitting the form without it passes. Everything else is required and validated in a single call.

Flask route with the decorator¶

Validate incoming JSON before your route body runs. On failure the decorator returns the error dict directly — you just need to check for it.

from flask import Flask, request, jsonify
from validatedata import validate, ValidationError

app = Flask(__name__)

signup_rule = {
    'username': 'str|strip|min:3|max:32',
    'email':    'email',
    'password': 'str|min:8|re:(?=.*[A-Z])(?=.*\\d).+',
}

@app.route('/signup', methods=['POST'])
def signup():
    body = request.get_json()

    result = validate_data(body, signup_rule)
    if not result.ok:
        return jsonify({'errors': result.errors}), 422

    # body is clean — proceed
    user = create_user(body['username'], body['email'], body['password'])
    return jsonify({'id': user.id}), 201

Or register a Flask error handler and use raise_exceptions=True to keep the route body completely free of validation logic:

from validatedata import ValidationError

@app.errorhandler(ValidationError)
def handle_validation_error(e):
    return jsonify({'errors': str(e)}), 422

@app.route('/signup', methods=['POST'])
@validate(signup_rule, raise_exceptions=True)
def signup(username, email, password):
    user = create_user(username, email, password)
    return jsonify({'id': user.id}), 201

Application config file¶

Validate a config dict loaded from YAML, TOML, or environment variables before your app starts. Mirror-structure rules match the shape of the config exactly — no structural boilerplate required.

import yaml
from validatedata import validate_data

with open('config.yaml') as f:
    config = yaml.safe_load(f)

# config.yaml looks like:
#
# app:
#   name: MyService
#   version: 1.4.0
#   debug: false
#
# database:
#   host: 127.0.0.1
#   port: 5432
#   name: mydb
#
# server:
#   host: 0.0.0.0
#   port: 8080

rule = {
    'app': {
        'name':    'str|min:1',
        'version': 'semver',
        'debug':   'bool',
    },
    'database': {
        'host': 'ip',
        'port': 'int|between:1,65535',
        'name': 'str|min:1',
    },
    'server': {
        'host': 'ip',
        'port': 'int|between:1024,65535',
    },
}

result = validate_data(data=config, rule=rule, mutate=True)

if not result.ok:
    for error in result.errors:
        print(f'Config error: {error}')
    raise SystemExit('Invalid configuration — aborting startup')

# result.data is a dict with the same shape as config —
# use it directly so any transforms (e.g. strip on string fields) are applied
app_config = result.data

Bad config fails loudly at startup with a clear field path (e.g. database.port: invalid integer) rather than surfacing as a cryptic runtime error later. Passing mutate=True means result.data gives you back the validated — and optionally transformed — config in exactly the same nested structure, ready to use without re-reading the original dict.

Bulk data import (high‑performance)¶

When processing thousands or millions of rows, you often only need to know whether each row is valid or not. Use validator() for maximum throughput.

from validatedata import validator

# Pre‑compile rule once
row_validator = validator([
    'str|strip|min:1|max:128',    # name
    'email',                       # email
    'int|min:0',                   # age
    'str|in:active,inactive',      # status
])

rows = [
    ['Alice',  'alice@example.com',  30, 'active'],
    ['',       'bob@example.com',    25, 'active'],    # blank name
    ['Carol',  'not-an-email',       28, 'active'],    # bad email
    ['Dave',   'dave@example.com',  -1,  'pending'],   # bad age, bad status
]

bad_rows = []

for i, row in enumerate(rows):
    if not row_validator(row):
        # No error messages — just a boolean reject
        bad_rows.append(i + 1)   # row numbers

if bad_rows:
    print(f"Invalid rows: {bad_rows}")
else:
    write_to_database(rows)

If you need error messages (e.g., for a validation report), use validate_data_fast or validate_data instead – but for pure pass/fail, validator() is unbeatable.

Bulk data import (with full error messages)¶

When you need to report why a row failed, use validate_data_fast (or validate_data). The fast variant is still much faster than the original.

from validatedata import validate_data_fast

row_rule = [
    'str|strip|min:1|max:128',    # name
    'email',                       # email
    'int|min:0',                   # age
    'str|in:active,inactive',      # status
]

rows = [
    ['Alice',  'alice@example.com',  30, 'active'],
    ['',       'bob@example.com',    25, 'active'],
    ['Carol',  'not-an-email',       28, 'active'],
    ['Dave',   'dave@example.com',  -1,  'pending'],
]

bad_rows = []

for i, row in enumerate(rows):
    result = validate_data_fast(row, row_rule)
    if not result.ok:
        bad_rows.append({'row': i + 1, 'errors': result.errors})

if bad_rows:
    for entry in bad_rows:
        print(f"Row {entry['row']}: {entry['errors']}")
else:
    write_to_database(rows)

Running through all rows before writing means you can return a full report to the user — not just the first bad row.

Conditional fields on a checkout form¶

Delivery method determines which fields are required. depends_on skips validation on a field entirely when the condition isn’t met.

from collections import OrderedDict
from validatedata import validate_data

rule = {
    'delivery_method': 'str|in:pickup,delivery',
    'address': {
        'type':       'str',
        'range':      (10, 'any'),
        'depends_on': {'field': 'delivery_method', 'value': 'delivery'},
        'message':    'a delivery address is required',
    },
    'promo_code': {
        'type':     'str',
        'length':   8,
        'nullable': True,
        'message':  'promo code must be exactly 8 characters',
    },
}

# pickup — address is skipped, promo code is optional
result = validate_data(
    data=OrderedDict([
        ('delivery_method', 'pickup'),
        ('address',         None),
        ('promo_code',      None),
    ]),
    rule=rule,
)
result.ok  # True

# delivery without address — fails
result = validate_data(
    data=OrderedDict([
        ('delivery_method', 'delivery'),
        ('address',         None),
        ('promo_code',      None),
    ]),
    rule=rule,
)
result.ok     # False
result.errors # [[], ['a delivery address is required', 'a delivery address is required'], []]

Normalising data before saving¶

Use transforms with mutate=True to clean user input in the same pass as validation. The function receives the cleaned values — no separate sanitisation step needed.

from validatedata import validate

@validate(
    rule={
        'username': 'str|strip|lower|min:3|max:32',
        'bio':      'str|strip|max:280|nullable',
        'website':  'url|nullable',
    },
    mutate=True,
)
def update_profile(username, bio, website):
    # username is already stripped and lowercased
    # bio is stripped, website is validated
    db.update(username=username, bio=bio, website=website)
    return 'profile updated'

update_profile(
    username='  Alice_99  ',
    bio='  Building things.  ',
    website='https://alice.dev',
)
# saves username='alice_99', bio='Building things.'  — whitespace stripped

Input arrives messy, your function receives it clean. No intermediate variables, no separate call to .strip() or .lower().

The same thing works with validate_data() directly. When the input is a dict, result.data is returned as a dict with the same keys — so you can use the cleaned values straight away without tracking positional order:

from validatedata import validate_data

result = validate_data(
    data={
        'username': '  Alice_99  ',
        'bio':      '  Building things.  ',
        'website':  'https://alice.dev',
    },
    rule={
        'username': 'str|strip|lower|min:3|max:32',
        'bio':      'str|strip|max:280|nullable',
        'website':  'url|nullable',
    },
    mutate=True,
)

if result.ok:
    db.update(**result.data)
    # result.data == {
    #     'username': 'alice_99',
    #     'bio':      'Building things.',
    #     'website':  'https://alice.dev',
    # }

result.data mirrors the shape of the input dict exactly — nested dicts are preserved, so a config-shaped input comes back as a config-shaped output.

Auto‑validation of a module¶

Place this at the bottom of any module you want to automatically validate.

# myapp/domain.py
from validatedata import autovalidate

def create_order(user_id: int, product_ids: list[str]) -> dict:
    # function body
    return {'order_id': 123}

def calculate_total(items: list[dict], discount: float) -> float:
    # function body
    return 99.95

# Auto‑validate at module load
autovalidate(module=__name__, raise_exceptions=True)

Now every call to these functions checks argument types automatically.

Validating a whole package¶

Add this to your package’s __init__.py:

# mypackage/__init__.py
from validatedata import autovalidate_package

# Validate all modules except tests, and also auto‑register any class
# whose name ends with 'Model' as a custom type.
autovalidate_package(
    package=__name__,
    include=["mypackage.*"],
    exclude=["mypackage.tests.*"],
    auto_register_types=True,
    raise_exceptions=False,
)

Now all functions with type hints in your package are validated at call time, and any *Model classes become usable as types in rules.

Using the fast validator with error messages¶

Replace validate_data with validate_data_fast for a speed boost:

from validatedata import validate_data_fast

result = validate_data_fast(
    data={'username': '  alice  ', 'score': 95},
    rule={
        'username': 'str|strip|min:3',
        'score': 'int|min:0|max:100',
    },
    mutate=True,
)

if result.ok:
    print(result.data)   # {'username': 'alice', 'score': 95}
else:
    print(result.errors)

Enforcing type hints with `enforce_hints`¶

Use enforce_hints=True to catch functions that accidentally lack type annotations.

from validatedata import autovalidate

def no_hints(a, b):   # Missing type hints
    return a + b

def with_hints(x: int, y: int) -> int:
    return x + y

# The following line would raise TypeError because 'no_hints' has no annotations
# autovalidate(enforce_hints=True)

# Instead, skip it explicitly
autovalidate(ignore=["__main__.no_hints"], enforce_hints=True)

Using a custom decorator¶

You can replace validate_types with your own decorator, for example to add logging.

from validatedata import autovalidate

def log_and_validate(fn):
    def wrapper(*args, **kwargs):
        print(f"Calling {fn.__name__}")
        return fn(*args, **kwargs)
    return wrapper

autovalidate(decorator=log_and_validate)

FastModel for structured data¶

When you have a recurring data shape, define a model once and reuse it.

from validatedata import FastModel, Rule

class Product(FastModel):
    name: str = Rule(min=3, max=100)
    price: float = Rule(min=0)
    tags: list[str] = Rule(default=[], init_new=True, max_items=10)

# Create and validate
product = Product(name="Widget", price=19.99, tags=["new", "featured"])

# Serialize to dict
data = product.to_dict()

# Reconstruct from dict (fast path)
product2 = Product.from_dict(data, validate="check")