The verifier runs server-side after a human submits, before the task lands inDocumentation Index
Fetch the complete documentation index at: https://docs.awaithumans.dev/llms.txt
Use this file to discover all available pages before exploring further.
COMPLETED. It does two things in one LLM call:
- Quality check — was the response acceptable per your
instructions? - NL parsing (optional) — if the human replied in free text (Slack thread, email body), extract structured data into the response schema.
REJECTED (non-terminal), the reason is shown to the reviewer, and they get to retry. After max_attempts, it’s VERIFICATION_EXHAUSTED (terminal) and the agent gets a typed error.
BYOK — bring your own key
Each provider needs two things on the server: the SDK extra installed, and the API key exported as an env var.| Provider | Install | Required env var | Default model |
|---|---|---|---|
claude | pip install "awaithumans[verifier-claude]" | ANTHROPIC_API_KEY | claude-sonnet-4-5 |
openai | pip install "awaithumans[verifier-openai]" | OPENAI_API_KEY | gpt-4o-2024-11-20 |
gemini | pip install "awaithumans[verifier-gemini]" | GEMINI_API_KEY | gemini-2.0-flash |
azure | pip install "awaithumans[verifier-azure]" | AZURE_OPENAI_API_KEY + AZURE_OPENAI_ENDPOINT | (operator-supplied deployment name) |
api_key_env= if you’d rather not use the default — e.g. claude_verifier(api_key_env="ACME_CLAUDE_KEY").
HTTP 500 VERIFIER_API_KEY_MISSING with a docs link to the troubleshooting page. Set the key, restart the server, the failed task can be resubmitted by the human — the verifier didn’t burn an attempt.
Quickstart
claude_verifier(...) doesn’t take an API key — it just declares the config. The agent ships that config to the server, and the server reads ANTHROPIC_API_KEY from its own env.
Other providers
VerifierConfig you pass to await_human(verifier=...).
NL parsing
When the human replies in free text (Slack thread reply, email body), the verifier extracts structured data into the response schema. No code changes — the server detects raw text vs. structured form, runs the verifier in NL-parsing mode. Operator’sinstructions should anticipate the cases:
{approved: true, notes: "looks legit"}.
redact_payload
If the task’sredact_payload=True, the verifier is skipped entirely. The operator marked the payload sensitive; we don’t ship it to a third-party LLM regardless of verifier config.
Error handling
Provider failures (vendor outage, missing API key, network blip) propagate as typed errors WITHOUT consuming amax_attempts slot. The human gets a fresh shot once the operator fixes config:
Best practices
- Keep instructions narrow. “Reject if X” / “Require Y when Z” — not “be a good reviewer.” Narrow rules are easier to debug when the verifier rejects something you didn’t expect.
- Set
max_attemptslow. Default is 3. For high-stakes decisions, lower (1) so a single rejection fails closed rather than letting the human grind through retries. - Test the prompt. Run a few representative submissions through it before relying on the verifier in production. The dashboard’s audit trail records every verifier verdict so you can inspect them.
Cost
The verifier fires on every submission. At ~1k input tokens + ~100 output per call:- Claude Sonnet: ~$0.005 / call
- GPT-4o: ~$0.01 / call
- Gemini 2.0 Flash: ~$0.0005 / call
gemini-2.0-flash.
Runnable examples
| Example | Language | Walks through |
|---|---|---|
examples/verifier-py/ | Python | All three verifier paths: pass, reject + retry, exhaust |
examples/verifier/ | TypeScript | Same flow, TS SDK |
ANTHROPIC_API_KEY, python refund.py (or npx tsx), and walk the dashboard through each path. A useful template for fixture-driven tests of your own verifier prompts.
Where to next
- Testing — patterns for testing verifier paths in CI
- Idempotency — what happens to in-flight tasks across verifier retries
- Troubleshooting — every verifier error code with the matching fix