Wall 1 — Authorization
Agents can reason. They cannot be trusted to decide alone — consequence, liability, and accountability require a human signature. This wall gets higher as agents get more capable, not lower: a more powerful agent doing more autonomous work means a bigger blast radius when it’s wrong, which means more reason to gate the consequential calls. Examples:- A CFO agent pauses before wiring $2M to a new vendor.
- A medical agent routes a dosage decision to the on-call physician.
- KYC: the model says 73% likely match; the regulator says you can’t reject without a manual review.
- Refunds over $1k: the model says fraud; the human knows the customer just had a bad week.
- Content moderation escalations: the model says borderline hate speech; the policy team owns the line.
Wall 2 — Reality
The world exists outside the model’s context window. No amount of intelligence closes the gap between what the model knows and what’s actually happening on the ground. This isn’t a training problem — it’s a physics problem. Examples:- A logistics agent waits for confirmation the package was actually picked up.
- A real-estate agent needs a human to walk the property before listing.
- Transaction reconciliation: the payment provider didn’t ack — was the transfer applied?
- Distributed-system inconsistency: order shows as shipped in one DB and not-shipped in another.
- Vendor outage during a long-running workflow: did the workflow’s last side effect take?
Wall 3 — Presence
Software 2.0 will be headless — agents navigating the internet autonomously. But the physical world wasn’t built for agents and won’t be rebuilt overnight. Until it catches up, agents need humans to be their hands. Examples:- An agent needs a wet signature on a legal document before filing.
- An agent managing a retail store needs someone to restock a shelf.
- KYC ID-photo verification: a human compares face to document.
- Pickup-and-delivery: someone has to physically grab the thing.
- Phone calls to vendors who don’t have APIs.
- Visits to physical locations: inspections, audits, walkthroughs.
assign_to: { capability: "pickup-and-deliver", region: "SF" }. For v0.1 the presence wall is just “humans on your team, routed via assign_to.” The post-Phase-3 marketplace expansion targets the broader case where the embodied work is sourced from outside your team.
The wall doesn’t go away; it shrinks unevenly. Some industries (digital-first SaaS) feel it less. Others (logistics, real estate, regulated finance, healthcare) feel it every day. Wherever your stack lives on that spectrum, the wall is where awaithumans plugs in.
Why this matters for your stack
If you treat HITL as a temporary hack — a Slack channel where everyone yells, a spreadsheet someone updates by hand — you’ll outgrow it within months and have to rip-and-replace. If you treat it as permanent infrastructure with a clean primitive (await_human()), the same code that powers your scrappy v1 review queue still works when:
- You add your second reviewer (just
assign_to=...) - You add a fourth notification channel (just register it)
- You add an AI verifier (just pass
verifier=) - You move to durable workflows (swap to the Temporal adapter)