verify_document() supports three flows. Most customers use Flow A or B; Flow C is the quality loop on top.
Decision table
| Your situation | Use | Why |
|---|---|---|
| You already have an extractor (your code, your model, your OCR). You want a human to verify the output. | Flow A | Reviewer sees your extraction and corrects the cells the model got wrong. |
| You don’t have an extractor pipeline but you have an OpenAI / Anthropic / Reducto / Azure DI key. | Flow B | The SDK runs the model on your machine using your credentials. Reviewer sees the model output and corrects it. |
| You want an AI verifier to recheck what the human typed. | Flow C (compose with A or B) | If the verifier disagrees with the human, the task re-routes back to a human. |
Flow A: Human only
You provide the data. The reviewer confirms or corrects it against the document.- The document fragments (five masked views per page).
- A response form pre-filled with
my_extraction. - They edit, then submit.
Flow B: Model then human
You don’t have your own extractor yet. The SDK runs one on your machine using your provider credentials, then sends both the document and the extracted result to the reviewer.Your provider credentials never leave your machine. The SDK calls OpenAI / Anthropic / etc. directly from your process; only the extracted result + the encrypted fragments go to AwaitVerify.
Flow C: Human then model (AI verifier loop)
After the human submits, an AI verifier rechecks the response. If it flags a problem, the task re-routes to another human.prior_extraction= (or extraction=) and verifier=; the SDK runs the extraction, the human verifies, then the AI verifier rechecks.
Best for: high-stakes review where a single human reviewer is not enough (regulated compliance, fraud detection).
What happens between Submit and your return value
The SDK long-polls the managed backend. When the reviewer submits:- OSS server records the response, fires the callback to managed.
- Managed marks the task
completed, destroys the wrapped DEK, deletes the encrypted fragments from blob storage. - Your next
pollcall returns the typed response. - Managed nulls its copy of the response in the same transaction.
verify_document() returns, the response content lives only inside your Python process. Security details →
Timeouts and retries
timeout_seconds defaults to 48 hours and caps at 30 days. If no human submits before the timeout:
- Standard priority: task times out, your
verify_document()raisesVerifyTimeoutError. You’re not billed. - Express priority: same behavior but with a faster (configurable) SLA target. Pricing →
verify_document() again with the same arguments. The SDK doesn’t dedupe automatically; if you need idempotency, pass a stable idempotency_key= (the managed backend supports this on the /tasks endpoint).
Where to go next
Response schemas
What Pydantic shapes the reviewer can edit comfortably (and what falls back to JSON).
Providers
Flow B providers: OpenAI, Anthropic, Azure, Reducto, Docling, PaddleOCR.