Security & document handling

How documents are handled

Documents arrive through controlled intake channels. Each one is classified, validated and cross-checked before anything moves forward — nothing is silently auto-posted.

Controlled intake

Documents arrive through defined channels — UI upload, email, API, webhook, or ZIP batch. The intake path for each workflow is agreed upon during the workflow review, before onboarding begins.

Classification and validation

Each document is classified by type, validated against the expected structure, and cross-checked against the other documents in the same operation. A document that does not pass this step does not proceed.

Nothing posted silently

Extracted data does not reach downstream systems automatically without passing the validation and — where required — human review steps. Uncertain or inconsistent cases are held in an exception queue, not forwarded.

Access & workflow review

During early access, every workflow is reviewed before onboarding. We do not open access to a workflow we have not examined. The review covers:

What documents are processed — types, formats, sources, and any edge cases the workflow regularly encounters.

Where documents come from — the intake channel and how documents arrive in practice.

What data must be extracted — the specific fields, values and structures the operation depends on.

What systems receive the output — the destination, format, and any transformation the data needs before it arrives.

What cases require human review — the exception criteria that must route to a person before data is forwarded.

Traceability

Every extracted value traces back to its source document. Validation signals and an audit trail accompany the data at every stage. If a downstream team questions a value, you can show them where it came from and how it was validated — not just what the final record says.

Source linkage on every value

Extracted fields maintain a direct reference to the source document and the region they came from. There is no gap between a record in your system and the document it originated from.

Validation signals

Each field in the output carries a signal explaining why it passed or why it was flagged — format, cross-document consistency, expected range, or structural position.

Audit trail per document

Every processed document produces a record: when it arrived, what was extracted, which validations ran, whether it was flagged, who reviewed it, and what decision was made.

Learn more about traceability

The Traceability page covers the full pipeline — from source linkage through exception routing to the complete audit record.

Human review and exception handling

Risky, unclear, or inconsistent cases are routed to a person — not pushed straight into downstream systems. The exception path is a deliberate design decision, not an afterthought.

Documents with ambiguous content, conflicting values, or low-confidence fields are separated from clean cases and placed in an exception queue. The two paths do not mix.

Reviewers see each exception alongside the source document, the extracted values, and the specific signals that triggered the flag. They make a decision; the system records it.

Only records that passed validation or explicit human review reach your exports and integrations. The exception queue is a gate, not a log.

Exception criteria — what constitutes a case that must wait for human review — are defined per workflow during the onboarding review, not applied generically.

Infrastructure approach

Data is processed in controlled infrastructure. We are honest about what that means in practice for early-access customers.

Controlled processing environment

Document processing runs in a controlled environment. Access to processing infrastructure is restricted and not exposed to the public internet beyond the defined intake channels.

Dedicated capacity for high-volume teams

Dedicated processing lanes and reserved capacity are available for teams with high-volume or time-sensitive workflows. These are configured per engagement during the workflow review.

Specifics defined per engagement

Data residency, retention, and infrastructure specifics are discussed and agreed as part of the workflow review — before any documents are processed. There is no single configuration applied uniformly to all customers.

What we do not claim yet

Honesty about the current state of the product is part of how we build trust with the teams that use it.

SunnyExtract is currently in a controlled early-access phase.

We do not claim:

Universal automation — not every document type or workflow is supported today.
Guaranteed extraction accuracy — outputs are validated and human-reviewed where required, but no system guarantees 100% accuracy on all documents in all conditions.
Instant self-serve onboarding for every workflow — each workflow is reviewed before access is granted.
Certified compliance frameworks before they are formally completed — we do not list certifications we do not hold.
Production integration without workflow review — no workflow goes live without first being examined with the customer team.

Instead, we review each workflow before onboarding and define the right intake, validation and output path together.

Review your workflow with us before committing

Every onboarding starts with a workflow review. We look at what documents you process, where they come from, what data must come out, and what cases need a human in the loop — before anything is built or connected.

Request private workflow review