Outputs — SunnyExtract

Seven output types, one consistent pipeline

Each output serves a specific role: some are for your systems, some for your team, some for audit. Together they give you full visibility into what was extracted, how it was validated, and where a human still needs to decide.

Structured JSON

Normalized, schema-consistent records ready for programmatic consumption. Fields are typed, named predictably and stripped of the formatting noise present in the original document.

CSV Exports

Flat-file exports formatted for spreadsheet review and bulk import workflows. Column headers and date formats are consistent across batches so they load cleanly without manual cleanup.

Audit Trail

A structured record of what was extracted, what was validated, and what was changed during processing. Supports compliance reviews without requiring access to the internal pipeline.

Original Files

Every source document is preserved and linked to its extracted data record. When a reviewer needs to verify a field, they reach the original in one step — not a file archive search.

Exception Lists

A prioritised list of the cases that could not be resolved automatically and require a human decision. Each exception includes the reason it was flagged so reviewers can act immediately.

Validation Notes

Field-level notes that explain why a value passed validation or why it was flagged for review. Reviewers see reasoning alongside the data — not just a pass/fail result.

System-Ready Payloads

Structured payloads shaped to match the field names, types and constraints your internal systems expect. The mapping is defined during onboarding so imports work without transformation scripts.

Outputs are validated data, not text recognition results

OCR and text recognition are one input into the SunnyExtract pipeline — not the output. By the time data reaches your systems it has been classified, cross-checked against expected values, and reviewed for consistency. What you receive is not a raw read of the document; it is a verified, structured record with a full trail of how it got there.

Designed for review, not blind import

Every output format is built around a principle: your team should be able to verify any value before it enters a downstream system. Nothing is a black box.

Each extracted field links back to the exact location in the source document, so reviewers can confirm any value in seconds rather than re-reading the whole document.

Exception lists surface only the cases that need attention — not the full batch — so your team focuses effort where it matters, not on scanning clean records.

Validation notes give reviewers the context they need to approve or correct a value without re-running any analysis — the reasoning travels with the data.

The audit trail is a separate, human-readable record — not embedded metadata — so it can be shared with compliance teams without granting access to the processing pipeline.

See what your documents produce

We review every early access request manually and walk through a real document workflow with you before any data moves.

Request private workflow review