Launch HN: Extend (YC W23) – Turn your messiest documents into data

extend.ai

60 points by kbyatnal 4 days ago


Hey HN! We’re Kushal and Eli, co-founders of Extend (https://www.extend.ai/). Extend is a toolkit for AI teams to ingest any kind of messy document (e.g. PDFs, images, excel files) and build incredible products.

We built Extend to handle the hardest documents that break most pipelines. You can see some examples here in our demo (no signup required): https://dashboard.extend.ai/demo

I know you're probably thinking “not another document API startup”. Unfortunately, the problem just isn’t solved yet!

I’ve personally spent months struggling to build reliable document pipelines at a previous job. The long tail of edge cases is endless — massive tables split across pages, 100pg+ files, messy handwriting, scribbled signatures, checkboxes represented in 10 different formats, multiple file types… the list just keeps going. After seeing countless other teams during our time in YC run into these same issues, we started building Extend.

We initially launched with a set of APIs for engineers to parse, classify, split, and extract documents. That started to take off, and soon we were deployed in production at companies building everything from medical agents, to real-time bank account onboarding, to mortgage automation. Over time, we’ve worked closely with these teams and seen first-hand how large the gap is between raw OCR/model outputs —> a production-ready pipeline (LLMs and VLMs aren’t magic).

Unlike other solutions in the space, we're specifically focused on three core areas: (1) the computer vision layer, (2) LLM context engineering, and (3) the surrounding product tooling. The combination of all three is what we think it takes to hit 99% accuracy and maintain it at scale.

For instance, to parse messy handwriting, we built an agentic OCR correction layer which uses a VLM to review and make edits to low confidence OCR errors. To tackle multi-page tabular data, we built a semantic chunking engine which can detect the optimal boundaries within a document so models can excel with smaller context inputs.

We also shipped a prompt optimization agent to automate the endless prompt engineering whack-a-mole teams spend time on. It’s built as a background agent to replicate the best prompter on your team, and runs in a loop with access to a set of tools (view files, run evals, analyze results, and update schemas).

The most surprising part of this whole experience has been seeing how many crazy PDF formats are out there! We've run into everything from supermarket inventory magazines, pesticide labels, construction blueprints, and satellite manufacturing plans.

Everything described above is live today. You can see it in action here (no signup): https://dashboard.extend.ai/demo. To upload your own files, you can log in and do so (we’re adding free usage credits to all accounts that sign up today).

We’re excited to be sharing with HN! We’d love to hear about your experiences building document pipelines. Please try it out, and share any and all feedback with us (e.g. hard documents that didn’t work, feature requests).

airstrike - 4 days ago

Congrats on the launch! It looks really cool.

> Unlike other solutions in the space, we're specifically focused on three core areas: (1) the computer vision layer, (2) LLM context engineering, and (3) the surrounding product tooling.

I assume the goal is to continue to serve this via an API? That would be immensely helpful to teams building other products around these capabilities.

pratikshelar871 - 3 days ago

300+ dollars for starter plan targeted to startups seems to be a missed opportunity. Startups might find it as a high barrier to try. You are solving a good problem but the pricing seems too high.

nextworddev - 3 days ago

I highly recommend companies to keep it simple and use n8n with Gemini for OCR. You will save money and get 90%+ of the same functionality as products like this.

constantinum - 3 days ago

Other players:

1. Trellis (YC W24) 2. Roe AI (YC W24) 3. Omni AI (YC W24) 4. Reductor (YC W24)

Other players(extended):

1. Unstract: Open-source ETL for documents (https://github.com/Zipstack/unstract) 2. Datalab: Makers of Surya/Marker 3. Unstructured.io

arvind_k - 3 days ago

At Zipphy, I worked on solving similar problems in on-prem environments — building an OCR + NLP + CV pipeline to generate spatial layouts and classify documents at scale.

One persistent challenge was generalizing across “wild” PDFs, especially multi-page tables.

Your mention of agentic OCR correction and semantic chunking really caught my attention. I’m curious — how did you architect those to stay consistent across diverse layouts without relying on massive rule sets?

FitchApps - 4 days ago

Very cool. Are there any checks for accuracy / data verification? How accurate is your solution when it comes to messy table parsing or handwriting.

FabioFleitas - 4 days ago

We've been using Extend for over a year and have been super happy with the product and accuracy of the data extraction

aaa29292 - 4 days ago

on the pricing page, what in the world is performance optimized vs cost optimized???

https://docs.extend.ai/2025-04-21/product/general/how-credit...

Are those just different SLAs or different APIs or what?

asdev - 4 days ago

Have you ran your pipeline against an open benchmark like https://github.com/opendatalab/OmniDocBench?

nibab - 4 days ago

at ng3n.ai ive been using datalab.to for document processing. currently its mostly for conversion to markdown and some extraction.

ng3n is more of a grid-like workflow solution on top of documents. it's a user-facing application geared towards non-technical users that have processing needs.

if there are all these new problems that became solvable, what exactly are they?

id be interested in replacing datalab with extend, but im not sure what avenues that opens for ng3n. would be very curious to learn!

- 4 days ago
[deleted]
wunderlust - 4 days ago

For some reason "turn your messiest data into documents" makes more sense.

nextworddev - 4 days ago

Just how many IDP / document processing “AI” startups are out there?