Show HN: OCR Arena – A playground for OCR models

ocrarena.ai

33 points by kbyatnal 3 days ago


I built OCR Arena as a free playground for the community to compare leading foundation VLMs and open-source OCR models side-by-side.

Upload any doc, measure accuracy, and (optionally) vote for the models on a public leaderboard.

It currently has Gemini 3, dots.ocr, DeepSeek, GPT5, olmOCR 2, Qwen, and a few others. If there's any others you'd like included, let me know!

ArcaneMoose - 18 minutes ago

I've been really impressed with this model specifically because of how insanely cheap it is: https://replicate.com/ibm-granite/granite-vision-3.3-2b

I didn't expect IBM to be making relevant AI models but this thing is priced at $1 per 4,000,000 output tokens... I'm using it to transcribe handwritten input text and it works very well and super fast.

zzleeper - 41 minutes ago

Love this! Would have liked to see something like textract for a pre-LLM benchmark (but of course that's expensive), and also a distinction between handwritten text and printed one.

But still, this is incredibly useful!

krashidov - 36 minutes ago

I would be curious to see how Sonnet does. Their models are pretty solid when it comes to PDFs

fzysingularity - an hour ago

FYI one of the models on the battle was pretty slow to load. Are these also being rated on latency or just quality?

ianhawes - an hour ago

Please add Chandra by Datalab

arathis - an hour ago

Claude would be good!

dang - 2 hours ago

[under-the-rug stub]

[see https://news.ycombinator.com/item?id=45988611 for explanation]