Baidu's "Unlimited OCR" processes dozens of document pages in one pass by treating memory like human forgetting
Baidu’s new OCR model processes dozens of document pages in a single pass while keeping memory usage constant, outperforming existing systems limited to about ten pages.

- Baidu’s Unlimited OCR processes dozens of document pages in a single pass, unlike traditional systems limited to about ten pages.
- The model uses a modified attention mechanism to keep memory usage flat, regardless of input size.
- It currently ranks first on a major OCR benchmark, demonstrating superior performance.
- The technology could revolutionize document digitization in industries like legal, healthcare, and finance.
Baidu has introduced a breakthrough in optical character recognition (OCR) technology with its "Unlimited OCR" system. Unlike traditional OCR models that struggle with memory constraints when processing multiple pages, this new approach treats memory usage like human forgetting. By modifying the attention mechanism, the system maintains flat memory consumption regardless of the number of pages scanned in a single pass.
The innovation allows the model to handle dozens of pages simultaneously, far exceeding the typical limit of around ten pages in existing systems. This advancement is not just a theoretical improvement; it currently holds the top position on a major OCR benchmark, indicating real-world performance gains. The technology could significantly streamline document processing tasks in industries like legal, healthcare, and finance, where large volumes of text need to be digitized efficiently.
The key to this breakthrough lies in how the attention mechanism is adjusted to forget irrelevant information dynamically, mimicking human cognitive processes. This ensures that memory usage does not scale with input size, making the system scalable and efficient for large-scale document processing.
Source: Baidu's "Unlimited OCR" processes dozens of document pages in one pass by treating memory like human forgetting. Read the full piece at the source.
Offers a scalable solution for OCR tasks with constant memory usage, enabling more efficient document processing pipelines.
Reduces processing time and costs for large-scale document digitization, improving operational efficiency.
Highlights Baidu’s leadership in AI-driven OCR, potentially boosting confidence in its broader AI portfolio.
Demonstrates how AI can mimic human cognitive processes to solve practical problems.
- OCR
- Optical Character Recognition, technology that converts different types of documents into editable and searchable data.
- Attention mechanism
- A component in AI models that helps them focus on relevant parts of input data while ignoring irrelevant details.

I asked Claude Fable 5 to build a HydePHP site. It invented an airline.
Claude Code and Fable 5 ported the 2003 PC game Command & Conquer to native iOS in "a few hours"

Using llama.cpp with pi
