Skip to content

Conversation

@athoik
Copy link
Contributor

@athoik athoik commented Oct 20, 2025

The rapidocr_onnxruntime package is no longer actively maintained (see RapidAI/RapidOCR#579).

This commit
migrates the document loader parsers to use the maintained rapidocr package.

Changes include:

  • Replacing imports of rapidocr_onnxruntime with rapidocr
  • Updating OCR result handling from tuple (result, _) to single RapidOCROutput object
  • Using result.txts for text extraction
  • Updating import error messages accordingly

This aligns the image and PDF parsers with the latest RapidOCR API.

The rapidocr_onnxruntime package is no longer actively maintained (see RapidAI/RapidOCR#579).

This commit
migrates the document loader parsers to use the maintained rapidocr
package.

Changes include:
- Replacing imports of rapidocr_onnxruntime with rapidocr
- Updating OCR result handling from tuple (result, _) to single
  RapidOCROutput object
- Using result.txts for text extraction
- Updating import error messages accordingly

This aligns the image and PDF parsers with the latest RapidOCR API.
@mdrxy mdrxy merged commit 70c1331 into langchain-ai:main Nov 5, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants