Skip to content

Comments

fix: normalize MinerU 2.0 field names for backward compatibility (#89)#202

Merged
LarFii merged 1 commit intoHKUDS:mainfrom
sotastack:fix/mineru-2-field-names
Feb 20, 2026
Merged

fix: normalize MinerU 2.0 field names for backward compatibility (#89)#202
LarFii merged 1 commit intoHKUDS:mainfrom
sotastack:fix/mineru-2-field-names

Conversation

@teamauresta
Copy link
Contributor

MinerU 2.0 renamed img_caption -> image_caption and img_footnote -> image_footnote. Add field name normalization in parser after loading MinerU JSON output, ensuring both old and new field names are present so all downstream code works regardless of MinerU version.

Description

[Briefly describe the changes made in this pull request.]

Related Issues

[Reference any related issues or tasks addressed by this pull request.]

Changes Made

[List the specific changes made in this pull request.]

Checklist

  • Changes tested locally
  • Code reviewed
  • Documentation updated (if necessary)
  • Unit tests added (if applicable)

Additional Notes

[Add any additional notes or context for the reviewer(s).]

…DS#89)

MinerU 2.0 renamed img_caption -> image_caption and img_footnote -> image_footnote.
Add field name normalization in parser after loading MinerU JSON output,
ensuring both old and new field names are present so all downstream code
works regardless of MinerU version.
@LarFii
Copy link
Collaborator

LarFii commented Feb 18, 2026

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@LarFii LarFii merged commit 17714fc into HKUDS:main Feb 20, 2026
1 check passed
@teamauresta teamauresta deleted the fix/mineru-2-field-names branch February 23, 2026 03:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants