Skip to content

Comments

feat: support environment variables in parsers#210

Open
wkpark wants to merge 2 commits intoHKUDS:mainfrom
wkpark:feat-env-vars
Open

feat: support environment variables in parsers#210
wkpark wants to merge 2 commits intoHKUDS:mainfrom
wkpark:feat-env-vars

Conversation

@wkpark
Copy link
Contributor

@wkpark wkpark commented Feb 21, 2026

support environment variables in parsers

Checklist

  • Changes tested locally
  • Code reviewed
  • Documentation updated (if necessary)
  • Unit tests added (if applicable)

Additional Notes

you can set environment variables for Mineru and Docling.

for example, you can set CUDA_VISIBLE_DEVICES=1 to make the parser run on a specific GPU. This prevents the parser from slowing down your main AI model on the first GPU.

# Use the second GPU (ID: 1) for this parsing task
await rag.process_document_complete(
    file_path="manual.pdf",
    env={"CUDA_VISIBLE_DEVICES": "1"}
)

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@LarFii
Copy link
Collaborator

LarFii commented Feb 24, 2026

Thanks for the contribution.

One potential regression to consider:

  • In MineruParser._run_mineru_command, the new **kwargs makes unknown arguments silently ignored (except env) instead of failing fast. Previously, unexpected arguments would raise a TypeError, which was easier to catch during integration.
  • This could lead to “runs successfully but config not applied” scenarios.

Suggestion:

  1. Validate and explicitly reject unsupported kwargs (or at least log a warning).
  2. Add light validation for env (e.g., ensure mapping of string keys/values).
  3. Add tests for:
    • env propagation to subprocess
    • behavior when unknown kwargs are passed

@wkpark wkpark force-pushed the feat-env-vars branch 2 times, most recently from a6cde1f to 69a2d75 Compare February 24, 2026 13:15
@wkpark wkpark marked this pull request as draft February 24, 2026 13:20
- Add strict validation for the 'env' parameter to ensure it is a dictionary of strings.
- Implement fail-fast behavior in MineruParser for unsupported keyword arguments.
- Add a new test for environment propagation and argument validation.
@wkpark wkpark marked this pull request as ready for review February 24, 2026 13:27
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants