Skip to content

Conversation

@yethikrishna
Copy link

This commit introduces the necessary files and configurations to enable deploying the Marker application as a Job on Choreo.

Key changes include:

  • A Dockerfile to containerize the application, including system dependencies like Tesseract 5 and Ghostscript, and Python dependencies managed by Poetry.
  • A choreo.yaml file defining the Choreo Job component, specifying Docker build, container command, environment variables, and I/O handling.
  • A run_conversion.sh wrapper script that acts as the Docker entrypoint, parsing environment variables for input/output paths and parameters, and executing the appropriate Marker conversion script (convert_single.py or convert.py).
  • Updated .gitignore to include common local test output directories.

These changes facilitate running Marker for PDF-to-Markdown conversion in a Choreo-managed job environment.

This commit introduces the necessary files and configurations to enable
deploying the Marker application as a Job on Choreo.

Key changes include:

- A `Dockerfile` to containerize the application, including system
  dependencies like Tesseract 5 and Ghostscript, and Python dependencies
  managed by Poetry.
- A `choreo.yaml` file defining the Choreo Job component, specifying
  Docker build, container command, environment variables, and I/O handling.
- A `run_conversion.sh` wrapper script that acts as the Docker entrypoint,
  parsing environment variables for input/output paths and parameters,
  and executing the appropriate Marker conversion script (`convert_single.py`
  or `convert.py`).
- Updated `.gitignore` to include common local test output directories.

These changes facilitate running Marker for PDF-to-Markdown conversion
in a Choreo-managed job environment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant