A Markdown text splitter for modular docs and maximum flexibility.
SplitmeAI is a Python module that addresses challenges in managing large Markdown files, particularly when creating and maintaining structured static documentation sites such as Mkdocs.
Key Features:
- Section Splitting: Breaks down large Markdown files into smaller, manageable sections based on specified heading levels.
- Hierarchy Preservation: Maintains parent heading context within each split file.
- Filename Sanitization: Generates clean, unique filenames for each section, ensuring compatibility and readability.
- Reference Link Management: Extracts and appends reference-style links used within each section.
- Reference Link Conversion: Convert all inline links to reference-style links for improved readability and maintainability.
- Link Validation: Checks and validates all links within a Markdown file for accuracy and integrity.
- Thematic Break Handling: Recognizes and handles line breaks (
---
,***
,___
) for intelligent content segmentation. - MkDocs Integration: Automatically generates an
mkdocs.yml
configuration file based on the split sections. - CLI Support: Provides a user-friendly Command-Line Interface for seamless operation.
Install from PyPI using your preferred package manager listed below.
Use pip (recommended for most users):
pip install -U splitme-ai
Install in an isolated environment with pipx:
❯ pipx install splitme-ai
For the fastest installation use uv:
❯ uv tool install splitme-ai
Let's take a look at some examples of how to use the splitme-ai
CLI.
Example 1: Split a Markdown file on heading level 2 (default setting):
splitme-ai \
--split.i docs/examples/data/README-AI.md \
--split.settings.o docs/examples/output-h2
Example 2: Split on heading level 2 and generate an mkdocs.yml configuration file:
splitme-ai \
--split.i docs/examples/data/README-AI.md \
--split.settings.o docs/examples/output-h2 \
--split.settings.mkdocs
Example 3: Split on heading level 3:
splitme-ai \
--split.i docs/examples/data/README-AI.md \
--split.settings.o docs/examples/output-h3 \
--split.settings.hl "###"
Example 4: Split on heading level 4:
splitme-ai \
--split.i docs/examples/data/README-AI.md \
--split.settings.o docs/examples/output-h4 \
--split.settings.hl "####"
Example 5: Convert inline links to reference-style links:
splitme-ai --reflinks.i tests/data/pydantic.md --reflinks.o with_reflinks.md
Example 6: Validate all links in a Markdown file:
splitme-ai --validate-links.i tests/data/pydantic.md
The output will display the results of whether the links are valid or broken.
Scanning markdown file tests/data/pydantic.md for broken links...
Markdown Link Check Results:
--------------------------------------------------------------------------------
✓ Line 2: [![CI](https://img.shields.io/github/actions/workflow/status/pydantic/pydantic/ci.yml?branch=main&logo=github&label=CI)
✓ Line 3: [![Coverage](https://coverage-badge.samuelcolvin.workers.dev/pydantic/pydantic.svg)
✓ Line 4: [![pypi](https://img.shields.io/pypi/v/pydantic.svg)
✓ Line 5: [![CondaForge](https://img.shields.io/conda/v/conda-forge/pydantic.svg)
✓ Line 6: [![downloads](https://static.pepy.tech/badge/pydantic/month)
✓ Line 7: [![versions](https://img.shields.io/pypi/pyversions/pydantic.svg)
✓ Line 8: [![license](https://img.shields.io/github/license/pydantic/pydantic.svg)
✓ Line 9: [![Pydantic v2](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/pydantic/pydantic/main/docs/badge/v2.json)
✓ Line 18: [Learn more](https://pydantic.dev/articles/logfire-announcement)
✓ Line 24: [pydantic V1.10 Documentation](https://docs.pydantic.dev/)
✓ Line 24: [`1.10.X-fixes` git branch](https://github.com/pydantic/pydantic/tree/1.10.X-fixes)
✓ Line 28: [documentation](https://docs.pydantic.dev/)
✓ Line 34: [Install](https://docs.pydantic.dev/install/)
Summary: 0 broken links out of 13 total links.
View the output of all examples above here.
Note
Explore the [Official Documentation][docs] for more detailed guides and examples.
- Implement reference link conversion and management.
- Enhance CLI usability and user experience.
- Integrate AI-powered content analysis and segmentation.
- Add robust chunking and splitting algorithms for LLM applications.
- Add support for additional static site generators.
- Add support for additional input and output formats.
Contributions are welcome! For bug reports, feature requests, or questions, please open an issue or submit a pull request on GitHub.
Copyright © 2024-2025 splitme-ai.
Released under the MIT license.