Skip to content

eli64s/splitme-ai

Repository files navigation

SplitMe-AI Logo

Break down your docs. Build up your knowledge.

A Markdown text splitter for modular docs and maximum flexibility.

separator

What is SplitmeAI?

SplitmeAI is a Python module that addresses challenges in managing large Markdown files, particularly when creating and maintaining structured static documentation sites such as Mkdocs.

Key Features:

  • Section Splitting: Breaks down large Markdown files into smaller, manageable sections based on specified heading levels.
  • Hierarchy Preservation: Maintains parent heading context within each split file.
  • Filename Sanitization: Generates clean, unique filenames for each section, ensuring compatibility and readability.
  • Reference Link Management: Extracts and appends reference-style links used within each section.
  • Reference Link Conversion: Convert all inline links to reference-style links for improved readability and maintainability.
  • Link Validation: Checks and validates all links within a Markdown file for accuracy and integrity.
  • Thematic Break Handling: Recognizes and handles line breaks (---, ***, ___) for intelligent content segmentation.
  • MkDocs Integration: Automatically generates an mkdocs.yml configuration file based on the split sections.
  • CLI Support: Provides a user-friendly Command-Line Interface for seamless operation.

Quick Start

Installation

Install from PyPI using your preferred package manager listed below.

 pip

Use pip (recommended for most users):

pip install -U splitme-ai

 pipx

Install in an isolated environment with pipx:

❯ pipx install splitme-ai

 uv

For the fastest installation use uv:

❯ uv tool install splitme-ai

Usage

Using the CLI

Let's take a look at some examples of how to use the splitme-ai CLI.

Splitting a Markdown File

Example 1: Split a Markdown file on heading level 2 (default setting):

splitme-ai \
    --split.i docs/examples/data/README-AI.md \
    --split.settings.o docs/examples/output-h2

Example 2: Split on heading level 2 and generate an mkdocs.yml configuration file:

splitme-ai \
    --split.i docs/examples/data/README-AI.md \
    --split.settings.o docs/examples/output-h2 \
    --split.settings.mkdocs

Example 3: Split on heading level 3:

splitme-ai \
    --split.i docs/examples/data/README-AI.md \
    --split.settings.o docs/examples/output-h3 \
    --split.settings.hl "###"

Example 4: Split on heading level 4:

splitme-ai \
    --split.i docs/examples/data/README-AI.md \
    --split.settings.o docs/examples/output-h4 \
    --split.settings.hl "####"
Converting Reference Links

Example 5: Convert inline links to reference-style links:

splitme-ai --reflinks.i tests/data/pydantic.md --reflinks.o with_reflinks.md
Validating Links

Example 6: Validate all links in a Markdown file:

splitme-ai --validate-links.i tests/data/pydantic.md

The output will display the results of whether the links are valid or broken.

Scanning markdown file tests/data/pydantic.md for broken links...

Markdown Link Check Results:
--------------------------------------------------------------------------------
✓ Line 2: [![CI](https://img.shields.io/github/actions/workflow/status/pydantic/pydantic/ci.yml?branch=main&logo=github&label=CI)
✓ Line 3: [![Coverage](https://coverage-badge.samuelcolvin.workers.dev/pydantic/pydantic.svg)
✓ Line 4: [![pypi](https://img.shields.io/pypi/v/pydantic.svg)
✓ Line 5: [![CondaForge](https://img.shields.io/conda/v/conda-forge/pydantic.svg)
✓ Line 6: [![downloads](https://static.pepy.tech/badge/pydantic/month)
✓ Line 7: [![versions](https://img.shields.io/pypi/pyversions/pydantic.svg)
✓ Line 8: [![license](https://img.shields.io/github/license/pydantic/pydantic.svg)
✓ Line 9: [![Pydantic v2](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/pydantic/pydantic/main/docs/badge/v2.json)
✓ Line 18: [Learn more](https://pydantic.dev/articles/logfire-announcement)
✓ Line 24: [pydantic V1.10 Documentation](https://docs.pydantic.dev/)
✓ Line 24: [`1.10.X-fixes` git branch](https://github.com/pydantic/pydantic/tree/1.10.X-fixes)
✓ Line 28: [documentation](https://docs.pydantic.dev/)
✓ Line 34: [Install](https://docs.pydantic.dev/install/)

Summary: 0 broken links out of 13 total links.

View the output of all examples above here.

Note

Explore the [Official Documentation][docs] for more detailed guides and examples.


Roadmap

  • Implement reference link conversion and management.
  • Enhance CLI usability and user experience.
  • Integrate AI-powered content analysis and segmentation.
  • Add robust chunking and splitting algorithms for LLM applications.
  • Add support for additional static site generators.
  • Add support for additional input and output formats.

Contributing

Contributions are welcome! For bug reports, feature requests, or questions, please open an issue or submit a pull request on GitHub.


License

Copyright © 2024-2025 splitme-ai.
Released under the MIT license.

separator