Repo2Text is a Python-based command-line utility designed to transform entire code repositories into a human-readable text format, making them suitable for processing by large language models (LLMs). The tool respects .gitignore
rules, provides a clean hierarchical view of the repository, and offers additional features like binary file exclusion, clipboard integration, and optional output file creation.
- Generates a hierarchical tree representation of the project structure.
- Excludes files and directories ignored by
.gitignore
.
- Automatically identifies and omits binary files to prevent irrelevant data from cluttering the output.
- Automatically copies the formatted repository content to the clipboard for easy sharing or further processing.
- Indicates empty files explicitly.
- Limits displayed content to the first 20 lines per file in the console for concise previews.
- Provides a complete copy of all file contents to the clipboard.
- Uses
colorama
for color-coded console messages, improving readability.
- Saves the formatted output to a specified file if needed.
- Displays the total time taken for the operation.
- Python 3.6 or higher must be installed on your system.
-
Clone the repository:
git clone <repo_url> cd repo2text
-
Install the package:
pip install -e .
Run the following command to process a repository:
repo2text [root_dir] [-o OUTPUT]
root_dir
: The root directory of the project to process. Defaults to the current directory (.
).
-o
,--output
: Path to save the formatted output file.
To process the current directory and save the output to a file:
repo2text . -o output.txt
-
Load
.gitignore
Rules:- Combines custom
.gitignore
patterns with default ignore rules to exclude unnecessary files.
- Combines custom
-
Build Project Tree:
- Recursively traverses directories to create a clear hierarchical representation.
-
Collect Files:
- Includes only non-binary, non-ignored files.
-
Format Content:
- Prepares the project tree and file contents in a structured text format.
-
Clipboard and File Output:
- Copies the result to the clipboard and optionally writes it to a file.
- Warns if
.gitignore
is missing or empty. - Skips files that cannot be read, marking them as inaccessible.
- Omits binary files automatically.
- Provides clear error messages for invalid input or file I/O issues.
colorama
: For colorized console output.pathspec
: To handle.gitignore
patterns.pyperclip
: For clipboard integration.
Install dependencies automatically with the pip install -e .
command.
We welcome contributions! To contribute:
- Fork the repository.
- Create a feature branch.
- Submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
For issues or feature requests, please open an issue on the GitHub repository.