This repo is a personal project to solve a very specific issue: using Large Language Models (LLMs) effectively for tasks involving Excel data processing and enabling LLMs to process Excel/CSV files efficiently.
- Excel Data Processing: Prepares and processes Excel/CSV data specifically for LLM usage.
- LLM Agent Integration: Handles LLM agent workflows and interactions with processed data.
- LLM-Specific Tasks: Optimizes the way LLMs interact with structured data like Excel files for better results.
Out of Scope:
- Conversation UI (e.g., Streamlit)
- General-purpose Excel reading tools (this assumes you have the data ready to go).
- Broad data aggregation or non-LLM-related processing tasks.
This project plans to uses:
- Python: The backbone of it all.
- Poetry: For dependency management and keeping the project organized.
- LangChain: To streamline LLM workflows.
- Python 3.8 or newer.
- Poetry installed on your system.
- Clone the repository:
git clone https://github.com/yourusername/llm-data-injection.git cd llm-data-injection
- Install dependencies:
poetry install
- Prepare your Excel file (make sure it’s ready for processing—this repo assumes clean data).
- Use the provided scripts to preprocess the data and inject it into the LLM pipeline.
- Run the LLM agent to handle queries, summaries, or any task you define.
- You’re focused on integrating Excel data with LLM agents, not on building UI or generic tools.
- You want streamlined scripts for Excel-to-LLM workflows.
- You need a specialized solution, not a generic one.
Feel free to contribute. Fork the repo, make your changes, and submit a pull request!
Here’s the narrowed-down focus:
- Build rock-solid Excel data preprocessing for LLMs.
- Develop and refine LLM agent workflows.
- Add optimizations for common LLM tasks involving structured data.