Skip to content

rupeshtr78/rag-agent-rust

Repository files navigation

Vector Embedding and RAG with LLM Integration in Rust

This repository features a Rust-based system for managing vector embeddings, enabling queries through a LanceDB vector database. It supports interaction with LLM models using the retrieved context (currently compatible with Ollama). The system is designed to handle tasks such as embedding generation, storage, querying, and engaging in interactive chat sessions.

Table of Contents

Overview

The system is composed of several modules that handle different aspects of the embedding, querying, and chat processes:

  • Commands: Handles command-line arguments and subcommands.
  • Config: Manages configuration settings for embedding requests, database connections, and chat interactions.
  • Constants: Provides constant values used throughout the application.
  • Embedding: Contains logic for generating embeddings and persisting them to the database.
  • VectorDB: Handles interactions with the lancedb vector database for storing and querying vector embeddings.
  • Chat: Integrates with the Ollama LLM model to provide interactive chat functionalities based on retrieved embeddings.
  • TODO: Adding Tests, Adding PDF Support, Adding Agent refactor

Features

  • File Type Support: The tool supports multiple file types including Rust (rs), Python (py), C++ (cpp), Java (java), JavaScript (js), TypeScript (ts), and text files. (TODO: Add PDF support)
  • LanceDB Integration:
    • Create and manage vector tables in LanceDB.
    • Insert and update records in LanceDB tables.
    • Query the nearest vectors in LanceDB tables.
  • Embedding:
    • Generate embeddings for text data using an external embedding service.
    • Store embeddings in LanceDB tables.
  • CLI Interface: Command-line interface for easy interaction with the tool.
  • Database Persistence: Store embeddings in a lance vector database.
  • Querying: Query the database to find nearest neighbors based on vector embeddings.
  • Chat Integration: Interact with the Ollama LLM model using embeddings retrieved from the database.

Getting Started

Prerequisites

  • Rust (latest stable version)
  • Lancedb Vector DB.
  • Docker (for running tests)
  • Active Ollama Service with nomic-embed-text or similar model.

Installation

  1. Clone the repository:

    git clone https://github.com/rupeshtr78/rag-agent-rust.git
    cd rag-agent-rust
  2. Install dependencies:

    cargo build

Running the Application

  1. Ensure the Ollama service is running with the specified model.

  2. Run the application:

    cargo run -- --help
    
    

Usage

Commands:
  version      Get the version of the application
  load         Load a directory of files into the lance vector database
  lance-query  Query the Lance Vector Database
  rag-query    Query the Lance Vector Database and chat with the AI
  generate     Chat with the AI
  exit         Exit the application
  man          
  help         Print this message or the help of the given subcommand(s)

Options:
  -l, --log-level <LOG_LEVEL>    [possible values: debug, info, warn, error, off]
  -g, --agent-mode <AGENT_MODE>  Select to run in agent mode [possible values: true, false]
  -h, --help                     Print help
  -V, --version                  Print version

Commands

The application supports various commands and subcommands. Use the --help flag to see available options:

# run in interactive cli mode
cargo run -- -g true

# Debug mode
cargo run -- -g true -l debug


# Generate embeddings and store them in the database
cargo run -- load -p sample/

# Query the database for nearest neighbors
cargo run -- rag-query -t sample_table -d sample_db -i "what is temperature"

# Start an interactive chat session
cargo run -- chat -p "what is mirostat"

Configuration

Configuration settings for embedding requests, database connections, and chat interactions are managed in src/app/config.rs. You can modify these settings as needed.

Embedding and Querying

  • Generate Embeddings: Use the run_embedding function to generate embeddings and persist them to the database.
  • Query Embeddings: Use the run_query function to query the database for nearest neighbors based on vector embeddings.

Chat Integration

  • Interactive Chat: Use the chat command to start an interactive chat session with the Ollama LLM model. The chat session will use embeddings retrieved from the database to provide context-aware responses.

Testing

The test suite requires Ollama with an embedding and LLM model to be running in the correct configuration.

  • TODO missing test
cargo test

Code Structure

  • code_loader.rs: Handles file type detection and content loading.
  • load_lancedb.rs: Manages LanceDB table creation, insertion, and querying.
  • query.rs: Contains logic for running queries on LanceDB tables.
  • chat.rs: Implements chat functionalities using the Ollama LLM model.
  • main.rs: Entry point for the CLI application.

Contributing

Contributions are welcome! Please read the CONTRIBUTING.md file for details on how to contribute to this project.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Releases

No releases published

Packages

No packages published

Languages