From 6f81c531986c8aa0bd7fe2d18df4fe8e1ad4cecd Mon Sep 17 00:00:00 2001 From: datalayer0 Date: Wed, 12 Feb 2025 17:26:25 +0300 Subject: [PATCH 1/3] docs: update README.md with comprehensive documentation --- README.md | 164 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 162 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 14e0917..20e4fbc 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,171 @@ # Django Assistant Bot +A powerful Django-based framework for building AI-powered chat assistants with support for multiple AI models and platforms. + +## Features + +- Support for multiple AI providers (OpenAI, Groq, Ollama, Transformers) +- Built-in Telegram platform integration +- Extensible architecture for adding new platforms +- Document processing and RAG (Retrieval-Augmented Generation) capabilities +- Multi-language support +- Dialog management and persistence +- Resource management system +- Asynchronous message processing +- Debug mode for development +- Whitelist access control + ## Installation ```bash pip install . ``` -# Example +## Project Structure + +The project is organized into several Django apps: + +### admin +Administrative interface and management commands. + +### ai +Core AI functionality: +- `dialog.py`: AI dialog management +- `domain.py`: Core domain models +- `embedders/`: Various embedding providers (GPU, Ollama, OpenAI, Transformers) +- `providers/`: AI model providers implementation +- `services/`: AI-related services + +### bot +Main bot functionality: +- `assistant_bot.py`: Core bot implementation +- `chat_completion.py`: Chat completion handling +- `platforms/`: Platform-specific implementations (e.g., Telegram) +- `services/`: Bot-related services +- `schemas/`: JSON schemas for various bot operations + +### loading +Data loading functionality: +- CSV data loading +- Management commands for data import + +### processing +Document processing: +- Wiki document processing +- Document splitting and formatting +- Question generation and merging +- Custom processing steps + +### rag +Retrieval-Augmented Generation: +- Search services +- Document retrieval + +### storage +Data storage: +- Document storage +- API endpoints +- Database models + +### utils +Utility functions: +- Database utilities +- Debugging tools +- JSON schema handling +- Language processing +- Throttling + +## Configuration + +The bot can be configured through Django settings and environment variables. Key settings include: + +- `DEFAULT_AI_MODEL`: Default AI model to use +- `DIALOG_FAST_AI_MODEL`: Model for quick responses +- `DIALOG_STRONG_AI_MODEL`: Model for complex processing + +## Usage + +1. Install the package: +```bash +pip install . +``` + +2. Add required apps to INSTALLED_APPS in your Django settings: +```python +INSTALLED_APPS = [ + ... + 'assistant.admin', + 'assistant.ai', + 'assistant.bot', + 'assistant.loading', + 'assistant.processing', + 'assistant.rag', + 'assistant.storage', +] +``` + +3. Configure your AI providers in settings.py: +```python +DEFAULT_AI_MODEL = 'your-default-model' +DIALOG_FAST_AI_MODEL = 'fast-model' +DIALOG_STRONG_AI_MODEL = 'strong-model' +``` + +4. Set up your platform credentials (e.g., Telegram bot token) in environment variables. + +5. Run migrations: +```bash +python manage.py migrate +``` + +## Example Implementation + +See the `example` directory for a complete working example of a bot implementation. + +## Commands + +The bot supports various commands: + +- `/start`: Start a new conversation +- `/help`: Show help message +- `/continue`: Continue the previous response +- `/new`: Start a new dialog +- `/model `: Switch AI model +- `/models`: List available AI models +- `/debug`: Show debug information +- `/doc `: Show document content +- `/wiki `: Show wiki document + +## Development + +### Adding New AI Providers + +1. Create a new provider in `assistant.ai.providers` +2. Implement the required interface +3. Register the provider in settings + +### Adding New Platforms + +1. Create a new platform implementation in `assistant.bot.platforms` +2. Implement the platform interface +3. Register the platform in your bot configuration + +## Dependencies + +Key dependencies include: +- Django 4.2.13 +- djangorestframework 3.15.1 +- python-telegram-bot 21.1.1 +- openai 1.28.1 +- groq 0.6.0 +- ollama 0.4.4 +- celery 5.4.0 +- pgvector 0.2.5 (for vector embeddings) + +## License + +This project is licensed under the MIT License. + +## Author -See the example in the `example` directory. \ No newline at end of file +Aleksandr Fedotov (a_fedotov89@mail.ru) \ No newline at end of file From 7c7c860985dee855d0f9abbeed42693ff6db90e2 Mon Sep 17 00:00:00 2001 From: datalayer0 Date: Wed, 12 Feb 2025 17:31:05 +0300 Subject: [PATCH 2/3] docs: enhance documentation with detailed module descriptions --- README.md | 319 +++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 233 insertions(+), 86 deletions(-) diff --git a/README.md b/README.md index 20e4fbc..564511c 100644 --- a/README.md +++ b/README.md @@ -26,141 +26,288 @@ pip install . The project is organized into several Django apps: ### admin -Administrative interface and management commands. +Administrative interface and management commands: +- Management commands for queue processing +- Admin interface customization +- Bot configuration management ### ai Core AI functionality: -- `dialog.py`: AI dialog management -- `domain.py`: Core domain models -- `embedders/`: Various embedding providers (GPU, Ollama, OpenAI, Transformers) -- `providers/`: AI model providers implementation -- `services/`: AI-related services + +#### dialog.py +Handles AI dialog management: +- Message context management +- AI provider integration +- Response handling +- Token calculation +- Context size management + +#### domain.py +Core domain models for AI interaction: +- Message structure +- AI Response format +- Common data types + +#### embedders/ +Various embedding providers: +- `gpu_service.py`: GPU-accelerated embeddings +- `ollama.py`: Ollama embeddings integration +- `openai.py`: OpenAI embeddings +- `transformers.py`: Hugging Face Transformers embeddings + +#### providers/ +AI model providers implementation: +- `base.py`: Base provider interface +- `gpu_service.py`: GPU service integration +- `groq.py`: Groq API integration +- `ollama.py`: Ollama models support +- `openai.py`: OpenAI API integration +- `transformers.py`: Local transformer models + +#### services/ +AI-related services: +- `ai_service.py`: Core AI service functionality + - Provider management + - Response formatting + - Error handling ### bot Main bot functionality: -- `assistant_bot.py`: Core bot implementation -- `chat_completion.py`: Chat completion handling -- `platforms/`: Platform-specific implementations (e.g., Telegram) -- `services/`: Bot-related services -- `schemas/`: JSON schemas for various bot operations + +#### assistant_bot.py +Core bot implementation: +- Message handling +- Command processing +- State management +- Dialog control +- Platform integration + +#### chat_completion.py +Chat completion handling: +- Message generation +- Context management +- Model selection +- Response formatting + +#### platforms/ +Platform-specific implementations: +- Telegram integration with markdown support +- Extensible base for other platforms + +#### services/ +Bot-related services: +- `dialog_service.py`: Dialog management +- `instance_service.py`: Bot instance handling +- `schema_service.py`: JSON schema validation +- `context_service/`: Context management + +#### schemas/ +JSON schemas for various operations: +- Context checking +- Document selection +- Question handling +- Topic classification +- Search operations ### loading Data loading functionality: -- CSV data loading -- Management commands for data import +- CSV data import +- Data validation +- Format conversion +- Import management commands ### processing Document processing: -- Wiki document processing -- Document splitting and formatting -- Question generation and merging + +#### documents/ +Document processing pipeline: +- `processor.py`: Main document processor - Custom processing steps +#### schemas/ +Processing operation schemas: +- Document questions generation +- Document formatting +- Question merging +- Document splitting +- Section extraction + +#### wiki.py +Wiki document processing: +- Content extraction +- Structure analysis +- Metadata handling + ### rag Retrieval-Augmented Generation: -- Search services -- Document retrieval + +#### services/ +- `search_service.py`: Vector-based document search + - Similarity scoring + - Result ranking + - Context retrieval ### storage Data storage: + +#### models.py +Database models: - Document storage -- API endpoints -- Database models +- Wiki document management +- Embedding storage +- Metadata management + +#### api/ +REST API implementation: +- `filters.py`: Query filtering +- `pagination.py`: Result pagination +- `serializers.py`: Data serialization +- `views.py`: API endpoints ### utils Utility functions: -- Database utilities -- Debugging tools -- JSON schema handling -- Language processing -- Throttling + +#### db.py +Database utilities: +- Connection management +- Query optimization +- Transaction handling + +#### debug.py +Debugging tools: +- Performance monitoring +- Error tracking +- Debug logging + +#### json_schema.py +JSON schema utilities: +- Schema validation +- Format checking +- Error reporting + +#### language.py +Language processing: +- Language detection +- Text processing +- Character encoding + +#### repeat_until.py +Retry mechanism: +- Operation retrying +- Error handling +- Timeout management + +#### throttle.py +Rate limiting: +- API call throttling +- Request queuing +- Limit enforcement ## Configuration -The bot can be configured through Django settings and environment variables. Key settings include: +### Environment Variables +```bash +# AI Provider Settings +OPENAI_API_KEY=your_openai_key +GROQ_API_KEY=your_groq_key +OLLAMA_HOST=http://localhost:11434 -- `DEFAULT_AI_MODEL`: Default AI model to use -- `DIALOG_FAST_AI_MODEL`: Model for quick responses -- `DIALOG_STRONG_AI_MODEL`: Model for complex processing +# Database Settings +DATABASE_URL=postgresql://user:pass@localhost/dbname -## Usage +# Redis Settings (for Celery) +REDIS_URL=redis://localhost:6379/0 -1. Install the package: -```bash -pip install . +# Telegram Bot Settings +TELEGRAM_BOT_TOKEN=your_bot_token ``` -2. Add required apps to INSTALLED_APPS in your Django settings: +### Django Settings ```python -INSTALLED_APPS = [ - ... - 'assistant.admin', - 'assistant.ai', - 'assistant.bot', - 'assistant.loading', - 'assistant.processing', - 'assistant.rag', - 'assistant.storage', -] +# AI Models Configuration +DEFAULT_AI_MODEL = 'gpt-3.5-turbo' +DIALOG_FAST_AI_MODEL = 'gpt-3.5-turbo' +DIALOG_STRONG_AI_MODEL = 'gpt-4' + +# Vector Search Settings +VECTOR_SIMILARITY_THRESHOLD = 0.8 +MAX_SEARCH_RESULTS = 5 + +# Bot Settings +BOT_MESSAGE_TIMEOUT = 60 +BOT_MAX_RETRIES = 3 ``` -3. Configure your AI providers in settings.py: +## Development + +### Adding New AI Providers + +1. Create a new provider class in `assistant.ai.providers`: ```python -DEFAULT_AI_MODEL = 'your-default-model' -DIALOG_FAST_AI_MODEL = 'fast-model' -DIALOG_STRONG_AI_MODEL = 'strong-model' +from assistant.ai.providers.base import AIProvider + +class NewProvider(AIProvider): + async def get_response(self, messages, max_tokens=1024, json_format=False): + # Implementation + pass ``` -4. Set up your platform credentials (e.g., Telegram bot token) in environment variables. +2. Register in `ai_service.py`: +```python +PROVIDERS = { + 'new-provider': NewProvider, +} +``` -5. Run migrations: -```bash -python manage.py migrate +### Adding New Platforms + +1. Create platform implementation: +```python +from assistant.bot.domain import BotPlatform + +class NewPlatform(BotPlatform): + async def send_message(self, chat_id, text, **kwargs): + # Implementation + pass ``` -## Example Implementation +2. Register in your bot configuration. -See the `example` directory for a complete working example of a bot implementation. +## Testing -## Commands +Run tests with pytest: +```bash +pytest +``` -The bot supports various commands: +Test coverage: +```bash +pytest --cov=assistant +``` -- `/start`: Start a new conversation -- `/help`: Show help message -- `/continue`: Continue the previous response -- `/new`: Start a new dialog -- `/model `: Switch AI model -- `/models`: List available AI models -- `/debug`: Show debug information -- `/doc `: Show document content -- `/wiki `: Show wiki document +## Deployment -## Development +1. Install requirements: +```bash +pip install -r requirements.txt +``` -### Adding New AI Providers +2. Configure environment variables -1. Create a new provider in `assistant.ai.providers` -2. Implement the required interface -3. Register the provider in settings +3. Run migrations: +```bash +python manage.py migrate +``` -### Adding New Platforms +4. Start Celery worker: +```bash +celery -A your_project worker -l info +``` -1. Create a new platform implementation in `assistant.bot.platforms` -2. Implement the platform interface -3. Register the platform in your bot configuration - -## Dependencies - -Key dependencies include: -- Django 4.2.13 -- djangorestframework 3.15.1 -- python-telegram-bot 21.1.1 -- openai 1.28.1 -- groq 0.6.0 -- ollama 0.4.4 -- celery 5.4.0 -- pgvector 0.2.5 (for vector embeddings) +5. Run the bot: +```bash +python manage.py runbot +``` ## License From cde979fb7769454b5f83238a3b5e8b9e7c1f9fcc Mon Sep 17 00:00:00 2001 From: datalayer0 Date: Wed, 12 Feb 2025 17:33:57 +0300 Subject: [PATCH 3/3] docs: add RAG system examples and usage documentation --- README.md | 157 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 157 insertions(+) diff --git a/README.md b/README.md index 564511c..7a08282 100644 --- a/README.md +++ b/README.md @@ -109,6 +109,97 @@ JSON schemas for various operations: - Topic classification - Search operations +### Domain Models + +#### Update +Represents an incoming update from the platform: +```python +@dataclasses.dataclass +class Update: + chat_id: str # Chat identifier + message_id: Optional[int] # Message identifier + text: Optional[str] # Message text + photo: Optional[Photo] # Photo data if present + user: Optional[User] # User information + callback_query: Optional[CallbackQuery] # Callback data +``` + +#### Answer Types +Two types of responses are supported: + +1. SingleAnswer: +```python +answer = SingleAnswer( + text="Response text", + thinking="Internal thought process", + image_url="Optional image URL", + is_markdown=True, + buttons=[[Button("Click me", "/command")]], + state={"key": "value"}, + usage=[{"model": "gpt-4", "tokens": 150}] +) +``` + +2. MultiPartAnswer for complex responses: +```python +multi_answer = MultiPartAnswer([ + SingleAnswer(text="Part 1"), + SingleAnswer(text="Part 2") +]) +``` + +### Command System Examples + +1. Basic command: +```python +@AssistantBot.command(r'/start') +async def command_start(self, match, message_id): + return SingleAnswer("Bot started!") +``` + +2. Command with parameters: +```python +@AssistantBot.command(r'/search\s+(.*)') +async def command_search(self, match, message_id): + query = match.group(1) + return SingleAnswer(f"Searching for: {query}") +``` + +### State Management Examples + +1. Updating state: +```python +await self.update_state({ + 'current_mode': 'search', + 'last_query': 'example', + 'results_count': 5 +}) +``` + +2. Reading state: +```python +current_mode = self.instance.state.get('current_mode') +if current_mode == 'search': + # Handle search mode +``` + +### Resource Management Examples + +1. Loading localized messages: +```python +# messages/en/welcome.txt +resource_manager = ResourceManager(codename='mybot', language='en') +welcome_text = resource_manager.get_message('welcome.txt') +``` + +2. Error handling: +```python +try: + text = resource_manager.get_phrase('key') +except NoMessageFound: + text = "Default message" +``` + ### loading Data loading functionality: - CSV data import @@ -147,6 +238,72 @@ Retrieval-Augmented Generation: - Result ranking - Context retrieval +### RAG Examples + +#### Document Search and Retrieval +```python +from assistant.rag.services.search_service import SearchService +from assistant.storage.models import Document + +# Initialize search service +search_service = SearchService() + +# Search for relevant documents +results = await search_service.search( + query="How to configure logging?", + limit=3, + similarity_threshold=0.8 +) + +# Process search results +for doc in results: + print(f"Document: {doc.name}") + print(f"Similarity: {doc.similarity_score}") + print(f"Content: {doc.content[:200]}...") +``` + +#### Integrating RAG with Bot Responses +```python +async def handle_user_query(query: str) -> str: + # Search for relevant documents + docs = await search_service.search(query) + + # Build context from documents + context = "\n".join([ + f"Document '{doc.name}':\n{doc.content}" + for doc in docs + ]) + + # Create messages with context + messages = [ + { + "role": "system", + "content": f"Use this context to answer questions:\n{context}" + }, + { + "role": "user", + "content": query + } + ] + + # Get AI response with context + response = await ai_service.get_response(messages) + return response.text +``` + +#### Vector Search Configuration +```python +# settings.py +VECTOR_SEARCH_SETTINGS = { + 'model': 'text-embedding-3-small', # Embedding model + 'dimensions': 1536, # Vector dimensions + 'metric': 'cosine', # Similarity metric + 'index_type': 'hnsw', # Index type for pgvector + 'ef_search': 100, # HNSW search parameter + 'm': 16 # HNSW graph parameter +} +``` + ### storage Data storage: