support for open ai cache by bixia · Pull Request #6 · BradMoonUESTC/finite-monkey-engine

bixia · 2024-12-24T06:27:01Z

Improve OpenAI API Response Caching System

Overview

This PR enhances the caching system for OpenAI API calls to improve performance and reduce API costs. The changes provide consistent caching behavior across different API endpoints (OpenAI, Azure, Claude) and request types (chat completions, embeddings).

Key Changes

Cache Implementation

Unified caching approach for all API endpoints (OpenAI, Azure, Claude)
Consistent cache key generation using request data
JSON serialization for embedding responses
Zero-vector fallback for embedding errors

Supported Endpoints

OpenAI chat completions
Azure OpenAI completions
Claude chat completions
OpenAI embeddings
Custom embeddings service

Error Handling

Improved error handling with fallback responses
Zero-vector returns for embedding failures
Cache miss handling with proper error messages

Benefits

Reduced API costs through efficient response caching
Improved performance for repeated queries
Consistent caching behavior across all endpoints
Better error recovery and fallback mechanisms

Testing

The changes have been tested with:

Standard OpenAI endpoints
Azure OpenAI endpoints
Claude API endpoints
Custom embedding services
Error scenarios and fallbacks

Usage Example

# The cache is automatically used for all API calls
response = common_ask(prompt)  # Will use cache if available

# Embeddings are also cached
embedding = common_get_embedding(text)  # Cached with proper JSON serialization

Notes

No database schema changes required
Backwards compatible with existing cache entries
Thread-safe implementation
Proper JSON serialization for embedding vectors

Related Issues

Reduces API costs through caching
Improves response times for repeated queries
Provides consistent behavior across different API endpoints

Future Improvements

Add cache expiration policies
Implement cache size limits
Add cache statistics tracking

K2 · 2025-02-23T06:46:02Z

@bixia Cool stuff! I'd love to merge something like I'm too new here to check this in just yet. I wrote a similar cache that will probably get you a limited amount of functionality in my dev fork, I have a bit of testing to finish up and will have the current stack + everything on async with some more formalized pipeline's that guide flow through the taskmgr/aimgr, this will let you play with scripts and make run-time testing/changes very easily, I hope to have some of the new user-defined graph queries in a couple days also.

I would ping @BradMoonUESTC for this one however. If you are in need of cache to help suspending-resuming of sessions (save cost too), the implementation I had done is in (https://github.com/K2/finite-monkey-engine/blob/dev/services/cache.py) it's a very simple python object cache over the translation engine, I hope I can get it working well with everything else there will be automatic CN/EN translation that I hope works well for some people.

okxdex added 2 commits December 24, 2024 14:25

support for open ai cache

a0d92e8

support for open ai cache

0157afb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support for open ai cache#6

support for open ai cache#6
bixia wants to merge 2 commits intoBradMoonUESTC:mainfrom
bixia:support-openai-cache

bixia commented Dec 24, 2024

Uh oh!

K2 commented Feb 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bixia commented Dec 24, 2024

Improve OpenAI API Response Caching System

Overview

Key Changes

Cache Implementation

Supported Endpoints

Error Handling

Benefits

Testing

Usage Example

Notes

Related Issues

Future Improvements

Uh oh!

K2 commented Feb 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants