Conversation
|
@bixia Cool stuff! I'd love to merge something like I'm too new here to check this in just yet. I wrote a similar cache that will probably get you a limited amount of functionality in my dev fork, I have a bit of testing to finish up and will have the current stack + everything on async with some more formalized pipeline's that guide flow through the taskmgr/aimgr, this will let you play with scripts and make run-time testing/changes very easily, I hope to have some of the new user-defined graph queries in a couple days also. I would ping @BradMoonUESTC for this one however. If you are in need of cache to help suspending-resuming of sessions (save cost too), the implementation I had done is in (https://github.com/K2/finite-monkey-engine/blob/dev/services/cache.py) it's a very simple python object cache over the translation engine, I hope I can get it working well with everything else there will be automatic CN/EN translation that I hope works well for some people. |
Improve OpenAI API Response Caching System
Overview
This PR enhances the caching system for OpenAI API calls to improve performance and reduce API costs. The changes provide consistent caching behavior across different API endpoints (OpenAI, Azure, Claude) and request types (chat completions, embeddings).
Key Changes
Cache Implementation
Supported Endpoints
Error Handling
Benefits
Testing
The changes have been tested with:
Usage Example
Notes
Related Issues
Future Improvements