A lightweight Google Gemini API-compatible proxy server that allows you to call OpenAI-compatible LLM services using the Gemini API format.
Currently, Gemini CLI cannot easily use models other than Gemini. This Python tool was developed to meet this need.
Usage Steps:
- Modify
config.jsonand fill in your Kimi API key - Install dependencies and run
python gemini_proxy_for_kimi.py - Set environment variables:
export GOOGLE_GEMINI_BASE_URL=http://localhost:8000/ export GEMINI_API_KEY=sk-1234
- In Gemini CLI, use
/authand select "Use Gemini API Key"
β οΈ Important Note: The current version has only been fully tested and optimized on Moonshot Kimi. Other OpenAI-compatible services require your own testing and adjustments.
δΈζζζ‘£ | Chinese Documentation
- π Complete API Compatibility - Supports all Gemini API endpoints
- π Intelligent Format Conversion - Seamless Gemini β OpenAI format conversion
- π Streaming Response Support - Complete Server-Sent Events streaming processing
- π οΈ Function Calling Support - Bidirectional conversion of tool calls
- π£οΈ Multi-turn Conversations - Complete conversation history handling
- π Model Mapping - Flexible model name mapping configuration
- π Detailed Logging - Configurable access logs and detailed request logs
- βοΈ Configuration Files - Unified management through JSON configuration files
- Python 3.8+
- Dependencies:
fastapi,uvicorn,openai
pip install fastapi uvicorn openaiCreate a config.json file:
{
"openai_api_key": "sk-your-kimi-api-key",
"openai_base_url": "https://api.moonshot.cn/v1",
"model_mapping": {
"gemini-2.5-pro": "kimi-k2-0711-preview",
"gemini-2.5-flash": "moonshot-v1-auto",
},
"default_openai_model": "kimi-k2-0711-preview",
"server": {
"host": "0.0.0.0",
"port": 8000,
"log_level": "info"
},
"logging": {
"enable_detailed_logs": true,
"enable_access_logs": true,
"log_directory": "logs"
}
}python gemini_proxy_for_kimi.pyThe service will start at http://0.0.0.0:8000.
curl -X POST http://localhost:8000/v1beta/models/gemini-2.5-pro:generateContent \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [{"text": "Hello, how are you?"}]
}],
"generationConfig": {
"temperature": 0.7,
"maxOutputTokens": 200
}
}'curl -X POST http://localhost:8000/v1beta/models/gemini-2.5-pro:streamGenerateContent \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [{"text": "Write a short story"}]
}]
}'curl -X POST http://localhost:8000/v1beta/models/gemini-2.5-pro:countTokens \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [{"text": "Count tokens for this text"}]
}]
}'curl -X POST http://localhost:8000/v1beta/models/gemini-2.5-pro:generateContent \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"parts": [{"text": "What is the weather like in Beijing?"}]
}],
"tools": [{
"functionDeclarations": [{
"name": "get_weather",
"description": "Get weather information for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
"date": {"type": "string", "description": "Date in YYYY-MM-DD format"}
},
"required": ["city"]
}
}]
}]
}'{
"openai_api_key": "sk-your-kimi-api-key",
"openai_base_url": "https://api.moonshot.cn/v1",
"model_mapping": {
"gemini-2.5-pro": "kimi-k2-0711-preview",
"gemini-2.5-flash": "moonshot-v1-auto",
},
"default_openai_model": "kimi-k2-0711-preview",
"server": {
"host": "0.0.0.0",
"port": 8000,
"log_level": "info"
},
"logging": {
"enable_detailed_logs": false,
"enable_access_logs": true,
"log_directory": "logs"
}
}| Option | Description | Default |
|---|---|---|
openai_api_key |
OpenAI API key | Required |
openai_base_url |
OpenAI API base URL | https://api.openai.com/v1 |
model_mapping |
Gemini to OpenAI model mapping | {} |
default_openai_model |
Default OpenAI model | gpt-3.5-turbo |
server.host |
Listen address | 0.0.0.0 |
server.port |
Listen port | 8000 |
server.log_level |
Log level | info |
logging.enable_detailed_logs |
Enable detailed request logs | false |
logging.enable_access_logs |
Enable access logs | true |
logging.log_directory |
Log directory | logs |
{
"openai_api_key": "sk-xxx",
"openai_base_url": "https://api.moonshot.cn/v1",
"model_mapping": {
"gemini-2.5-pro": "kimi-k2-0711-preview",
"gemini-2.5-flash": "kimi-k2-0711-preview"
}
}Note: The following services are theoretically compatible but require your own testing and adjustments. You can enable enable_detailed_logs in the config file to output detailed request and response information for targeted adaptation.
{
"openai_api_key": "sk-xxx",
"openai_base_url": "https://api.openai.com/v1",
"model_mapping": {
"gemini-2.5-pro": "gpt-4o",
"gemini-2.5-flash": "gpt-4o-mini"
}
}{
"openai_api_key": "your-azure-key",
"openai_base_url": "https://your-resource.openai.azure.com/openai/deployments/your-deployment",
"model_mapping": {
"gemini-2.5-pro": "gpt-4",
"gemini-2.5-flash": "gpt-35-turbo"
}
}{
"openai_api_key": "sk-xxx",
"openai_base_url": "https://api.deepseek.com/v1",
"model_mapping": {
"gemini-2.5-pro": "deepseek-chat",
"gemini-2.5-flash": "deepseek-chat"
}
}{
"openai_api_key": "your-zhipu-key",
"openai_base_url": "https://open.bigmodel.cn/api/paas/v4",
"model_mapping": {
"gemini-2.5-pro": "glm-4",
"gemini-2.5-flash": "glm-4-flash"
}
}{
"openai_api_key": "ollama",
"openai_base_url": "http://localhost:11434/v1",
"model_mapping": {
"gemini-2.5-pro": "llama3:8b",
"gemini-2.5-flash": "llama3:8b"
}
}If you need to adapt other OpenAI-compatible services:
- Clone the project:
git clone - Modify configuration: Update API endpoints and model mappings in
config.json - Test functionality: Focus on testing streaming responses, function calls, multi-turn conversations. You can enable enable_detailed_logs in the config file to output detailed request and response information for targeted adaptation.
- Optimize code: Adjust conversion logic based on target service characteristics
Concise access logs showing basic information for each request:
π POST /v1beta/models/gemini-2.5-pro:generateContent - 200 - Model: gemini-2.5-pro - ID: abc12345 - 2.341s
π POST /v1beta/models/gemini-2.5-pro:streamGenerateContent - 200 - Model: gemini-2.5-pro(stream) - ID: def67890 - 5.123s
Complete request/response conversion process (optional):
1_GEMINI_REQUEST- Original Gemini request2_OPENAI_REQUEST- Converted OpenAI request3_OPENAI_RESPONSE- Raw OpenAI response4_GEMINI_RESPONSE- Final Gemini response
- Install Gunicorn:
pip install gunicorn- Create startup script
start.sh:
#!/bin/bash
gunicorn gemini_proxy_for_kimi:app \
--worker-class uvicorn.workers.UvicornWorker \
--workers 4 \
--bind 0.0.0.0:8000 \
--access-logfile - \
--error-logfile - \
--log-level info- Run:
chmod +x start.sh
./start.shCreate /etc/systemd/system/gemini-proxy.service:
[Unit]
Description=Gemini API Proxy
After=network.target
[Service]
Type=exec
User=your-user
Group=your-group
WorkingDirectory=/path/to/your/app
ExecStart=/path/to/your/venv/bin/gunicorn gemini_proxy_for_kimi:app \
--worker-class uvicorn.workers.UvicornWorker \
--workers 4 \
--bind 0.0.0.0:8000
Restart=always
RestartSec=3
[Install]
WantedBy=multi-user.targetStart the service:
sudo systemctl daemon-reload
sudo systemctl enable gemini-proxy
sudo systemctl start gemini-proxyCreate Nginx configuration:
server {
listen 80;
server_name your-domain.com;
location / {
proxy_pass http://127.0.0.1:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Support streaming responses
proxy_buffering off;
proxy_cache off;
# Increase timeout
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
}-
Worker Count: Set based on CPU cores, recommended
2 * CPU_CORES + 1 -
Memory Optimization:
# Limit memory usage
gunicorn --max-requests 1000 --max-requests-jitter 100 ...-
Connection Pool: Add connection pool settings in configuration
-
Caching: Consider adding Redis caching layer for responses
curl http://localhost:8000/healthReturns:
{"status": "healthy", "service": "gemini-proxy"}Use logrotate to manage log files:
# /etc/logrotate.d/gemini-proxy
/path/to/your/app/logs/*.log {
daily
rotate 30
compress
delaycompress
missingok
create 644 your-user your-group
postrotate
systemctl reload gemini-proxy
endscript
}Access logs include:
- Request method and path
- Response status code
- Model used
- Request ID
- Response time
# Enable detailed logs
# Set in config.json
{
"logging": {
"enable_detailed_logs": true,
"enable_access_logs": true
}
}
# Start development server
python gemini_proxy_for_kimi.py- View detailed logs: Enable
enable_detailed_logsto see complete conversion process - Model mapping testing: Test mapping with different Gemini model names
- Streaming response debugging: Observe SSE data streams
- Function call debugging: Check format conversion of tool calls
Issues and Pull Requests are welcome!
git clone
cd gemini-proxy
pip install -r requirements.txt- Follow PEP 8 code style
- Add appropriate type annotations
- Write clear docstrings
- Ensure backward compatibility
This project is licensed under the MIT License. See LICENSE file for details.
A: Supports generateContent, streamGenerateContent, countTokens, and health endpoints.
A: Currently only fully tested on Moonshot Kimi. Other OpenAI-compatible services are theoretically usable but require your own testing and adjustments.
A: 1) Clone the project code 2) Modify config.json configuration 3) Focus on testing streaming responses, function calls, etc. 4) Adjust code logic as needed
A: Different LLM services have variations in API details, response formats, error handling, etc., requiring targeted testing and optimization. Currently, efforts are focused on complete Kimi adaptation.
A: Check if the client correctly handles text/event-stream format and ensure the network environment supports SSE.
A: Increase Gunicorn worker count, use load balancers, consider adding caching layers.
β If this project helps you, please give us a Star!