Skip to content

Commit 04c327a

Browse files
committed
docs: update README to enhance clarity and detail on features and usage
1 parent e4a6676 commit 04c327a

1 file changed

Lines changed: 140 additions & 88 deletions

File tree

README.md

Lines changed: 140 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -1,121 +1,173 @@
11
# Gcli-Nexus
22

3-
> High-performance Gemini CLI reverse proxy that talks to the raw Cloud Code Gemini endpoints while presenting Gemini-native responses.
3+
[![GitHub Release](https://img.shields.io/github/v/release/Yoo1tic/gcli-nexus?style=flat&logo=github&color=blue)](https://github.com/Yoo1tic/gcli-nexus/releases/latest)
4+
[![License](https://img.shields.io/badge/License-MIT-green?style=flat)](LICENSE)
45

5-
## Highlights
6+
**Gcli-Nexus is a high-performance Rust adapter that bridges Gemini CLI (Cloud Code) to the Standard Gemini API.**
67

7-
- **Gemini-native proxy for the official CLI**: accepts `/v1beta/models/{model}:generateContent` and `:streamGenerateContent` payloads from `geminicli` while converting upstream CLI envelopes back into the standard Gemini response shape.
8-
- **SSE-friendly normalization**: streaming events land in the Gemini-native `candidates/usageMetadata/modelVersion` shape so dashboards and SDKs can consume them directly.
9-
- **Credential pool with actor scheduling**: a `ractor`-driven worker manages Google OAuth credentials stored in SQLite, separates “big” and “tiny” model queues, cools down projects that hit 429, and refreshes tokens only when a credential is near expiry or fails authentication.
10-
- **Operable out of the box**: `.env` configuration via Figment/dotenvy, SQLite (`data.db`) that bootstraps automatically, structured tracing, and a `mimalloc` global allocator for predictable latency.
11-
- **One-click browser auth**: hitting `/auth` in a browser jumps straight to Google OAuth for login/consent.
8+
It acts as a headless protocol bridge: feeding raw GCP service accounts turns into a drop-in `/v1beta/models` interface. It normalizes proprietary CLI streams into standard JSON/SSE, compatible with LangChain, curl, and modern AI clients.
129

13-
## Quick start
10+
### Highlights
11+
12+
- **Protocol Standardization**: Exposes native Gemini API endpoints (`:generateContent`, `:streamGenerateContent`) backed by Cloud Code credentials.
13+
- **Actor-Driven Concurrency**: Built on `ractor` for zero-lock scheduling, enabling high throughput with minimal resource overhead.
14+
- **Headless & Self-Healing**: Zero management UI. Traffic automatically scrubs invalid tokens and repairs the pool asynchronously.
15+
- **Hot-Swapping**: Scale capacity instantly via the `/auth` endpoint without restarting the service or dropping connections.
16+
- **Portable**: Ships as a single static binary (Linux/macOS/Windows) or a lightweight Docker container.
17+
18+
## API Endpoints
19+
20+
Authentication requires the `x-goog-api-key` header (or `?key=` query parameter).
21+
22+
| Endpoint | Method | Auth | Description |
23+
| :--------------------------------------------- | :----- | :--- | :---------------------------------------------------------------- |
24+
| `/v1beta/models/{model}:generateContent` | `POST` || **Core Interface**. Standard chat completion (unary). |
25+
| `/v1beta/models/{model}:streamGenerateContent` | `POST` || **Core Interface**. Standard chat completion (streaming). |
26+
| `/v1beta/models` | `GET` || Lists supported models in standard Gemini JSON format. |
27+
| `/v1beta/openai/models` | `GET` || Lists supported models in OpenAI-compatible format. |
28+
| `/auth` | `GET` || **Hot-Swapping**. Initiates OAuth flow to inject new credentials. |
29+
| `/oauth2callback` | `GET` || Internal callback handler for Google OAuth redirects. |
30+
31+
## Quick Start
1432

1533
### Prerequisites
1634

17-
- Google Cloud projects that already have Gemini CLI access; export each account as the JSON blob that contains `client_id`, `client_secret`, `refresh_token`, `project_id`, etc.
18-
- For the prebuilt binary: Linux host with SQLite available (no Rust toolchain required).
19-
- For containers: Docker + docker compose. Building from source remains possible with Rust 1.78+ if needed.
20-
21-
### Run the prebuilt binary
22-
23-
1. Copy the sample environment file and fill in secrets:
24-
```bash
25-
cp .env.example .env
26-
# edit NEXUS_KEY plus (optionally) DATABASE_URL, BIGMODEL_LIST, PROXY...
27-
```
28-
2. Drop every Gemini credential JSON into the folder referenced by `CRED_PATH` (default `./credentials`). On startup the actor will normalize, refresh, and persist them into SQLite. Additions today require a restart to be ingested.
29-
3. Download the latest release binary for your platform, make it executable, and run it from the project root:
30-
```bash
31-
chmod +x gcli-nexus
32-
./gcli-nexus
33-
```
34-
The server binds `0.0.0.0:8188`. Logs reveal how many credentials were activated and whether a proxy is in use.
35-
36-
### Run with docker compose
37-
38-
1. Copy the compose template and set secrets:
39-
```bash
40-
cp docker-compose.yml.example docker-compose.yml
41-
# edit NEXUS_KEY and other options in docker-compose.yml
42-
```
43-
2. Ensure local folders exist for persistence and credentials:
44-
```bash
45-
mkdir -p data credentials
46-
# place credential JSON files under ./credentials
47-
```
48-
3. Start the stack:
49-
```bash
50-
docker compose up -d
51-
```
52-
The service listens on `0.0.0.0:8188` and stores SQLite data under `./data`.
35+
- **Google Account**: A Google account with access to Gemini CLI (Cloud Code).
36+
- **Environment**:
37+
- **Docker** (Recommended) for containerized deployment.
38+
- **Linux Host** with SQLite if running the binary directly.
5339

54-
## Configuration
40+
### 1. Start the Service
41+
42+
You can start Gcli-Nexus immediately with an empty credential pool.
43+
44+
#### Option A: Docker Compose (Recommended)
45+
46+
1. **Setup Directories**:
47+
48+
```bash
49+
mkdir -p gcli-nexus/data
50+
cd gcli-nexus
51+
```
52+
53+
2. **Create Compose File**:
54+
Copy `docker-compose.yml.example` or create a new `docker-compose.yml`:
55+
56+
3. **Launch**:
57+
58+
```bash
59+
docker compose up -d
60+
```
61+
62+
#### Option B: Prebuilt Binary
63+
64+
1. **Prepare Environment**:
65+
66+
```bash
67+
cp .env.example .env
68+
# Edit .env to set NEXUS_KEY and MODEL_LIST
69+
```
70+
71+
2. **Run**:
72+
73+
```bash
74+
chmod +x gcli-nexus
75+
./gcli-nexus
76+
```
77+
78+
The server binds to `0.0.0.0:8188` by default.
79+
80+
### 2. Onboard Credentials (Instant & Dynamic)
81+
82+
Gcli-Nexus supports **Hot-Swapping**. You can add credentials at runtime without restarting the service.
5583

56-
| Env var | Required | Default | Description |
57-
| ------------------------------------ | -------- | ------------------ | --------------------------------------------------------------------------------------------------------------------- |
58-
| `NEXUS_KEY` | Yes | _none_ | Shared secret checked on every request via `x-goog-api-key`, `Authorization: Bearer`, or `?key=`. |
59-
| `DATABASE_URL` | No | `sqlite://data.db` | SQLite DSN; the actor creates the file/migrations automatically. |
60-
| `LOGLEVEL` | No | `info` | Tracing level (`error`, `warn`, `info`, `debug`, `trace`). `RUST_LOG` still works as a fallback. |
61-
| `BIGMODEL_LIST` | No | `[]` | JSON array of model names treated as “big”. They get their own queue/cooldown bucket to avoid starving lighter chats. |
62-
| `CRED_PATH` | No | unset | Directory that is scanned once during startup for credential JSON; leave unset to rely purely on SQLite contents. |
63-
| `OAUTH_TPS` | No | `10` | OAuth refresh requests per second; refresh buffer/burst sizes are derived as `OAUTH_TPS * 2`. |
64-
| `GEMINI_RETRY_MAX_TIMES` | No | `3` | Max retry attempts for Gemini CLI upstream calls. |
65-
| `ENABLE_MULTIPLEXING` | No | `false` | Allow outbound reqwest clients to use HTTP/2 multiplexing; keep `false` to force HTTP/1-only behavior. |
66-
| `PROXY` | No | unset | Outbound HTTP proxy applied to both the Gemini caller and the OAuth refresh client (supports HTTP/SOCKS). |
67-
| `DATABASE_URL`, `PROXY`, `CRED_PATH` ||| Accept absolute or relative paths; Figment merges `.env` values automatically. |
84+
#### Method A: Browser-Based Auto Ingestion (Easiest)
6885

69-
### Credential lifecycle
86+
1. Navigate to `http://<your-server-ip>:8188/auth` in your browser.
87+
2. Complete the Google OAuth login flow.
88+
3. **Done.** The credential is automatically captured, persisted to SQLite, and **immediately injected** into the scheduling queue.
89+
4. Repeat for as many accounts as needed.
7090

71-
1. **Ingestion**: Each JSON file is parsed via `GoogleCredential::from_payload`, refreshed immediately, and upserted into SQLite. Duplicate `project_id`s are replaced atomically.
72-
2. **Queues**: Active credentials are pushed into both the “big” and “tiny” queues; requests choose a queue based on whether `model` matches `BIGMODEL_LIST`.
73-
3. **Rate limits**: When a 429 response contains `quotaResetTimeStamp`, the actor parks the credential for that many seconds before putting it back in queue.
74-
4. **Refresh flow**: 401/403 responses trigger `ReportInvalid` → refresh pipeline → DB update → re-enqueue. Failing refreshes disable the credential (status=false).
75-
5. **Persistence**: Because the DB is authoritative, restarts reuse the latest access tokens/expiry timestamps without re-reading every JSON file.
91+
#### Method B: Manual JSON File (Legacy)
7692

77-
## API usage
93+
If you already have credential JSON files (containing `project_id` and `refresh_token`), place them into the `credentials/` directory.
7894

79-
### Authentication
95+
- **Docker**: Place files in the mapped `./credentials` volume.
96+
- **Binary**: Place files in the directory referenced by `CRED_PATH`.
8097

81-
- Send `x-goog-api-key: <NEXUS_KEY>` (preferred).
82-
- Or append `?key=<NEXUS_KEY>` to the request URL.
83-
- Visit `/auth` in a browser to be redirected to Google OAuth for login/consent.
98+
_Note: Files added manually usually require a restart to be ingested, whereas Method A is instant._
99+
100+
### Credential JSON Format (For Method B)
101+
102+
```json
103+
{
104+
"project_id": "my-gcp-project",
105+
"refresh_token": "1//0gExampleRefreshToken"
106+
}
107+
```
84108

85-
### Generate content (non-streaming)
109+
_Only `project_id` and `refresh_token` are strictly required. Missing fields (like `access_token`) are automatically filled during the first refresh._
110+
111+
### Usage
112+
113+
Gcli-Nexus exposes a standard Gemini-compatible surface.
114+
115+
**Generate Content:**
86116

87117
```bash
88118
curl -X POST http://localhost:8188/v1beta/models/gemini-2.5-pro:generateContent \
89119
-H "x-goog-api-key: $NEXUS_KEY" \
90120
-H "Content-Type: application/json" \
91121
-d '{
92-
"contents":[{"role":"user","parts":[{"text":"hello from gcli-nexus"}]}]
122+
"contents":[{"role":"user","parts":[{"text":"Hello World"}]}]
93123
}'
124+
94125
```
95126

96-
### Streaming
127+
## Configuration
97128

98-
```bash
99-
curl --no-buffer -X POST \
100-
http://localhost:8188/v1beta/models/gemini-2.5-pro:streamGenerateContent \
101-
-H "x-goog-api-key: $NEXUS_KEY" \
102-
-H "Content-Type: application/json" \
103-
-d '{"contents":[{"role":"user","parts":[{"text":"stream"}]}]}'
104-
```
129+
| Env var | Required | Default | Description |
130+
| ------------------------ | -------- | ------------------------------------------------------------ | -------------------------------------------------------------------------------- |
131+
| `LOGLEVEL` | No | `info` | Logging verbosity for tracing (e.g. `error`, `warn`, `info`, `debug`, `trace`). |
132+
| `LISTEN_ADDR` | No | `0.0.0.0` | HTTP server listen address. |
133+
| `LISTEN_PORT` | No | `8188` | HTTP server listen port. |
134+
| `NEXUS_KEY` | Yes | `pwd` | Required Nexus API key used to authorize inbound requests. |
135+
| `MODEL_LIST` | No | `"[gemini-2.5-flash, gemini-2.5-pro, gemini-3-pro-preview]"` | JSON array of Gemini models. |
136+
| `CRED_PATH` | No | `./credentials` | Optional directory containing credential JSON files to preload. |
137+
| `OAUTH_TPS` | No | `5` | OAuth refresh requests per second (TPS) for the refresh worker. |
138+
| `ENABLE_MULTIPLEXING` | No | `false` | Allow reqwest clients to use HTTP/2 multiplexing. Leave `false` to force HTTP/1. |
139+
| `GEMINI_RETRY_MAX_TIMES` | No | `3` | Max retry attempts for Gemini CLI upstream calls. |
140+
| `PROXY` | No | unset | Optional outbound HTTP proxy (`scheme://user:pass@host:port`). Remove if unused. |
141+
142+
## Technical Details
143+
144+
### 1. Dynamic Scalability (Hot-Swapping)
145+
146+
Adding capacity is instantaneous.
147+
148+
- **Zero-Touch Ingestion**: Visit the `/auth` endpoint to authenticate a new account. The credential is automatically persisted to the database and **immediately injected** into the scheduling loop.
149+
- **No Restarts**: Scale your pool from 1 to 1,000 credentials at runtime without dropping a single connection.
150+
151+
### 2. Traffic-Driven Maintenance
152+
153+
We don't run expensive background cron jobs to check for expired tokens. Instead, we use live traffic as a probe.
154+
155+
- **Lazy Self-Healing**: A credential's validity is verified only when a request hits the proxy. Invalid tokens (401/403) are instantly quarantined and repaired asynchronously.
156+
- **Auto-Convergence**: The higher the concurrency, the faster the system converges to a clean state.
157+
158+
### 3. Zero-Lock Concurrency
159+
160+
Built on the **Actor Model (Ractor)**, Gcli-Nexus eliminates the mutex contention that plagues traditional multi-threaded proxies.
105161

106-
### Error semantics
162+
- **In-Memory Scheduling**: The critical path (Client -> Actor -> Client) is purely single-threaded and non-blocking.
163+
- **Decoupled IO**: Database writes (SQLite WAL) and OAuth refreshes are offloaded to detached workers, ensuring the proxy latency remains stable under load.
107164

108-
- `401/403` from upstream map to a temporary `502/500` locally after a refresh attempt; the credential is refreshed before reuse.
109-
- `429` returns upstream headers/body untouched while the offending credential cools down.
110-
- `503` with `{"error":"no available credential"}` means all queues are empty or cooling—add more credentials or wait for cooldowns.
165+
### 4. Precision Rate Limiting
111166

112-
## Operations
167+
Handling upstream Rate Limits (429) is a scheduling problem, not an error handling problem.
113168

114-
- **Logging**: Structured tracing goes to stdout; set `LOGLEVEL=debug` for detailed actor logs (queue lengths, refresh states). Use `RUST_LOG` for per-module overrides.
115-
- **Database**: `data.db` lives at the path inside `DATABASE_URL`; backup the file periodically if you care about history.
116-
- **Proxying**: Set `PROXY` (e.g. `http://127.0.0.1:1080`) if your network requires outbound proxying; both Gemini traffic and OAuth refresh calls use it.
117-
- **Credential rotation**: Update the JSON file, restart the binary, or seed SQLite manually; the actor upserts by `project_id`.
118-
- **Security**: Treat `.env`, `credentials/*.json`, and `data.db` as sensitive—they contain refresh and access tokens.
169+
- **The Waiting Room**: Rate-limited credentials are parked in a **Binary Heap**.
170+
- **O(1) Wakeups**: We strictly avoid polling. Credentials are reclaimed into the active queue at the exact millisecond their quota resets.
119171

120172
## License
121173

0 commit comments

Comments
 (0)