|
| 1 | +# How It Works |
| 2 | + |
| 3 | +This page explains the internal architecture and flow of the asyncio-https-proxy library, showing how components interact to provide HTTPS interception and request/response handling. |
| 4 | + |
| 5 | +## Architecture Overview |
| 6 | + |
| 7 | +The asyncio-https-proxy library follows a handler-based architecture where each client connection spawns a new handler instance that manages the complete request/response lifecycle. |
| 8 | + |
| 9 | +### Core Components |
| 10 | + |
| 11 | +#### Server Layer |
| 12 | +- **`start_proxy_server()`** (`server.py`) - Main entry point for starting the proxy server |
| 13 | +- **Connection Handler** (`server.py`) - Manages incoming client connections and creates handler instances |
| 14 | + |
| 15 | +#### Handler Layer |
| 16 | +- **[`HTTPSProxyHandler`](reference/https_proxy_handler.md)** (`https_proxy_handler.py`) - Base class providing lifecycle hooks for customization |
| 17 | +- **[`HTTPSForwardProxyHandler`](reference/https_forward_proxy_handler.md)** (`https_forward_proxy_handler.py`) - Ready-to-use implementation for forwarding requests |
| 18 | +- **Custom Handlers** - User implementations extending `HTTPSProxyHandler` |
| 19 | + |
| 20 | +#### HTTP Layer |
| 21 | +- **[`HTTPRequest`](reference/http_request.md)** (`http_request.py`) - Parses and represents HTTP requests |
| 22 | +- **[`HTTPResponse`](reference/http_response.md)** (`http_response.py`) - Handles HTTP response data and streaming |
| 23 | +- **[`HTTPHeader`](reference/http_header.md)** (`http_header.py`) - Utilities for HTTP header manipulation |
| 24 | + |
| 25 | +#### TLS Layer |
| 26 | +- **[`TLSStore`](reference/tls_store.md)** (`tls_store.py`) - Manages dynamic certificate generation and caching |
| 27 | + |
| 28 | +#### Component Relationships |
| 29 | + |
| 30 | +- **Server** creates new **Handler** instances for each client connection |
| 31 | +- **Handlers** use **HTTP components** to parse and manipulate requests/responses |
| 32 | +- **Handlers** interact with **TLS Store** for certificate management during HTTPS interception |
| 33 | +- **Custom Handlers** extend the base **HTTPSProxyHandler** to implement specific behavior |
| 34 | + |
| 35 | + |
| 36 | +## Key Design Principles |
| 37 | + |
| 38 | +### 1. Handler-Based Architecture |
| 39 | +Each client connection gets its own handler instance, providing isolation and allowing per-connection customization. |
| 40 | + |
| 41 | +### 2. Asyncio Native |
| 42 | +All operations are async/await based, ensuring non-blocking I/O and high concurrency. |
| 43 | + |
| 44 | +### 3. Streaming Support |
| 45 | +Both requests and responses are processed as streams, allowing handling of large payloads without memory issues. |
| 46 | + |
| 47 | +### 4. Certificate Management |
| 48 | +Dynamic certificate generation enables transparent HTTPS interception without pre-configuration. |
| 49 | + |
| 50 | +### 5. Extensible Design |
| 51 | +The base `HTTPSProxyHandler` can be extended for custom behavior, while `HTTPSForwardProxyHandler` provides a ready-to-use forwarding implementation. |
| 52 | + |
| 53 | +## Handler Lifecycle |
| 54 | + |
| 55 | +Each handler instance follows a well-defined lifecycle with extensible hook points: |
| 56 | + |
| 57 | +```mermaid |
| 58 | +stateDiagram-v2 |
| 59 | + [*] --> Created: New connection |
| 60 | + Created --> Connected: on_client_connected() |
| 61 | + Connected --> RequestReceived: Parse request |
| 62 | + RequestReceived --> RequestProcessed: on_request_received() |
| 63 | + RequestProcessed --> ResponseReceived: Forward to target |
| 64 | + ResponseReceived --> ResponseProcessed: on_response_received() |
| 65 | + ResponseProcessed --> Streaming: Stream response chunks |
| 66 | + Streaming --> Complete: on_response_complete() |
| 67 | + Complete --> [*]: Connection closed |
| 68 | + |
| 69 | + note right of RequestProcessed |
| 70 | + Handler can modify |
| 71 | + request before forwarding |
| 72 | + end note |
| 73 | + |
| 74 | + note right of ResponseProcessed |
| 75 | + Handler can modify |
| 76 | + response before streaming |
| 77 | + end note |
| 78 | +``` |
| 79 | + |
| 80 | +### Handler Hook Points |
| 81 | + |
| 82 | +The `HTTPSProxyHandler` base class provides several hook methods for customization: |
| 83 | + |
| 84 | +- **`on_client_connected()`**: Called after successful connection establishment |
| 85 | +- **`on_request_received()`**: Called after parsing complete HTTP request |
| 86 | +- **`on_response_received()`**: Called when response headers are received from target |
| 87 | +- **`on_response_complete()`**: Called after complete response has been streamed to client |
| 88 | + |
| 89 | +These hooks enable: |
| 90 | +- Request/response logging and analytics |
| 91 | +- Content modification and filtering |
| 92 | +- Custom authentication and authorization |
| 93 | +- Traffic shaping and rate limiting |
| 94 | +- Security scanning and threat detection |
| 95 | + |
| 96 | + |
| 97 | +## Request Flow |
| 98 | + |
| 99 | +### HTTP Request Flow |
| 100 | + |
| 101 | +```mermaid |
| 102 | +sequenceDiagram |
| 103 | + participant Client |
| 104 | + participant Proxy |
| 105 | + participant Handler |
| 106 | + participant Target |
| 107 | + |
| 108 | + Client->>Proxy: HTTP CONNECT or Direct Request |
| 109 | + Proxy->>Handler: Create Handler Instance |
| 110 | + |
| 111 | + alt HTTP Request |
| 112 | + Handler->>Handler: Parse HTTP Request |
| 113 | + Handler->>Handler: on_request_received() |
| 114 | + Handler->>Target: Forward Request |
| 115 | + Target->>Handler: Response |
| 116 | + Handler->>Handler: on_response_received() |
| 117 | + Handler->>Client: Stream Response |
| 118 | + Handler->>Handler: on_response_complete() |
| 119 | + else HTTPS Request |
| 120 | + Handler->>Client: 200 Connection Established |
| 121 | + Note over Client,Handler: TLS Handshake with Generated Certificate |
| 122 | + Client->>Handler: Encrypted HTTP Request |
| 123 | + Handler->>Handler: Decrypt & Parse Request |
| 124 | + Handler->>Handler: on_request_received() |
| 125 | + Handler->>Target: Forward Request (with TLS) |
| 126 | + Target->>Handler: Response |
| 127 | + Handler->>Handler: on_response_received() |
| 128 | + Handler->>Client: Encrypt & Stream Response |
| 129 | + Handler->>Handler: on_response_complete() |
| 130 | + end |
| 131 | +``` |
| 132 | + |
| 133 | +### HTTPS Interception Flow |
| 134 | + |
| 135 | +```mermaid |
| 136 | +sequenceDiagram |
| 137 | + participant Client |
| 138 | + participant Proxy |
| 139 | + participant TLSStore |
| 140 | + participant Handler |
| 141 | + participant Target |
| 142 | + |
| 143 | + Client->>Proxy: CONNECT example.com:443 |
| 144 | + Proxy->>Handler: Create Handler Instance |
| 145 | + Handler->>Client: 200 Connection Established |
| 146 | + |
| 147 | + Note over Client,Handler: Client initiates TLS handshake |
| 148 | + Client->>Handler: ClientHello |
| 149 | + Handler->>TLSStore: Get/Generate Certificate for example.com |
| 150 | + TLSStore->>TLSStore: Generate Certificate if needed |
| 151 | + TLSStore->>Handler: Return Certificate & Private Key |
| 152 | + Handler->>Client: ServerHello + Certificate |
| 153 | + |
| 154 | + Note over Client,Handler: TLS Handshake Complete |
| 155 | + Client->>Handler: Encrypted HTTP Request |
| 156 | + Handler->>Handler: Decrypt Request |
| 157 | + Handler->>Handler: Parse HTTP Request |
| 158 | + Handler->>Handler: on_request_received() |
| 159 | + |
| 160 | + Handler->>Target: Forward Request (new TLS connection) |
| 161 | + Target->>Handler: Response |
| 162 | + Handler->>Handler: on_response_received() |
| 163 | + Handler->>Handler: Process Response Chunks |
| 164 | + Handler->>Client: Encrypt & Send Response |
| 165 | + Handler->>Handler: on_response_complete() |
| 166 | +``` |
| 167 | + |
| 168 | + |
| 169 | +## TLS Certificate Management |
| 170 | + |
| 171 | +The TLS Store manages dynamic certificate generation for HTTPS interception: |
| 172 | + |
| 173 | +The library automatically generates and caches certificates for each unique hostname: |
| 174 | + |
| 175 | +```mermaid |
| 176 | +graph TD |
| 177 | + Request[Certificate Request for hostname] --> Check{Certificate Exists in Cache?} |
| 178 | + Check -->|Yes| Return[Return Cached Certificate & Key] |
| 179 | + Check -->|No| Generate[Generate New Certificate] |
| 180 | + Generate --> SAN[Add Subject Alternative Names] |
| 181 | + SAN --> Sign[Sign with Root CA Certificate] |
| 182 | + Sign --> Cache[Cache Certificate & Private Key] |
| 183 | + Cache --> Return |
| 184 | + |
| 185 | + subgraph "Certificate Components" |
| 186 | + CA[Root CA Certificate<br/>Self-signed authority] |
| 187 | + Cert[Host Certificate<br/>Signed by CA] |
| 188 | + Key[Private Key<br/>For TLS handshake] |
| 189 | + end |
| 190 | + |
| 191 | + Generate --> CA |
| 192 | + Sign --> Cert |
| 193 | + Sign --> Key |
| 194 | +``` |
0 commit comments