Skip to content

Commit 831265f

Browse files
authored
Merge pull request #12 from project-aethermesh/feat/use-public-endpoints
feat: introduce a new public-first load-balancing strategy
2 parents 82d8b37 + d318b0a commit 831265f

9 files changed

Lines changed: 202 additions & 63 deletions

File tree

.env.example

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,10 @@ PROXY_MAX_RETRIES=3
5050
PROXY_TIMEOUT=15
5151
## Timeout per individual retry attempt in seconds
5252
PROXY_TIMEOUT_PER_TRY=5
53+
## Prioritize public endpoints over primary endpoints (true/false)
54+
PUBLIC_FIRST=false
55+
## Number of attempts to make at public endpoints before trying primary/fallback
56+
PUBLIC_FIRST_ATTEMPTS=2
5357

5458
# Logging Configuration
5559
## Log level: debug, info, warn, error, fatal, panic

README.md

Lines changed: 48 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@ A lightweight, low-latency RPC load balancer written in Go. It is designed to ma
55
## Features
66

77
- **Round-Robin Load Balancing**: Distributes requests to available endpoints in a round-robin manner, prioritizing those with fewer requests in the last 24 hours.
8-
- **Intelligent Retry Logic**: Configurable retry attempts with priority-based endpoint selection (first primary endpoints, then fallbacks).
8+
- **Intelligent Retry Logic**: Configurable retry attempts with priority-based endpoint selection (primary endpoints, fallbacks, and optional public-first mode).
9+
- **Public-First Mode**: Optional prioritization of public RPC endpoints to reduce costs while maintaining reliability.
910
- **Flexible Timeout Control**: Separate timeouts for overall requests and individual retry attempts.
1011
- **Rate Limit Recovery**: Safe rate limit detection and recovery with exponential backoff strategies per endpoint, to avoid making things worse when a provider is rate-limiting you.
1112
- **Health Checks**: Regularly checks the health of upstream endpoints and updates their status in Redis.
@@ -109,10 +110,13 @@ The load balancer implements intelligent retry logic with configurable timeouts:
109110

110111
### How Retries Work
111112

112-
1. **Priority-based selection**: Always tries primary endpoints first, then fallbacks.
113+
1. **Priority-based selection**: Endpoint selection follows these priorities:
114+
- **Normal mode**: primary → fallback → public
115+
- **Public-first mode** (`PUBLIC_FIRST=true`): public → primary → fallback
113116
2. **Configurable attempts**: Retries up to `PROXY_MAX_RETRIES` times.
114-
3. **Endpoint rotation**: Removes failed endpoints from the retry pool to avoid repeated failures.
115-
4. **Dual timeout control**: There are 2 settings that control how long requests take:
117+
3. **Public endpoint limiting**: When `PUBLIC_FIRST=true`, attempts to reach public endpoints are limited to the value of `PUBLIC_FIRST_ATTEMPTS`, after which the proxy tries using a primary or fallback endpoint.
118+
4. **Endpoint rotation**: Removes failed endpoints from the retry pool to avoid repeated failures.
119+
5. **Dual timeout control**: There are 2 settings that control how long requests take:
116120
- **Total request timeout** (`PROXY_TIMEOUT`): Maximum time for the entire request (this is what the end user "sees").
117121
- **Per-try timeout** (`PROXY_TIMEOUT_PER_TRY`): Maximum time per individual request sent from the proxy to each endpoint.
118122

@@ -141,6 +145,8 @@ The load balancer implements intelligent retry logic with configurable timeouts:
141145
| `--proxy-retries` | `3` | Maximum number of retries for proxy requests |
142146
| `--proxy-timeout` | `15` | Total timeout for proxy requests in seconds |
143147
| `--proxy-timeout-per-try` | `5` | Timeout per individual retry attempt in seconds |
148+
| `--public-first` | `false` | Prioritize public endpoints over primary endpoints |
149+
| `--public-first-attempts` | `2` | Number of attempts to make at public endpoints before trying primary/fallback |
144150
| `--redis-host` | `localhost` | Redis server hostname |
145151
| `--redis-pass` | - | Redis server password |
146152
| `--redis-port` | `6379` | Redis server port |
@@ -170,6 +176,8 @@ The load balancer implements intelligent retry logic with configurable timeouts:
170176
| `PROXY_MAX_RETRIES` | `3` | Maximum number of retries for proxy requests |
171177
| `PROXY_TIMEOUT` | `15` | Total timeout for proxy requests in seconds |
172178
| `PROXY_TIMEOUT_PER_TRY` | `5` | Timeout per individual retry attempt in seconds |
179+
| `PUBLIC_FIRST` | `false` | Prioritize public endpoints over primary and fallback endpoints |
180+
| `PUBLIC_FIRST_ATTEMPTS` | `2` | Number of attempts to make at public endpoints before trying with a primary/fallback |
173181
| `REDIS_HOST` | `localhost` | Redis server hostname |
174182
| `REDIS_PASS` | - | Redis server password |
175183
| `REDIS_PORT` | `6379` | Redis server port |
@@ -215,6 +223,42 @@ For production deployments with multiple load balancer pods, use the standalone
215223
- **Resource Efficiency**: Reduces RPC endpoint usage
216224
- **Better Separation of Concerns**: Health monitoring isolated from request handling
217225

226+
## Public-First Mode
227+
228+
Ætherlay supports a "public-first" mode that prioritizes public RPC endpoints over primary and fallback endpoints to help reduce costs while maintaining reliability.
229+
230+
### How Public-First Mode Works
231+
232+
1. **Enable public-first**: Set `PUBLIC_FIRST=true` (or use the `--public-first` CLI flag)
233+
2. **Configure attempts**: Set `PUBLIC_FIRST_ATTEMPTS` to control how many public endpoints to try (default: 2)
234+
3. **Endpoint hierarchy**:
235+
- **When enabled**: public → primary → fallback
236+
- **When disabled**: primary → fallback → public
237+
238+
### Configuration Example
239+
240+
In your `endpoints.json`, mark endpoints with `"role": "public"`:
241+
242+
```json
243+
{
244+
"mainnet": {
245+
"publicnode-1": {
246+
"provider": "publicnode",
247+
"role": "public",
248+
"type": "archive",
249+
"http_url": "https://ethereum-rpc.publicnode.com",
250+
"ws_url": "wss://ethereum-rpc.publicnode.com"
251+
},
252+
"alchemy-1": {
253+
"provider": "alchemy",
254+
"role": "primary",
255+
"type": "archive",
256+
"http_url": "https://eth-mainnet.g.alchemy.com/v2/${ALCHEMY_API_KEY}"
257+
}
258+
}
259+
}
260+
```
261+
218262
## Rate Limit Recovery
219263

220264
Ætherlay includes intelligent rate limit detection and recovery mechanisms to handle upstream provider rate limits gracefully. This system automatically detects when endpoints are rate-limited and implements recovery strategies to restore service.

configs/endpoints-example.json

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"mainnet": {
33
"llama-1": {
44
"provider": "llama",
5-
"role": "primary",
5+
"role": "public",
66
"type": "full",
77
"http_url": "https://eth.llamarpc.com",
88
"rate_limit_recovery": {
@@ -16,7 +16,7 @@
1616
},
1717
"drpc-1": {
1818
"provider": "drpc",
19-
"role": "fallback",
19+
"role": "primary",
2020
"type": "full",
2121
"http_url": "https://eth.drpc.org",
2222
"ws_url": "wss://eth.drpc.org"
@@ -32,13 +32,13 @@
3232
"arbitrum": {
3333
"drpc-1": {
3434
"provider": "drpc",
35-
"role": "primary",
35+
"role": "public",
3636
"type": "full",
3737
"http_url": "https://arbitrum.drpc.org"
3838
},
3939
"publicnode-1": {
4040
"provider": "public_node",
41-
"role": "fallback",
41+
"role": "public",
4242
"type": "archive",
4343
"http_url": "https://arbitrum-one-rpc.publicnode.com",
4444
"ws_url": "wss://arbitrum-one-rpc.publicnode.com"
@@ -47,7 +47,7 @@
4747
"base": {
4848
"alchemy-test": {
4949
"provider": "alchemy",
50-
"role": "primary",
50+
"role": "public",
5151
"type": "archive",
5252
"http_url": "https://base-mainnet.g.alchemy.com/v2/${ALCHEMY_API_KEY}",
5353
"ws_url": "wss://base-mainnet.g.alchemy.com/v2/${ALCHEMY_API_KEY}",
@@ -62,7 +62,7 @@
6262
},
6363
"infura-staging": {
6464
"provider": "infura",
65-
"role": "primary",
65+
"role": "fallback",
6666
"type": "archive",
6767
"http_url": "https://base-mainnet.infura.io/v3/${INFURA_API_KEY}",
6868
"ws_url": "wss://base-mainnet.infura.io/ws/v3/${INFURA_API_KEY}",

internal/config/config.go

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,24 @@ func (c *Config) GetFallbackEndpoints(chain string) []Endpoint {
119119
return fallbackEndpoints
120120
}
121121

122+
// GetPublicEndpoints returns all public endpoints for a chain.
123+
// Public endpoints are free/public RPC nodes that can be prioritized when PUBLIC_FIRST is enabled.
124+
// Returns nil if the chain doesn't exist or has no public endpoints.
125+
func (c *Config) GetPublicEndpoints(chain string) []Endpoint {
126+
endpoints, exists := c.Endpoints[chain]
127+
if !exists {
128+
return nil
129+
}
130+
131+
var publicEndpoints []Endpoint
132+
for _, endpoint := range endpoints {
133+
if endpoint.Role == "public" {
134+
publicEndpoints = append(publicEndpoints, endpoint)
135+
}
136+
}
137+
return publicEndpoints
138+
}
139+
122140
// DefaultRateLimitRecovery returns the default rate limit recovery configuration
123141
func DefaultRateLimitRecovery() RateLimitRecovery {
124142
return RateLimitRecovery{

internal/config/config_test.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,8 @@ func TestLoadConfig(t *testing.T) {
3333
t.Errorf("Expected provider 'llama', got '%s'", llamaEndpoint.Provider)
3434
}
3535

36-
if llamaEndpoint.Role != "primary" {
37-
t.Errorf("Expected role 'primary', got '%s'", llamaEndpoint.Role)
36+
if llamaEndpoint.Role != "public" {
37+
t.Errorf("Expected role 'public', got '%s'", llamaEndpoint.Role)
3838
}
3939
}
4040

internal/helpers/helpers.go

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ type Config struct {
2525
ProxyMaxRetries int
2626
ProxyTimeout int
2727
ProxyTimeoutPerTry int
28+
PublicFirst bool
29+
PublicFirstAttempts int
2830
RedisHost string
2931
RedisPass string
3032
RedisPort string
@@ -52,6 +54,8 @@ func ParseFlags() *Config {
5254
flag.IntVar(&config.ProxyMaxRetries, "proxy-retries", 3, "Maximum number of retries for proxy requests")
5355
flag.IntVar(&config.ProxyTimeout, "proxy-timeout", 15, "Timeout for proxy requests in seconds")
5456
flag.IntVar(&config.ProxyTimeoutPerTry, "proxy-timeout-per-try", 5, "Timeout per individual retry attempt in seconds")
57+
flag.BoolVar(&config.PublicFirst, "public-first", false, "Prioritize public endpoints over primary endpoints")
58+
flag.IntVar(&config.PublicFirstAttempts, "public-first-attempts", 2, "Number of attempts to make at public endpoints before trying primary/fallback")
5559
flag.StringVar(&config.RedisHost, "redis-host", "localhost", "Redis host")
5660
flag.StringVar(&config.RedisPass, "redis-pass", "", "Redis password")
5761
flag.StringVar(&config.RedisPort, "redis-port", "6379", "Redis port")
@@ -117,6 +121,8 @@ func (c *Config) LoadConfiguration() *LoadedConfig {
117121
ProxyMaxRetries: c.GetIntValue("proxy-retries", c.ProxyMaxRetries, "PROXY_MAX_RETRIES", 3),
118122
ProxyTimeout: c.GetIntValue("proxy-timeout", c.ProxyTimeout, "PROXY_TIMEOUT", 15),
119123
ProxyTimeoutPerTry: c.GetIntValue("proxy-timeout-per-try", c.ProxyTimeoutPerTry, "PROXY_TIMEOUT_PER_TRY", 5),
124+
PublicFirst: c.GetBoolValue("public-first", c.PublicFirst, "PUBLIC_FIRST", false),
125+
PublicFirstAttempts: c.GetIntValue("public-first-attempts", c.PublicFirstAttempts, "PUBLIC_FIRST_ATTEMPTS", 2),
120126
RedisHost: c.GetStringValue("redis-host", c.RedisHost, "REDIS_HOST", "localhost"),
121127
RedisPass: c.GetStringValue("redis-pass", c.RedisPass, "REDIS_PASS", ""),
122128
RedisPort: c.GetStringValue("redis-port", c.RedisPort, "REDIS_PORT", "6379"),
@@ -142,6 +148,8 @@ type LoadedConfig struct {
142148
ProxyMaxRetries int
143149
ProxyTimeout int
144150
ProxyTimeoutPerTry int
151+
PublicFirst bool
152+
PublicFirstAttempts int
145153
RedisHost string
146154
RedisPass string
147155
RedisPort string

0 commit comments

Comments
 (0)