1
1
# Gubernator
2
2
3
- Gubernator is a rate limiting service which calculates rate limits
4
- via a configurable algorithm.
3
+ Gubernator is a distributed, high performance, cloud native and stateless rate
4
+ limiting service designed to support many different rate limiting scenarios.
5
+
6
+ #### Scenarios
7
+ * Meter ingress traffic
8
+ * Meter egress traffic
9
+ * Limit bursts on network queues
10
+ * Enforce capacity limits on network services
5
11
6
12
## Architecture overview
7
13
8
14
![ gubernator arch image] ( /architecture.png )
9
15
10
- Gubernator is designed to run as a cluster of peers which utilize an
11
- in memory cache of all the currently active rate limits, no data is
12
- ever synced to disk. Since most ingress HTTP rate limit durations are
13
- held for only a few seconds losing the in memory cache during a reboot
14
- or scheduled downtime isn't a huge deal. For Gubernator we choose
15
- performance over accuracy as it's acceptable for a small subset of
16
- traffic to be allowed to over request for a short period of time.
17
- Gubernator could be expanded in the future to store rate limits with
18
- longer durations to disk, but currently this is not supported.
19
-
20
- When a rate limit request is made to Gubernator the request is keyed and
21
- a consistent hashing algorithm is applied to determine which of the
22
- peers will be the coordinator for the rate limit request. Choosing a
23
- single coordinator for a rate limit makes atomic increments of counts very fast
24
- and avoids the complexity and latency involved in distributing counts consistently
25
- across a cluster of peers. Although simple and performant this design can be
26
- susceptible to a thundering herd of requests since a single coordinator is responsible
27
- for possibly hundreds of thousands of requests to a rate limit. To combat this the
28
- server can take multiple requests within a specified window and batch the requests
29
- into a single peer request, thus reducing the total number of requests to a single
16
+ Gubernator is designed to run as a distributed cluster of peers which utilize
17
+ an in memory cache of all the currently active rate limits, as such no data is
18
+ ever synced to disk. Since most network based rate limit durations are held for
19
+ only a few seconds losing the in memory cache during a reboot or scheduled
20
+ downtime isn't a huge deal. For Gubernator we choose performance over accuracy
21
+ as it's acceptable for a small subset of traffic to be allowed to over request
22
+ for a short period of time (usually milliseconds) in the case of cache loss.
23
+
24
+ When a rate limit request is made to Gubernator the request is keyed and a
25
+ consistent hashing algorithm is applied to determine which of the peers will be
26
+ the owner of the rate limit request. Choosing a single owner for a rate limit
27
+ makes atomic increments of counts very fast and avoids the complexity and
28
+ latency involved in distributing counts consistently across a cluster of peers.
29
+
30
+ Although simple and performant this design can be susceptible to a thundering
31
+ herd of requests since a single coordinator is responsible for possibly
32
+ hundreds of thousands of requests to a rate limit. To combat this peers can
33
+ take multiple requests within a specified window and batch the requests into a
34
+ single peer request, thus reducing the total number of requests to a single
30
35
Gubernator peer tremendously.
31
36
32
37
To ensure each peer in the cluster accurately calculates the correct hash
@@ -38,18 +43,24 @@ simplify deployment.
38
43
39
44
## Gubernator Operation
40
45
41
- Gubernator does not read from a list of pre-configured rate limits.
42
- Instead each request to a peer includes the rate limit to be applied to the request.
43
- This allows clients who understand their rate limit problem space to create and apply
44
- new rate limit configurations without the need of an out of band process to configure
45
- the rate limiting service.
46
+ Unlike other generic rate limit service implementations, Gubernator does not have
47
+ the concept of pre-configured rate limit that clients make requests against.
48
+ Instead each request to the service includes the rate limit config to be
49
+ applied to the request. This allows clients the flexibility to govern their
50
+ rate limit problem domain without the need to coordinate rate limit
51
+ configuration deployments with Gubernator.
46
52
47
- The rate limit configuration is stored with the current rate limit in the local cache of
48
- the coordinator owner. Rate limits and their configuration that are stored in the local
49
- cache will only exist for the specified duration of the rate limit configuration. After
50
- the duration time has expired, and if the rate limit was not requested again within the
51
- duration it is dropped from the cache. Subsequent requests for the same unique_key will
52
- recreate the config and rate limit in the cache and the cycle will repeat.
53
+ When a client or service makes a request to Gubernator the rate limit config is
54
+ provided with each request by the client. The rate limit configuration is then
55
+ stored with the current rate limit status in the local cache of the rate limit
56
+ owner. Rate limits and their configuration that are stored in the local cache
57
+ will only exist for the specified duration of the rate limit configuration.
58
+ After the duration time has expired, and if the rate limit was not requested
59
+ again within the duration it is dropped from the cache. Subsequent requests for
60
+ the same ` name ` and ` unique_key ` pair will recreate the config and rate limit
61
+ in the cache and the cycle will repeat. Subsequent requests with different
62
+ configs will overwrite the previous config and will apply the new config
63
+ immediately.
53
64
54
65
An example rate limit request sent via GRPC might look like the following
55
66
``` yaml
@@ -58,7 +69,7 @@ rate_limits:
58
69
# other applications that might also use the same unique_key
59
70
name : requests_per_sec
60
71
# A unique_key that identifies this rate limit request
61
- unique_key : account_id=24b00c590856900ee961b275asdfd |source_ip=172.0.0.1
72
+ unique_key : account_id=123 |source_ip=172.0.0.1
62
73
# The number of hits we are requesting
63
74
hits : 1
64
75
# The total number of requests allowed for this rate limit
@@ -69,6 +80,11 @@ rate_limits:
69
80
# 0 = Token Bucket
70
81
# 1 = Leaky Bucket
71
82
algorithm : 0
83
+ # The behavior of the rate limit in gubernator.
84
+ # 0 = BATCHING (Enables batching of requests to peers)
85
+ # 1 = NO_BATCHING (Disables batching)
86
+ # 2 = GLOBAL (Enable global caching for this rate limit)
87
+ behavior : 0
72
88
` ` `
73
89
74
90
And example response would be
@@ -88,34 +104,84 @@ rate_limits:
88
104
# Additional metadata about the request the client might find useful
89
105
metadata :
90
106
# This is the name of the coordinator that rate limited this request
91
- " owner " : " api-n03.staging.us-east-1.definbox.com :9041"
107
+ " owner " : " api-n03.staging.us-east-1.mailgun.org :9041"
92
108
` ` `
93
109
94
110
#### Rate limit Algorithm
95
111
Gubernator currently supports 2 rate limit algorithms.
96
112
97
- 1. [Token Bucket](https://en.wikipedia.org/wiki/Token_bucket) is useful for rate limiting very
98
- bursty traffic. The downside to token bucket is that once you have hit the limit no more requests
99
- are allowed until the configured rate limit duration resets the bucket to zero.
100
- 2. [Leaky Bucket](https://en.wikipedia.org/wiki/Leaky_bucket) is useful for metering traffic
101
- at a consistent rate, as the bucket leaks at a consistent rate allowing traffic to continue
102
- without the need to wait for the configured rate limit duration to reset the bucket to zero.
113
+ 1. **Token Bucket** implementation starts with an empty bucket, then each ` Hit`
114
+ adds a token to the bucket until the bucket is full. Once the bucket is
115
+ full, requests will return `OVER_LIMIT` until the `reset_time` is reached at
116
+ which point the bucket is emptied and requests will return `UNDER_LIMIT`.
117
+ This algorithm is useful for enforcing very bursty limits. (IE : Applications
118
+ where a single request can add more than 1 `hit` to the bucket; or non network
119
+ based queuing systems.) The downside to this implementation is that once you
120
+ have hit the limit no more requests are allowed until the configured rate
121
+ limit duration resets the bucket to zero.
122
+
123
+ 2. [Leaky Bucket](https://en.wikipedia.org/wiki/Leaky_bucket) is implemented
124
+ similarly to **Token Bucket** where `OVER_LIMIT` is returned when the bucket
125
+ is full. However tokens leak from the bucket at a consistent rate which is
126
+ calculated as `duration / limit`. This algorithm is useful for metering, as
127
+ the bucket leaks allowing traffic to continue without the need to wait for
128
+ the configured rate limit duration to reset the bucket to zero.
129
+
130
+ # # Global Limits
131
+ Since Gubernator rate limits are hashed and handled by a single peer in the
132
+ cluster. Rate limits that apply to every request in a data center would result
133
+ in the rate limit request being handled by a single peer for the entirety of
134
+ the data center. For example, consider a rate limit with
135
+ ` name=requests_per_datacenter` and a `unique_id=us-east-1`. Now imagine that a
136
+ request is made to Gubernator with this rate limit for every http request that
137
+ enters the `us-east-1` data center. This could be hundreds of thousands,
138
+ potentially millions of requests per second that are all hashed and handled by
139
+ a single peer in the cluster. Because of this potential scaling issue
140
+ Gubernator introduces a configurable `behavior` called `GLOBAL`.
141
+
142
+ When a rate limit is configured with `behavior=GLOBAL` the rate limit request
143
+ that is received from a client will not be forwarded to the owning peer but
144
+ will be answered from an internal cache handled by the peer. `Hits` toward the
145
+ rate limit will be batched by the receiving peer and sent asynchronously to the
146
+ owning peer where the hits will be totaled and `OVER_LIMIT` calculated. It
147
+ is then the responsibility of the owning peer to update each peer in the
148
+ cluster with the current status of the rate limit, such that peer internal
149
+ caches routinely get updated with the most current rate limit status.
150
+
151
+ # #### Side effects of global behavior
152
+ Since `Hits` are batched and forwarded to the owning peer asynchronously, the
153
+ immediate response to the client will not include the most accurate `remaining`
154
+ counts. As that count will only get updated after the async call to the owner
155
+ peer is complete and the owning peer has had time to update all the peers in
156
+ the cluster. As a result the use of `GLOBAL` allows for greater scale but at
157
+ the cost of consistency. Using `GLOBAL` also increases the amount of traffic
158
+ per rate limit request. `GLOBAL` should only be used for extremely high volume
159
+ rate limits that don't scale well with the traditional non `GLOBAL` behavior.
160
+
161
+ # # Performance
162
+ TODO : Show some performance metrics of gubernator running in production
103
163
104
164
# # API
105
- All Methods implement in GRPC are exposed to HTTP via the
165
+ All methods are accessed via GRPC but are also exposed via HTTP using the
106
166
[GRPC Gateway](https://github.com/grpc-ecosystem/grpc-gateway)
107
167
108
168
# ### Health Check
109
169
Health check returns `unhealthy` in the event a peer is reported by etcd as `up` but the server
110
170
instance is unable to contact the peer via it's advertised address.
111
171
172
+ # ##### GRPC
173
+ ` ` ` grpc
174
+ rpc HealthCheck (HealthCheckReq) returns (HealthCheckResp)
175
+ ` ` `
176
+
177
+ # ##### HTTP
112
178
```
113
179
GET /v1/HealthCheck
114
180
```
115
181
116
182
Example response:
117
183
118
- ```javascript
184
+ ```json
119
185
{
120
186
"status": "healthy",
121
187
"peer_count": 3
@@ -127,18 +193,24 @@ Rate limits can be applied or retrieved using this interface. If the client
127
193
makes a request to the server with ` hits: 0 ` then current state of the rate
128
194
limit is retrieved but not incremented.
129
195
196
+ ###### GRPC
197
+ ``` grpc
198
+ rpc GetRateLimits (GetRateLimitsReq) returns (GetRateLimitsResp)
199
+ ```
200
+
201
+ ###### HTTP
130
202
```
131
203
POST /v1/GetRateLimits
132
204
```
133
205
134
206
Example Payload
135
- ``` javascript
207
+ ``` json
136
208
{
137
- " rate_limits " : [
209
+ "requests" : [
138
210
{
211
+ "name" : " requests_per_sec" ,
212
+ "unique_key" : " account.id=1234" ,
139
213
"hits" : 1 ,
140
- " namespace" : " my-app" ,
141
- " unique_key" : " domain.id=1234" ,
142
214
"duration" : 60000 ,
143
215
"limit" : 10
144
216
}
@@ -148,10 +220,11 @@ Example Payload
148
220
149
221
Example response:
150
222
151
- ``` javascript
223
+ ``` json
152
224
{
153
- " rate_limits " : [
225
+ "responses" : [
154
226
{
227
+ "status" : 0 ,
155
228
"limit" : " 10" ,
156
229
"remaining" : " 7" ,
157
230
"reset_time" : " 1551309219226" ,
@@ -161,6 +234,10 @@ Example response:
161
234
```
162
235
163
236
237
+ ## Installation
238
+ TODO: Show how to run gubernator in a docker container with just environs
239
+
240
+
164
241
## Development with Docker Compose
165
242
Gubernator uses etcd to keep track of all it's peers. This peer list is
166
243
used by the consistent hash to calculate which peer is the coordinator
@@ -173,13 +250,11 @@ You will need to be on the VPN to pull docker images from the repository.
173
250
# Start the containers
174
251
$ docker-compose up -d
175
252
176
- # Run radar to create the configs in etcd (https://github.com/mailgun/radar)
177
- push_configs --etcd-endpoint localhost:2379 --env-names test,dev
178
-
179
253
# Run gubernator
180
- export ETCD3_ENDPOINT=localhost:2379
181
- export MG_ENV=dev
182
254
$ cd golang
183
255
$ go run ./cmd/gubernator --config config.yaml
184
256
```
185
257
258
+ ### What kind of name is Gubernator?
259
+ Gubernator is the [ english pronunciation of governor] ( https://www.google.com/search?q=how+to+say+governor+in+russian&oq=how+to+say+govener+in+russ )
260
+ in Russian, also it sounds cool.
0 commit comments