Skip to content
This repository was archived by the owner on Apr 19, 2024. It is now read-only.

Commit 5eea10b

Browse files
authored
Merge pull request #13 from mailgun/thrawn/develop
PIP-574: Support global rate limits
2 parents 15558cd + 009a3e7 commit 5eea10b

23 files changed

+640
-162
lines changed

Dockerfile

-3
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,3 @@
1-
# TODO: https://stackoverflow.com/questions/1722807/how-to-convert-git-urls-to-http-urls
2-
# TODO: https://medium.com/@tonistiigi/build-secrets-and-ssh-forwarding-in-docker-18-09-ae8161d066
3-
41
# Build image
52
FROM golang:1.11.2 as build
63

README.md

+129-54
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,37 @@
11
# Gubernator
22

3-
Gubernator is a rate limiting service which calculates rate limits
4-
via a configurable algorithm.
3+
Gubernator is a distributed, high performance, cloud native and stateless rate
4+
limiting service designed to support many different rate limiting scenarios.
5+
6+
#### Scenarios
7+
* Meter ingress traffic
8+
* Meter egress traffic
9+
* Limit bursts on network queues
10+
* Enforce capacity limits on network services
511

612
## Architecture overview
713

814
![gubernator arch image](/architecture.png)
915

10-
Gubernator is designed to run as a cluster of peers which utilize an
11-
in memory cache of all the currently active rate limits, no data is
12-
ever synced to disk. Since most ingress HTTP rate limit durations are
13-
held for only a few seconds losing the in memory cache during a reboot
14-
or scheduled downtime isn't a huge deal. For Gubernator we choose
15-
performance over accuracy as it's acceptable for a small subset of
16-
traffic to be allowed to over request for a short period of time.
17-
Gubernator could be expanded in the future to store rate limits with
18-
longer durations to disk, but currently this is not supported.
19-
20-
When a rate limit request is made to Gubernator the request is keyed and
21-
a consistent hashing algorithm is applied to determine which of the
22-
peers will be the coordinator for the rate limit request. Choosing a
23-
single coordinator for a rate limit makes atomic increments of counts very fast
24-
and avoids the complexity and latency involved in distributing counts consistently
25-
across a cluster of peers. Although simple and performant this design can be
26-
susceptible to a thundering herd of requests since a single coordinator is responsible
27-
for possibly hundreds of thousands of requests to a rate limit. To combat this the
28-
server can take multiple requests within a specified window and batch the requests
29-
into a single peer request, thus reducing the total number of requests to a single
16+
Gubernator is designed to run as a distributed cluster of peers which utilize
17+
an in memory cache of all the currently active rate limits, as such no data is
18+
ever synced to disk. Since most network based rate limit durations are held for
19+
only a few seconds losing the in memory cache during a reboot or scheduled
20+
downtime isn't a huge deal. For Gubernator we choose performance over accuracy
21+
as it's acceptable for a small subset of traffic to be allowed to over request
22+
for a short period of time (usually milliseconds) in the case of cache loss.
23+
24+
When a rate limit request is made to Gubernator the request is keyed and a
25+
consistent hashing algorithm is applied to determine which of the peers will be
26+
the owner of the rate limit request. Choosing a single owner for a rate limit
27+
makes atomic increments of counts very fast and avoids the complexity and
28+
latency involved in distributing counts consistently across a cluster of peers.
29+
30+
Although simple and performant this design can be susceptible to a thundering
31+
herd of requests since a single coordinator is responsible for possibly
32+
hundreds of thousands of requests to a rate limit. To combat this peers can
33+
take multiple requests within a specified window and batch the requests into a
34+
single peer request, thus reducing the total number of requests to a single
3035
Gubernator peer tremendously.
3136

3237
To ensure each peer in the cluster accurately calculates the correct hash
@@ -38,18 +43,24 @@ simplify deployment.
3843

3944
## Gubernator Operation
4045

41-
Gubernator does not read from a list of pre-configured rate limits.
42-
Instead each request to a peer includes the rate limit to be applied to the request.
43-
This allows clients who understand their rate limit problem space to create and apply
44-
new rate limit configurations without the need of an out of band process to configure
45-
the rate limiting service.
46+
Unlike other generic rate limit service implementations, Gubernator does not have
47+
the concept of pre-configured rate limit that clients make requests against.
48+
Instead each request to the service includes the rate limit config to be
49+
applied to the request. This allows clients the flexibility to govern their
50+
rate limit problem domain without the need to coordinate rate limit
51+
configuration deployments with Gubernator.
4652

47-
The rate limit configuration is stored with the current rate limit in the local cache of
48-
the coordinator owner. Rate limits and their configuration that are stored in the local
49-
cache will only exist for the specified duration of the rate limit configuration. After
50-
the duration time has expired, and if the rate limit was not requested again within the
51-
duration it is dropped from the cache. Subsequent requests for the same unique_key will
52-
recreate the config and rate limit in the cache and the cycle will repeat.
53+
When a client or service makes a request to Gubernator the rate limit config is
54+
provided with each request by the client. The rate limit configuration is then
55+
stored with the current rate limit status in the local cache of the rate limit
56+
owner. Rate limits and their configuration that are stored in the local cache
57+
will only exist for the specified duration of the rate limit configuration.
58+
After the duration time has expired, and if the rate limit was not requested
59+
again within the duration it is dropped from the cache. Subsequent requests for
60+
the same `name` and `unique_key` pair will recreate the config and rate limit
61+
in the cache and the cycle will repeat. Subsequent requests with different
62+
configs will overwrite the previous config and will apply the new config
63+
immediately.
5364

5465
An example rate limit request sent via GRPC might look like the following
5566
```yaml
@@ -58,7 +69,7 @@ rate_limits:
5869
# other applications that might also use the same unique_key
5970
name: requests_per_sec
6071
# A unique_key that identifies this rate limit request
61-
unique_key: account_id=24b00c590856900ee961b275asdfd|source_ip=172.0.0.1
72+
unique_key: account_id=123|source_ip=172.0.0.1
6273
# The number of hits we are requesting
6374
hits: 1
6475
# The total number of requests allowed for this rate limit
@@ -69,6 +80,11 @@ rate_limits:
6980
# 0 = Token Bucket
7081
# 1 = Leaky Bucket
7182
algorithm: 0
83+
# The behavior of the rate limit in gubernator.
84+
# 0 = BATCHING (Enables batching of requests to peers)
85+
# 1 = NO_BATCHING (Disables batching)
86+
# 2 = GLOBAL (Enable global caching for this rate limit)
87+
behavior: 0
7288
```
7389
7490
And example response would be
@@ -88,34 +104,84 @@ rate_limits:
88104
# Additional metadata about the request the client might find useful
89105
metadata:
90106
# This is the name of the coordinator that rate limited this request
91-
"owner": "api-n03.staging.us-east-1.definbox.com:9041"
107+
"owner": "api-n03.staging.us-east-1.mailgun.org:9041"
92108
```
93109
94110
#### Rate limit Algorithm
95111
Gubernator currently supports 2 rate limit algorithms.
96112
97-
1. [Token Bucket](https://en.wikipedia.org/wiki/Token_bucket) is useful for rate limiting very
98-
bursty traffic. The downside to token bucket is that once you have hit the limit no more requests
99-
are allowed until the configured rate limit duration resets the bucket to zero.
100-
2. [Leaky Bucket](https://en.wikipedia.org/wiki/Leaky_bucket) is useful for metering traffic
101-
at a consistent rate, as the bucket leaks at a consistent rate allowing traffic to continue
102-
without the need to wait for the configured rate limit duration to reset the bucket to zero.
113+
1. **Token Bucket** implementation starts with an empty bucket, then each `Hit`
114+
adds a token to the bucket until the bucket is full. Once the bucket is
115+
full, requests will return `OVER_LIMIT` until the `reset_time` is reached at
116+
which point the bucket is emptied and requests will return `UNDER_LIMIT`.
117+
This algorithm is useful for enforcing very bursty limits. (IE: Applications
118+
where a single request can add more than 1 `hit` to the bucket; or non network
119+
based queuing systems.) The downside to this implementation is that once you
120+
have hit the limit no more requests are allowed until the configured rate
121+
limit duration resets the bucket to zero.
122+
123+
2. [Leaky Bucket](https://en.wikipedia.org/wiki/Leaky_bucket) is implemented
124+
similarly to **Token Bucket** where `OVER_LIMIT` is returned when the bucket
125+
is full. However tokens leak from the bucket at a consistent rate which is
126+
calculated as `duration / limit`. This algorithm is useful for metering, as
127+
the bucket leaks allowing traffic to continue without the need to wait for
128+
the configured rate limit duration to reset the bucket to zero.
129+
130+
## Global Limits
131+
Since Gubernator rate limits are hashed and handled by a single peer in the
132+
cluster. Rate limits that apply to every request in a data center would result
133+
in the rate limit request being handled by a single peer for the entirety of
134+
the data center. For example, consider a rate limit with
135+
`name=requests_per_datacenter` and a `unique_id=us-east-1`. Now imagine that a
136+
request is made to Gubernator with this rate limit for every http request that
137+
enters the `us-east-1` data center. This could be hundreds of thousands,
138+
potentially millions of requests per second that are all hashed and handled by
139+
a single peer in the cluster. Because of this potential scaling issue
140+
Gubernator introduces a configurable `behavior` called `GLOBAL`.
141+
142+
When a rate limit is configured with `behavior=GLOBAL` the rate limit request
143+
that is received from a client will not be forwarded to the owning peer but
144+
will be answered from an internal cache handled by the peer. `Hits` toward the
145+
rate limit will be batched by the receiving peer and sent asynchronously to the
146+
owning peer where the hits will be totaled and `OVER_LIMIT` calculated. It
147+
is then the responsibility of the owning peer to update each peer in the
148+
cluster with the current status of the rate limit, such that peer internal
149+
caches routinely get updated with the most current rate limit status.
150+
151+
##### Side effects of global behavior
152+
Since `Hits` are batched and forwarded to the owning peer asynchronously, the
153+
immediate response to the client will not include the most accurate `remaining`
154+
counts. As that count will only get updated after the async call to the owner
155+
peer is complete and the owning peer has had time to update all the peers in
156+
the cluster. As a result the use of `GLOBAL` allows for greater scale but at
157+
the cost of consistency. Using `GLOBAL` also increases the amount of traffic
158+
per rate limit request. `GLOBAL` should only be used for extremely high volume
159+
rate limits that don't scale well with the traditional non `GLOBAL` behavior.
160+
161+
## Performance
162+
TODO: Show some performance metrics of gubernator running in production
103163

104164
## API
105-
All Methods implement in GRPC are exposed to HTTP via the
165+
All methods are accessed via GRPC but are also exposed via HTTP using the
106166
[GRPC Gateway](https://github.com/grpc-ecosystem/grpc-gateway)
107167

108168
#### Health Check
109169
Health check returns `unhealthy` in the event a peer is reported by etcd as `up` but the server
110170
instance is unable to contact the peer via it's advertised address.
111171

172+
###### GRPC
173+
```grpc
174+
rpc HealthCheck (HealthCheckReq) returns (HealthCheckResp)
175+
```
176+
177+
###### HTTP
112178
```
113179
GET /v1/HealthCheck
114180
```
115181
116182
Example response:
117183
118-
```javascript
184+
```json
119185
{
120186
"status": "healthy",
121187
"peer_count": 3
@@ -127,18 +193,24 @@ Rate limits can be applied or retrieved using this interface. If the client
127193
makes a request to the server with `hits: 0` then current state of the rate
128194
limit is retrieved but not incremented.
129195

196+
###### GRPC
197+
```grpc
198+
rpc GetRateLimits (GetRateLimitsReq) returns (GetRateLimitsResp)
199+
```
200+
201+
###### HTTP
130202
```
131203
POST /v1/GetRateLimits
132204
```
133205

134206
Example Payload
135-
```javascript
207+
```json
136208
{
137-
"rate_limits": [
209+
"requests":[
138210
{
211+
"name": "requests_per_sec",
212+
"unique_key": "account.id=1234",
139213
"hits": 1,
140-
"namespace": "my-app",
141-
"unique_key": "domain.id=1234",
142214
"duration": 60000,
143215
"limit": 10
144216
}
@@ -148,10 +220,11 @@ Example Payload
148220

149221
Example response:
150222

151-
```javascript
223+
```json
152224
{
153-
"rate_limits": [
225+
"responses":[
154226
{
227+
"status": 0,
155228
"limit": "10",
156229
"remaining": "7",
157230
"reset_time": "1551309219226",
@@ -161,6 +234,10 @@ Example response:
161234
```
162235

163236

237+
## Installation
238+
TODO: Show how to run gubernator in a docker container with just environs
239+
240+
164241
## Development with Docker Compose
165242
Gubernator uses etcd to keep track of all it's peers. This peer list is
166243
used by the consistent hash to calculate which peer is the coordinator
@@ -173,13 +250,11 @@ You will need to be on the VPN to pull docker images from the repository.
173250
# Start the containers
174251
$ docker-compose up -d
175252

176-
# Run radar to create the configs in etcd (https://github.com/mailgun/radar)
177-
push_configs --etcd-endpoint localhost:2379 --env-names test,dev
178-
179253
# Run gubernator
180-
export ETCD3_ENDPOINT=localhost:2379
181-
export MG_ENV=dev
182254
$ cd golang
183255
$ go run ./cmd/gubernator --config config.yaml
184256
```
185257

258+
### What kind of name is Gubernator?
259+
Gubernator is the [english pronunciation of governor](https://www.google.com/search?q=how+to+say+governor+in+russian&oq=how+to+say+govener+in+russ)
260+
in Russian, also it sounds cool.

docker-compose.yaml

-6
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,3 @@ services:
1414
-initial-cluster-state new
1515
ports:
1616
- "2379:2379"
17-
metrics:
18-
image: registry.postgun.com:5000/kubegun/metrics:latest
19-
ports:
20-
- "8125:8125/udp"
21-
- "8081:81"
22-
- "8080:80"

go.mod

+1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ require (
88
github.com/coreos/go-semver v0.3.0 // indirect
99
github.com/coreos/go-systemd v0.0.0-20190321100706-95778dfbb74e // indirect
1010
github.com/coreos/pkg v0.0.0-20180928190104-399ea9e2e55f // indirect
11+
github.com/davecgh/go-spew v1.1.1
1112
github.com/dgrijalva/jwt-go v3.2.0+incompatible // indirect
1213
github.com/fatih/structs v1.1.0 // indirect
1314
github.com/ghodss/yaml v1.0.0

golang/algorithms.go

+10-1
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,11 @@ func leakyBucket(c cache.Cache, r *RateLimitReq) (*RateLimitResp, error) {
9999
b.LimitRemaining = b.Limit
100100
}
101101

102-
b.TimeStamp = now
102+
// Only update the TS if client is incrementing the hit
103+
if r.Hits != 0 {
104+
b.TimeStamp = now
105+
}
106+
103107
rl := &RateLimitResp{
104108
Limit: b.Limit,
105109
Remaining: b.LimitRemaining,
@@ -127,6 +131,11 @@ func leakyBucket(c cache.Cache, r *RateLimitReq) (*RateLimitResp, error) {
127131
return rl, nil
128132
}
129133

134+
// Client is only interested in retrieving the current status
135+
if r.Hits == 0 {
136+
return rl, nil
137+
}
138+
130139
b.LimitRemaining -= r.Hits
131140
rl.Remaining = b.LimitRemaining
132141
c.UpdateExpiration(r.HashKey(), now*r.Duration)

golang/cache/types.go

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
package cache
22

33
// Interface accepts any cache which returns cache stats
4-
type CacheStats interface {
4+
type Stater interface {
55
Stats(bool) Stats
66
}
77

golang/cluster/cluster.go

+18-1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ import (
77
"github.com/sirupsen/logrus"
88
"google.golang.org/grpc"
99
"net"
10+
"time"
1011
)
1112

1213
type instance struct {
@@ -39,6 +40,16 @@ func GetPeer() string {
3940
return gubernator.RandomPeer(peers)
4041
}
4142

43+
// Returns a specific peer
44+
func PeerAt(idx int) string {
45+
return peers[idx]
46+
}
47+
48+
// Returns a specific instance
49+
func InstanceAt(idx int) *instance {
50+
return instances[idx]
51+
}
52+
4253
// Start a local cluster of gubernator servers
4354
func Start(numInstances int) error {
4455
var addresses []string
@@ -53,7 +64,13 @@ func StartWith(addresses []string) error {
5364
for _, address := range addresses {
5465
srv := grpc.NewServer()
5566

56-
guber, err := gubernator.New(gubernator.Config{GRPCServer: srv})
67+
guber, err := gubernator.New(gubernator.Config{
68+
GRPCServer: srv,
69+
Behaviors: gubernator.BehaviorConfig{
70+
GlobalSyncWait: time.Millisecond * 50, // Suitable for testing but not production
71+
GlobalTimeout: time.Second,
72+
},
73+
})
5774
if err != nil {
5875
return errors.Wrap(err, "while creating new gubernator instance")
5976
}

0 commit comments

Comments
 (0)