diff --git a/.gitignore b/.gitignore index 3cec0c56..ce3f52e1 100644 --- a/.gitignore +++ b/.gitignore @@ -3,3 +3,4 @@ __pycache__ *.pyc gubernator.egg-info/ +.DS_Store diff --git a/CHANGELOG b/CHANGELOG index a1936bbd..3397b792 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added * Support for prometheus monitoring * Support for environment based config +* Support for kubernetes peer discovery ## [0.4.0] - 2019-07-16 ### Added diff --git a/README.md b/README.md index fea41d94..b6d7d006 100644 --- a/README.md +++ b/README.md @@ -1,74 +1,38 @@ # Gubernator Gubernator is a distributed, high performance, cloud native and stateless rate -limiting service designed to support many different rate limiting scenarios. - -#### Scenarios -* Meter ingress traffic -* Meter egress traffic -* Limit bursts on network queues -* Enforce capacity limits on network services - -## Architecture overview - -![gubernator arch image](/architecture.png) - -Gubernator is designed to run as a distributed cluster of peers which utilize -an in memory cache of all the currently active rate limits, as such no data is -ever synced to disk. Since most network based rate limit durations are held for -only a few seconds losing the in memory cache during a reboot or scheduled -downtime isn't a huge deal. For Gubernator we choose performance over accuracy -as it's acceptable for a small subset of traffic to be allowed to over request -for a short period of time (usually milliseconds) in the case of cache loss. - -When a rate limit request is made to Gubernator the request is keyed and a -consistent hashing algorithm is applied to determine which of the peers will be -the owner of the rate limit request. Choosing a single owner for a rate limit -makes atomic increments of counts very fast and avoids the complexity and -latency involved in distributing counts consistently across a cluster of peers. - -Although simple and performant this design can be susceptible to a thundering -herd of requests since a single coordinator is responsible for possibly -hundreds of thousands of requests to a rate limit. To combat this peers can -take multiple requests within a specified window and batch the requests into a -single peer request, thus reducing the total number of requests to a single -Gubernator peer tremendously. - -To ensure each peer in the cluster accurately calculates the correct hash -for a rate limit key, the list of peers in the cluster must be distributed -to each peer in the cluster in a timely and consistent manner. Currently -Gubernator uses Etcd to distribute the list of peers, This could be later -expanded to a consul or a custom consistent implementation which would further -simplify deployment. - -## Gubernator Operation - -Unlike other generic rate limit service implementations, Gubernator does not have -the concept of pre-configured rate limit that clients make requests against. -Instead each request to the service includes the rate limit config to be -applied to the request. This allows clients the flexibility to govern their -rate limit problem domain without the need to coordinate rate limit -configuration deployments with Gubernator. - -When a client or service makes a request to Gubernator the rate limit config is -provided with each request by the client. The rate limit configuration is then -stored with the current rate limit status in the local cache of the rate limit -owner. Rate limits and their configuration that are stored in the local cache -will only exist for the specified duration of the rate limit configuration. -After the duration time has expired, and if the rate limit was not requested -again within the duration it is dropped from the cache. Subsequent requests for -the same `name` and `unique_key` pair will recreate the config and rate limit -in the cache and the cycle will repeat. Subsequent requests with different -configs will overwrite the previous config and will apply the new config -immediately. +limiting service. + + +#### Features of Gubernator +* Gubernator evenly distributes rate limit requests across the entire cluster, + which means you can scale the system by simply adding more nodes. +* Gubernator doesn’t rely on external caches like memcache or redis, as such + there is no deployment synchronization with a dependant service. This makes + dynamically growing or shrinking the cluster in an orchestration system like + kubernetes or nomad trivial. +* Gubernator holds no state on disk, It’s configuration is passed to it by the + client on a per-request basis. +* Gubernator provides both GRPC and HTTP access to it’s API. +* Can be run as a sidecar to services that need rate limiting or as a separate service. +* Can be used as a library to implement a domain specific rate limiting service. +* Supports optional eventually consistent rate limit distribution for extremely + high throughput environments. (See GLOBAL behavior [architecture.md](/architecture.md)) +* Gubernator is the english pronunciation of governor in Russian, also it sounds cool. + +### Stateless configuration +Gubernator is stateless in that it doesn’t require disk space to operate. No +configuration or cache data is ever synced to disk. This is because every +request to gubernator includes the config for the rate limit. At first you +might think this an unnecessary overhead to each request. However, In reality a +rate limit config is made up of only 4, 64bit integers. An example rate limit request sent via GRPC might look like the following ```yaml rate_limits: - # Scopes the unique_key to your application to avoid collisions with - # other applications that might also use the same unique_key - name: requests_per_sec - # A unique_key that identifies this rate limit request + # Scopes the request to a specific rate limit + - name: requests_per_sec + # A unique_key that identifies this instance of a rate limit request unique_key: account_id=123|source_ip=172.0.0.1 # The number of hits we are requesting hits: 1 @@ -87,27 +51,27 @@ rate_limits: behavior: 0 ``` -And example response would be +An example response would be ```yaml rate_limits: - # The status of the rate limit. OK = 0, OVER_LIMIT = 1 - status: 0, - # The current configured limit - limit: 10, - # The number of requests remaining - remaining: 7, - # A unix timestamp in milliseconds of when the bucket will reset, or if - # OVER_LIMIT is set it is the time at which the rate limit will no - # longer return OVER_LIMIT. - reset_time: 1551309219226, - # Additional metadata about the request the client might find useful - metadata: - # This is the name of the coordinator that rate limited this request - "owner": "api-n03.staging.us-east-1.mailgun.org:9041" + # The status of the rate limit. OK = 0, OVER_LIMIT = 1 + - status: 0, + # The current configured limit + limit: 10, + # The number of requests remaining + remaining: 7, + # A unix timestamp in milliseconds of when the bucket will reset, or if + # OVER_LIMIT is set it is the time at which the rate limit will no + # longer return OVER_LIMIT. + reset_time: 1551309219226, + # Additional metadata about the request the client might find useful + metadata: + # This is the name of the coordinator that rate limited this request + "owner": "api-n03.staging.us-east-1.mailgun.org:9041" ``` -#### Rate limit Algorithm +### Rate limit Algorithm Gubernator currently supports 2 rate limit algorithms. 1. **Token Bucket** implementation starts with an empty bucket, then each `Hit` @@ -127,47 +91,38 @@ Gubernator currently supports 2 rate limit algorithms. the bucket leaks allowing traffic to continue without the need to wait for the configured rate limit duration to reset the bucket to zero. -## Global Limits -Since Gubernator rate limits are hashed and handled by a single peer in the -cluster. Rate limits that apply to every request in a data center would result -in the rate limit request being handled by a single peer for the entirety of -the data center. For example, consider a rate limit with -`name=requests_per_datacenter` and a `unique_id=us-east-1`. Now imagine that a -request is made to Gubernator with this rate limit for every http request that -enters the `us-east-1` data center. This could be hundreds of thousands, -potentially millions of requests per second that are all hashed and handled by -a single peer in the cluster. Because of this potential scaling issue -Gubernator introduces a configurable `behavior` called `GLOBAL`. - -When a rate limit is configured with `behavior=GLOBAL` the rate limit request -that is received from a client will not be forwarded to the owning peer but -will be answered from an internal cache handled by the peer. `Hits` toward the -rate limit will be batched by the receiving peer and sent asynchronously to the -owning peer where the hits will be totaled and `OVER_LIMIT` calculated. It -is then the responsibility of the owning peer to update each peer in the -cluster with the current status of the rate limit, such that peer internal -caches routinely get updated with the most current rate limit status. - -##### Side effects of global behavior -Since `Hits` are batched and forwarded to the owning peer asynchronously, the -immediate response to the client will not include the most accurate `remaining` -counts. As that count will only get updated after the async call to the owner -peer is complete and the owning peer has had time to update all the peers in -the cluster. As a result the use of `GLOBAL` allows for greater scale but at -the cost of consistency. Using `GLOBAL` also increases the amount of traffic -per rate limit request. `GLOBAL` should only be used for extremely high volume -rate limits that don't scale well with the traditional non `GLOBAL` behavior. - -## Performance -TODO: Show some performance metrics of gubernator running in production - -## API +### Performance +In our production environment, for every request to our API we send 2 rate +limit requests to gubernator for rate limit evaluation, one to rate the HTTP +request and the other is to rate the number of recipients a user can send an +email too within the specific duration. Under this setup a single gubernator +node fields over 2,000 requests a second with most batched responses returned +in under 1 millisecond. + +![requests graph](/images/requests-graph.png) + +Peer requests forwarded to owning nodes typically respond in under 30 microseconds. + +![peer requests graph](/images/peer-requests-graph.png) + +NOTE The above graphs only report the slowest request within the 1 second sample time. + So you are seeing the slowest requests that gubernator fields to clients. + +Gubernator allows users to choose non-batching behavior which would further +reduce latency for client rate limit requests. However because of throughput +requirements our production environment uses Behaviour=BATCHING with the +default 500 microsecond window. In production we have observed batch sizes of +1,000 during peak API usage. Other users who don’t have the same high traffic +demands could disable batching and would see lower latencies but at the cost of +throughput. + +### API All methods are accessed via GRPC but are also exposed via HTTP using the [GRPC Gateway](https://github.com/grpc-ecosystem/grpc-gateway) #### Health Check -Health check returns `unhealthy` in the event a peer is reported by etcd as `up` but the server -instance is unable to contact the peer via it's advertised address. +Health check returns `unhealthy` in the event a peer is reported by etcd or kubernetes + as `up` but the server instance is unable to contact that peer via it's advertised address. ###### GRPC ```grpc @@ -233,28 +188,57 @@ Example response: } ``` +### Deployment +NOTE: Gubernator uses etcd or kubernetes to discover peers and establish a cluster. If you +don't have either, the docker-compose method is the simplest way to try gubernator out. -## Installation -TODO: Show how to run gubernator in a docker container with just environs +##### Docker with existing etcd cluster +```bash +$ docker run -p 8081:81 -p 8080:80 -e GUBER_ETCD_ENDPOINTS=etcd1:2379,etcd2:2379 \ + thrawn01/gubernator:latest + +# Hit the API at localhost:8080 (GRPC is at 8081) +$ curl http://localhost:8080/v1/HealthCheck +``` + +##### Docker compose +The docker compose file includes a local etcd server and 2 gubernator instances +```bash +# Download the docker-compose file +$ curl -O https://raw.githubusercontent.com/mailgun/gubernator/master/docker-compose.yaml +# Edit the compose file to change the environment config variables +$ vi docker-compose.yaml -## Development with Docker Compose -Gubernator uses etcd to keep track of all it's peers. This peer list is -used by the consistent hash to calculate which peer is the coordinator -for a rate limit, the docker compose file starts a single instance of -etcd which is suitable for testing the server locally. +# Run the docker container +$ docker-compose up -d -You will need to be on the VPN to pull docker images from the repository. +# Hit the API at localhost:8080 (GRPC is at 8081) +$ curl http://localhost:8080/v1/HealthCheck +``` +##### Kubernetes ```bash -# Start the containers -$ docker-compose up -d +# Download the kubernetes deployment spec +$ curl -O https://raw.githubusercontent.com/mailgun/gubernator/master/k8s-deployment.yaml -# Run gubernator -$ cd golang -$ go run ./cmd/gubernator --config config.yaml +# Edit the deployment file to change the environment config variables +$ vi k8s-deployment.yaml + +# Create the deployment (includes headless service spec) +$ kubectl create -f k8s-deployment.yaml ``` -### What kind of name is Gubernator? -Gubernator is the [english pronunciation of governor](https://www.google.com/search?q=how+to+say+governor+in+russian&oq=how+to+say+govener+in+russ) -in Russian, also it sounds cool. +### Configuration +Gubernator is configured via environment variables with an optional `--config` flag +which takes a file of key/values and places them into the local environment before startup. + +See the `example.conf` for all available config options and their descriptions. + + +### Architecture +See [architecture.md](/architecture.md) for a full discription of the architecture and the inner +workings of gubernator. + + + diff --git a/architecture.md b/architecture.md new file mode 100644 index 00000000..5fcbba98 --- /dev/null +++ b/architecture.md @@ -0,0 +1,92 @@ +## Gubernator Architecture + +![architecture diagram](/images/architecture.png) + +Gubernator is designed to run as a distributed cluster of peers which utilize +an in memory cache of all the currently active rate limits, as such no data is +ever synced to disk. Since most network based rate limit durations are held for +only a few seconds losing the in memory cache during a reboot or scheduled +downtime isn't a huge deal. For Gubernator we choose performance over accuracy +as it's acceptable for a small subset of traffic to over request for a short +period of time (usually seconds) in the case of cache loss. + +When a rate limit request is made to Gubernator the request is keyed and a +consistent hashing algorithm is applied to determine which of the peers will be +the owner of the rate limit request. Choosing a single owner for a rate limit +makes atomic increments of counts very fast and avoids the complexity and +latency involved in distributing counts consistently across a cluster of peers. + +Although simple and performant this design could be susceptible to a thundering +herd of requests since a single coordinator is responsible for possibly +hundreds of thousands of requests to a rate limit. To combat this, clients can +request `Behaviour=BATCHING` which allows peers to take multiple requests within +a specified window (default is 500 microseconds) and batch the requests into a +single peer request, thus reducing the total number of over the wire requests +to a single Gubernator peer tremendously. + +To ensure each peer in the cluster accurately calculates the correct hash for a +rate limit key, the list of peers in the cluster must be distributed to each +peer in the cluster in a timely and consistent manner. Currently Gubernator +supports using etcd or the kubernetes endpoints API to discover gubernator +peers. + +## Gubernator Operation +When a client or service makes a request to Gubernator, the rate limit config +is provided with each request by the client. The rate limit configuration is +then stored with the current rate limit status in the local cache of the rate +limit owner. Rate limits and their configuration that are stored in the local +cache will only exist for the specified duration of the rate limit +configuration. After the duration time has expired, and if the rate limit was +not requested again within the duration it is dropped from the cache. +Subsequent requests for the same `name` and `unique_key` pair will recreate the +config and rate limit in the cache and the cycle will repeat. Subsequent +requests with different configs will overwrite the previous config and will +apply the new config immediately. + +## Global Behavior +Since Gubernator rate limits are hashed and handled by a single peer in the +cluster. Rate limits that apply to every request in a data center would result +in the rate limit request being handled by a single peer for the entirety of +the data center. For example, consider a rate limit with +`name=requests_per_datacenter` and a `unique_id=us-east-1`. Now imagine that a +request is made to Gubernator with this rate limit for every http request that +enters the `us-east-1` data center. This could be hundreds of thousands, +potentially millions of requests per second that are all hashed and handled by +a single peer in the cluster. Because of this potential scaling issue +Gubernator introduces a configurable behavior called `GLOBAL`. + +When a rate limit is configured with `behavior=GLOBAL`, the rate limit request +that is received from a client will not be forwarded to the owning peer but +will be answered from an internal cache handled by the peer who received the +request. Hits toward the rate limit will be batched by the receiving peer and +sent asynchronously to the owning peer where the hits will be totaled and +`OVER_LIMIT` calculated. It is then the responsibility of the owning peer to +update each peer in the cluster with the current status of the rate limit, such +that peer internal caches routinely get updated with the most current rate +limit status from the owner. + +#### Side effects of global behavior +Since Hits are batched and forwarded to the owning peer asynchronously, the +immediate response to the client will not include the most accurate remaining +counts. As that count will only get updated after the async call to the owner +peer is complete and the owning peer has had time to update all the peers in +the cluster. As a result the use of GLOBAL allows for greater scale but at the +cost of consistency. Using `GLOBAL` can increase the amount of traffic per rate +limit request if the cluster is large enough. GLOBAL should only be used for +extremely high volume rate limits that don't scale well with the traditional +non `GLOBAL` behavior. + +## Gubernator as a library +If you are using golang, you can use gubernator as a library. This is useful if +you wish to implement a rate limit service with your own company specific model +on top. We do this internally here at mailgun with a service we creatively +called `ratelimits` which keeps track of the limits imposed on a per account +basis. In this way you can utilize the power and speed of gubernator but still +layer business logic and integrate domain specific problems into your rate +limiting service. + +When you use the library, your service becomes a full member of the cluster +participating in the same consistent hashing and caching as a stand alone +gubernator server would. All you need to do is provide the GRPC server instance +and tell gubernator where the peers in your cluster are located. + diff --git a/cmd/gubernator/config.go b/cmd/gubernator/config.go index 926956dc..9dbbd78c 100644 --- a/cmd/gubernator/config.go +++ b/cmd/gubernator/config.go @@ -229,7 +229,7 @@ func fromEnvFile(configFile string) error { } logrus.Debugf("config: [%d] '%s'", i, line) - parts := strings.Split(line, "=") + parts := strings.SplitN(line, "=", 2) if len(parts) != 2 { return errors.Errorf("malformed key=value on line '%d'", i) } diff --git a/example.conf b/example.conf index 9263619a..5ae24505 100644 --- a/example.conf +++ b/example.conf @@ -8,31 +8,11 @@ GUBER_GRPC_ADDRESS=0.0.0.0:81 # The address HTTP requests will listen on GUBER_HTTP_ADDRESS=0.0.0.0:80 -# Max size of the cache; The cache size will never grow beyond this size. +# Max size of the cache; This is the cache that holds +# all the rate limits. The cache size will never grow +# beyond this size. GUBER_CACHE_SIZE=50000 -############################ -# Etcd Config -############################ - -# A Comma separate list of etcd nodes -GUBER_ETCD_ENDPOINTS=localhost:2379 - -# The address peers will connect too -# Should be the same as grpc-listen-address unless you are running behind -# a NAT or running in a docker container without host networking -GUBER_ETCD_ADVERTISE_ADDRESS=localhost:81 - -# The prefix gubernator will use to register peers under in etcd -#GUBER_ETCD_KEY_PREFIX=/gubernator-peers - -# How long etcd client will wait for a response when initial dialing a node -#GUBER_ETCD_DIAL_TIMEOUT=5s - -# Authentication -#GUBER_ETCD_USER= -#GUBER_ETCD_PASSWORD= - ############################ # Behavior Config @@ -57,6 +37,47 @@ GUBER_ETCD_ADVERTISE_ADDRESS=localhost:81 #GUBER_GLOBAL_SYNC_WAIT=500ns +############################ +# Kubernetes Config +############################ + +# The namespace the gubernator instances were deployed into +#GUBER_K8S_NAMESPACE=default + +# Should be set to the IP of the pod the gubernator instance is running in. +# This allows gubernator to know which of the peers it discovers is it's self. +#GUBER_K8S_POD_IP= + +# Should be set to the port number of the pod, as defined by `containerPort` in the pod spec. +#GUBER_K8S_POD_PORT= + +# The selector used when listing the endpoints API to find peers. +#GUBER_K8S_ENDPOINTS_SELECTOR=app=gubernator + + +############################ +# Etcd Config +############################ + +# A Comma separate list of etcd nodes +GUBER_ETCD_ENDPOINTS=localhost:2379 + +# The address peers will connect too +# Should be the same as grpc-listen-address unless you are running behind +# a NAT or running in a docker container without host networking +GUBER_ETCD_ADVERTISE_ADDRESS=localhost:81 + +# The prefix gubernator will use to register peers under in etcd +#GUBER_ETCD_KEY_PREFIX=/gubernator-peers + +# How long etcd client will wait for a response when initial dialing a node +#GUBER_ETCD_DIAL_TIMEOUT=5s + +# Authentication +#GUBER_ETCD_USER= +#GUBER_ETCD_PASSWORD= + + ############################ # Etcd TLS Config ############################ @@ -72,3 +93,5 @@ GUBER_ETCD_ADVERTISE_ADDRESS=localhost:81 # Skip CERT verification #GUBER_ETCD_TLS_SKIP_VERIFY=true + + diff --git a/architecture.png b/images/architecture.png similarity index 100% rename from architecture.png rename to images/architecture.png diff --git a/images/peer-requests-graph.png b/images/peer-requests-graph.png new file mode 100644 index 00000000..79a87be6 Binary files /dev/null and b/images/peer-requests-graph.png differ diff --git a/images/requests-graph.png b/images/requests-graph.png new file mode 100644 index 00000000..c0d66226 Binary files /dev/null and b/images/requests-graph.png differ