This contains instructions on how to deploy Sleeper.
If you just want a local instance for testing, see the documentation on deploying to localstack. This has very limited functionality compared to a deployed instance.
You will need to get your environment set up correctly so that you can deploy a Sleeper instance to AWS and then interact with it. See getting started for how to install the Sleeper CLI. The information below provides more detail on how to create an environment to deploy Sleeper into, and how to get set up to deploy into AWS.
Currently it's necessary to build Sleeper before any deployment. With the sleeper environment
setup described in the
getting started guide, you get an EC2 with the Sleeper CLI installed, and the Git repository checked out. Once this is
deployed, you can connect to it and build Sleeper like this:
sleeper environment connect # Get a shell in the EC2 you deployed
sleeper builder # Get a shell in a builder Docker container (hosted in the EC2)
cd sleeper # Change directory to the root of the Git repository
./scripts/build/build.sh
If you used the system test deployment described in the getting started guide, you will have already built Sleeper.
The Sleeper CLI runs commands inside a Docker container. This way you can avoid needing to install anything other than Docker on your machine.
The sleeper builder
command gets you a shell inside a container with all the dependencies required to build and deploy
an instance of Sleeper. Note that when you run this inside an environment EC2, the Sleeper Git repository will have been
cloned into the working directory of the container. If you are not using an environment EC2, you will need to manually
clone the repository. If you deploy from outside of AWS this will involve lengthy uploads of build artifacts, which you
can avoid with the environment EC2.
If you have AWS CLI installed, it will use your configuration from the host. Otherwise, any configuration you set in the container will be persisted in the host home directory. AWS authentication environment variables will be propagated to the container as well.
The host Docker environment will be propagated to the container via the Docker socket.
The files generated for the Sleeper instance will be persisted in the host home directory under ~/.sleeper
, so that
if you run the Docker container multiple times you will still have details of the last Sleeper instance you worked with.
If you add a command on the end, you can run a specific script like this:
sleeper builder sleeper/scripts/test/deployAll/deployTest.sh myinstanceid myvpc mysubnet
When you configure AWS on your machine or in the environment EC2, if you do it outside the Sleeper CLI, the configuration will be passed on to any Sleeper CLI commands.
The following configuration should allow the SDKs, the CLI and CDK to all access AWS:
~/.aws/credentials:
[named-profile-123456789]
aws_access_key_id = abcd12345
aws_secret_access_key = defg12345
aws_session_token = hijK12345
~/.aws/config
[profile named-profile-123456789]
region = eu-west-2
~/.bashrc:
export AWS_PROFILE=named-profile-123456789
export AWS_REGION=eu-west-2
Also see the AWS IAM guide for CLI access.
To deploy Sleeper into your AWS account you will need to have bootstrapped CDK in the
account. Bootstrapping installs all the resources that CDK needs to do deployments. Note
that bootstrapping CDK is a one-time action for the account that is nothing to do with
Sleeper itself. See
this link for guidance
on how to bootstrap CDK in your account. Note that the cdk bootstrap
command should
not be run from inside the sleeper directory. You can run cdk bootstrap
in a Sleeper CLI
Docker container, as described in getting started.
When deploying Sleeper, depending on the stacks you need, it will deploy a few Lambda functions into your account. Some of these Lambda functions are configured to run with reserved concurrency of 1. In order to allow this you will need to make sure you have enough free reserved concurrency in your account.
You will need a reserved account concurrency of at most 6 for all the Sleeper stacks to be deployed. In order to check how many you have, go to the Lambda section in your AWS Console and check the dashboard. It should say at the top "full account concurrency = X" (usually 1000) and "unreserved account concurrency = Y". You can't use the last 100 of your limit. So if Y is greater than or equal to X-100 you won't be able to deploy Sleeper and you will have to see if you can adjust your existing lambdas to free some up.
You're now ready to build and deploy Sleeper.
Please follow the getting started guide to set up an environment suitable to deploy Sleeper. This also assumes you have installed the Sleeper CLI. This section adds more detail for the tools to set up this environment.
The environment EC2 described here is provided to allow easy deployment, especially for development. It's a machine that can build Sleeper within AWS, and it avoids lengthy uploads of built artifacts into AWS, particularly jars and Docker images. In the future we may add support for prebuilt artifacts, in which case the EC2 will not be needed to deploy Sleeper.
The EC2 is deployed with admin access to your AWS account. A production instance of Sleeper is likely to need some extra security setup, and you may wish to avoid deploying an EC2 with admin access to a production AWS account.
For general administration of an existing Sleeper instance it is not necessary to connect to an environment EC2.
If you run sleeper environment
, you'll get a shell inside a Docker container where you can run aws
, cdk
and
Sleeper environment
commands directly, without prefixing with sleeper
.
You can use aws
commands there to set the AWS account, region and authentication. You can also set AWS environment
variables or configuration on the host machine, which will be propagated to the Docker container when you use
sleeper
commands.
Note that sleeper environment
commands are not intended to be run from inside an environment EC2. When you connect to
an EC2, this will be in a fresh context that is not aware of environments you have deployed or added. You can still use
it to run aws
and cdk
commands, although it may be more convenient to use sleeper builder
for this.
You can deploy either the VPC or the EC2 independently, or specify an existing VPC to deploy the EC2 to.
You must specify a unique environment ID when deploying an environment. Parameters after the environment ID will be
passed to a cdk deploy --all
command.
# Deploy EC2 in a new VPC
sleeper environment deploy <environment-id>
# Only deploy VPC (running this with an existing environment will remove the EC2)
sleeper environment deploy <environment-id> -c deployEc2=false
# Deploy EC2 in an existing VPC
sleeper environment deploy <environment-id> -c vpcId=[vpc-id]
# Deploy with nightly system test automation (set nightlyTestDeployId to your own 2-character value)
sleeper environment deploy <environment-id> -c nightlyTestsEnabled=true -c nightlyTestDeployId=my
You can add an environment that was previously deployed like this:
sleeper environment add <environment-id>
Whether you deployed or added an environment, you can connect to the deployed EC2 like this when it is running:
sleeper environment connect
This will SSH into the machine with EC2 Instance Connect and SSM Session Manager, and create a Linux screen
session.
If you do not explicitly exit this session, you will reconnect to the same screen
session next time you connect to the
EC2. If multiple connections are made to the EC2 as the same user, this will take over the screen
session and
disconnect the previous connection.
You can replace the screen
command by adding your own parameters to pass to ssh, like this:
sleeper environment connect bash
You can switch environments like this:
sleeper environment list
sleeper environment set <environment-id>
You can tear down the deployed environment like this:
sleeper environment destroy <environment-id>
Parameters after the environment ID will be passed to a cdk destroy
command.
When you deploy or add an environment, you will connect to the EC2 as the default user for the machine. This may not be desirable if you want to share the EC2, or if you want to automate system tests to run as that user.
From outside the EC2, you can manage operating system users on the EC2 like this:
sleeper environment adduser <username>
sleeper environment setuser <username>
sleeper environment deluser <username>
When you add a new user or set your user, further invocations of sleeper environment connect
will connect as that
user.
When you add a new user a fresh instance of the Sleeper Git repository will be checked out for that user, accessible
through sleeper builder
as that user.
There's no authorisation that links your identity to a particular user. Anyone with access to the EC2 can connect as any user.
There are two ways to deploy Sleeper: you can use the automated scripts or a more manual approach.
Either approach should be done from within an EC2 instance set up as described above, to avoid lengthy uploads of large jar files and Docker images.
The automated deployment creates an instance of Sleeper either from your own configuration files, or from templates. This also pre-populates certain properties for you, e.g. from your AWS configuration, and handles uploading the necessary deployment artifacts to AWS.
Please ensure Sleeper has been built successfully before using this. This guide assumes you are in a sleeper builder
container in an EC2 deployed with sleeper environment
.
Properties that are set to "changeme" in the templates will be overwritten and should not be set manually during automated deployment.
You can find the template files here. It is recommended that you change these templates to configure Sleeper in the way that you want before you run the automated script. At the very least you will want to change the tags.template file. See the Configuration section below for further details. In that guide, ignore the properties that are set to "changeme" in the templates as they are overwritten by the automated deployment.
If you deploy from the templates, it will create an instance with no tables:
cd scripts
editor templates/instanceproperties.template
editor templates/tags.template
./deploy/deployNew.sh <instance-id> <vpc-id> <subnet-ids>
Here vpc-id
and subnet-ids
are the ids of the VPC and subnets that some components of Sleeper will be deployed into.
Multiple subnet ids can be specified with commas in between, e.g. subnet-a,subnet-b
.
You can also create your own configuration, including tables, and deploy that:
cd scripts
mkdir my-instance
cp templates/instanceproperties.template my-instance/instance.properties
cp templates/tags.template my-instance/tags.properties
cp templates/tableproperties.template my-instance/tables/my-table/table.properties
cp templates/schema.template my-instance/tables/my-table/schema.json
# Edit configuration files as above
./deploy/deployNew.sh <instance-id> <vpc-id> <subnet-ids> ./my-instance/instance.properties
This script will upload the necessary jars to a bucket in S3 and push the Docker container images to respositories in ECR.
The deployment scripts will create all of the required configuration files in a folder called generated
in the scripts
directory.
For Sleeper to be deployed manually, some resources have to be uploaded to AWS first. The jar files need to be uploaded to an S3 bucket, and some Docker images need to be uploaded to an ECR repository.
There are multiple ECR images that need to be created and pushed to an ECR repo, depending on the stacks you want to deploy. There's one for ingest, one for compaction and two for bulk import (for EKS and EMR Serverless). You may not wish to use the bulk import stacks so don't upload the images if you aren't. There's also an image for data generation for system tests.
Next, create some environment variables for convenience:
INSTANCE_ID=<insert-a-unique-id-for-the-sleeper-instance-here>
VERSION=$(cat "./scripts/templates/version.txt")
DOCKER_REGISTRY=<insert-your-account-id-here>.dkr.ecr.eu-west-2.amazonaws.com
REPO_PREFIX=${DOCKER_REGISTRY}/${INSTANCE_ID}
DOCKER_BASE_DIR=./scripts/docker
Then log in to ECR:
aws ecr get-login-password --region eu-west-2 | docker login --username AWS --password-stdin ${DOCKER_REGISTRY}
Upload the container for ingest:
TAG=$REPO_PREFIX/compaction-job-execution:$VERSION
aws ecr create-repository --repository-name $INSTANCE_ID/ingest
docker build -t $TAG $DOCKER_BASE_DIR/ingest
docker push $TAG
Upload the container for compaction:
TAG=$REPO_PREFIX/compaction-job-execution:$VERSION
aws ecr create-repository --repository-name $INSTANCE_ID/compaction-job-execution
docker build -t $TAG $DOCKER_BASE_DIR/compaction-job-execution
docker push $TAG
If you will be using bulk import on EMR Serverless then upload the container as follows:
TAG=$REPO_PREFIX/bulk-import-runner-emr-serverless:$VERSION
aws ecr create-repository --repository-name $INSTANCE_ID/bulk-import-runner-emr-serverless
docker build -t $TAG $DOCKER_BASE_DIR/bulk-import-runner-emr-serverless
docker push $TAG
If you will be using the experimental bulk import using EKS then upload the container as follows (note this container will take around 35 minutes to build and it is not needed for bulk importing data using EMR):
TAG=$REPO_PREFIX/bulk-import-runner:$VERSION
aws ecr create-repository --repository-name $INSTANCE_ID/bulk-import-runner
docker build -t $TAG $DOCKER_BASE_DIR/bulk-import-runner
docker push $TAG
If you will be using the data generation that's used in system tests then upload the container as follows:
TAG=$REPO_PREFIX/system-test:$VERSION
aws ecr create-repository --repository-name $INSTANCE_ID/system-test
docker build -t $TAG $DOCKER_BASE_DIR/system-test
docker push $TAG
If you'd like to run operations in AWS Graviton-based instances, on the ARM64 architecture, you can use Docker BuildX to build multiplatform images.
These commands will create or recreate a builder:
docker buildx rm sleeper || true
docker buildx create --name sleeper --use
This command should replace the docker build
and docker push
commands documented above:
docker buildx build --platform linux/amd64,linux/arm64 -t $TAG --push $DOCKER_BASE_DIR/<image directory>
We need to upload jars to a S3 bucket so that they can be used by various resources. The code below assumes you start
in the project root directory, and you've already built the system with scripts/build/buildForTest.sh
.
INSTANCE_ID=<insert-a-unique-id-for-the-sleeper-instance-here>
JARS_BUCKET=sleeper-${INSTANCE_ID}-jars
REGION=<insert-the-AWS-region-you-want-to-use-here>
./scripts/deploy/syncJars.sh $JARS_BUCKET $REGION
Before we can use CDK to deploy Sleeper, we need to create some configuration files:
- An
instance.properties
file - containing information about your Sleeper instance, as well as default values used by tables if not specified. - A
table.properties
file which contains information about a table and a link to its schema file. - A
schema.json
file which describes the data stored in a Sleeper table. - A
tags.properties
file which lists the tags you want all of your Sleeper infrastructure to be tagged with.
There's an example of a basic instance properties file here and an example of a full instance properties file here. This latter file shows all the instance properties that you can set. Whichever of these two files you use as your starting point, you will need to set sensible values for the following properties:
sleeper.id
sleeper.jars.bucket
- if you followed the steps above for uploading the jars this needs to be set tosleeper-${INSTANCE_ID}-jars
sleeper.account
sleeper.region
sleeper.vpc
sleeper.subnets
- multiple subnet ids can be specified with commas in between, e.g.subnet-a,subnet-b
.sleeper.retain.infra.after.destroy
- set to false to cause resources such as the S3 buckets and Dynamo tables to be destroyed after running CDK destroy.
You will also need to set values for whichever ECR repositories you have uploaded Docker images to. These should be set
to the ECR repository name, eg. my-instance-id/ingest
.
sleeper.ingest.repo
sleeper.compaction.repo
sleeper.bulk.import.emr.serverless.repo
sleeper.bulk.import.eks.repo
sleeper.systemtest.repo
To include a table in your instance, your table.properties
file must be next to your instance.properties
file.
You can add more than one by creating a tables
directory, with a subfolder for each table.
See tables for more information on creating and working with Sleeper tables.
Each table will also need a schema.json
file next to the table.properties
file.
See create a schema for how to create a schema.
You can optionally create a tags.properties
file next to your instance.properties
, to apply tags to AWS resources
deployed by Sleeper. An example tags.properties file can be found here.
Here's a full example with two tables:
instance.properties
tags.properties
tables/table-1/table.properties
tables/table-1/schema.json
tables/table-2/table.properties
tables/table-2/schema.json
Note, if you do not set the property sleeper.retain.infra.after.destroy
to false
when deploying then however you choose to tear down Sleeper later on
you will also need to destroy some further S3 buckets and DynamoDB tables manually.
This is because by default they are kept.
You may optionally want to predefine your split points for a given table.
You can do this by setting the sleeper.table.splits.file
property in the
table properties file. There's an example of this in the
full example. If you decide not to set
this, your state store will be initialised with a single root partition. Note that
pre-splitting a table is important for any large-scale use of Sleeper, and is essential
for running bulk import jobs.
Now you have your configuration in place and your environment set up, you can deploy your Sleeper instance using AWS CDK.
INSTANCE_PROPERTIES=/path/to/instance.properties
VERSION=$(cat "./scripts/templates/version.txt")
cdk -a "java -cp scripts/jars/cdk-${VERSION}.jar sleeper.cdk.SleeperCdkApp" deploy -c propertiesfile=${INSTANCE_PROPERTIES} -c newinstance=true "*"
To avoid having to explicitly give approval for deploying all the stacks, add "--require-approval never" to the command.
If you'd like to include data generation for system tests, use the system test CDK app instead.
INSTANCE_PROPERTIES=/path/to/instance.properties
VERSION=$(cat "./scripts/templates/version.txt")
cdk -a "java -cp scripts/jars/system-test-${VERSION}-utility.jar sleeper.systemtest.cdk.SystemTestApp" deploy -c propertiesfile=${INSTANCE_PROPERTIES} -c newinstance=true "*"
By default all the stacks are deployed. However, if you don't need them, you can customise which stacks are deployed.
Mandatory components are the configuration bucket and data bucket, the index of Sleeper tables, the state store,
policies and roles to interact with the instance, and the TopicStack
which creates an SNS topic used by other stacks
to report errors.
That leaves the following stacks as optional:
CompactionStack
- for running compactions (in practice this is essential)GarbageCollectorStack
- for running garbage collection (in practice this is essential)IngestStack
- for ingesting files using the "standard" ingest methodPartitionSplittingStack
- for splitting partitions when they get too largeQueryStack
- for handling queries via SQSWebSocketQueryStack
- for handling queries via a web socketKeepLambdaWarmStack
- for sending dummy queries to avoid waiting for lambdas to start up during queriesEmrServerlessBulkImportStack
- for running bulk import jobs using Spark running on EMR ServerlessEmrStudioStack
- to create an EMR Studio containing the EMR Serverless applicationEmrBulkImportStack
- for running bulk import jobs using Spark running on an EMR cluster that is created on demandPersistentEmrBulkImportStack
- for running bulk import jobs using Spark running on a persistent EMR cluster, i.e. one that is always running (and therefore always costing money). By default, this uses EMR's managed scaling to scale up and down on demand.IngestBatcherStack
- for gathering files to be ingested or bulk imported in larger jobsTableMetricsStack
- for creating CloudWatch metrics showing statistics such as the number of records in a table over timeDashboardStack
- to create a CloudWatch dashboard showing recorded metrics
The following stacks are optional and experimental:
AthenaStack
- for running SQL analytics over the dataEksBulkImportStack
- for running bulk import jobs using Spark running on EKS
By default most of the optional stacks are included but to customise it, set the sleeper.optional.stacks
sleeper
property to a comma separated list of stack names, for example:
sleeper.optional.stacks=CompactionStack,IngestStack,QueryStack
Note that the system test stacks do not need to be specified. They will be included if you use the system test CDK app.
There are clients and scripts in the scripts/deploy
and scripts/utility
directories that can be used to adjust an
existing instance.
See the usage guide for information on how to interact with the instance. The admin client described there can be used to adjust the configuration of an instance by setting instance properties.
See the tables documentation for how to add/edit Sleeper tables.
The deployExisting.sh
script can be used to bring an existing instance up to date. This will upload any jars
that have changed, update all the docker images, and perform a cdk deploy
.
./scripts/deploy/deployExisting.sh <instance-id>
We are planning to add support to this script for declarative deployment, so that you can set your full instance and tables configuration in a folder structure and pass it to this script to apply any changes. Currently such changes must be done with the admin client.
If there is no ingest in progress, and all compactions have completed, then Sleeper will go to sleep, i.e. the only significant ongoing charges are for data storage. However, there are several lambda functions that are scheduled to run periodically using EventBridge rules. These lambda functions look for work to do, such as compactions to run. The execution of these should have very small cost, but it is best practice to pause the system, i.e. turn these rules off, if you will not be using it for a while. Note that the system can still be queried when it is paused.
# Pause the System
./scripts/utility/pauseSystem.sh ${INSTANCE_ID}
# Restart the System
./scripts/utility/restartSystem.sh ${INSTANCE_ID}
Once your finished with your Sleeper instance, you can delete it, i.e. remove all the resources associated with it.
Again there are two options regarding teardown, the automatic and the manual options. The automatic option
will only work if you deployed Sleeper automatically and you still have the generated
folder
in the project root directory. If you do you can simply run:
./scripts/deploy/tearDown.sh
To delete the resources manually use the following commands from the project root directory:
INSTANCE_PROPERTIES=/path/to/instance.properties
cdk -a "java -cp scripts/jars/cdk-${VERSION}.jar sleeper.cdk.SleeperCdkApp" \
destroy -c propertiesfile=${INSTANCE_PROPERTIES} -c validate=false "*"
To delete the jars bucket and all the jars in it:
aws s3 rb s3://${JARS_BUCKET} --force
To delete the ECR repositories use the following where INSTANCE_ID is the instance id of the cluster.
aws ecr delete-repository --repository-name=${INSTANCE_ID}/ingest --force
aws ecr delete-repository --repository-name=${INSTANCE_ID}/compaction-job-execution --force
aws ecr delete-repository --repository-name=${INSTANCE_ID}/bulk-import-runner --force