Defines the infrastructure for the KM NLP Project.
Based on https://gist.github.com/jacobtomlinson/ee5ba79228e42bcc9975faf0179c3d1a
Input
- cluster arn
INfrastructure
- IAM roles
- dask-fargate-execution
- dask-fargate-task
- Security Groups
- dask
- Allow 8786 - 8787 from things that are communicating wiht dask
- Ephemeral port access from itself?
- dask
- Cloud watch log group
- ECS Tasks
- dask-scheduler
- dask-worker
Based on dask cloud provider source
- Create cluster in ECS
- Create execution role - 1061
- ecs-tasks asume role and attached policies
- Create task role
- Create cloud watch logs group
- Create security group
- 8786 and 8787 for external
- 0 - 65535 from the group itself
- Creates the scheduler task def
- Create the worker task def
Scaling up
- Scheduler task is started
- An ecs task is run
- TODO continue here. It should just be starting the workers
- Install cdktf and terraform
brew install cdktf terraform
on the mac
- Checkout the code.
- Create/activate your Python environment of choice.
- Install uv:
pip install uv
. - Install dependencies:
uv pip install -r pyproject.toml
. - Install dev dependencies:
uv pip install -r pyproject.toml --extra dev
. - Run
pre-commit install
to install pre-commit hooks. - Configure your editor for realtime linting:
- For VS Code:
- Set the correct Python environment for the workspace via
ctrl+shift+P
>Python: Select Interpreter
. - Install the Pylance and Ruff extensions.
- Set the correct Python environment for the workspace via
- For VS Code:
- Make changes.
- Verify linting passes
scripts/lint.sh
. - Verify tests pass
scripts/test.sh
. - Commit and push your changes.
Assumes you've done the developer setup
- Copy
.env.template
to.env
and modify as needed- At the very least, you will likely need to un-comment the
AWS_PROFILE
value.
- At the very least, you will likely need to un-comment the
- Connect as necessary to AWS to get credentials or login with your profile.
- Start the dask cluster in AWS from the
demo-app
project. (Instructions in that project's README.) NOTE: Do not hit enter when it finishes. It'll tear down the cluster. - Once the clsuter starts, copy the tcp address of the scheduler, and paste as the value of
DASK_ADDRESS
in your.env
file. - Run
scripts/deploy.sh
- Check to see whether the service is up at https://demo.kmnlp.element84.com/
- If the page doesn't resolve, either your IP address isn't in the allow list, or the deploy created the load balancer (rather than updating an existing one).
- Get your public IP address.
- Go to the AWS Console
- Use search to go to Systems Manager
- Click on "Parameter Store"
- Select
e84-kmnlp-demo-chainlit-allowed-cidrs
- Click "Edit"
- Add your IP address appended with
/32
to the value. (Individual items are comma-separated.)
- Login to the AWS Console.
- Search for "Load balancers" and select the one labeled "EC2 Feature"
- Select the relevant load balancer. It should be named:
demo-kmnlp-chainlit-alb
. - Copy the DNS name
- Navigate to Route 53 in the AWS Console. (I like to do this in a separate tab.)
- Select "Hosted Zones".
- Select
kmnlp.element84.com
. - Select the CNAME record whose record name is
demo.kmnlp.element84.com
. - Click on "Edit record"
- Replace the value with the DNS name you copied from the load balancer.
- Hit save.
In order for the deploy to work, there must be a certificate present for the domain demo.kmnlp.element84.com. If this certificate disappears for whatever reason, or if the domain name is changing:
- Go to the AWS Console.
- Use search to go to Certificate Manager.
- Click on Request.
- Select "Request a public certificate" and click Next..
- Enter the fully qualified domain name (e.g. demo.kmnlp.element84.com).
- For validation method, use DNS validation.
- The default key algorithm should be fine.
- Click Request.
- From the resulting certificate, in the Domains section, there is a button to "Create records in Route 53". Click it.
- Follow whatever the steps are here to create the relevant CNAME record.
The deploy also needs to have a VPC to work. The VPC's ID is hard-coded in the common.py file. Currently, we assume that the VPC is in multiple availability zones and has both public and private subnets in each zone.