-
Notifications
You must be signed in to change notification settings - Fork 11
feat(pulumi): add AWS infrastructure code and deployment guide #1052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 9 commits
4dbdd47
e91ea76
176badd
daebb37
1fa6843
e449df2
68fdadb
9cd267c
31fc8b4
217e619
87c8b92
2e41a6d
4c02615
68b8002
44c10a2
cf6c84f
e546840
366a7dd
7e75828
b754870
eac6e24
a0cbebe
c30fe43
c5dd9fc
1b0e8eb
68003e4
20afc50
cb7947c
ea75cf1
46bfdbd
c2fe428
91494df
e67b5b4
d606342
03a3ba3
9b6bd3a
f9fab02
9803c04
35fb523
fca0e89
6658ea1
ade4e95
c3bfea5
1fbe45a
2d4a3e2
f1fc02d
70cc82c
bc3aaa9
3f2e14f
7c89e94
31c6e6d
791f493
efe3363
ba59265
9f5a568
7b341ff
b756454
639b482
880066a
a7ebde5
7f89036
69d260c
671d254
bc3de5e
22aa1cc
d5930d4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,389 @@ | ||
| # Antenna Backend - Deployment & Infrastructure Guide | ||
|
|
||
| This document describes the AWS infrastructure and deployment pipeline for the Antenna backend. | ||
| The system runs on AWS Elastic Beanstalk (ECS-based multicontainer) using Docker, Celery, ElastiCache Redis (TLS), RDS PostgreSQL, S3, ECR, and Sentry. | ||
| It is intended for maintainers and contributors who need to understand, update, or reproduce the deployed environment. | ||
|
|
||
| --- | ||
|
|
||
| ## 1. Overview | ||
|
|
||
| The Antenna backend is a Django application deployed using: | ||
|
|
||
| - **Elastic Beanstalk (ECS-based multicontainer)** running Docker | ||
| - **ECR** for storing container images | ||
| - **RDS PostgreSQL** as the application database | ||
| - **ElastiCache Redis (TLS)** for Celery broker + Django cache | ||
| - **Dockerized services** (Django, Celery Worker, Celery Beat, Flower, AWS CLI helper) | ||
| - **S3** as static storage backend | ||
| - **IAM** roles for instance profiles and service roles | ||
| - **CloudWatch** for logs, health monitoring, ECS task metrics | ||
| - **Default VPC** with public and private subnets | ||
|
|
||
| --- | ||
|
|
||
| ## 2. Repository Structure (Deployment-Relevant) | ||
|
|
||
| - /.ebextensions/00_setup.config : EB environment variables and settings | ||
| - /.ebignore : Exclusion list for EB deployment bundle | ||
| - /Dockerrun.aws.json : Multi-container EB deployment config | ||
|
|
||
| --- | ||
|
|
||
| ## 3. Deployment Architecture | ||
|
|
||
| ### 3.1. Elastic Beanstalk (EB) | ||
|
|
||
| - Platform: ECS on Amazon Linux 2 (Multicontainer Docker) | ||
| - Deployment bundle includes: | ||
| - `Dockerrun.aws.json` (v2) | ||
| - `.ebextensions/00_setup.config` | ||
| - Environment type: | ||
| - Single-instance environment (used for development/testing to reduce cost). | ||
| - Can be upgraded later to a load-balanced environment for production. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fix fragment sentence. This line lacks a subject. Restructure to: "This environment can be upgraded later to a load-balanced environment for production." 🧰 Tools🪛 LanguageTool[style] ~49-~49: To form a complete sentence, be sure to include a subject. (MISSING_IT_THERE) 🤖 Prompt for AI Agents |
||
| - **Instance Configuration** | ||
| - Architecture: `x86_64` | ||
| - Instance types (preferred order): | ||
| - `t3.large` | ||
| - `t3.small` | ||
| - Capacity type: **On-Demand instances** | ||
|
|
||
| - **Auto Scaling Group** | ||
| - Uses a **single-instance ASG** (managed automatically by Elastic Beanstalk) | ||
| - EB performs health checks on the instance | ||
|
|
||
| - **Security Groups** | ||
| - EB-managed instance security group (default inbound + outbound rules) | ||
| - Additional outbound egress security group | ||
| *(originally created for App Runner, now reused for EB networking)* | ||
|
|
||
| - **Enhanced health reporting** | ||
| - Real-time system + application monitoring | ||
| - Free custom metric: `EnvironmentHealth` | ||
|
|
||
| - **Health Event Streaming** | ||
| - Log streaming to CloudWatch Logs: Enabled | ||
| - Retention: 7 days | ||
| - Lifecycle: Keep logs after terminating environment | ||
|
|
||
| - **Managed Platform Updates** | ||
| - Enabled | ||
| - Weekly maintenance window: Thursday @ 22:40 UTC | ||
| - Update level: Apply **minor and patch** updates | ||
| - Instance replacement enabled : EB replaces instance if no other updates apply. | ||
|
|
||
| - **Rolling Updates & Deployments** | ||
| - Deployment policy: All at once | ||
| - Batch size type: Percentage | ||
| - Rolling updates: Disabled (not needed for single instance) | ||
| - **Deployment preferences:** | ||
| - Ignore health check: `False` | ||
| - Health threshold: `OK` | ||
| - Command timeout: `600 seconds` | ||
|
|
||
|
|
||
| --- | ||
|
|
||
| ### 3.2. Docker Containers | ||
|
|
||
| EB ECS runs the following containers: | ||
|
|
||
| 1. **django** - web application (the container listens on port 5000, which is exposed as port 80 on the Elastic Beanstalk host) | ||
| 2. **celeryworker** - asynchronous task worker | ||
| 3. **celerybeat** - scheduled task runner | ||
| 4. **flower** - Celery monitoring UI (port 5555) | ||
| 5. **awscli** - lightweight helper container for internal AWS commands | ||
|
|
||
| --- | ||
|
|
||
| ### 3.3. ECR Repositories Used | ||
|
|
||
| All application containers pull from: | ||
|
|
||
| - **antenna-backend** | ||
| `<ECR_URI>/antenna-backend` | ||
|
|
||
| The AWS CLI helper container pulls from: | ||
|
|
||
| - **antenna-awscli** | ||
| `<ECR_URI>/antenna-awscli` | ||
|
|
||
| Both repositories are **mutable** and **AES-256 encrypted**. | ||
|
|
||
| --- | ||
|
|
||
| ## 4. Environment Variables | ||
|
|
||
| In this setup, **all required environment variables—including secrets—are defined inside** | ||
| `.ebextensions/00_setup.config`. | ||
|
|
||
| Elastic Beanstalk automatically reads the values from this file and writes them into its | ||
| **Environment Properties** at deployment time. | ||
| This ensures a fully automated bootstrap with no manual EB console entry. | ||
|
|
||
| The deployment uses the following environment variables across these categories: | ||
|
|
||
| ### Django | ||
| - `DJANGO_SETTINGS_MODULE` | ||
| - `DJANGO_SECRET_KEY` | ||
| - `DJANGO_ALLOWED_HOSTS` | ||
| - `DJANGO_SECURE_SSL_REDIRECT` | ||
| - `DJANGO_ADMIN_URL` | ||
| - `DJANGO_DEBUG` | ||
| - `EB_HEALTHCHECK` | ||
|
|
||
| ### AWS / S3 | ||
| - `DJANGO_AWS_ACCESS_KEY_ID` | ||
| - `DJANGO_AWS_SECRET_ACCESS_KEY` | ||
| - `DJANGO_AWS_STORAGE_BUCKET_NAME` | ||
| - `DJANGO_AWS_S3_REGION_NAME` | ||
|
|
||
| ### Database (RDS) | ||
| - `POSTGRES_DB` | ||
| - `POSTGRES_USER` | ||
| - `POSTGRES_PASSWORD` | ||
| - `POSTGRES_HOST` | ||
| - `POSTGRES_PORT` | ||
| - `DATABASE_URL` | ||
|
|
||
| ### Redis / Celery | ||
| - `REDIS_URL` | ||
| - `CELERY_BROKER_URL` | ||
|
|
||
| ### Third-Party Integrations | ||
| - `SENDGRID_API_KEY` | ||
| - `SENTRY_DSN` | ||
|
|
||
| --- | ||
|
|
||
| ## 5. AWS Infrastructure Components | ||
|
|
||
| ### 5.1. RDS (PostgreSQL) | ||
|
|
||
| - **Engine:** PostgreSQL | ||
| - **Instance class:** `db.t4g.small` | ||
| - **Availability Zone:** Single-AZ | ||
|
|
||
| - **Networking:** | ||
| - Runs inside the **default VPC** | ||
| - RDS subnet group uses **public subnets** | ||
| - Instance is configured as **publicly accessible** (need to make it private) | ||
|
|
||
coderabbitai[bot] marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - **Endpoint:** *(redacted for security)* | ||
|
|
||
| - **Security group:** | ||
| - Inbound port **5432** allowed from the EB instance SG | ||
| - Outbound allowed to `0.0.0.0/0` | ||
|
|
||
| --- | ||
|
|
||
| ### 5.2. ElastiCache (Redis) | ||
|
|
||
| - **Engine:** Redis 7.1 | ||
| - **Node type:** `cache.t4g.micro` | ||
| - **Cluster mode:** Disabled (single node) | ||
| - **Multi-AZ:** Disabled | ||
| - **Auto-failover:** Disabled | ||
|
|
||
| - **Security:** | ||
| - Encryption in transit: **Enabled** | ||
| - Encryption at rest: **Enabled** | ||
| - Redis URL requires: | ||
| - `rediss://` (TLS) | ||
| - `ssl_cert_reqs=none` for Celery/Django clients | ||
| - Inbound port **6379** allowed only from the EB instance SG | ||
|
|
||
| - **Networking:** | ||
| - Deployed into private subnets (via its subnet group) | ||
| - Runs within the same VPC as EB and RDS | ||
|
|
||
| --- | ||
|
|
||
| ### 5.3. Elastic Beanstalk EC2 Instance & IAM Roles | ||
|
|
||
| - **Instance type:** `t3.large` | ||
| - **Instance profile:** `aws-elasticbeanstalk-ec2-role` | ||
| - **Service role:** `aws-elasticbeanstalk-service-role` | ||
| - Create an EC2 key pair in your AWS account and attach it to the EB environment when launching the backend. (Each developer should use their own key pair.) | ||
| - **Public IP:** Assigned | ||
| - **Security groups:** | ||
| - EB default instance SG | ||
| - Outbound-only egress SG | ||
|
|
||
|
|
||
| ### 5.4. IAM Roles and Policies | ||
|
|
||
| **1. EC2 Instance Profile – `aws-elasticbeanstalk-ec2-role`** | ||
| Attached AWS-managed policies (default from EB): | ||
| - `AWSElasticBeanstalkWebTier` | ||
| - `AWSElasticBeanstalkWorkerTier` | ||
| - `AmazonEC2ContainerRegistryReadOnly` (ECR pull) | ||
| - `CloudWatchAgentServerPolicy` (log streaming) | ||
| - S3 read/write access granted through `AWSElasticBeanstalkWebTier` | ||
| (used for EB deployment bundles, log archives, temp artifacts) | ||
|
|
||
| This role is used **by the EC2 instance itself**. | ||
| It allows the instance to: | ||
| - Pull container images from ECR | ||
| - Upload logs to CloudWatch | ||
| - Read/write to the EB S3 bucket | ||
| - Communicate with ECS agent inside the EB environment | ||
|
|
||
| --- | ||
|
|
||
| **2. Service Role – `aws-elasticbeanstalk-service-role`** | ||
| Attached AWS-managed policies (default from EB): | ||
| - `AWSElasticBeanstalkEnhancedHealth` | ||
| - `AWSElasticBeanstalkService` | ||
|
|
||
| This role is used **by the Elastic Beanstalk service**, not the EC2 instance. | ||
| It allows EB to: | ||
| - Manage environment health monitoring | ||
| - Launch/update/terminate EC2 instances | ||
| - Interact with Auto Scaling | ||
| - Register container tasks and update ECS configuration | ||
|
|
||
| --- | ||
|
|
||
| ### Notes on Security / Least Privilege | ||
|
|
||
| The current roles use **Elastic Beanstalk’s default managed policies**, which are intentionally broad to ensure environments deploy successfully. | ||
|
|
||
| For a production-grade hardened setup, these should eventually be adjusted toward **least privilege**, including: | ||
|
|
||
| - Restricting S3 access to only specific buckets | ||
| - Restricting ECR access to only required repositories | ||
| - Minimizing CloudWatch permissions | ||
| - Adding explicit denies on unneeded services | ||
|
|
||
| This is recommended once the deployment architecture has stabilized so it would be a part of future scope. | ||
|
|
||
|
|
||
|
|
||
| --- | ||
|
|
||
| ### 5.5. Networking (EB Environment) | ||
|
|
||
| - **VPC:** default VPC | ||
| - **Subnets:** | ||
| - EB instance runs in a **public subnet** | ||
| - RDS + Redis run in **private subnets** (via their subnet groups) | ||
| - **Public access:** | ||
| - EB EC2 instance receives a public IP | ||
| - No load balancer (single-instance environment) | ||
| - **Connectivity:** | ||
| - EB instance can reach RDS & Redis via SG rules | ||
| - Internet connectivity available through AWS default routing | ||
|
|
||
| --- | ||
|
|
||
| ## 6. .ebextensions Configuration | ||
|
|
||
| `00_setup.config` handles: | ||
|
|
||
| - Loading environment variables into EB | ||
| - Setting health check path: `/api/v2/` | ||
| - Disabling SSL redirects during health checks (`EB_HEALTHCHECK=1`) | ||
| - Running Django migrations via Docker: | ||
| docker exec $(docker ps -q -f name=django) python manage.py migrate --noinput | ||
|
|
||
|
|
||
| --- | ||
|
|
||
| ## 7. Deployment Workflow | ||
|
|
||
| ### Step 1 — Build and push image to ECR | ||
|
|
||
| docker build -t antenna-backend . | ||
| docker tag antenna-backend:latest <ECR_URI>:v10 | ||
|
||
| docker push <ECR_URI>:v10 | ||
|
|
||
| ### Step 2 — Update Dockerrun.aws.json | ||
|
|
||
| Update the tag: | ||
|
|
||
| "image": "<ECR_URI>:v10" | ||
|
|
||
| ### Step 3 — Create EB bundle | ||
|
|
||
| zip -r deploy.zip Dockerrun.aws.json .ebextensions .ebignore | ||
|
|
||
|
|
||
| ### Step 4 — Deploy to Elastic Beanstalk | ||
|
|
||
| - EB Console → Environment → Upload & Deploy | ||
| - Upload `deploy.zip` | ||
| - Wait for ECS tasks to start | ||
|
|
||
| ### Step 5 — Validate Deployment | ||
|
|
||
| - `/api/v2/` returns `200` | ||
| - Django container remains healthy | ||
| - Celery worker connects to Redis successfully | ||
carlosgjs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - Celery Beat schedules run successfully | ||
| - Flower UI loads on port 5555 (if security groups permit) | ||
|
|
||
| --- | ||
|
|
||
| ## 8. Common Issues & Fixes | ||
|
|
||
| ### Redis SSL Errors | ||
|
|
||
| ElastiCache requires TLS. Missing SSL args causes: | ||
|
|
||
| ssl.SSLCertVerificationError | ||
|
|
||
| **Fix:** | ||
| Use `rediss://` and `ssl_cert_reqs=none`. | ||
carlosgjs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ### Health Check Redirect Loops | ||
|
|
||
| EB health checks cannot handle HTTPS. | ||
|
|
||
| **Fix:** | ||
| Set `EB_HEALTHCHECK=1` and temporarily disable SSL redirect for health checks. | ||
|
|
||
| ### Early Migrations Failure | ||
|
|
||
| EB sometimes runs migrations before services are ready. | ||
|
|
||
| **Fix:** | ||
| `.ebextensions` migration command is set to ignore failures and retry. | ||
|
|
||
| --- | ||
|
|
||
| ## 9. Future Improvements | ||
|
|
||
| To harden the deployment and move toward a production-grade architecture, the following enhancements are recommended: | ||
|
|
||
carlosgjs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - **Move secrets to AWS Secrets Manager** | ||
| Centralize all sensitive variables (DB password, Redis URL, Django secret key, Sentry key, SendGrid, etc.) and replace `.ebextensions` injection with runtime retrieval. | ||
|
|
||
| - **Enable ElastiCache Multi-AZ + Auto-Failover** | ||
| Improves high availability for Celery and Django caching; eliminates single-node Redis failure risks. | ||
|
|
||
| - **Restrict RDS and Redis to private-only access** | ||
| Disable public accessibility on RDS and ensure Redis remains reachable only via EB’s security group. | ||
|
|
||
| - **IAM hardening and least-privilege review** | ||
| Replace broad EB-managed policies with reduced IAM policies scoped only to required S3, ECR, CloudWatch, and ECS resources. | ||
|
|
||
| - **Add CI/CD pipeline (GitHub Actions -> ECR -> EB)** | ||
| Automate build, tag, push of images and deployments to Elastic Beanstalk for consistent, reproducible releases. | ||
|
|
||
| - **Add staging environment** | ||
| Separate EB environment (staging) for testing migrations, image builds, and infrastructure changes before production. | ||
|
|
||
| - **Migrate to load-balanced EB environment (optional)** | ||
| Enables rolling deployments, zero-downtime updates, and better scalability. | ||
|
|
||
| - **Enable RDS Multi-AZ + automated backups** | ||
| Ensures database failover and improves disaster recovery readiness. | ||
|
|
||
| - **Add health checks for Celery worker & beat** | ||
| Custom EB or CloudWatch alarms to alert on worker failures, broker connectivity issues, or long task queues. | ||
|
|
||
|
|
||
| --- | ||
|
|
||
| _End of documentation._ | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
|
|
||
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add example versions of these files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely! Some of these files may contain secrets, so I’ll sanitize the sensitive parts and upload minimal example versions.