Skip to content
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
4dbdd47
docs(infra): add aws-infra folder with initial pipeline documentation
bhagyashreewagh Nov 13, 2025
e91ea76
docs(infra): add Elastic Beanstalk PassRole policy under aws-infra/po…
bhagyashreewagh Nov 13, 2025
176badd
Delete aws-infra/aws-infra/policies directory
bhagyashreewagh Nov 13, 2025
daebb37
Create iam_elastic_beanstalk_policy.json
bhagyashreewagh Nov 13, 2025
1fa6843
docs(readme): add complete AWS deployment and infrastructure document…
bhagyashreewagh Nov 13, 2025
e449df2
docs: add full AWS deployment and infrastructure guide to README
bhagyashreewagh Nov 14, 2025
68fdadb
Update README.md
bhagyashreewagh Nov 21, 2025
9cd267c
Update README.md
bhagyashreewagh Nov 21, 2025
31fc8b4
Update README.md
bhagyashreewagh Nov 21, 2025
217e619
docs(readme): update deployment guide with clearer examples and confi…
bhagyashreewagh Dec 5, 2025
87c8b92
Correct grammatical error in README.md
bhagyashreewagh Dec 7, 2025
2e41a6d
Update README.md
bhagyashreewagh Dec 8, 2025
4c02615
Update README.md
bhagyashreewagh Dec 8, 2025
68b8002
Create aws-architecture.png
bhagyashreewagh Dec 8, 2025
44c10a2
Add files via upload
bhagyashreewagh Dec 8, 2025
cf6c84f
Rename White Beige Minimal Flowchart Diagram Graph (1).png to aws_arc…
bhagyashreewagh Dec 8, 2025
e546840
Delete aws-infra/images/aws-architecture.png
bhagyashreewagh Dec 8, 2025
366a7dd
Update README.md
bhagyashreewagh Dec 8, 2025
7e75828
Add files via upload
bhagyashreewagh Dec 8, 2025
b754870
Delete aws-infra/images/aws_architecture.png
bhagyashreewagh Dec 8, 2025
eac6e24
Rename White Beige Minimal Flowchart Diagram Graph (2).png to aws_arc…
bhagyashreewagh Dec 8, 2025
a0cbebe
Update README.md
bhagyashreewagh Dec 8, 2025
c30fe43
Add files via upload
bhagyashreewagh Dec 8, 2025
c5dd9fc
Rename White Beige Minimal Flowchart Diagram Graph.svg to aws_archite…
bhagyashreewagh Dec 8, 2025
1b0e8eb
Delete aws-infra/images/aws_architecture.png
bhagyashreewagh Dec 8, 2025
68003e4
Update README.md
bhagyashreewagh Dec 8, 2025
20afc50
Rename aws_architecture.svg to aws_architecture_backend.svg
bhagyashreewagh Dec 8, 2025
cb7947c
Add files via upload
bhagyashreewagh Dec 8, 2025
ea75cf1
Delete aws-infra/images/2.svg
bhagyashreewagh Dec 8, 2025
46bfdbd
Add files via upload
bhagyashreewagh Dec 8, 2025
c2fe428
Add Antenna UI deployment guide to README
bhagyashreewagh Dec 8, 2025
91494df
Revise Antenna deployment documentation
bhagyashreewagh Dec 8, 2025
e67b5b4
Fix header formatting in README.md
bhagyashreewagh Dec 8, 2025
d606342
Revise README for backend services and security notes
bhagyashreewagh Dec 9, 2025
03a3ba3
Add Elastic Beanstalk configuration template
bhagyashreewagh Dec 10, 2025
9b6bd3a
Add Dockerrun.aws.json_template for AWS deployment
bhagyashreewagh Dec 10, 2025
f9fab02
Create storage.py
bhagyashreewagh Dec 10, 2025
9803c04
Merge branch 'RolnickLab:main' into feat/aws-pipeline
bhagyashreewagh Dec 10, 2025
35fb523
Refactor AWS S3 and MinIO configuration logic
bhagyashreewagh Dec 11, 2025
fca0e89
Add experimental warning to AWS deployment README
bhagyashreewagh Dec 26, 2025
6658ea1
Update storage.py
bhagyashreewagh Dec 26, 2025
ade4e95
Update storage.py
bhagyashreewagh Dec 26, 2025
c3bfea5
Add secrets manager implementation for AWS
bhagyashreewagh Jan 30, 2026
1fbe45a
Add dependencies to requirements.txt for AWS infra
bhagyashreewagh Jan 30, 2026
2d4a3e2
Add Redis ElastiCache configuration with Pulumi
bhagyashreewagh Jan 30, 2026
f1fc02d
Add RDS instance configuration with monitoring
bhagyashreewagh Jan 30, 2026
70cc82c
Add Docker image build and push logic for AWS ECR
bhagyashreewagh Jan 30, 2026
bc3aaa9
Add IAM roles and policies for ECS and EB
bhagyashreewagh Jan 30, 2026
3f2e14f
Add script to create AWS ECR repositories
bhagyashreewagh Jan 30, 2026
7c89e94
Add AWS Elastic Beanstalk infrastructure setup
bhagyashreewagh Jan 30, 2026
31c6e6d
Add CloudFront configuration and S3 bucket setup
bhagyashreewagh Jan 30, 2026
791f493
Initialize AWS infrastructure deployment script
bhagyashreewagh Jan 30, 2026
efe3363
Delete aws-infra/policies directory
bhagyashreewagh Jan 30, 2026
ba59265
Add S3 bucket configuration with ownership and encryption
bhagyashreewagh Jan 30, 2026
9f5a568
Import S3 modules in storage init file
bhagyashreewagh Jan 30, 2026
7b341ff
Add S3 bucket policies and public access settings
bhagyashreewagh Jan 30, 2026
b756454
Add installation guide for Pulumi on AWS
bhagyashreewagh Jan 30, 2026
639b482
Add VPC configuration using AWS default VPC
bhagyashreewagh Jan 30, 2026
880066a
Implement AWS subnets for Redis and RDS
bhagyashreewagh Jan 30, 2026
a7ebde5
Add security groups for EB, RDS, and Redis
bhagyashreewagh Jan 30, 2026
7f89036
Remove specific IPs from security group rules
bhagyashreewagh Jan 30, 2026
69d260c
Create private route table for RDS subnets
bhagyashreewagh Jan 30, 2026
671d254
Update installation.md
bhagyashreewagh Jan 30, 2026
bc3de5e
Update eb.py
bhagyashreewagh Jan 30, 2026
22aa1cc
Update s3.py
bhagyashreewagh Jan 30, 2026
d5930d4
Rename security_groups.py to security_group.py
bhagyashreewagh Feb 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
389 changes: 389 additions & 0 deletions aws-infra/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,389 @@
# Antenna Backend - Deployment & Infrastructure Guide

This document describes the AWS infrastructure and deployment pipeline for the Antenna backend.
The system runs on AWS Elastic Beanstalk (ECS-based multicontainer) using Docker, Celery, ElastiCache Redis (TLS), RDS PostgreSQL, S3, ECR, and Sentry.
It is intended for maintainers and contributors who need to understand, update, or reproduce the deployed environment.

---

## 1. Overview

The Antenna backend is a Django application deployed using:

- **Elastic Beanstalk (ECS-based multicontainer)** running Docker
- **ECR** for storing container images
- **RDS PostgreSQL** as the application database
- **ElastiCache Redis (TLS)** for Celery broker + Django cache
- **Dockerized services** (Django, Celery Worker, Celery Beat, Flower, AWS CLI helper)
- **S3** as static storage backend
- **IAM** roles for instance profiles and service roles
- **CloudWatch** for logs, health monitoring, ECS task metrics
- **Default VPC** with public and private subnets

---

## 2. Repository Structure (Deployment-Relevant)

- /.ebextensions/00_setup.config : EB environment variables and settings
- /.ebignore : Exclusion list for EB deployment bundle
- /Dockerrun.aws.json : Multi-container EB deployment config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add example versions of these files?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely! Some of these files may contain secrets, so I’ll sanitize the sensitive parts and upload minimal example versions.


---

## 3. Deployment Architecture

### 3.1. Elastic Beanstalk (EB)

- Platform: ECS on Amazon Linux 2 (Multicontainer Docker)
- Deployment bundle includes:
- `Dockerrun.aws.json` (v2)
- `.ebextensions/00_setup.config`
- Environment type:
- Single-instance environment (used for development/testing to reduce cost).
- Can be upgraded later to a load-balanced environment for production.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix fragment sentence.

This line lacks a subject. Restructure to: "This environment can be upgraded later to a load-balanced environment for production."

🧰 Tools
🪛 LanguageTool

[style] ~49-~49: To form a complete sentence, be sure to include a subject.
Context: ...velopment/testing to reduce cost). - Can be upgraded later to a load-balanced en...

(MISSING_IT_THERE)

🤖 Prompt for AI Agents
In aws-infra/README.md around line 49, the sentence "Can be upgraded later to a
load-balanced environment for production." is a fragment without a subject;
replace it with a full sentence such as "This environment can be upgraded later
to a load-balanced environment for production." to fix grammar and clarity.

- **Instance Configuration**
- Architecture: `x86_64`
- Instance types (preferred order):
- `t3.large`
- `t3.small`
- Capacity type: **On-Demand instances**

- **Auto Scaling Group**
- Uses a **single-instance ASG** (managed automatically by Elastic Beanstalk)
- EB performs health checks on the instance

- **Security Groups**
- EB-managed instance security group (default inbound + outbound rules)
- Additional outbound egress security group
*(originally created for App Runner, now reused for EB networking)*

- **Enhanced health reporting**
- Real-time system + application monitoring
- Free custom metric: `EnvironmentHealth`

- **Health Event Streaming**
- Log streaming to CloudWatch Logs: Enabled
- Retention: 7 days
- Lifecycle: Keep logs after terminating environment

- **Managed Platform Updates**
- Enabled
- Weekly maintenance window: Thursday @ 22:40 UTC
- Update level: Apply **minor and patch** updates
- Instance replacement enabled : EB replaces instance if no other updates apply.

- **Rolling Updates & Deployments**
- Deployment policy: All at once
- Batch size type: Percentage
- Rolling updates: Disabled (not needed for single instance)
- **Deployment preferences:**
- Ignore health check: `False`
- Health threshold: `OK`
- Command timeout: `600 seconds`


---

### 3.2. Docker Containers

EB ECS runs the following containers:

1. **django** - web application (the container listens on port 5000, which is exposed as port 80 on the Elastic Beanstalk host)
2. **celeryworker** - asynchronous task worker
3. **celerybeat** - scheduled task runner
4. **flower** - Celery monitoring UI (port 5555)
5. **awscli** - lightweight helper container for internal AWS commands

---

### 3.3. ECR Repositories Used

All application containers pull from:

- **antenna-backend**
`<ECR_URI>/antenna-backend`

The AWS CLI helper container pulls from:

- **antenna-awscli**
`<ECR_URI>/antenna-awscli`

Both repositories are **mutable** and **AES-256 encrypted**.

---

## 4. Environment Variables

In this setup, **all required environment variables—including secrets—are defined inside**
`.ebextensions/00_setup.config`.

Elastic Beanstalk automatically reads the values from this file and writes them into its
**Environment Properties** at deployment time.
This ensures a fully automated bootstrap with no manual EB console entry.

The deployment uses the following environment variables across these categories:

### Django
- `DJANGO_SETTINGS_MODULE`
- `DJANGO_SECRET_KEY`
- `DJANGO_ALLOWED_HOSTS`
- `DJANGO_SECURE_SSL_REDIRECT`
- `DJANGO_ADMIN_URL`
- `DJANGO_DEBUG`
- `EB_HEALTHCHECK`

### AWS / S3
- `DJANGO_AWS_ACCESS_KEY_ID`
- `DJANGO_AWS_SECRET_ACCESS_KEY`
- `DJANGO_AWS_STORAGE_BUCKET_NAME`
- `DJANGO_AWS_S3_REGION_NAME`

### Database (RDS)
- `POSTGRES_DB`
- `POSTGRES_USER`
- `POSTGRES_PASSWORD`
- `POSTGRES_HOST`
- `POSTGRES_PORT`
- `DATABASE_URL`

### Redis / Celery
- `REDIS_URL`
- `CELERY_BROKER_URL`

### Third-Party Integrations
- `SENDGRID_API_KEY`
- `SENTRY_DSN`

---

## 5. AWS Infrastructure Components

### 5.1. RDS (PostgreSQL)

- **Engine:** PostgreSQL
- **Instance class:** `db.t4g.small`
- **Availability Zone:** Single-AZ

- **Networking:**
- Runs inside the **default VPC**
- RDS subnet group uses **public subnets**
- Instance is configured as **publicly accessible** (need to make it private)

- **Endpoint:** *(redacted for security)*

- **Security group:**
- Inbound port **5432** allowed from the EB instance SG
- Outbound allowed to `0.0.0.0/0`

---

### 5.2. ElastiCache (Redis)

- **Engine:** Redis 7.1
- **Node type:** `cache.t4g.micro`
- **Cluster mode:** Disabled (single node)
- **Multi-AZ:** Disabled
- **Auto-failover:** Disabled

- **Security:**
- Encryption in transit: **Enabled**
- Encryption at rest: **Enabled**
- Redis URL requires:
- `rediss://` (TLS)
- `ssl_cert_reqs=none` for Celery/Django clients
- Inbound port **6379** allowed only from the EB instance SG

- **Networking:**
- Deployed into private subnets (via its subnet group)
- Runs within the same VPC as EB and RDS

---

### 5.3. Elastic Beanstalk EC2 Instance & IAM Roles

- **Instance type:** `t3.large`
- **Instance profile:** `aws-elasticbeanstalk-ec2-role`
- **Service role:** `aws-elasticbeanstalk-service-role`
- Create an EC2 key pair in your AWS account and attach it to the EB environment when launching the backend. (Each developer should use their own key pair.)
- **Public IP:** Assigned
- **Security groups:**
- EB default instance SG
- Outbound-only egress SG


### 5.4. IAM Roles and Policies

**1. EC2 Instance Profile – `aws-elasticbeanstalk-ec2-role`**
Attached AWS-managed policies (default from EB):
- `AWSElasticBeanstalkWebTier`
- `AWSElasticBeanstalkWorkerTier`
- `AmazonEC2ContainerRegistryReadOnly` (ECR pull)
- `CloudWatchAgentServerPolicy` (log streaming)
- S3 read/write access granted through `AWSElasticBeanstalkWebTier`
(used for EB deployment bundles, log archives, temp artifacts)

This role is used **by the EC2 instance itself**.
It allows the instance to:
- Pull container images from ECR
- Upload logs to CloudWatch
- Read/write to the EB S3 bucket
- Communicate with ECS agent inside the EB environment

---

**2. Service Role – `aws-elasticbeanstalk-service-role`**
Attached AWS-managed policies (default from EB):
- `AWSElasticBeanstalkEnhancedHealth`
- `AWSElasticBeanstalkService`

This role is used **by the Elastic Beanstalk service**, not the EC2 instance.
It allows EB to:
- Manage environment health monitoring
- Launch/update/terminate EC2 instances
- Interact with Auto Scaling
- Register container tasks and update ECS configuration

---

### Notes on Security / Least Privilege

The current roles use **Elastic Beanstalk’s default managed policies**, which are intentionally broad to ensure environments deploy successfully.

For a production-grade hardened setup, these should eventually be adjusted toward **least privilege**, including:

- Restricting S3 access to only specific buckets
- Restricting ECR access to only required repositories
- Minimizing CloudWatch permissions
- Adding explicit denies on unneeded services

This is recommended once the deployment architecture has stabilized so it would be a part of future scope.



---

### 5.5. Networking (EB Environment)

- **VPC:** default VPC
- **Subnets:**
- EB instance runs in a **public subnet**
- RDS + Redis run in **private subnets** (via their subnet groups)
- **Public access:**
- EB EC2 instance receives a public IP
- No load balancer (single-instance environment)
- **Connectivity:**
- EB instance can reach RDS & Redis via SG rules
- Internet connectivity available through AWS default routing

---

## 6. .ebextensions Configuration

`00_setup.config` handles:

- Loading environment variables into EB
- Setting health check path: `/api/v2/`
- Disabling SSL redirects during health checks (`EB_HEALTHCHECK=1`)
- Running Django migrations via Docker:
docker exec $(docker ps -q -f name=django) python manage.py migrate --noinput


---

## 7. Deployment Workflow

### Step 1 — Build and push image to ECR

docker build -t antenna-backend .
docker tag antenna-backend:latest <ECR_URI>:v10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is v10 an example version number? If so maybe add a small note about how to think about and use container versions

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, v10 is just an example. I’ll add a note explaining how contributors should choose version numbers so it's clear and consistent.

docker push <ECR_URI>:v10

### Step 2 — Update Dockerrun.aws.json

Update the tag:

"image": "<ECR_URI>:v10"

### Step 3 — Create EB bundle

zip -r deploy.zip Dockerrun.aws.json .ebextensions .ebignore


### Step 4 — Deploy to Elastic Beanstalk

- EB Console → Environment → Upload & Deploy
- Upload `deploy.zip`
- Wait for ECS tasks to start

### Step 5 — Validate Deployment

- `/api/v2/` returns `200`
- Django container remains healthy
- Celery worker connects to Redis successfully
- Celery Beat schedules run successfully
- Flower UI loads on port 5555 (if security groups permit)

---

## 8. Common Issues & Fixes

### Redis SSL Errors

ElastiCache requires TLS. Missing SSL args causes:

ssl.SSLCertVerificationError

**Fix:**
Use `rediss://` and `ssl_cert_reqs=none`.

### Health Check Redirect Loops

EB health checks cannot handle HTTPS.

**Fix:**
Set `EB_HEALTHCHECK=1` and temporarily disable SSL redirect for health checks.

### Early Migrations Failure

EB sometimes runs migrations before services are ready.

**Fix:**
`.ebextensions` migration command is set to ignore failures and retry.

---

## 9. Future Improvements

To harden the deployment and move toward a production-grade architecture, the following enhancements are recommended:

- **Move secrets to AWS Secrets Manager**
Centralize all sensitive variables (DB password, Redis URL, Django secret key, Sentry key, SendGrid, etc.) and replace `.ebextensions` injection with runtime retrieval.

- **Enable ElastiCache Multi-AZ + Auto-Failover**
Improves high availability for Celery and Django caching; eliminates single-node Redis failure risks.

- **Restrict RDS and Redis to private-only access**
Disable public accessibility on RDS and ensure Redis remains reachable only via EB’s security group.

- **IAM hardening and least-privilege review**
Replace broad EB-managed policies with reduced IAM policies scoped only to required S3, ECR, CloudWatch, and ECS resources.

- **Add CI/CD pipeline (GitHub Actions -> ECR -> EB)**
Automate build, tag, push of images and deployments to Elastic Beanstalk for consistent, reproducible releases.

- **Add staging environment**
Separate EB environment (staging) for testing migrations, image builds, and infrastructure changes before production.

- **Migrate to load-balanced EB environment (optional)**
Enables rolling deployments, zero-downtime updates, and better scalability.

- **Enable RDS Multi-AZ + automated backups**
Ensures database failover and improves disaster recovery readiness.

- **Add health checks for Celery worker & beat**
Custom EB or CloudWatch alarms to alert on worker failures, broker connectivity issues, or long task queues.


---

_End of documentation._
1 change: 1 addition & 0 deletions aws-infra/policies/iam_elastic_beanstalk_policy.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

File is empty and contains no valid JSON policy content.

The IAM policy file is currently empty and fails JSON validation. A valid Elastic Beanstalk IAM policy should follow the structure with "Version" and "Statement" keys, but this file contains none.

Since the PR adds comprehensive AWS infrastructure documentation, this policy file is critical infrastructure-as-code. An empty file cannot be used for actual deployment and contradicts the PR's intent.

Do you want me to generate a starter IAM policy tailored to the Antenna Elastic Beanstalk deployment? Based on the PR objectives mentioning an ECS-based multi-container setup with RDS and ElastiCache, I can provide a policy that aligns with those resources. Please clarify:

  1. Intended scope: Is this policy for the EC2 instance role (what EB instances assume) or for deployment/CI-CD users?
  2. Least-privilege requirement: Should the policy follow least-privilege with scoped resources, or use broader wildcards for initial setup?
  3. Additional integrations: Does it need permissions for ECR, RDS, ElastiCache, CloudWatch, or other services mentioned in the infrastructure guide?
🧰 Tools
🪛 Biome (2.1.2)

[error] 1-1: Expected an array, an object, or a literal but instead found the end of the file.

Expected an array, an object, or a literal here.

(parse)

🤖 Prompt for AI Agents
In aws-infra/policies/iam_elastic_beanstalk_policy.json around line 1, the file
is empty and must be replaced with a valid IAM policy JSON (including "Version"
and "Statement"). Fix by creating a proper JSON policy for the intended role:
decide whether this is an EC2/Elastic Beanstalk instance profile or a
CI/CD/deployer role, pick least-privilege vs wildcard scope, and either (A)
reference AWS-managed Elastic Beanstalk policies (e.g.,
AWSElasticBeanstalkWebTier/WorkerTier plus any needed service policies like
AmazonEC2ContainerRegistryReadOnly, CloudWatchLogsFullAccess, or scoped
RDS/ElastiCache permissions) or (B) author a custom policy JSON with a Version
and Statements granting only the required actions (S3 get/put for deployments,
ecs/ecr describe/pull if ECS tasks used, logs:CreateLogStream/PutLogEvents,
ec2/autoscaling describe and tagging, and scoped RDS/ElastiCache access if
needed). Ensure the final file is valid JSON and add comments in PR describing
scope and resource ARNs used for least-privilege.