Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DevOps course refactor #1737

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 13 additions & 43 deletions devops/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,17 @@

## The goal

The course aims to offer in-depth knowledge of DevOps principles and essential AWS services necessary for efficient automation and infrastructure management. Participants will gain practical skills in setting up, deploying, and managing Kubernetes clusters on AWS, using tools like Kops and Terraform.
The course aims to offer in-depth knowledge of DevOps principles and essential AWS services necessary for efficient automation and infrastructure management. Participants will gain practical skills in setting up, deploying, and managing Kubernetes clusters on AWS, using tools like K3s and Terraform, Jenkins and monitoring tools.

## Prerequisite

- The application code should have unit tests and should be able to run (`npm test` or similar).
- Students should have a SonarQube account and will be able to run with their code (`sonar-scanner -Dsonar.projectKey=Your_Project_Key -Dsonar.sources=. -Dsonar.host.url=https://sonarcloud.io/`).
- Dockerfile with application.
- Basic knowledge of Cloud computing and networking
- Personal laptop

## Module 1: Configuration and Resources

### Part 1 (configuration)

- Developing the architecture of the infrastructure with private and public networks.
- Install aws cli.
- Installation and configuration of Terraform.
- Configuring access to AWS via Terraform (API keys, IAM roles).
Expand Down Expand Up @@ -46,31 +44,7 @@ The course aims to offer in-depth knowledge of DevOps principles and essential A

## Module 2: Cluster Configuration and Creation

### Part 1 (cluster configuration)

#### Option 1 (kops config)

- Installing Kops on your workstation.
- Creating an IAM user for Kops with the necessary permissions.
- Configuring access to AWS using IAM credentials.

#### Option 2 (k3s config)

- Installing K3s on your local machine.
- Creating an IAM user with the necessary permissions specifically for managing the K3s environment.
- Configuring AWS IAM credentials on the workstation.

### Part 2 (create cluster)

#### Option 1 (kops installation)

- Prepare cluster configuration with kops command.
- Apply kops configuration.
- Validate the cluster.
- Create an account in Kubernetes for Jenkins.

#### Option 2 (k3s installation)

- Installing K3s on your EC2 instances.
- Prepare the K3s cluster configuration.
- Applying the K3s configuration using Terraform and K3s setup commands.
- Validating the cluster to ensure it's correctly configured and operational.
Expand All @@ -89,42 +63,38 @@ The course aims to offer in-depth knowledge of DevOps principles and essential A
- Installing necessary plugins in Jenkins. (sonarqube, docker).
- Set up necessary plugins in Jenkins for Kubernetes like Kubernetes plugin. Configure the plugin with endpoints and credentials for Kubernetes.

### Part 2 (Create Pipeline)
### Part 2 (Create HELM chart)

- Create a Helm chart for [given application](./flask_app). The chart should contain templates for all necessary Kubernetes resources like Deployments as well as Health checks, liveness, readiness probes.
- Check that the application works as expected.

### Part 3 (Create Pipeline)

- Create pipeline, add steps:
1. Build application.
2. Unit tests.
3. SonarQube check.
4. Build and push docker image to ECR.
5. Deploy docker image to Kubernetes cluster.
- Create a Helm chart for your application. The chart should contain templates for all necessary Kubernetes resources like Deployments as well as Health checks, liveness, readiness probes.
- Store your Helm chart in a source control system accessible from Jenkins.
- In the deployment stage of your Jenkinsfile, add steps to deploy the application using Helm.
- Check that the application works as expected.
- After the deployment, you can add steps to verify that the application is running as expected. This could involve checking the status of the Kubernetes deployment, running integration tests, or hitting a health check endpoint.

## Module 4: Monitoring with Prometheus and Grafana

### Part 1 (Installation and configuration Prometheus)

### Prometheus
- Using Helm to install Prometheus in Kubernetes.
- Configuring Prometheus to collect metrics from the cluster.
- Creating and configuring Service Monitor to track services in the cluster.
- Configuring alert rules in Prometheus for monitoring critical events.

### Part 2 (Installation and configuration Grafana)

### Grafana
- Deploying Grafana in Kubernetes using Helm.
- Setting up secure access to Grafana via Ingress or LoadBalancer.
- Configuring Grafana to connect to Prometheus as a data source.
- Importing or creating dashboards to visualize metrics from Prometheus.

### Part 3. (Testing monitoring)

### Alerting Management
- Conducting tests to verify the collection of metrics and their display in Grafana.
- Simulating failures or high loads to test configured alerts.

## Useful links

- [Kubernetes with Kops](https://blog.kubecost.com/blog/kubernetes-kops/)
- [K3s AWS Terraform Cluster](https://garutilorenzo.github.io/k3s-aws-terraform-cluster/)
8 changes: 8 additions & 0 deletions devops/flask_app/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
To run this application, use Docker source image with python3.9+
INstall requirements with ```pip install -r requirements.txt```

Run application with:
```
FLASK_APP=main.py
flask run --host=0.0.0.0 --port=8080
```
8 changes: 8 additions & 0 deletions devops/flask_app/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from flask import Flask

app = Flask(__name__)


@app.route('/')
def hello():
return 'Hello, World!'
1 change: 1 addition & 0 deletions devops/flask_app/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Flask
41 changes: 25 additions & 16 deletions devops/modules/1_basic-configuration/task_1.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,18 @@
# Task 1: AWS Account Configuration

![task_1 schema](../../visual_assets/task_1.png)
## Objective

In this task, you will:

- Install and configure the required software on your local computer
- Set up an AWS account with the necessary permissions and security configurations
- Deploy S3 buckets for Terraform states
- Create a Github Actions workflow to deploy infrastructure in AWS

Additional tasks:
- Create a federation with your AWS account for Github Actions
- Create an IAM role for Github Actions
- Create a Github Actions workflow to deploy infrastructure in AWS


## Steps

Expand Down Expand Up @@ -43,10 +46,11 @@ In this task, you will:

5. **Create a bucket for Terraform states**

- Locking terraform state via DynamoDB is not required in this task, but recommended by the best practices. vvvv
- [Managing Terraform states Best Practices](https://spacelift.io/blog/terraform-s3-backend)
- [Terraform backend S3](https://developer.hashicorp.com/terraform/language/backend/s3)

6. **Create an IAM role for Github Actions**
6. **Create an IAM role for Github Actions(Additional task)πŸ’«**

- Create an IAM role `GithubActionsRole` with the same permissions as in step 2:
- AmazonEC2FullAccess
Expand All @@ -58,17 +62,17 @@ In this task, you will:
- AmazonEventBridgeFullAccess
- [Terraform resource](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role)

7. **Configure an Identity Provider and Trust policies for Github Actions**
7. **Configure an Identity Provider and Trust policies for Github Actions(Additional task)πŸ’«**

- Update the `GithubActionsRole` IAM role with Trust policy following the next guides
- Update the `GithubActionsRole` IAM role with a Trust policy following the next guides
- [IAM roles terms and concepts](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html#id_roles_terms-and-concepts)
- [Github tutorial](https://docs.github.com/en/actions/security-for-github-actions/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services)
- [AWS documentation on OIDC providers](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-idp_oidc.html#idp_oidc_Create_GitHub)
- `GitHubOrg` is a Github `username` in this case

8. **Create a Github Actions workflow for deployment via Terraform**
- The workflow should have 3 jobs that run on pull request and push to the default branch:
- `terraform-check` with format checking [terraform fmt](https://developer.hashicorp.com/terraform/cli/commands/fmt)
- `terraform-check` with format checking using [terraform fmt](https://developer.hashicorp.com/terraform/cli/commands/fmt)
- `terraform-plan` for planning deployments [terraform plan](https://developer.hashicorp.com/terraform/cli/commands/plan)
- `terraform-apply` for deploying [terraform apply](https://developer.hashicorp.com/terraform/cli/commands/apply)
- [terraform init](https://developer.hashicorp.com/terraform/cli/commands/init)
Expand All @@ -77,20 +81,24 @@ In this task, you will:
- [Configure AWS Credentials](https://github.com/aws-actions/configure-aws-credentials)

## Submission

Ensure that the AWS CLI and Terraform installations are verified using `aws --version` and `terraform version`.
- Create a branch `task_1` from `main` branch in your repository.
- [Create a Pull Request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) (PR) from `task_1` branch to `main`.
- Provide the code for Terraform and GitHub Actions in the PR.
- Provide screenshots of `aws --version` and `terraform version` in the PR description.
- Provide a link to the Github Actions workflow run in the PR description.
- Provide the Terraform plan output with S3 bucket (and possibly additional resources) creation in the PR description.

## Evaluation Criteria (100 points for covering all criteria)

1. **MFA User configured (10 points)**

- Provide a screenshot of the non-root account secured by MFA (ensure sensitive information is not shared).
- Screenshot of the non-root account secured by MFA (ensure sensitive information is not shared) is presented

2. **Bucket and GithubActionsRole IAM role configured (30 points)**
2. **Bucket and GithubActionsRole IAM role configured (20 points)**

- Terraform code is created and includes:
- A bucket for Terraform states
- IAM role with correct Identity-based and Trust policies
- Provider initialization
- Creation of S3 Bucket

3. **Github Actions workflow is created (30 points)**

Expand All @@ -103,11 +111,12 @@ Ensure that the AWS CLI and Terraform installations are verified using `aws --ve

5. **Verification (10 points)**

- Terraform plan is executed successfully for `GithubActionsRole`
- Terraform plan is executed successfully for a terraform state bucket
- Terraform plan is executed successfully

6. **Additional Tasks (10 points)**
6. **Additional Tasks (20 points)πŸ’«**
- **Documentation (5 points)**
- Document the infrastructure setup and usage in a README file.
- Document the infrastructure setup and usage in a README file.
- **Submission (5 points)**
- A GitHub Actions (GHA) pipeline is passing
- **Secure authorization (10 points)**
- IAM role with correct Identity-based and Trust policies used to connect GitHubActions to AWS.
26 changes: 14 additions & 12 deletions devops/modules/1_basic-configuration/task_2.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Task 2: Basic Infrastructure Configuration
![task_2 schema](../../visual_assets/task_2.png)

## Objective

Expand Down Expand Up @@ -27,19 +28,20 @@ In this task, you will write Terraform code to configure the basic networking in
- Execute `terraform plan` to ensure the configuration is correct.
- Provide a resource map screenshot (VPC -> Your VPCs -> your_VPC_name -> Resource map).

4. **Submit Code**

- Create a PR with the Terraform code in a new repository.
- (Optional) Set up a GitHub Actions (GHA) pipeline for the Terraform code.

5. **Additional Tasks**
4. **Additional TasksπŸ’«**
- Implement security groups.
- Create a bastion host for secure access to the private subnets.
- Organize NAT for private subnets, so instances in private subnet can connect with outside world:
- Organize NAT for private subnets, so instances in the private subnet can connect with the outside world:
- Simpler way: create a NAT Gateway
- Cheaper way: configure a NAT instance in public subnet
- Cheaper way: configure a NAT instance in the public subnet
- Document the infrastructure setup and usage in a README file.

## Submission
- Create `task_2` branch from `main` in your repository.
- [Create a Pull Request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request) (PR) with the Terraform code in your repository from `task_2` to `main`.
- Provide screenshots of a resource map screenshot (VPC -> Your VPCs -> your_VPC_name -> Resource map) in the PR description.
- (Optional) Set up a GitHub Actions (GHA) pipeline for the Terraform code.

## Evaluation Criteria (100 points for covering all criteria)

1. **Terraform Code Implementation (50 points)**
Expand All @@ -51,7 +53,7 @@ In this task, you will write Terraform code to configure the basic networking in
- Internet Gateway
- Routing configuration:
- Instances in all subnets can reach each other
- Instances in public subnets can reach addresses outside VPC and vice-versa
- Instances in public subnets can reach addresses outside the VPC and vice-versa

2. **Code Organization (10 points)**

Expand All @@ -63,14 +65,14 @@ In this task, you will write Terraform code to configure the basic networking in
- Terraform plan is executed successfully.
- A resource map screenshot is provided (VPC -> Your VPCs -> your_VPC_name -> Resource map).

4. **Additional Tasks (30 points)**
4. **Additional Tasks (30 points)πŸ’«**
- **Security Groups and Network ACLs (5 points)**
- Implement security groups and network ACLs for the VPC and subnets.
- **Bastion Host (5 points)**
- Create a bastion host for secure access to the private subnets.
- **NAT is implemented for private subnets (10 points)**
- Orginize NAT for private subnets with simpler or cheaper way
- Instances in private subnets should be able to reach addresses outside VPC
- Orginize NAT for private subnets in a simpler or cheaper way
- Instances in private subnets should be able to reach addresses outside the VPC
- **Documentation (5 points)**
- Document the infrastructure setup and usage in a README file.
- **Submission (5 points)**
Expand Down
11 changes: 2 additions & 9 deletions devops/modules/2_cluster-configuration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,9 @@ In this module you need to configure a K8s cluster on top of the network infrast

## K8s deployment and configuration

There are multiple ways of deployment K8s cluster on AWS. In this course you're supposed to use either kOps (https://kops.sigs.k8s.io/) or k3s (https://k3s.io/). You need to get familiar with both project and decide which one is more suitable for you.
There are multiple ways of deployment K8s cluster on AWS. In this course you're supposed to use k3s (https://k3s.io/).

Several things to keep in mind during cluster deployment:

1. kOps will handle creation of most resources for you, with k3s you are in charge of underlying infrastructure management;
2. By default, kOps creates more AWS resources. This may lead to additional expenses and not fit into AWS Free Tier;
3. You'll need a domain name or a sub-domain for kOps-managed cluster;
4. Make sure you're using AWS EC2 instances type from Free Tier to avoid addition expenses (see https://aws.amazon.com/free for more details).

Rule of thumb: use k3s if you don't want to spend on AWS resources; use kOps if you'd like to practise with a more real-life cluster.
Make sure you're using AWS EC2 instances type from Free Tier to avoid addition expenses (see https://aws.amazon.com/free for more details).

**This task is considered as done if all the conditions below are met:**

Expand Down
Loading
Loading