Skip to content

aiwithqasim/cloud-data-engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CLOUD DATA ENGINEERING

cloud-data-enginnering-logo

Version Status Contributors Stars License


Welcome to the Cloud Data Engineering course! This comprehensive 6–8 month journey is designed to equip you with the necessary skills to become a proficient Data Engineer, focusing on cloud-based technologies, data acquisition, modeling, warehousing, and orchestration.

Our curriculum is divided into 5 modules that include hands-on projects, assignments, and real-world case studies to ensure a practical understanding of the technologies covered.


This repository includes the Roadmap for Data Engineering. Since Data Engineering is a broad field, we'll try to cover the following tools:


📑 Table of Contents

  1. Course Overview
  2. Understanding Data Engineering
  3. Module 1: Data Acquisition
  4. Module 2: Data Modeling
  5. Module 3: Cloud Data Warehousing
  6. Module 4: Data Orchestration & Streaming
  7. Module 5: Architecting AWS Data Engineering Projects
  8. Why These Technologies?
  9. Final Notes

🚀 Course Overview

This course is meticulously crafted to cover all facets of Cloud Data Engineering.
You'll learn everything from the basics of data acquisition and transformation to advanced cloud-based data warehousing, orchestration, and streaming techniques.

The course is structured to build your skills progressively, ensuring you are job-ready to tackle complex data engineering challenges by the end.

📖 Understanding Data Engineering

Before diving deep, one should know:

  • What is Data Engineering?
  • What is the scope of Data Engineering in 2025 and beyond?
  • What tools are required for a modern Data Engineer?

📂 Understanding Data Engineering (PPT)

📦 Module 1: Data Acquisition Python Docker Git

Overview

The focus of this module is on acquiring, manipulating, and processing data from various sources.
You’ll set up your data engineering environment, explore Python, manage projects with Git, and gain hands-on experience with web scraping using BeautifulSoup and Selenium.

➡️ Includes projects like:

  • ETL with Python
  • Netflix Data Analysis
  • GitHub History (Scala)
  • Security Log Analysis, etc.

🗄️ Module 2: Data Modeling SQL

Dive into database design, SQL querying, optimization, and ETL pipelines.

📌 Covers:

  • SQL Server setup
  • Joins, aggregations, window functions
  • Stored procedures, triggers, optimization

➡️ Includes projects like:

  • ETL pipeline with Python + Pandas + SQL

❄️ Module 3: Cloud Data Warehousing Snowflake

Master Snowflake Cloud Data Warehousing through hands-on badges, Udemy masterclass, and real-time projects.

📌 Includes official Snowflake badges:

  • Data Warehousing Workshop
  • Collaboration & Marketplace
  • Data Application Builders
  • Data Lake Workshop
  • Data Engineering Workshop

➡️ Includes projects like:

  • Snowflake Real Time Data Warehouse For Beginners
  • Batch pipeline using AWS S3, lambda, Eventbridge and Snowflake for currency Exhancge rates
  • Real-time Snowflake Data Warehouse, Change Data Capture with AWS

⏳ Module 4: Data Orchestration & Streaming Airflow Kafka

  • Apache Airflow for orchestration of ETL pipelines
  • Apache Kafka for real-time data streaming and decoupling producers/consumers

➡️ Includes projects like:

  • Twitter Data Pipeline, Stock Market Analysis, Airflow on AWS EC2

☁️ Module 5: Architecting AWS Data Engineering Projects AWS

Dive deep into AWS ecosystem for data engineering:

📌 Covers:

  • S3, Redshift, Glue, Athena, Lambda, Kinesis, RDS, EMR

➡️ Projects:

  • Batch Data Pipeline (S3 + Lambda + CloudWatch)
  • ETL pipeline with Glue & Athena
  • Real-time streaming with Kinesis
  • End-to-End AWS Data Engineering

❓ Why These Technologies?

The chosen technologies (Python, SQL, Snowflake, Airflow, Kafka, AWS) are the most in-demand in industry, ensuring you are job-ready by the end of this course.

Each module builds on the previous one, reinforcing both theory + practical projects.

📝 Final Notes

Throughout this course, you will engage in hands-on projects, assignments, and case studies that simulate real-world data engineering challenges.

⚡ Get ready to embark on this exciting journey of becoming a proficient Cloud Data Engineer! 🚀


About

This repository include the Roadmap for AWS Data Engineering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •