This Repo contain details related to Data Engineering tech stacks in GCP
-
Updated
Jan 20, 2025 - Jupyter Notebook
This Repo contain details related to Data Engineering tech stacks in GCP
Apache Beam demo projects
Batch ETL using Cloud Environment which is GCP by utilizing Cloud Composer + Google Cloud Storage + Dataflow + Cloud Build
Added for those who want to create a data pipeline with Apache Beam, Google DataFlow and BigQuery.
🤖 Apache Beam RunInference API sample
Apache Beam and EDA Projects: Showcasing real-time data processing with Apache Beam, interactive visualizations with D3.js, and automated EDA with Sweetviz and PyCaret. Includes Jupyter notebooks and outputs for learning and exploration.
To set up your Python development environment, get the Apache Beam SDK for Python, and run an example pipeline.
Read step by step tutorial here: https://frazynondo.medium.com/etl-with-gcp-part-i-apache-beam-eclipse-gcs-and-bigquery-dc9529ee7f19
Sample code to build big data pipeline (batch and stream) using apache beam in Python
This is a small Dataflow Job that receives a message via pubsub every time someone accesses a shortened URL. It accumulates the items using a Fixed Time Window, groups by Id and updates FireStore with the amount of clicks
Personal Apache Beam studies repository
Projects done to learn database projects
Project to Analysis image using different computer vision Algorithm. to able to get all the info from an image
Data Engineering Using Google Could Platform and Mage
Add a description, image, and links to the apachebeam topic page so that developers can more easily learn about it.
To associate your repository with the apachebeam topic, visit your repo's landing page and select "manage topics."