This repository contains an ETL (Extract, Transform, Load) pipeline that retrieves weather data from the OpenWeatherMap API, stores it in a PostgreSQL database, and uses Apache Airflow for orchestration.
The ETL pipeline performs the following steps:
- Extract: The pipeline utilizes the OpenWeatherMap API to retrieve weather data for a specified location.
- Transform: The extracted data is processed and transformed to ensure consistency and usability.
- Load: The transformed data is stored in a PostgreSQL database for future analysis and visualization.
The pipeline is orchestrated using Apache Airflow, a platform for programmatically authoring, scheduling, and monitoring workflows. The Airflow DAG (Directed Acyclic Graph) defines the tasks and their dependencies, ensuring the proper execution order.
Before running the pipeline, make sure you have the following set up:
- OpenWeatherMap API Key: Obtain an API key by creating an account on the OpenWeatherMap website and generating an API key.
- PostgreSQL Database: Set up a PostgreSQL database where the weather data will be stored.
- Apache Airflow: Install and configure Apache Airflow according to the official documentation.