Skip to content

ETF Performance Insights & Comparison using DuckDB and PowerBI

Notifications You must be signed in to change notification settings

arrismo/epic-etf

Repository files navigation

epic-etf

ETF Performance Insights & Comparison using DuckDB and Tableau

This project creates a pipeline for analyzing ETF (Exchange-Traded Fund) performance data. It pulls historical ETF price data from Kaggle, processes it using DuckDB, and visualizes the results in Power BI.

Screenshot

  1. Bond ETFs (AGG, BIL, BND, EDV, IEF, SGOV, SHV, SHY, TLH, TLT, VGIT)
  2. Equity ETFs (IEFA, IEMG, IJH, IJR, IVV, IWF, IWM, QQQ, RSP, SCHD, SPY, etc.)

Data Pipeline

  1. Data Import

Uses Kaggle API to fetch 10 years of historical ETF data Downloads both bond and equity ETF datasets

  1. Database Creation

Creates a DuckDB database (bonds.db) Imports CSV data into database tables Creates two main tables:

bonds: Raw bond ETF data equity: Raw equity ETF data

  1. Data Transformation The SQL transformations calculate:
  • Daily price changes
  • Monthly and yearly aggregations
  • Percentage changes
  • Cumulative returns since inception

Creates two analysis tables:

bonds_analysis equity_analysis

  1. Data Export

Exports processed data to CSV files in the cleaned_data directory Ready for import into Power BI

  1. Visualization

Connect Power BI to the exported CSV files Create interactive dashboards and reports

Example Visualizations

Screenshot Screenshot

Future Improvements

  1. Include Airflow for orchestration
  2. Include Machine learning for predicting stock prices

About

ETF Performance Insights & Comparison using DuckDB and PowerBI

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages