ETF Performance Insights & Comparison using DuckDB and Tableau
This project creates a pipeline for analyzing ETF (Exchange-Traded Fund) performance data. It pulls historical ETF price data from Kaggle, processes it using DuckDB, and visualizes the results in Power BI.
- Bond ETFs (AGG, BIL, BND, EDV, IEF, SGOV, SHV, SHY, TLH, TLT, VGIT)
- Equity ETFs (IEFA, IEMG, IJH, IJR, IVV, IWF, IWM, QQQ, RSP, SCHD, SPY, etc.)
- Data Import
Uses Kaggle API to fetch 10 years of historical ETF data Downloads both bond and equity ETF datasets
- Database Creation
Creates a DuckDB database (bonds.db) Imports CSV data into database tables Creates two main tables:
bonds: Raw bond ETF data equity: Raw equity ETF data
- Data Transformation The SQL transformations calculate:
- Daily price changes
- Monthly and yearly aggregations
- Percentage changes
- Cumulative returns since inception
Creates two analysis tables:
bonds_analysis equity_analysis
- Data Export
Exports processed data to CSV files in the cleaned_data directory Ready for import into Power BI
- Visualization
Connect Power BI to the exported CSV files Create interactive dashboards and reports
- Include Airflow for orchestration
- Include Machine learning for predicting stock prices