Investigating Climate Change Through Buoy Data
A science fair project by Jason Fischer
Copyright 2019
In the media, there is a lot of discussion about global warming but not as much information on how it's computed. I was interested in this so I decided to do a science experiment to explore how to calculate climate change. NASA’s Earth Observatory concludes that there is an average global rise in temperature of between .15-0.20℃ per decade after studying data from as far back as 1880.
The National Oceanic and Atmospheric Administration (NOAA) has buoys out at sea that monitor air temperature (will be referred to as ATMP) and water temperature (will be referred to as WTMP) and other values but these are the important ones. These buoys are spread across the world and their data has been published from as far back as 1979. The data from these buoys are freely available on the web as text files. Using the data from NOAA, I can use code to determine the average, min, max, etc. over time for each buoy. I have selected four different buoys off the coast of the continental United States to analyze. One is off the coast of California, one is off the coast of Louisiana, one is off the coast of Maine, and one is off the coast of Virginia.
The project has been modified to use Data Workspaces to track experiments and data lineage.
This project has the following subdirectories:
data- this directory contains the raw text files for buoys in subdirectories by buoy number.intermediate-data- this directory contains the preprocessed data for each buoy in CSV format, one file per buoy.code- the code for this project, including the preprocessor script and Jupyter notebooks.results- the output charts for each buoy showing temperature change per decade, along with lineage metadata.
To set this up as a data workspace in its own Conda environment, run:
pip install dataworkspaces
dws clone [email protected]:data-workspaces/buoy-data-analysis.git
cd buoy-data-analysis
conda env create -f environment.yml
conda activate buoy-data-analysis
To run the experiments and take a snapshot for each buoy's results, run the following commands from the current directory:
# preprocess the data
for buoy in 42040 44005 44014 46026; do python3 ./code/preprocess.py $buoy; done
# run notebooks and take snapshot after each notebook
cd code
for buoy in 42040 44005 44014 46026
do
jupyter nbconvert --to notebook --execute anomaly-analysis-buoy-$buoy.ipynb --output anomaly-analysis-buoy-$buoy.ipynb
dws snapshot -m "results from buoy $buoy"
done
Here are some Data Workspaces commands you might try running in this workspace.
To see the snapshot history:
dws report history
To see the lineage for the snapshot buoy-44014-better-slope:
dws report lineage --snapshot buoy-44014-better-slope
After running a notebook, but before taking a snapshot, you can see the current lineage as follows:
dws report lineage