SAWS is designed to serve as a centralized database of drivers and barriers to the adoption of alternative, non-traditional water sources across the contiguous United States. Our database spans a diverse set of influential factors in this space, touching on the political, economic, social, technological, environmental, and legal considerations of water supply portfolio planning. This repository contains the functions necessary to retrieve, clean, reformat, and compile raw datasets into the SAWS database. For a more interactive data exploration and query experience, please visit the SAWS web app. This repository only supports database creation and updating.
In this section, we provide details on the organizational structure of the SAWS repository and how to use it. Understanding some of these details, however, requires first understanding the structure of the SAWS database, which can be broken down into three levels, as illustrated in Figure 1. Level 1 datasets are those that have been pulled directly (with minimal processing) from a public source and reorganized into the standardized SAWS format. Level 2 datasets are those that have been newly developed for the purposes of SAWS using information from Level 1 datasets. Level 3 datasets represent the highest level viability metrics that are used to compare viability across water supplies and geographic regions. Note that the database contained in this repository only includes Level 1 and 2 datasets. Level 3 datasets can be created using the SAWS Web Application.
Fig. 1 Diagram of SAWS database structure.
The project_utils.py script contains global variables and functions to be used throughout the SAWS project.
Most Level 1 datasets, along with all Level 2 datasets, are provided to users at six geospatial resolutions – county, state, HUC-2, HUC-4, HUC-6, and HUC-8 – which are hereafter referred to collectively as "SAWS-supported resolutions." The shapefiles/ folder contains the official SAWS .geojson files corresponding to each resolution as well as the code used to generate them. For details on shapefile processing, see shapefiles_generation.md.
The dataset_builder/ directory contains a Python package that supports development of datasets of all levels. The Dataset class is used to set up a SAWS dataset and assign the appropriate metadata. The processing module contains functions that can be used to load and clean a variety of data. The transformation module contains functions that can be used to convert data between geospatial resolutions. For information on how these conversions are conducted, see geo_conversions.md.
The documentation/ directory contains a series of markdown files that contain information on how the database is built and processed. See the table below for a description of each file.
| File | Description |
|---|---|
| Reading a SAWS Dataset | instructions for interpreting SAWS dataset structures and nomenclature |
| Generating Geographic Boundary Shapefiles for SAWS-Supported Resolutions | sources and methods used for generating shapefiles for SAWS-supported resolutions |
| Geospatial Conversions | methods for converting between geospatial resolutions |
The images/ directory contains figures used throughout the documentation in this repository. It does not contain any figures from the database itself.
The v1/ directory contains all of the metadata, data, and documentation for the v1.0 database, as well as the code used to generate it. The directory contains one folder per dataset, labeled by the dataset's reference code. Each dataset folder is structured as follows:
└── [DATASET CODE]/
├── build-[DATASET CODE].py -- script used to build dataset, metadata, and documentation
├── [DATASET CODE]-metadata.yaml -- dataset metadata to be used for visualization
├── [DATASET CODE]-documentation.md -- markdown file with information on data sourcing and processing
└── data/ -- dataset files at all available resolutions
├── county-[DATASET CODE].csv
├── huc2-[DATASET CODE].csv
├── huc4-[DATASET CODE].csv
├── huc6-[DATASET CODE].csv
├── huc8-[DATASET CODE].csv
├── state-[DATASET CODE].csv
OR
└── overlay-[DATASET CODE].geojson
| Category | Title | Reference Code | Processing Script | Documentation |
|---|---|---|---|---|
| Social and Demographic Factors | Social Vulnerability Index | svi |
build-svi.py | svi-documentation.md |
| Population Projections | pop |
build-pop.py | pop-documentation.md | |
| Disadvantaged Communities | cejst |
build-cejst.py | cejst-documentation.md | |
| Water Use | Historical Water Withdrawals | histwwd |
build-histwwd.py | histwwd-documentation.md |
| Water Footprints of Production | wfp |
build-wfp.py | wfp-documentation.md | |
| Geographic and Hydrogeologic Features | Subsurface Lithology | litho |
build-litho.py | litho-documentation.md |
| Aquifer Properties | aqprop |
build-aqprop.py | aqprop-documentation.md | |
| Coastlines | coast |
build-coast.py | coast-documentation.md | |
| Proximity to Coastline | coastprox |
build-coastprox.py | coastprox-documentation.md | |
| Major Fault Lines | fault |
build-fault.py | fault-documentation.md | |
| Proximity to Fault Line | faultprox |
build-faultprox.py | faultprox-documentation.md | |
| Fresh Surface and Groundwater | Depth to Fresh Groundwater | d2fgw |
build-d2fgw.py | d2fgw-documentation.md |
| Baseflow | bsflow |
build-bsflow.py | bsflow-documentation.md | |
| Operational Cost of Fresh Groundwater Extraction | fgwexcost |
build-fgwexcost.py | fgwexcost-documentation.md | |
| Soil Moisture | soilmoi |
build-soilmoi.py | soilmoi-documentation.md | |
| Evapotranspiration | et |
build-et.py | et-documentation.md | |
| Snow Water Equivalent | snoweq |
build-snoweq.py | snoweq-documentation.md | |
| Energy Requirements for Fresh Groundwater Extraction | fgwexen |
build-fgwexen.py | fgwexen-documentation.md | |
| Streamflow | stflow |
build-stflow.py | stflow-documentation.md | |
| Quickflow | qkflow |
build-qkflow.py | qkflow-documentation.md | |
| Energy and Electricity | Industrial Electricity Prices | elecprice |
build-elecprice.py | elecprice-documentation.md |
| Electricity Generation | elecgen |
build-elecgen.py | elecgen-documentation.md | |
| Industrial Electricity Emissions Intensity | gridco2 |
build-gridco2.py | gridco2-documentation.md | |
| Brackish Groundwater | Depth to Brackish Groundwater | d2bgw |
build-d2bgw.py | d2bgw-documentation.md |
| Operational Cost of Brackish Groundwater Extraction | bgwexcost |
build-bgwexcost.py | bgwexcost-documentation.md | |
| Specific Energy Consumption of Brackish Groundwater Desalination | bgwsec |
build-bgwsec.py | bgwsec-documentation.md | |
| Brackish Groundwater Composition | bgwcomp |
build-bgwcomp.py | bgwcomp-documentation.md | |
| Levelized Cost of Brackish Groundwater Desalination | bgwlcow |
build-bgwlcow.py | bgwlcow-documentation.md | |
| Brackish Groundwater Availability | bgwavail |
build-bgwavail.py | bgwavail-documentation.md | |
| Energy Requirements for Brackish Groundwater Extraction | bgwexen |
build-bgwexen.py | bgwexen-documentation.md | |
| Problematic Scalants for Brackish Groundwater Desalination | bgwscale |
build-bgwscale.py | bgwscale-documentation.md | |
| Water Rights, Law, and Policy | Water Rights Allocation Doctrines | wrdoc |
build-wrdoc.py | wrdoc-documentation.md |
| Drinking Water Quality Standards | mcl |
build-mcl.py | mcl-documentation.md | |
| Western Water Rights Allocations | hardwr |
build-hardwr.py | hardwr-documentation.md | |
| Water Supply Adoption Readiness Levels | arl |
build-arl.py | arl-documentation.md | |
| Water Service Administration and Infrastructure | UIC Injection Well Inventory | uicinv |
build-uicinv.py | uicinv-documentation.md |
| Public Water Service Areas | wsa |
build-wsa.py | wsa-documentation.md | |
| Municipal Desalination Facilities | desal |
build-desal.py | desal-documentation.md | |
| Climate and Water Stress | Drought Risk Index | dri |
build-dri.py | dri-documentation.md |
| Water Supply Sustainability Risk Index | wssri |
build-wssri.py | wssri-documentation.md | |
| Temperature and Precipitation Projections | climate |
build-climate.py | climate-documentation.md | |
| Water Quality and Composition | TDS Exceedance Probabilities | tdsprob |
build-tdsprob.py | tdsprob-documentation.md |
- Caroline Adkins, Stanford University, [email protected]
- Carson Tucker, Stanford University, [email protected]
- Emily Winn, Stanford University, [email protected]
- Meerashree Sundara Raju, Stanford University, [email protected]
- Meagan Mauter, Stanford University, [email protected]
Manuscript in preparation.
This project is licensed under the MIT License. Please see the LICENSE.md file for details.
The development of SAWS was made possible through the generous support of the National Science Foundation and ExxonMobil. We are especially grateful to ExxonMobil for not only funding this work, but also for providing expert guidance and continued support throughout the project. We thank Lexunit for their collaboration in developing the SAWS web application and for their dedication to building an accessible, user-friendly platform. We also extend our sincere appreciation to the many academic, government, and industry experts who provided valuable input on data selection, processing methodologies, and indicators of viability. Their insights were instrumental in shaping the foundation of this tool.
This material is based upon work supported by ExxonMobil as well as the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-2146755. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or ExxonMobil. Any reference to Exxon Mobil Corporation or its affiliates (“ExxonMobil”) funding, work with, or collaboration with the noted third-party organizations does not constitute or imply an endorsement by ExxonMobil of any or all of the positions of such organizations.
Please contact Caroline Adkins ([email protected]) with any questions.

