33[ ![ CI] ( https://github.com/Upstream-Tech/delineator/actions/workflows/ci.yml/badge.svg )] ( https://github.com/Upstream-Tech/delineator/actions/workflows/ci.yml )
44
55A set of Python scripts for delineating watersheds or drainage basins
6- using data from [ MERIT-Hydro] ( https://doi.org/10.1029/2019WR024873 ) and
7- [ MERIT-Basins] ( https://doi.org/10.1029/2019WR025287 ) .
8- The script outputs geospatial data for subbasins and river reaches,
9- and creates a river network graph representation,
10- which can be useful for watershed modeling or machine learning applications.
6+ using data from [ MERIT-Hydro] ( https://doi.org/10.1029/2019WR024873 ) and
7+ [ MERIT-Basins] ( https://doi.org/10.1029/2019WR025287 ) .
8+ The script outputs geospatial data for subbasins and river reaches,
9+ and creates a river network graph representation,
10+ which can be useful for watershed modeling or machine learning applications.
1111
1212This script also lets you subdivide the watershed at specific locations, such as gages, if you provide
1313additional points that fall inside the main watershed, as shown in red in the image below.
1414
1515![ Subbasins illustration] ( img/subbasins_map.jpg )
1616
1717
18- These scripts are a heavily modified fork of [ delineator.py] ( https://github.com/mheberger/delineator ) .
18+ These scripts are a heavily modified fork of [ delineator.py] ( https://github.com/mheberger/delineator ) .
1919
2020
2121# Outputs:
2222
2323Geodata in a variety of formats -- shapefile, geopackage, GeoJSON, etc., for:
2424
25- * sub-basin polygons
26- * sub-basin outlet points
27- * river reaches
25+ * sub-basin polygons
26+ * sub-basin outlet points
27+ * river reaches
2828
2929and, optionally:
3030
@@ -33,18 +33,18 @@ and, optionally:
3333The network graph can be saved in a variety of formats -- Python NetworkX Graph object (in a pickle file),
3434JSON, GML, XML, etc.
3535
36- You can also customize the size of the subbasins to make them larger; see * Outputing larger subbasins* below.
36+ You can also customize the size of the subbasins to make them larger; see * Outputing larger subbasins* below.
3737
3838
3939# Using these scripts
4040
41- This repository includes sample data covering Iceland. To delineate watersheds in other
42- parts of the world, you will need to download datasets from MERIT-Hydro and MERIT-Basins.
41+ This repository includes sample data covering Iceland. To delineate watersheds in other
42+ parts of the world, you will need to download datasets from MERIT-Hydro and MERIT-Basins.
4343Instructions on how to get the data and run the script are provided below.
4444
4545To get started, download the latest release from this GitHub repository (or fork the repository).
4646
47- These scripts require Python 3.12 or later.
47+ These scripts require Python 3.11 or later.
4848
4949## Setup
5050
@@ -98,14 +98,14 @@ uv run ruff format .
9898To run tests (once added):
9999``` bash
100100uv run pytest
101- ```
101+ ```
102102
103103
104104# Overview of using ` subbasins.py `
105105
106106The major steps are the following, with more detailed instructions below.
107- To simply try the program with the sample data provided, you can skip to step 5.
108- When you are ready to try with your own locations, you will need to download additional
107+ To simply try the program with the sample data provided, you can skip to step 5.
108+ When you are ready to try with your own locations, you will need to download additional
109109data as described in steps 1 and 2.
110110
1111111 . [ Export env vars pointing to raster and vector data] ( #step_env )
@@ -114,9 +114,9 @@ data as described in steps 1 and 2.
1141141 . [ Review output] ( #step_review )
1151151 . [ Run again to fix mistakes] ( #step_repeat )
116116
117- Before you begin downloading the data in steps 1 and 2, determine which files you need based on your region of interest.
118- The data files are organized into continental-scale river basins, or Pfafstetter Level 2 basins.
119- There are 61 of these basins in total. Basins are identified by a 2-digit code, with values from 11 to 91.
117+ Before you begin downloading the data in steps 1 and 2, determine which files you need based on your region of interest.
118+ The data files are organized into continental-scale river basins, or Pfafstetter Level 2 basins.
119+ There are 61 of these basins in total. Basins are identified by a 2-digit code, with values from 11 to 91.
120120
121121![ MERIT Level 2 Basins] ( img/merit_level2_basins.jpg )
122122MERIT Level 2 megabasins
@@ -145,36 +145,36 @@ export MEGABASINS_PATH="https://example.com/file.shp"
145145
146146## <a name =" step_csv " >Create a CSV file with your desired watershed outlet points</a >
147147
148- The script reads information about your desired watershed outlet points from a
149- plain-text comma-delimited (CSV) file. Edit this file carefully, as the script will
150- not run if this file is not formatted correctly.
148+ The script reads information about your desired watershed outlet points from a
149+ plain-text comma-delimited (CSV) file. Edit this file carefully, as the script will
150+ not run if this file is not formatted correctly.
151151
152152The CSV file ** must** contain these 4 required fields or columns.
153153
154154- ** id** - _ required_ : a unique identifier for your watershed or outlet point,
155- an alphanumeric string. The id may be any length, but shorter is better.
156- The script uses the id as the filename for output, so avoid using any
157- forbidden characters. On Linux, do not use the forward slash /.
155+ an alphanumeric string. The id may be any length, but shorter is better.
156+ The script uses the id as the filename for output, so avoid using any
157+ forbidden characters. On Linux, do not use the forward slash /.
158158On Windows, the list of forbidden characters is slightly longer (` \< \> : " / \ | ? \* ` ).
159159Also, do not use id = 0, since the convention is that 0 is reserved for discharging to
160160the ocean.
161161
162162- ** lat** - _ required_ : latitude in decimal degrees of the watershed outlet.
163- Avoid using a whole number without a decimal in the first row.
163+ Avoid using a whole number without a decimal in the first row.
164164For example, use 23.0 instead of 23.
165165
166166- ** lng** - _ required_ : longitude in decimal degrees
167167
168- - ** is_outlet** : If the gage is a watershed outlet, enter ` true ` or ` True `
168+ - ** is_outlet** : If the gage is a watershed outlet, enter ` true ` or ` True `
169169(capitalization does not matter). For intermediate
170170 upstream points that will be sub-basin outlets, enter ` false ` or ` False ` .
171171
172- All latitude and longitude coordinates should be in decimal degrees
172+ All latitude and longitude coordinates should be in decimal degrees
173173(EPSG: 4326, [ https://spatialreference.org/ref/epsg/4326/ ] ( https://spatialreference.org/ref/epsg/4326/ ) ).
174174
175175- The order of the columns does not matter, but the names must be exactly as shown above.
176176
177- - If you are delineating more than one main watershed, put any subbasin
177+ - If you are delineating more than one main watershed, put any subbasin
178178 outlets immediately after the main outlet, as in the following example:
179179
180180| id | lat | lng | name | is_outlet |
@@ -186,15 +186,15 @@ All latitude and longitude coordinates should be in decimal degrees
186186| algoso | 41.455 | -6.591 | "EN 219 Crossing near Algoso" | false |
187187
188188In this example, there are two * main* outlets. The first, "foz-tua," has two subbasin
189- outlets. The second, "baixo-sabor," has one subbasin outlet.
189+ outlets. The second, "baixo-sabor," has one subbasin outlet.
190190
191191## <a name =" step_run " >Delineate watersheds</a >
192192
193193Delineation can be run from the command line, or from Python. Either way, you will need to specify a few arguments:
194194
195195- ` input_csv ` (required) - Input CSV filename, for example ` outlets.csv `
196- - ` output_prefix ` (required) - Output prefix, a string. The output files will start with this string. For
197- example,
196+ - ` output_prefix ` (required) - Output prefix, a string. The output files will start with this string. For
197+ example,
198198if you provide 'shasta', the script will produce ` shasta_subbasins.shp ` , ` shasta_outlets.shp ` , etc.
199199- ` config_vals ` (optional) - Override default settings. Eg to turn ` VERBOSE ` logging on or off.
200200
@@ -213,36 +213,36 @@ Alternatively, you can call the delineation routine from Python:
213213
214214## <a name =" step_review " >Review results</a >
215215
216- The script can output several different geodata formats,
217- as long as the format is supported by ` GeoPandas ` . Shapefiles are popular,
218- but we recommend ** GeoPackage** , as it is a more modern and open format.
216+ The script can output several different geodata formats,
217+ as long as the format is supported by ` GeoPandas ` . Shapefiles are popular,
218+ but we recommend ** GeoPackage** , as it is a more modern and open format.
219219** Feather** is another lightweight, portable data format.
220- To get a full list of available formats, follow the directions
220+ To get a full list of available formats, follow the directions
221221[ here] ( https://geopandas.org/en/stable/docs/user_guide/io.html#writing-spatial-data )
222222(see Supported Drivers).
223223
224224
225225## <a name =" step_repeat " >Run again to fix any mistakes</a >
226226
227- Automated watershed delineation is often incorrect.
228- The good news is that errors can often be fixed by slightly moving the
227+ Automated watershed delineation is often incorrect.
228+ The good news is that errors can often be fixed by slightly moving the
229229location of your watershed outlets.
230230
231- Repeat the above steps to create a new outlets CSV file, or modify your existing file,
231+ Repeat the above steps to create a new outlets CSV file, or modify your existing file,
232232using revised coordinates. The script will automatically overwrite existing files
233233without any warning, so first make sure to back up anything you want to save.
234234
235235
236236# Simplification
237237
238238The Python routine used to simplify the subbasin polygons (` topojson ` ) is not perfect.
239- Sometimes, the output will contain weird overlaps and slivers. If appearances matter,
239+ Sometimes, the output will contain weird overlaps and slivers. If appearances matter,
240240we recommend setting ` SIMPLIFY ` to ` False ` and using external software for simplification.
241241
242- * Mapshaper* works well. You can use the [ web version] ( https://mapshaper.org/ ) , or you can install and run it from the command line.
243- Note that mapshaper will only accept shapefiles or geojson as input, and not geopackages or feather files.
242+ * Mapshaper* works well. You can use the [ web version] ( https://mapshaper.org/ ) , or you can install and run it from the command line.
243+ Note that mapshaper will only accept shapefiles or geojson as input, and not geopackages or feather files.
244244
245- As an alternative, GIS software like QGIS (free) or ArcGIS (commercial) do the job nicely.
245+ As an alternative, GIS software like QGIS (free) or ArcGIS (commercial) do the job nicely.
246246
247247
248248# Outputting larger subbasins
@@ -252,22 +252,22 @@ larger subbasins, se `CONSOLIDATE` to `True`.
252252Then, set a value for ` MAX_AREA ` in km². This sets the upper limit on the size of subbasins.
253253The script will merge unit catchments such that the overall structure
254254and connectivity of the drainage network is maintained. This example shows the subbasins
255- for the Yellowstone River with different values of ` MAX_AREA ` .
255+ for the Yellowstone River with different values of ` MAX_AREA ` .
256256
257257![ Subbasin consolidation illustration] ( img/consolidation.jpg )
258258
259- By "rediscretizing" the subbasins with this option, you can reduce their number while increasing their size.
259+ By "rediscretizing" the subbasins with this option, you can reduce their number while increasing their size.
260260This means that your hydrologic model will be smaller and simpler and probably run faster! 😊
261261
262- The simplification routine also appears to
262+ The simplification routine also appears to
263263make the subbasin sizes somewhat more homogeneous. The area of MERIT-Basins unit catchments
264264is highly variable, and highly skewed, with many very small unit catchments.
265265 After consolidation, the distribution
266- of subbasin areas tends to be more tightly clustered around the mean,
266+ of subbasin areas tends to be more tightly clustered around the mean,
267267as indicated by a lower coefficient
268268of variation (standard deviation divided by the mean). Here are the results of a
269269little experiment in using different values of ` MAX_AREA ` for the Yellowstone River
270- basin in North America. Statistics are for the subbasin areas in km².
270+ basin in North America. Statistics are for the subbasin areas in km².
271271
272272
273273| MAX_AREA | count | median | mean | std. dev. | CV | skewness |
@@ -285,9 +285,9 @@ basin in North America. Statistics are for the subbasin areas in km².
285285
286286More error handling of stress cases is forthcoming.
287287For example, the MERIT-Basins dataset
288- has some gaps and slivers of missing data; if your outlet points falls
289- into one of these locations, it may cause problems. In this case, simply nudge the
290- location by changing the latitude and/or longitude slightly.
288+ has some gaps and slivers of missing data; if your outlet points falls
289+ into one of these locations, it may cause problems. In this case, simply nudge the
290+ location by changing the latitude and/or longitude slightly.
291291
292292As another example, if you put two points in your input file that are the same, or very close
293293to one another, the script will fail and the error messages will not be very helpful. Please
@@ -300,10 +300,10 @@ make sure all of your points have a little space between them!
300300
301301To report any bugs, you can create an Issue on this GitHub page.
302302
303- This code is open source, so if you are motivated to make any
304- modifications, additions, or bug fixes, you can make a pull request on GitHub.
303+ This code is open source, so if you are motivated to make any
304+ modifications, additions, or bug fixes, you can make a pull request on GitHub.
305305
306306
307- # Acknowledgments
307+ # Acknowledgments
308308
309- Thanks to Matthew Heberger who wrote the [ original code] ( https://github.com/mheberger/delineator ) built upon here.
309+ Thanks to Matthew Heberger who wrote the [ original code] ( https://github.com/mheberger/delineator ) built upon here.
0 commit comments