Cascading workflow with spatial binning function.
This Hadoop based Cascading workflow enables me to take zip code locations in the continental US (not very big BTW, this is just a PoC :-)
overlay it with a set of hexagon cell in an Albers equal area conic projection
to produce a spatial density set of bins
$ git clone https://github.com/Esri/geometry-api-java.git
$ cd geometry-api-java
$ mvn install
$ git clone https://github.com/mraad/Shapefile.git
$ cd Shapefile
$ mvn install
I've placed some sample data in the data
folder. I'm assuming that you have a Hadoop cluster. If you do not have one, you can download the Cloudera Quick Start VM
$ hadoop fs -put data/zipcodes.tsv zipcodes.tsv
$ hadoop fs -put data/hexalbers.shp hexalbers.shp
$ mvn package
$ hadoop jar target/CascadingSpatial-1.0-job.jar zipcodes.tsv hexalbers.shp output
$ hadoop fs -cat output/part* | more
ORIGID,POPULATION
136,3
137,1
188,17
189,13
213,1
214,2
263,2
264,8
265,7
266,3
...
Save the output to a local file
$ hadoop fs -cat output/part* > density.csv
In ArcGIS for Desktop, add the density.csv
as table, and join it with the hexalbers
layer for symbolization on the POPULATION
field.