You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[](https://gitter.im/geospark-datasys/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
5
5
6
-
GeoSpark is listed as **Infrastructure Project** on **Apache Spark Official Third Party Project Page** ([http://spark.apache.org/third-party-projects.html](http://spark.apache.org/third-party-projects.html))
7
-
8
-
GeoSpark is a cluster computing system for processing large-scale spatial data. GeoSpark extends Apache Spark with a set of out-of-the-box Spatial Resilient Distributed Datasets (SRDDs) that efficiently load, process, and analyze large-scale spatial data across machines. GeoSpark provides APIs for Apache Spark programmer to easily develop their spatial analysis programs with Spatial Resilient Distributed Datasets (SRDDs) which have in house support for geometrical and Spatial Queries (Range, K Nearest Neighbors, Join).
GeoSpark is listed as **Infrastructure Project** on [**Apache Spark Official Third Party Project Page**](http://spark.apache.org/third-party-projects.html)
10
9
10
+
GeoSpark is a cluster computing system for processing large-scale spatial data. GeoSpark extends Apache Spark with a set of out-of-the-box Spatial Resilient Distributed Datasets (SRDDs) that efficiently load, process, and analyze large-scale spatial data across machines. GeoSpark provides APIs for Apache Spark programmer to easily develop their spatial analysis programs with Spatial Resilient Distributed Datasets (SRDDs) which have in house support for geometrical and Spatial Queries (Range, K Nearest Neighbors, Join).
11
11
12
-
GeoSpark artifacts are hosted in Maven Central. You can add a Maven dependency with the following coordinates:
13
12
14
-
The following version supports Apache Spark 2.X versions:
15
13
16
-
```
17
-
groupId: org.datasyslab
18
-
artifactId: geospark
19
-
version: 0.5.0
20
-
```
14
+
GeoSpark artifacts are hosted in Maven Central: [**Maven Central Coordinates**](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Maven-Central-Coordinates)
21
15
22
-
The following version supports Apache Spark 1.X versions:
23
16
24
-
```
25
-
groupId: org.datasyslab
26
-
artifactId: geospark
27
-
version: 0.5.0-spark-1.x
28
-
```
29
17
30
-
##Version information ([Full List](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Full-Version-Release-notes))
18
+
# Version information ([more](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Full-Version-Release-notes))
| 0.5.0| **Major updates:** We are pleased to announce the initial version of [Babylon](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon) a large-scale in-memory geospatial visualization system extending GeoSpark. Babylon and GeoSpark are integrated together. You can just import GeoSpark and enjoy! More details are available here: [Babylon GeoSpatial Visualization](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon);
36
-
| 0.4.0|**Major updates:** ([Example](https://github.com/DataSystemsLab/GeoSpark/blob/master/src/main/java/org/datasyslab/geospark/showcase/Example.java)) 1. Refactor constrcutor API usage. 2. Simplify Spatial Join Query API. 3. Add native support for LineStringRDD; **Functionality enhancement:** 1. Release the persist function back to users. 2. Add more exception explanations.|
37
-
38
-
##News
39
-
* GeoSpark Gitter Chat is now online! Chat with our GeoSpark users and ask questions!
40
-
***Babylon Visualization Framework** on GeoSpark is now available!
41
-
Babylon is a large-scale in-memory geospatial visualization system. More details are available here: [Babylon GeoSpatial Visualization](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon)
Note: Scala can call Java APIs seamlessly. That means GeoSpark Scala users use the same APIs with GeoSpark Java users.
94
-
95
-
Please refer to [GeoSpark Scala and Java API Usage](http://www.public.asu.edu/~jiayu2/geospark/javadoc/)
96
-
23
+
|0.5.1|**Bug fix:** (1) GeoSpark: Fix inaccurate KNN result when K is large (2) GeoSpark: Replace incompatible Spark API call [Issue #55](https://github.com/DataSystemsLab/GeoSpark/issues/55); (3) Babylon: Remove JPG output format temporarily due to the lack of OpenJDK support|
24
+
| 0.5.0|**Major updates:** We are pleased to announce the initial version of [Babylon](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon) a large-scale in-memory geospatial visualization system extending GeoSpark. Babylon and GeoSpark are integrated together. You can just import GeoSpark and enjoy! More details are available here: [Babylon GeoSpatial Visualization](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon)|
97
25
26
+
# Important features ([more](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Important-Features))
98
27
## Spatial Resilient Distributed Datasets (SRDDs)
99
-
100
-
GeoSpark extends RDDs to form Spatial RDDs (SRDDs) and efficiently partitions SRDD data elements across machines and introduces novel parallelized spatial (geometric operations that follows the Open Geosptial Consortium (OGC) standard) transformations and actions (for SRDD) that provide a more intuitive interface for users to write spatial data analytics programs. Moreover, GeoSpark extends the SRDD layer to execute spatial queries (e.g., Range query, KNN query, and Join query) on large-scale spatial datasets. After geometrical objects are retrieved in the Spatial RDD layer, users can invoke spatial query processing operations provided in the Spatial Query Processing Layer of GeoSpark which runs over the in-memory cluster, decides how spatial object-relational tuples could be stored, indexed, and accessed using SRDDs, and returns the spatial query results required by user.
Comma-Separated Values (**FileDataSplitter.CSV**), Tab-separated values (**FileDataSplitter.TSV**), Well-Known Text (**FileDataSplitter.WKT**), and GeoJSON (**FileDataSplitter.GeoJSON**) as the input formats. Users only need to specify input format as Splitter and the start and end offset (if necessary) of spatial fields in one row when call Constructors.
108
-
109
-
**User-supplied input format mapper**
110
-
111
-
Examples: [user-supplied input format mapper](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/geospark/showcase)
GeoSpark supports R-Tree (**GridType.RTREE**) and Voronoi diagram (**GridType.VORONOI**) spatial partitioning methods. Spatial partitioning is to repartition RDD according to objects' spatial locations. Spatial join on spatial paritioned RDD will be very fast.
31
+
**Native input format support**: CSV, TSV, WKT, GeoJSON
130
32
131
-
### Spatial Index
33
+
**User-supplied input format mapper**: Any input formats
132
34
133
-
GeoSpark supports two Spatial Indexes, Quad-Tree (**IndexType.QUADTREE**) and R-Tree (**IndexType.RTREE**). Quad-Tree doesn't support Spatial K Nearest Neighbors query.
Supported Spatial Indexes: Quad-Tree and R-Tree. Quad-Tree doesn't support Spatial K Nearest Neighbors query.
136
40
137
-
GeoSpark currently provides native support for Inside, Overlap, DatasetBoundary, Minimum Bounding Rectangle and Polygon Union in SRDDS following [Open Geospatial Consortium (OGC) standard](http://www.opengeospatial.org/standards).
41
+
## Geometrical operation
42
+
Inside, Overlap, DatasetBoundary, Minimum Bounding Rectangl, Polygon Union
138
43
139
-
### Spatial Operation
44
+
## Spatial Operation
45
+
Spatial Range Query, Spatial Join Query, and Spatial K Nearest Neighbors Query.
140
46
141
-
GeoSpark so far provides **Spatial Range Query**, **Spatial Join Query**, and **Spatial K Nearest Neighbors Query**.
Jia Yu, Jinxuan Wu, Mohamed Sarwat. ["A Demonstration of GeoSpark: A Cluster Computing Framework for Processing Big Spatial Data"](). (demo paper) In Proceeding of IEEE International Conference on Data Engineering ICDE 2016, Helsinki, FI, May 2016
158
64
159
65
Jia Yu, Jinxuan Wu, Mohamed Sarwat. ["GeoSpark: A Cluster Computing Framework for Processing Large-Scale Spatial Data"](http://www.public.asu.edu/~jiayu2/geospark/publication/GeoSpark_ShortPaper.pdf). (short paper) In Proceeding of the ACM International Conference on Advances in Geographic Information Systems ACM SIGSPATIAL GIS 2015, Seattle, WA, USA November 2015
160
66
161
67
162
-
##Acknowledgement
68
+
# Acknowledgement
163
69
164
70
GeoSpark makes use of JTS Plus (An extended JTS Topology Suite Version 1.14) for some geometrical computations.
165
71
166
72
Please refer to [JTS Topology Suite website](http://tsusiatsoftware.net/jts/main.html) and [JTS Plus](https://github.com/jiayuasu/JTSplus) for more details.
167
73
168
74
169
75
170
-
##Contact
76
+
# Contact
171
77
172
-
###Questions
78
+
## Questions
173
79
174
80
* Please join [](https://gitter.im/geospark-datasys/Lobby?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
Please visit [GeoSpark project wesbite](http://geospark.datasyslab.org) for latest news and releases.
187
93
188
-
###Data Systems Lab
94
+
## Data Systems Lab
189
95
GeoSpark is one of the projects under [Data Systems Lab](http://www.datasyslab.org/) at Arizona State University. The mission of Data Systems Lab is designing and developing experimental data management systems (e.g., database systems).
190
96
191
-
## Thanks for the help from GeoSpark community
192
-
We appreciate the help and suggestions from the following GeoSpark users (The list is growing..):
193
-
194
-
*@gaufung
195
-
*@lrojas94
196
-
*@mdespriee
197
-
*@sabman
198
-
*@samchorlton
199
-
*@Tsarazin
200
-
*@TBuc
201
-
* ...
97
+
# Thanks for the help from GeoSpark community
98
+
We appreciate the help and suggestions from GeoSpark users: [**Thanks List**](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Community-Thanks-List)
Copy file name to clipboardexpand all lines: src/main/java/org/datasyslab/babylon/README.md
+7-5
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@
12
12
13
13
## Main Features
14
14
15
-
### Extensible Visualization operator
15
+
### Extensible Visualization operator (just like playing LEGO bricks)!
16
16
17
17
* Support super high resolution image generation: parallel map image rendering
18
18
* Visualize Spatial RDD and Spatial Queries (Spatial Range, Spatial K Nearest Neighbors, Spatial Join)
@@ -22,7 +22,7 @@
22
22
### Overlay Operator
23
23
Overlay one map layer with many other map layers!
24
24
25
-
### Various Image filter
25
+
### Various Image Filter
26
26
* Gaussian Blur
27
27
* Box Blur
28
28
* Embose
@@ -34,10 +34,12 @@ You also can buld your new image filter by easily extending the photo filter!
34
34
35
35
### Various Image Type
36
36
* PNG
37
-
*JPEG
37
+
*JPG
38
38
* GIF
39
+
* More!
40
+
41
+
You also can support your desired image type by easily extending image generator! (JPG format is temporarily unavailable due to the lack of OpenJDK support)
39
42
40
-
You also can support your desired image type by easily extending the photo filter!
41
43
42
44
43
45
### Current Visualization effect
@@ -50,7 +52,7 @@ You also can support your desired image type by easily extending the photo filte
50
52
You also can build your new self-designed effects by easily extending the visualization operator!
51
53
52
54
### Example
53
-
Here is [a runnable single machine exmaple code](https://github.com/jiayuasu/GeoSpark/blob/master/src/main/java/org/datasyslab/babylon/showcase/Example.java). You can clone this repository and directly run it on you local machine!
55
+
Here is [a runnable single machine exmaple code](https://github.com/DataSystemsLab/GeoSpark/blob/master/src/main/java/org/datasyslab/babylon/showcase/Example.java). You can clone this repository and directly run it on you local machine!
54
56
55
57
### Scala and Java API
56
58
Please refer to [Babylon Scala and Java API](http://www.public.asu.edu/~jiayu2/geospark/javadoc/latest/).
0 commit comments