-
-
Notifications
You must be signed in to change notification settings - Fork 372
Log of steps to reproduce #291
-
got the data: http://www.geofabrik.de/data/download.html
-
Find out how to install the data into the database (fail is the first 5 or 6 possible links don’t look promising)
- googling strings:
- how to install osm into ubuntu tutorial found!!
- reading this tutorial http://www.slideshare.net/MarcHuang1/osm-installation-en, quite long but found stuff
- my steps
createdb -E UTF8 mexico
psql mexico
create extension postgis;
sudo apt-get install osm2pgsql
osm2pgsql -help
osm2pgsql --slim -a -d mexico mexico-latest.osm.pbf
I have the osm data now, could view it using qgis, and:
\dS planet_osm_roads
Table "public.planet_osm_roads"
Column | Type | Modifiers
--------------------+-----------------------------+-----------
osm_id | bigint |
oneway | text |
way | geometry(LineString,900913) |
lots of columns!!!! I see this very interesting http://wiki.openstreetmap.org/wiki/Way
I am analysing the data I get:
select oneway,count(*) from planet_osm_roads group by oneway;
oneway | count
--------+-------
| 28491
no | 3231
1 | 8
-1 | 874
yes | 45272
no;yes | 1
(6 rows)
explicitly there are more 1 way roads than 2 way roads
select oneway,count(*) from planet_osm_line group by oneway;
oneway | count
--------+--------
| 737887
no | 29602
1 | 15
-1 | 3291
yes | 89727
no;yes | 2
http://wiki.openstreetmap.org/wiki/Key:oneway 737887 records don’t have info about direction in lines
- Also analysing with Qgis:
I can see the street where I live, but has no name.
Select count(*) from planet_osm_roads where name is NULL;
count
-------
42781
select count(*) from planet_osm_roads where name is not NULL;
count
-------
35096
select count(*) from planet_osm_line where name is NULL;
count
--------
599804
-
Thinking on what I need and can do:
-
queries with the name involved its out of the question.
-
If I want to route cars its directed graph, if I want to route pedestrians is undirected.
-
I want to follow the structure of http://docs.pgrouting.org/2.0/en/doc/src/developer/sampledata.html#sampledata because they use that in the majority of the functions, so it is wise thing to follow the structure:
osm_id source target cost reverse_cost way
the geometry is there but with name way so to avoid doingway as the_geom
I need to change the name.Better: in a new table, with only the data relevant for pgRouting, so I will copy the structure in the sample data, X,Y I can get them later based on the geometry. except that I will have an additional column osm_id bigint to reference the original data if necessary.
-
I will use only planet_osm_line has 860,524 records
-
to simplify the problem (for issue purposes) Can I ignore the fact that there are railways, waterways, etc. and think of them as roads?. (maybe its on those that the NULL applies, going to check this) decided to get rid of them. 2 options:
- logically delete them with: using
set cost = -1, reverse cost = -1
- physically deleting them, (opting for this)
- logically delete them with: using
-
create table deleted as select * from planet_osm_line
where ("waterway" is not null or "aerialway" is not null
or "aeroway" is not null or "railway" is not null
or "route" is not null or boundary is not null);
delete from planet_osm_line
where ("waterway" is not null or "aerialway" is not null
or "aeroway" is not null or "railway" is not null
or "route" is not null or boundary is not null);
source and target columns are going o be filled up by pgRouting so I have to decide what cost, and reverse cost means. Based on the length of the geometry which is a distance, very practical, because in case I want to base the costs on time, an operation cost/speed and give that as cost.
- Have 4 cases:
-
Tagged as one way: "oneway" = 'yes' or "oneway" = '1'
``` cost = length reverse_cost = -1 ```
-
Tagged as one way with geometry reversed: "oneway" = '-1'
``` cost = -1 reverse_cost = length ```
-
Tagged as two way: "oneway" = 'no'
``` cost = length reverse_cost = length ```
-
Not Tagged and contradictory tagging: "oneway" is null or "oneway" = 'no;yes' , this is a problem,
-
select count(*) from planet_osm_line where ("oneway" is null or "oneway" = 'no;yes');
708488
select count(*) from planet_osm_line where not ("oneway" is null or "oneway" = 'no;yes');
122533
The mayority of the lines are not tagged, so the decision on how I am going to handle cost and reverse_cost I will make it by inspecting the data, and some numbers and some reasoning
-
by inspection: I compared the places I know in the Qgis map I made, vs google maps, so by visual inspection the untaged streets are one way.
-
some numbers:
The following table, 14.75% of the data is tagged, 85% is not. So what "guessing" combination for cost, reverse_causes the least error? One way streets are 73%, with reverse geometry is 2%, and two way is 24%, I those percentages hold on the untagged streets, and choose two-way then I 64% of the data is going to be wrong, if I choose reversed one way 82% is going to be wrong, if I choose one-way 22% is going to be wrong. So best is to choose 1 way.
| ------ |--------- |---------| -------- | ----- | ------- | --------- |
| oneway | # of | % of | tag | # of | % of | % of |
| status | records | records | | recs | recs |total recs |
| | A | B/T | | C | C/A*100 | C/T*100 |
| ------ |--------- |---------| -------- | ----- | ------- | --------- |
| | | | one way | 89685 | 73.2% | 10.8% |
| tagged | 122533 | 14.75% | rev. one | 3281 | 2.7% | 0.3% |
| | | | two way | 29567 | 24.1% | 3.5% |
| ------ |--------- |---------| -------- | ----- | ------- | --------- |--------- |
| | estimated|
| ------ |--------- |---------| -------- | ----- | ------- | --------- |--------- |
| | | | one way | | 73.2% | 62.4% | 518613 |
| un | 708488 | 85.25% | rev. one | | 2.7% | 2.3% | 19129 |
| tagged | | | two way | | 24.1% | 20.5% | 145240 |
| ------ |--------- |---------| -------- | ----- | ------- | --------- |--------- |
| Total | T=831021 | 100% | | 99.8% |
| ------ |--------- |---------| -------- | ----- | ------- | --------- |
-
By reasoning
If the data is most of the time collected by people on cars, then osm will receive the sequence of points in the order the car is circulating, so the direction of the geometry indicates the direction of a circulating car. So, going in the direction of the geometry goes with the flow of the cars. And to see if its two way, maybe not enough valid data osm has received to verify its two way.
Decision: I'll treat un-tagged as one way streets as follows
cost = length
reverse_cost = -1
With that, 88%+- of the records are expect to be correct.