I’m currently working on a project involving geographic locations and our program needs to state whether location A is contained in location B. Currently, we do it using administrative codes in GeoNames, e.g new york city is contained in USA because it has the same country code as USA. However, this method does not always work due to missing data and we are looking into other methods. If you can provide information regarding any of the following, that would help a lot:
How do most geocoding software look up hierarchy information? Do they use the administrative codes or look into polygons?
How fast is it to check whether polygon A is contained within or intersects with polygon B using PostGIS or lucene? I have never worked with polygons – do you know of any tutorials explaining how to use them?
Are there resources available which makes polygon information about geographic locations available for free? I think OpenStreetMap provides it but planet.osm is over 900 GB in size and our capacity currently is ~30GB. We don’t need extensive information about streets and addresses but we need to establish hierarchy up to city/village level. I also looked into DBPedia but it appears to contain a lot less info than GeoNames
Thanks a lot!
Advertisement
Answer
Here are a few thoughts on your questions:
How do most geocoding software look up hierarchy information? Do they use the administrative codes or look into polygons?
It’s nearly impossible to tell how most software work, but I can tell you that if they only rely on data like zip codes, instead of checking if they are inside of a given space and time, there is no need to bother with any GIS. Of course, working with geo codes is much faster, but has its limitations when it comes to any spatial operation, such as coverage, touches, overlaps, intersects, etc.
How fast is it to check whether polygon A is contained within or intersects with polygon B using PostGIS or lucene? I have never worked with polygons – do you know of any tutorials explaining how to use them?
Using PostGIS it is absolutely painless.
Example: Consider the following BBOX POLYGON((14.45 35.87,14.56 35.87,14.56 35.80,14.45 35.80,14.45 35.87))
:
This example checks if the POINT(14.48 35.85)
is inside of the given polygon using the function ST_Within
:
db=# SELECT ST_Within('POINT(14.48 35.85)'::GEOMETRY,'POLYGON((14.45 35.87,14.56 35.87,14.56 35.80,14.45 35.80,14.45 35.87))'::GEOMETRY); st_within ----------- t (1 row)
Now the same experiment using POINT(14.35 35.95)
, which is outside of the given polygon:
db=# SELECT ST_Within('POINT(14.35 35.95)'::GEOMETRY,'POLYGON((14.45 35.87,14.56 35.87,14.56 35.80,14.45 35.80,14.45 35.87))'::GEOMETRY); st_within ----------- f (1 row)
Are there resources available which makes polygon information about geographic locations available for free? I think OpenStreetMap provides it but planet.osm is over 900 GB in size and our capacity currently is ~30GB. We don’t need extensive information about streets and addresses but we need to establish hierarchy up to city/village level. I also looked into DBPedia but it appears to contain a lot less info than GeoNames
It really depends on your requirements (granularity, accuracy, coverage, etc.). There are many free sources of shapefiles on the web, such as:
- Cartographic Boundary Shapefiles – State Legislative Districts
- World Borders Dataset
- Statsilk – Free country shapefile maps
If you’re wondering how to import shapefiles into PostGIS, check this answer.
You can use this website here to visualise your WKT
(Well Known Text) literals: