Professional Documents
Culture Documents
html
Spatial Queries
Measuring
Sub-setting
Spatial Indexes
Spatial Joins
Conclusion
Continue Reading
Previous: Installing PostGIS and
GeoServer
About OpenGeo
OpenGeo provides commercial open
Note source software for internet mapping and
geospatial application development. We
The PostGIS database has been installed with unrestricted access for local users (users are a social enterprise dedicated to the
connecting from the same machine as the database is running). That means that it will growth and support of open source
accept any password you provide. If you need to connect from a remote computer, the software.
password for the postgres user has been set to postgres.
License
1. First, we need to start up the Suite (which will start both PostGIS and GeoServer). Click This work is licensed under a Creative
the green Start button at the top right corner of the Dashboard. Commons Attribution-Share Alike 3.0
United States License. Feel free to use this
2. The first time the Suite starts, it initializes a data area and sets up template databases. material, but we ask that you please retain
This can take a couple minutes. Once the Suite has started, you can click the Manage the OpenGeo branding, logos and style.
option under the PostGIS component to start the pgAdmin utility.
1 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
Note
3. If this is the first time you have run pgAdmin, you should have a server entry for PostGIS
(localhost:54321) already configured in pgAdmin. Double click the entry, and enter
anything you like at the password prompt to connect to the database.
Note
If you have a previous installation of PgAdmin on your computer, you will not have an
entry for (localhost:54321). You will need to create a new connection. Go to File >
Add Server, and register a new server at localhost and port 54321 (note the
non-standard port number) in order to connect to the PostGIS bundled with the
OpenGeo Suite.
Creating a Database
PostgreSQL has the notion of a template database that can be used to initialize a new
database – the new database automatically gets a copy of everything from the template. When
you installed PostGIS, a spatially enabled database called template_postgis was created.
If we use template_postgis as a template when creating our new database, the new
database will be spatially enabled.
1. Open the Databases tree item and have a look at the available databases. The
postgres database is the user database for the default postgres user and is not too
interesting to us. The template_postgis database is what we are going to use to
create spatial databases.
2 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
Note
3. Fill in the New Database form as shown below and click OK.
Name postgis
Owner postgres
Encoding UTF8
Template template_postgis
4. Select the new postgis database and open it up to display the tree of objects. You’ll
see the public schema, and under that a couple of PostGIS-specific metadata tables –
geometry_columns and spatial_ref_sys – which we will discuss later.
3 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
5. Click on the SQL query button indicated below (or go to Tools > Query Tool).
SELECT postgis_full_version();
Note
7. Click the Play button in the toolbar (or press F5) to “Execute the query”. The query will
return the following string, confirming that PostGIS is properly enabled in the database.
8. You have successfully created a PostGIS spatial database!! Now do a spatial calculation
just to make sure. Copy the following into the SQL window:
Our first spatial query constructs a diagonal line across a one-unit square. The length of
that line is sqrt(2), or 1.4142.
We will load our example data into PostGIS using the pgShapeLoader tool in to convert from
Shape files to PostGIS tables.
1. From the PgAdmin Plugins menu, select PostGIS Shapefile and DBF loader.
4 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
The loader still start with the connection information for your current PgAdmin database.
Click the “Test connection...” button to ensure you can connect to the database.
2. Now, click on the button in the “Shape File” area, and browse to the data directory.
Select the “school_pt.shp” file, and click “Open”.
5. Repeat the process for “road_ln.shp” and “taxlot_ply.shp”. These are much larger files. To
make the load process go faster, open the “Options...” dialogue and click the “Load using
COPY rather than INSERT” option on before running the import.
5 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
PostGIS ships with a command-line utility for loading shape files into the database, called
shp2pgsql, as well as a utility for exporting tables to shape files, call pgsql2shp.
If you completed the process with PostGIS Shapefile and DBF loader above, you do not
need to run these commands – the data is already loaded into your database.
Enter the workshop data directory, set the PATH environment variable to include the
PostgreSQL executables directory, and then run the data loading commands. shp2gpsql
converts the shape file into a SQL text file suitable for loading into the database. psql loads
the text file into the target database.
SPATIAL_REF_SYS
The SPATIAL_REF_SYS table contains information about “spatial reference systems” –
combinations of geographic systems (ellipsoids, datum) and projected systems (projections,
parameters) that are used for real-world mapping. “Transverse mercator” is an example of a
projection, and WGS84 is an example of a spheroid, but “UTM Zone 10 North, NAD 83” is an
example of a full spatial reference system.
Table "public.spatial_ref_sys"
Column | Type | Modifiers
-----------+-------------------------+-----------
srid | integer | not null
auth_name | character varying(256) |
auth_srid | integer |
srtext | character varying(2048) |
proj4text | character varying(2048) |
Indexes:
"spatial_ref_sys_pkey" PRIMARY KEY, btree (srid)
Each row in the SPATIAL_REF_SYS table corresponds to one spatial reference system. The
srid column is the unique identifier, and is considered “internal” to the database. The
auth_name and auth_srid are the external authority and authority number. The authority is
usually “EPSG” and the table that ships with PostGIS matches the srid to the auth_srid for
convenience.
The srtext is the OGC “well-known text” representation of the spatial reference system. The
proj4text is the representation consumed by the Proj.4 reprojection library PostGIS uses to
provide on-the-fly reprojection. Because only the proj4text is used internally by PostGIS, it
is usually safe to omit the srtext when adding new entries, but be aware that external
programs may use the srtext to determine the projection of a particular table.
GEOMETRY_COLUMNS
The GEOMETRY_COLUMNS table contains information about the spatial columns in a database.
Table "public.geometry_columns"
Column | Type | Modifiers
-------------------+------------------------+-----------
f_table_catalog | character varying(256) | not null
f_table_schema | character varying(256) | not null
f_table_name | character varying(256) | not null
f_geometry_column | character varying(256) | not null
coord_dimension | integer | not null
srid | integer | not null
type | character varying(30) | not null
Each row in the table corresponds to one spatial column. Tables may have multiple spatial
columns. Client software such as QGIS and uDig often use the GEOMETRY_COLUMNS table to
figure out which columns to display to the end user as “layers” suitable for viewing on a map.
6 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
Note that the GEOMETRY_COLUMNS table is not automatically updated as you create and drop
tables. You must manually keep it up to date.
One way to keep the table up-to-date is to religiously use the AddGeometryColumn()
function when managing DDL in spatial tables. This function takes in all the information
necessary to create a new column, performs the creation, and adds a metadata record:
SELECT AddGeometryColumn(
'public',
'mytable',
'mygeocolumn',
2,
4326,
'POLYGON'
);
Another way to keep the table up-to-date is to use helper functions. PostGIS 1.4 and higher
provide the Populate_Geometry_Columns() function, which checks for validity and also
fills in missing entries.
-- PostGIS 1.4
SELECT Populate_Geometry_Columns();
populate_geometry_columns
-------------------------------------------
probed:3 inserted:3 conflicts:0 deleted:0
(1 row)
Spatial Queries
We will now construct some queries of our spatial database, using “spatial SQL” functions
provided by PostGIS (and any other SFSQL spatial database). For a reference list of functions
we will be using, see the PostGIS Functions section.
Measuring
The taxlot_ply table contains 91,343 parcel polygons. It also includes a large number of
attributes about each parcel, including:
We can use the ST_Area() function in combination with these attributes to ask some
questions of the taxlot_ply table. Open the PgAdmin SQL window and enter the following
queries into database.
Answer: 1772888
7 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
Answer: 27176
Answer: 0.41
What is the value per square foot of all parcels held by out-of-state owners?
Answer: 0.38
Measurement is not limited to areas. We can also use linear measurements to characterize the
roads in the county.
SELECT
Sum(ST_Length(the_geom)) / 5280 as miles,
Count(*) as nsegments,
cfcc
FROM road_ln
GROUP BY cfcc
ORDER by cfcc;
Sub-setting
So far, our queries have calculated one metric or a summary against every record in the
database. Databases are commonly used to store very large tables – larger than can be stored
in memory – and efficiently access sub-sets of those tables.
First, let’s find out the coordinates of the first school in our school_pt table:
Now, let’s take that point, and find the average property value in a one-mile (5280 foot) radius.
Answer: 161,094
8 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
The ST_GeomFromText() function is used to build a geometry object from the text
representation of a point. Note that the SRID is also set to 2270 at the same time, to
match the SRID of our data tables.
The ST_DWithin() function is then used to test every geometry against the query
point, and return true only if the geometry was within 5280 units (feet).
Finally, only those records that passed the distance test were fed into the calculation of
the average property value: total value divided by number of properties.
Spatial Indexes
The PostGIS spatial index is an r-tree index, implemented on top of PostgreSQL’s GiST access
method infrastructure.
An “r-tree” (and any other spatial index) works by sorting the bounding boxes of features into a
quickly searchable tree. Because the features themselves are not indexed, just the bounding
boxes, all queries that use spatial indexes must proceed in two phases. First, the spatial index
is used to generate a subset of records that might match a spatial condition; then, an exact test
is used on just that subset to produce the final output set.
The “r-tree” index uses nested rectangles (in the two-dimensional case, cubes and hypercubes
for higher dimensions) to sort the features into a quickly searchable tree.
2. Run the average property query, and see how fast it executes:
3. Now, add the spatial indexes back onto your tables, and run the query again.
The unindexed query logs an execution time of over 1000ms, while with the indexes, a time of
less than 50ms is achieved.
Spatial Joins
With spatial indexes in place, we can perform spatial joins quickly – taking information from two
9 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
Our last query determined the average property value within a one-mile radius of a single
school. We can use a spatial join to determine the property value within a one-mile radius
for all schools. Or, to keep the result set smaller, just the high schools.
SELECT
s.name AS school_name,
Sum(t.landvalue + t.impvalue) / Count(*) AS avg_property_value
FROM taxlot_ply t, school_pt s
WHERE
ST_DWithin(t.the_geom, s.the_geom, 5280)
AND
s.type = 'High School'
GROUP BY s.name
ORDER BY avg_property_value DESC;
Conclusion
These have been a very few examples of using spatial SQL for querying a database. In the
remaining sections of the workshop, most of the querying will happen behind the scenes, as
tools like GeoServer pull data from the database.
However, the power of the spatial database for analysis and querying remains easily available
via scripting languages and direct user tools like PgAdmin to quickly analyze or automate
geospatial tasks.
Support Team
Careers
Partners Contact
Press
Partner Terms
Partner FAQ Blog
10 of 11 07/02/2011 10:35
OpenGeo : Introduction to an Open Source Geostack : PostGIS http://workshops.opengeo.org/stack-intro/postgis.html
11 of 11 07/02/2011 10:35