You are on page 1of 8

Susan Williams

GEOG 586
November 2013

Project 4: Point Pattern Analysis


Introduction
While the previous lesson discussed the importance of having knowledge of the
underlying process in order to accurately analyze the results, this lesson delved into
the importance of understanding the effects that varying parameters may have on a
point pattern analysis. Not only that, but certain conditions are necessary before an
analysis should even be considered. For example, each point should represent a
singular object, and points should represent the proper locations of events rather
than area centroids or some other representative mark. Furthermore, the events
should be mapped on a plane using a low-distortion projection in order to preserve
distances between points and the study area extent should be carefully considered
in order to avoid varying results on the results of the analysis. Finally, the points
should be a census that includes all relevant entities rather than a sample, although
this is often difficult to achieve in reality (OSullivan and Unwin, 2010).
To explore Point Pattern Analysis, we experimented with some 1982 St. Louis crime
event data to perform a series of analyses and to investigate information regarding
density and clustering.
Kernel Density Analysis
Kernel Density Estimation (KDE) is a density-based analysis method that creates a
continuous field layer to smooth and distribute the effect across the study region.
Some experts feel that KDE is one of the most useful transformations in GIS due to
its ability to detect local density hot spots and linking points to other geographic
data (ibid).
In the first exercise, we used the R statistical software package to generate and
plot point patterns and density maps using the provided St. Louis crime data. I
experimented with altering the kernel density variable and found that it seems to
have quite an effect on the results of the analysis. I began with small bandwidths,
0.1 (see Figure 1) and 0.25 respectively (see Figure 2). These densities were
applied to the first set of crime data regarding gun homicides and a density map
with contours was generated.

Figure 1: Density map of 1982 gun


homicides in St. Louis using a kernel
density bandwidth of 0.1

Figure 2: Density map of 1982 gun


homicides in St. Louis using a kernel
density bandwidth of 0.25.

Clearly, if the bandwidth is too small, density estimates will likely be too low and not
provide much more data than individual events as shown in the figures above.
I next experimented with large bandwidths, 0.75 (see Figure 3) and 1.5 respectively
(see Figure 4). These densities were applied to the first set of crime data regarding
gun homicides and a density map with contours was generated.

Figure 3: Density map of 1982 gun


homicides in St. Louis using a kernel
density bandwidth of 0.75

Figure 4: Density map of 1982 gun


homicides in St. Louis using a kernel
density bandwidth of 1.5

Clearly, if the bandwidth is too large, the pattern may be too smoothed and
generalized over the study area as shown in the figures above.
What we seek is the Goldilocks bandwidth that is neither too hot nor too cold but
just right. The R software program has a command that provides a suggested
optimal density for the kernel density bandwidth. In this instance, the program
suggested sigma 0.5334841 (see Figure 5). This density was then applied to the
first set of crime data regarding gun homicides and a density map with contours
was generated (see Figure 6 below).
Figure 5: This output from the R
software programs displays the
suggested optimal density for the gun
homicide data in our first point pattern
analysis.

Figure 6: Density map of 1982 gun homicides


in St. Louis using the suggested density
bandwidth of 0.5334841.

This does appear to be a very good


choice for a bandwidth as the results are
fairly easy to interpret visually. The map
is not overly homogeneous nor and
clusters can be spotted yet no single
event is shown as a large dense cluster.
Although there is no hard-and-fast rule for
selecting the best bandwidth, it is clear
that this is a variable that should be
carefully considered and ultimately
chosen based on the goals of the analysis
at hand. Much like choosing the proper
type of data breaks in ArcMap, choosing a
good density bandwidth will help reveal
rather than obscure relevant data.

Distance Based Analysis with Monte Carlo Assessment


A basic plot of the gun homicide data points looks like this:

The question is although it looks like there


may be clusters, we know that humans tend
to see patterns even when they may not really
exist. So, is this a random pattern or is it true
clustering?
This is where point pattern analysis comes to
the rescue.

Figure 7: Point pattern of gun


homicides
in St. Louis
1982.
After running a Monte Carlo
assessment
within99
simulations on the gun homicide

data (using the envelope function in the R software program), we end up with a
graph that looks like this:

The black line is the actual


pattern of gun homicides,
and we can clearly see that
it lies outside the gray zone
for the majority of the
ranges represented on the
graph. What this basically
tells us is that the pattern is
indeed clustered at the
stated range of distances
and its not just our
imagination.

However, one cannot


conclude that just because
the gun homicides are
clustered that all other
crime data points are also
clustered. Each data set
should be examined
individually.
With that in mind, I ran 99
simulations on the robbery
data and ended up with
this graph:

In this instance, the black line (the actual pattern of robberies) is generally within
the gray zone for the majority of the ranges represented on the graph. In other
words, the actual pattern is close to the reference point of a pattern generated by
IRP/CSR (the red line). This is nearly the opposite situation from the gun homicide
graph shown above so what does it mean? It means that we cannot reject the
null hypothesis in this case, at least for those areas that are within the gray
envelope .
A basic plot of the street robbery data points looks like this:

One could easily think that there is obvious


clustering in this pattern, so it is easy to see how
point pattern analysis can be useful in helping
identify true patterns rather than simply relying on
visual examination. Also, its a good thing we
didnt make any assumptions based on the gun
homicide data.

Figure 8: Point pattern of street


robbery events in St. Louis in 1982.

So lets have a quick


peek at the last set of
crime data, hit-andruns:
This is another
instance where the
actual pattern of crime
is within the gray
zone, although
interestingly it is below
the reference point of

a pattern generated by IRP/CSR. Again, we cannot reject the null hypothesis in this
case for hit-and-runs.

Out of curiosity, lets peek at the plot pattern for the hit-andrun data:

It does appear to be as random as one would expect based


on our knowledge from the previous function graph. Whew!
Granted, all three data sets were done using only the G
function, which concerns each event's nearest neighbor
distance. There are other analyses we can do to get a better
picture. The main point, however, is that we cannot trust our
eyeballs to make reliable decisions but should use statistical
analyses instead.

Figure 9: Point pattern of hitand-run events in St. Louis in


1982.

Conclusion
This project was a good exercise in attempting to analyze point patterns through
spatial analysis techniques, specifically distance and density-based measurement
methodologies as they related to crime event point data from St. Louis in 1982. The
purpose of these analyses is to attempt to identify patterns related to first- and
second-order effects. This allows us to reject (or accept) the null hypothesis that
the data points are truly the result of a random process. Although we cannot
determine exactly what process or processes are causing the 1982 St. Louis crime
patterns to cluster, we can conclude whether or not events are occurring by chance
or are indeed clustering in certain locations in the study area.

References

O'Sullivan, David. (2014). GEOG 586: Geographic Information Analysis, Lesson 4:


Point Pattern Analysis. The Pennsylvania State University World Campus. Accessed
November 2013 at https://www.e-education.psu.edu/geog586/l5.html
O'Sullivan, D., & Unwin, D. J. (2010). Geographic Information Analysis. (2nd ed.).
Hoboken, New Jersey: John Wiley & Sons, Inc.

You might also like