Professional Documents
Culture Documents
1
On August 31, 1854 – after several
outbreaks had already occurred
elsewhere in the city – a major
outbreak of cholera struck Soho.
Over the next ten days, over 500
people on or near Broad Street died.
Dr. John Snow wanted to prove his
hypothesis that the cause of the
disease was contaminated water
sources. He created a map, plotting
related deaths and water pumps to
illustrate how cases of cholera were
centered around the Broad Street
water pump.
3
Measuring Geographic
Distributions
•Mean Center
•Directional Distribution
Analyzing Patterns
•Average Nearest Neighbor
•Spatial Autocorrelation
•High/Low Clustering tool
Mapping Clusters
•Cluster and Outlier Analysis
•Hot Spot Analysis
4
The first step in our analysis is to
determine the center of our
cholera deaths. This will be a clue
as to the location of the
contaminated water source.
We will weight the features so
the mean center is more a measure
of concentration than a measure of
purely geographic distribution. In
this case, we want to use the
number of deaths at each point as
the weight.
We will also create a standard
distance circle, a circle with a radius
equal to one standard deviation,
with the mean center also the
center of the standard distance
circle.
What the mean center doesn’t
tell us is whether the data is
concentrated or dispersed or
whether it has a directional trend.
5
The Average Nearest Neighbor tool to
calculate the average distance between
each feature, based on area.
7
8
Looking for ‘hot’ and ‘cool’ spots in the data will help determine where there is a high
concentration of cholera related deaths.
In other words, we want to look for clusters of features with high values and clusters of features
with low values.
Legend
!
( < -2.0
(
! -2.0 to -1.0
(
! -1.0 to 1.0
(
! 1.0 to 2.0
(
! > 2.0
9
Outlier
High Value Clusters
Legend
!
( < -2.0
(
! -2.0 to -1.0
(
! -1.0 to 1.0
(
! 1.0 to 2.0
(
! > 2.0
Statistically speaking, that is a confidence of greater than 95% that it the cluster of high (or low)
values is not a random occurrence.
10
11
12
Deaths Per Water Pump
300
250
200
50
13
14