You are on page 1of 44

GEOSTATISTICS

REPORT
St. Catharines Fairview Neighbourhood Tree Canopy
Spread




Prepared By: Brian Lee & Olawale Babalola
Prepared For: Ian D. Smith
Date: March 21, 2014
GISC9308D4

380 Vine Street St. Catharines Ontario L2M 4T6
905-808-7710 its.brianlee@gmail.com


March 21, 2014
GISC9308D4b


Ian D. Smith, M.Sc, OLS, OLIP
Professor, GIS GM
Niagara College Canada
135 Taylor Road, Niagara-on-the-Lake, ON
L0S 1J0


Dear Mr. Smith,


Re: GISC9308 Deliverable 4b Geostatistics Report
Please accept this letter as the formal submission of Deliverable 4b: Geostatistics Report for GISC9308 Spatial
Statistics by Brian Lee and Olawale Babalola.

The goal of this deliverable was to collect geospatial data and report upon the findings. As a team, the initially
selected data to examine was the tree canopy spread of 100 trees in the Fairview neighbourhood of St. Catharines.
Google Earth was the means by which the data was collected, and allowed for the accurate measurement of tree
canopy spread. The tree canopy spread was to be further processed through the prediction of areas not directly
measured or observed by interpolation IDW and kriging techniques. The data was comprehensively analyzed and
explored, in order to gain a better understanding of the data both spatially and statistically. It was through the
exploration of the data that some measurements were found to be inaccurately recorded, and consequently had
to be removed from the dataset in order to derive more accurate results that are similar to reality. The
interpolation techniques undertaken produced results which were similar, and although not entirely accurate, it is
difficult to accurately fit a natural phenomenon, such as tree canopy spread to a specific model and for the model
parameters specified to manipulate and interpret the data truthfully.
Should you require further information or have any questions regarding the enclosed documents, please do not
hesitate to contact me its.brianlee@gmail.com or (905) 808-7710 at your convenience. I look forward to your
opinion and recommendations regarding my submission.

Regards,


Brian Lee PG (GIS)
Project Manager
BL/

Enclosures: [1.) Geostatistics Report]

cc: O. Babalola, B.Sc., PG (GIS)
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
i i
Abstract

This study is based on geostatistical analysis of tree canopy spread in the Fairview neighbourhood of St.
Catharines. The statistical analysis was carried out by data collection and illustrative representation of
the data values derived. The tree canopy cover was subsequently ground truthed in the study area and
the extent was consequently verified in corroboration with the measurements observed and recorded
during the study. In order to undertake a thorough and accurate geostatistical analysis, all of the data
was mapped and systematically explored. By exploring the data, the Lee Inc. team was able to examine
and understand the dataset as a whole component, and the data was explored more methodically than
the initial preliminary geostatistical analysis undertaken. By mapping the data the initial view of the tree
canopy spread spatially exhibited presented a spread of tree canopy cover which was more heavily
sampled in a specific location, than the rest of the study area; this was present in the Northwest
quadrant of the study area, and consequently influenced the results of the all interpolation techniques
undertaken throughout the course of the deliverable. In addition, the further investigation of the spatial
data explored indicated the presence of outliers and tree canopy spread values that were anomalous.
These inconsistent values were discovered to have been incorrectly measured, and had to be
accordingly removed from the dataset. The exploration of the spatial data allowed for the finding and
presented the opportunity to revise and make corrections before initiating any interpolation
undertaking.
Through the understanding gained of geospatial data, and the interpolation techniques utilized, it was
discovered how important the accurate and well distribution of collection of data is to the resultant
geospatial predictions of areas not directly observed. Although the tree canopy spread was measured
accurately utilizing the Google Earth measuring tool, the collection of data was not properly distributed,
and therefore it was difficult to obtain an accurate representation of the data in reality through the
prediction processes carried out.
Both methods utilized for the prediction on tree canopy spread in the Fairview Neighbourhood of St.
Catharines study area predicted results which were investigated and determined to be very similar.
Although the IDW interpolation technique was a relatively simple process that did not require much
thought to be processed into the results, the values are alike to the kriging method, which required
more comprehensive understanding of the parameters involved for manipulation. It is believed that
because the data itself cannot necessarily be manipulated or fit to a specified model as the
phenomenon being studied is natural, it would be difficult to accurately predict tree canopy spread
values in future studies without first understanding the data before initiating collection, and implanting
an enhanced planning process.



Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
ii ii

Contents

Abstract .......................................................................................................................................................... i
1. Introduction ............................................................................................................................................ 1
1.1. Background .................................................................................................................................... 1
1.2. Study Area .................................................................................................................................... 3
1.3. Project Overview ........................................................................................................................... 5
1.4. Project Goal .................................................................................................................................. 6
2. Methodology ......................................................................................................................................... 6
2.1. Initial Data Collection Methodology .............................................................................................. 6
3. Analysis and Examination of Spatial Data ............................................................................................ 9
3.1. Mapping the Data ......................................................................................................................... 9
3.2. Exploratory Data Spatial Analysis .............................................................................................. 14
3.2.1. Histogram Analysis ............................................................................................................... 14
3.2.2. QQ Plot Analysis ................................................................................................................. 19
3.2.3. Spatial Autocorrelation Analysis .......................................................................................... 20
3.2.4. Covariance Cloud Analysis .................................................................................................. 22
3.2.5. Trend Analysis ...................................................................................................................... 23
4. Interpolation Technique Analysis .......................................................................................................... 24
4.1. Inverse Distance Weighted Examination ...................................................................................... 24
4.1.1. IDW Procedure .................................................................................................................... 24
4.1.2. Power ................................................................................................................................... 24
4.1.3. Search Neighbourhood ........................................................................................................ 25
4.1.4. IDW - Table of Parameters ................................................................................................. 26
4.1.5. Cross Validation of Inverse Distance Weighted Technique .................................................. 26
4.1.6. Inverse Distance Weighted Results ....................................................................................... 27
4.2. Kriging Examination ..................................................................................................................... 29
5. Evaluation of Interpolation Methods .................................................................................................... 34
6. Recommendations and Conclusion ........................................................................................................ 35
References ................................................................................................................................................... 37
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
iii iii


List of Figures

Figure 1: Study Area ...................................................................................................................................... 3
Figure 2: Fairview Neighbourhood Location ................................................................................................. 4
Figure 3: Data Collection Process .................................................................................................................. 7
Figure 4: Ground Truthing Collected Data .................................................................................................... 8
Figure 5: Tree Canopy Spread - Fairview Neighbourhood ............................................................................ 8
Figure 6: Study Area ...................................................................................................................................... 9
Figure 7: Tree Canopy Spread Points Full Extent ........................................................................................ 11
Figure 8: Tree Canopy Spread Size .............................................................................................................. 13
Figure 9: Histogram Analysis ....................................................................................................................... 15
Figure 10: Histogram Transformation Analysis ........................................................................................... 16
Figure 11: Anomalous Tree Canopy Spread ................................................................................................ 18
Figure 12: Tree Canopy Spread - QQ Plot Analysis ..................................................................................... 19
Figure 13: Validation of Anomalous Tree Canopy Spread .......................................................................... 20
Figure 14: Semivariogram Analysis ............................................................................................................. 21
Figure 15: Semivariogram Validation of Uncorrelated Values ................................................................... 21
Figure 16: Covariance Cloud Analysis ......................................................................................................... 22
Figure 17: Tree Canopy Spread Trend Analysis ........................................................................................... 23
Figure 18: IDW Cross Validation ................................................................................................................. 25
Figure 19: Cross Validation Prediction Errors ............................................................................................. 26
Figure 20: Tree Canopy Predicted Values - IDW ......................................................................................... 28
Figure 21: Semivariogram Kriging Method ................................................................................................. 30
Figure 22: Kriging Search Neighbourhood .................................................................................................. 31
Figure 23: Kriging Cross-Validation ............................................................................................................. 32
Figure 24: Resultant Kriging Map ................................................................................................................ 33
Figure 25: Kriging Interpolation Technique ................................................................................................ 34
Figure 26: IDW Interpolation Technique .................................................................................................... 34

List of Tables

Table 1: Advantages and Disadvantages of Google Earth for Data Collection. 7
Table 2: IDW Parameters ............................................................................................................................ 26

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
iv iv
List of Appendices

Appendix 1 Table of locations and measurements for the tree canopy spread observed and collected

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
1 1
1. Introduction

1.1. Background

The subsequent report is a follow-up to the tree canopy cover study undertaken in the Fairview
neighbourhood of St. Catharines previously conducted by Lee Inc. The follow-up is the second part in
the undertaking of the geostatistical analysis of tree canopy spread observed and recorded on January
29, 2014, and successive verification performed by ground truthing the collected data in the field on
February 2
nd
, 2014.
As previously stated in the Geospatial Coverage Predictions report, the geospatial analysis of tree
canopy was considered important, as the study could deliver solutions to areas which could benefit from
increased tree canopy, as well as provide an understanding of relationships between tree canopy cover
and economic, as well as social benefits that can be attributed to the presence of increased tree canopy
cover. The phenomenon of interest to Lee Inc., being the environment, and specially the presence of
trees in the natural world was studied in order to examine ecological and economic benefits that are
obtained from a greater tree canopy cover, which include, but are not limited to: enhanced property
values, provision of wildlife habitats, reduced air pollution and decreased cooling and heating costs. The
initial analysis also provided the opportunity to introduce hypothesis related to tree canopy cover such
as: the increase of property values due to abundant vegetation, benefits of energy savings due to the
cover provided by extensive tree canopies, and positive results of reintroduction of species into urban
forests due to the lack of available forest as a result of urban sprawl.
In reflection of the solutions believed to be delivered through the analysis of tree canopy cover in the
Fairview neighbourhood of St. Catharines, the goal of the project was re-examined and is re-established
to be carried out to be fundamentally relatable to the City of St. Catharines. Therefore, it is now
ascertained that by examining and studying the predicted values of tree canopy spread in the study
area, the resultant predicated values could be utilized as a proof of concept which could potentially be
extended to include the entire coverage of the City of St. Catharines, which will allow the opportunity to
examine tree canopy spread in the city as a whole and plan for the costs associated with implementing a
tree maintenance inventory. The tree maintenance inventory would make available an opportunity for
the city to plan, organize, and undertake work required on areas which could potentially have trees
which pose a risk, either through physical or propertied damage, as well as plan and implement for
programs to be undertaken that will efficiently provide the city with cost benefits attributed to the
implementation of a tree based inventory. The opportunity to develop a maintenance based tree
inventory will subsequently allow for the reduced potential of liability from hazardous and
nonhazardous trees. With the recent ice storm that affected Southern Ontario in December of 2013 and
the damage inflicted, it is applicable to explore the concept of employing a tree inventory program
which can provide beneficial advantages.

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
2 2

Previous undertakings carried out on behalf of the tree canopy cover analysis included:
The collection of the spread size (in metres) of 100 trees in the Fairview Neighbourhood of St.
Catharines a sample study area of a city block in the neighbourhood bound by Vine Street to the
East, Geneva Street to the West, Scott Street to the North, and Carlton Street to the South,
A statistical analysis of the collected data and illustrative representation of data values derived,
Ground truthing of tree canopy cover in the study area to verify existence of measurements
observed and recorded, and
The aim of the following geostatistics report undertaken by Lee Inc. is to:
Spatially explore the data to gain a better understanding and apply knowledge gained to the
utilization of interpolation techniques
Predict tree canopy values between samples observed and recorded in the study area, utilizing
both Kriging and Inverse Distance Weighted techniques,
Present surface maps created, displaying predicted values of tree canopy cover in the study
areas not directly observed and measured due to the constrained means, limited resources and
expense associated with the total undertaking, and
Evaluate and report on the results of the data values established, which will consequently allow
for recommendations of better data collection coverage for future studies undertaken.












Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
3 3
Figure 1: Study Area
1.2. Study Area

As previously established in the initial report, measurements for the Tree Canopy Spread
Geostatistical Analysis were taken in the Fairview neighbourhood in St. Catharines. The sample
study area is bound by Scott and Carlton Streets to the North and South, and Vine and Geneva
Streets to the East and West, and the total study area covers approximately 0.880 sq. km. Figure 1
below details the study area:

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
4 4
Figure 2: Fairview Neighbourhood Location
St. Catharines was originally chosen as the study area, due to its abundance of vegetation: Also
known as The Garden City, St. Catharines is renowned for its lush parks, gardens, and trails (About
our city, n.d.); and the ability for the Lee Inc. team to ground truth the data collected by use of
Google Earth by travelling to the study area site and verifying the observed results in the field.
The Fairview neighbourhood in St. Catharines was identified as an ideal location to undertake the
initial data collection for geostatistical analysis, as the area offers a mix of residential and
commercial spaces, as well as the locality offered to the Lee Inc. team. The proximity of the study
site area and location to Lee Inc.s offices offered the ability to verify and confirm the existence of
the tree canopy cover in the study area, as well as the opportunity to compare the results of the
interpolation techniques that will be employed with reality. Figure 2 below displays the Fairview
Neighbourhood located in the North central St. Catharines:
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
5 5
1.3. Project Overview

This project will deliver geospatial coverage predictions of tree canopy spread in the study area of the
Fairview neighbourhood in St. Catharines, bound by Scott and Carlton Streets to the North and South,
and Vine and Geneva Streets to the East and West. Areas not directly observed and measured will be
predicted through interpolation techniques, creating continuous surfaces utilizing both IDW and kriging
techniques based upon the sample tree canopy spread points initially collected within the study area.
Surface models are utilized for a variety of purposes including interpolating between actual data
measurements, identifying data anomalies, and establishing confidence around predictions (Babish,
2006). As stated beforehand, the undertaking of creating surface models for the specific study of tree
canopy cover is to examine the effectiveness of the reliability employed by the interpolation techniques,
which will establish if the study can be increased to a larger coverage, particularly the City of St.
Catharines as a whole. The undertaking can be examined to determine if the prediction is accurate in
providing truthful tree canopy spread, which will provide the opportunity to establish a cost effective
plan that can be utilized to implement a tree based inventory.
For the geospatial prediction coverage of the tree canopy spread to take place there is a set of
essential assumptions:
The measurements taken are precise and reproducible
The sample measurements are accurate and represent the true value at that location
The samples are collected from a psychically continuous, homogenous population
The values at unsampled locations are related to values at sampled locations (Babish, 2006)
Therefore, the data will be thoroughly spatially explored, in order to develop a deeper understanding
and subsequently perform a geostatistical analysis that is precise based on the strong understanding of
the initially collected data, and the methods utilized to transform the data to ensure a normal
distribution transpires.
Two interpolation techniques will be utilized for predicting unknown tree canopy values in the Fairview
Neighbourhood study area. The Inverse Distance Weighting interpolation technique is utilized as to
perform a straightforward and immediate geospatial coverage prediction of the tree canopy cover and
examine the results. The relative straightforwardness and speed of the interpolation technique allows
the Lee Inc. team to analyze the resultant values and determine if abnormalities in the phenomenon
exist and explore the validity of the result.
In addition to the IDW technique being analyzed, the kriging interpolation technique will be utilized and
subject to an equivalent thorough and detailed analysis. The kriging technique utilized allowed for more
parameter manipulation and user provided input to fit the data to a specified model. It is anticipated
that the analysis utilizing the kriging method will predict values that are in close comparison to reality
and can be effectively used to explore the potential extent and scope of implementing a tree based
inventory as Kriging effectively involves an interactive investigation of the spatial behavior of the
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
6 6
phenomenon represented by the z-values before you select the best estimation method for generating
the output surface. (Esri, 2011). Therefore, it is believed that the technique will be a better
representation, more accurate of reality and tree canopy spread, given that the opportunity to explore
the data more thoroughly is presented and thorough exploration of the data should lead to results
which are of increased precision than the undertaking of other interpolation techniques.
1.4. Project Goal

The goal of this project is to successfully undertake a prediction of geospatial coverage in the study
area defined of the Fairview neighbourhood in St. Catharines and to subsequently report upon the
findings, evaluate the methods of interpolation techniques utilized, and to conclude the
geostatistical analysis of the collected spatial data by providing recommendations that will provide
better data collection coverage if the study area could be resampled. Additionally, after examining
the results and exploring the outcomes, the effectiveness, as well as relative accuracy can be
remarked upon in order to establish if the prediction and study of tree canopy cover is viable in
determining the scope, extent, and value at which a tree inventory system could be established and
if the value is beneficial to being initiated, based upon the findings and reasonable conclusion
established provided by the undertaking of predictive surface analysis.
2. Methodology

2.1. Initial Data Collection Methodology

The tree canopy spread in the Fairview Neighbourhood of St. Catharines was initially measured by
utilizing Google Earth to accurately identify the study area and use the tools provided through the
software to measure tree canopy spread. Specifically the ruler tool was used, as it provided the
opportunity to accurately measure the distance between two points on the ground. By employing
Google Earth, it furthermore allowed for the accurate identification of the Easting and Northing
location of each individual tree for measurement. The cursor was hovered above the approximate
centre location of the tree canopy and the measurements were then recorded along with the
measured length of the tree canopy spread. The tree canopy spread was measured by longest
distance from visible tree branch to tree branch. The advantages and disadvantages of using Google
Earth are outlined in Table 1 below:




Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
7 7
Figure 3: Data Collection Process
Table 1: Advantages and Disadvantages of Google Earth for Data Collection
Utilization of Google Earth for Tree Canopy Spread Data Collection
Advantages Disadvantages
Free
Easy to use
The ability to ground truth and verify
results was easily manageable
Data collection set up was prompt
Ability to quickly add more points if
needed is uncomplicated
Clusters of tree canopy made some areas
difficult to identify individual tree canopy
spreads
Shadows presented a problem in some
specific areas to accurately pinpoint the
edge of the canopy spread


By ground truthing the data, Lee Inc. sought to minimize any errors that could have occurred by the
use of an external source (Google Earth) for data collection. By employing a Garmin Etrex (GPS) and
entering the coordinates to locate a minority number of the data collected, the team was able to
verify the correct recording of data collection attributed to Google Earth. Visiting and examining the
study area allowed the team to confirm the extent of the tree canopy spread and the presence of
the vast number of trees observed through the use of Google Earth. Figure 3 displays how Google
Earth was utilized to collect the data for tree canopy spread in the study area:



Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
8 8
Figure 5: Tree Canopy Spread - Fairview Neighbourhood
The Lee Inc. team observed and recorded the tree canopy spread on January 29, 2014 and
subsequently verified the Easting and Northing locations in the field with a Garmin Etrex 20 GPS on
February 2
nd
2014. Figure 4 and Figure 5 below show the verification of coordinates, along with
images of tree canopy spread in the Fairview neighbourhood of St. Catharines:










Figure 4: Ground Truthing Collected Data
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
9 9
Figure 6: Study Area
3. Analysis and Examination of Spatial Data

3.1. Mapping the Data

Although the study area was initially examined in the Geospatial Coverage Predictions Technical
Memorandum, the data was not fully explored nor analyzed in its full extent, as only sample point
locations of the data collection undertaken were mapped as presented below in Figure 6 :

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
10 10
Consequently, in order to undertake a thorough and accurate geostatistical analysis, all of the data must
be mapped and explored. By exploring the data, the Lee Inc. team is able to examine and understand
the dataset as a whole component, and a more thorough and extensive analysis can be undertaken than
was initially completed when the data was explored subsequently after the preliminary geostatistical
analysis was initiated.
By investigating and assessing the data, the opportunity to look at the spatial components of the
dataset and give indications of outliers and erroneous data values, global trends, and the dominant
directions of spatial autocorrelation, among other factors (Esri, 2013) is provided. The Lee Inc. team is
able to develop an understanding of how the phenomenon is exhibited and can begin to comprehend
what parameters may need to be applied in the process of undertaking both the Inverse Distance
Weighted and Kriging interpolation techniques which will be utilized to predict geospatial coverage of
tree canopy cover in the study area.
As a result, the data as a whole was mapped and examined prior to undertaking interpolation
techniques, which allowed for the opportunity to analyze the tree canopy cover in the study area,
review the spatial extent of the coverage, and establish if the data collection process developed would
be an accurate indicator of tree canopy spread in areas not directly measured.
Figure 7 displays the mapping of the data and the first step taken in understanding the tree canopy
cover in the Fairview Neighbourhood in the City of St. Catharines:












Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
11 11

Figure 7: Tree Canopy Spread Points Full Extent
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
12 12
The initial view of the data spatially exhibited, presents a spread of tree canopy cover which can be
ascertained that the upper Northwest quadrant of the study area was more heavily sampled than the
rest of the study area. The preferential sampling of the Northwest quadrant of the study area will
require the implementation of a method and technique which will in effect, change the composition and
structure of the tree canopy spread values. This will more accurately reflect the data and present
predicted values which are not heavily weighted in locations of the study area due to the heavier
sampling that took place during the data collection process. Therefore, merely by mapping the extent of
tree canopy values, an idea of what transformation parameters to employ when utilizing the
interpolation techniques is available and the opportunity to examine what may produce a better
prediction of tree canopy spread values in the study area is achieved.
The second step after subsequently mapping the data, was to examine if the tree canopy spread was
affected by any distinct characteristics, and in particular to examine if there was any spatial relationship
that may be attributed to the tree canopy spread size in certain parts of the study area.
The following map, Figure 8, presents the tree canopy spread presented classified by 5 classes of tree
canopy spread width:










Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
13 13
Figure 8: Tree Canopy Spread Size

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
14 14
From this map, one can see that larger tree canopy spread appears in the Northwest quadrant, while
trees in the Southeast quadrant of the study area have a smaller tree canopy spread. This can be
attributed to several factors, including, but not limited to:
o Since the Northwest quadrant of the study area was preferentially sampled, this may be the cause
of larger tree canopy spread in the area, as there is more likely to be larger and differentiated values
in areas sampled heavier, than areas under sampled during the data collection process,
o The specific location in the study area could have been constructed earlier, in addition consequently
trees planted by the city would in fact be older, and subsequently larger than younger trees which
would have been later in the study area,
o When the data was initially collected, there was no planning or prior understanding of the
geostatical analysis process, and therefore tree canopy spreads were measured at random.

3.2. Exploratory Data Spatial Analysis

Subsequently after mapping and interpreting the data, a better understanding was obtained. Although
one which was straightforward, it was nonetheless essential in understanding the tree canopy spread
and any factors which could, in effect influence the results obtained from the prediction of geospatial
coverage through the interpolation techniques which are the foundation of the study. In order to
further investigate the tree canopy spread phenomenon in the Fairview Neighbourhood study area of St.
Catharines, an assessment of the data utilizing the Exploratory Spatial Data Analysis tools available
through the ArcMap environment was undertaken. This will allow a better understanding and study of
the tree canopy spread, and for the identification of factors which could influence the results of the
interpolation. For that reason, the data is investigated to make more knowledge decisions of how to
accurately interpolate the data to achieve results which are accurate and compare with reality.

3.2.1. Histogram Analysis

The initiation of the exploratory data analysis begins with the examination of the histogram, which
allows for the study of the distribution of tree canopy spread.
Figure 9 displays the initially observed histogram for the tree canopy points in the study area:
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
15 15

Figure 9: Histogram Analysis
By observing the histogram, it can be established that the data is approximately normally distributed,
which is further evident by the Mean being close to the Median (11.904 versus 10.92). The data can
further be established of values which are normally distributed through the skewness, which is close to
zero. The only variable that is evidently providing an illustration that the data is not exactly normally
distributed is the kurtosis, which should be near 0, but in actuality is approximately 4.6. This is a factor
that can be transformed through the interpolation process, in order to achieve data was is more
normally distributed.
Evidence of the data being transformed to display tree canopy spread values which are better
distributed is presented in Figure 10 through utilizing the transformation technique of log:

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
16 16
Figure 10: Histogram Transformation Analysis
This transformation technique can be applied when undertaking particular interpolation processes,
which will normally distribute the data. As the collection of tree canopy cover spread is positively
skewed and although, close to normally distributed, the log transformation distributes the data better as
The log transformation is often used where the data has a positively skewed distribution (and there are
a few very large values. If these large values are located in your study area, the log transformation will
help make the variances more constant and normalize your data (Esri, 2013). Consequently, after
performing the transformation, the Mean and Median values are closer together (2.4131 versus 2.3906),
the skewness is closer to 0 (approximately 0.02), and the kurtosis is 3.453. This transformation
technique has in effect, normally distributed the data and can be applied when undertaking the
interpolation techniques which will be applied in the study of tree canopy spread in the Fairview
Neighbourhood of St. Catharines.
In analyzing the data through the histogram, an indication of outliers is present, as there are tree canopy
spread values that are anomalous, and this presents the opportunity to further investigate why the
values are inconsistent with what is normal.
Figure 11 on the following page displays what were believed to be outliers in the data (highlighted in
blue):
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
17 17

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
18 18
Figure 11: Anomalous Tree Canopy Spread

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
19 19
By analyzing these tree canopy values, it is discovered that the measurements initially collected of tree
canopy spread were inaccurately measured from Google Earth. The tree canopy width of two trees,
rather than one, were measured. This exploration of the data through the histogram has led to this
discovery, and provides the opportunity to remove the inconsistent and inaccurately measured data
which would affect the prediction surface being created as the main objective of the deliverable study.
3.2.2. QQ Plot Analysis

The QQ plot allows further analysis in order to compare the tree canopy spread measurements to a
standard normal distribution, and to observe and investigate points that are departures from the normal
distribution.
Figure 12 displays the QQ plot derived from the tree canopy points collected:

Figure 12: Tree Canopy Spread - QQ Plot Analysis

From this graph, it can be established that the majority of the tree canopy spread points are close to
being normally distributed and that the main departures from the line are the same tree canopy values
that were initially investigated and discovered to be incorrectly measured as evidenced in Figure 13
below:

Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
20 20


Figure 13: Validation of Anomalous Tree Canopy Spread

Again, these points will need to subsequently be removed in order to perform interpolation techniques
which will be accurate.
3.2.3. Spatial Autocorrelation Analysis

The opportunity to analyze the tree canopy points through the semivariogram and covariance functions
is possible by utilizing the geostatical analyst function. It is through this function that one can establish
that the strength of the tree canopy points is strongly correlated overall, as the majority of the points
fall within the cluster, providing evidence that the majority of points close together will lead to strong
predictions, given that things nearby tend to be more similar than things further apart.
An example of the correlation that the semivarigogram established is evidenced in Figure 14:






Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
21 21
Figure 14: Semivariogram Analysis
Figure 15: Semivariogram Validation of Uncorrelated Values












The points which deviate from the cluster are intensely uncorrelated and are further apart, which lends
to the assumption provided through analysis of the semivariogram that tree canopy spreads further
apart are likely to not be correlated in comparison to points closer together, display in Figure 15:


Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
22 22
Figure 16: Covariance Cloud Analysis
In the above image, Figure 15, the two points highlighted share a stark contrast in tree canopy
measured, and are relatively far apart, which lends to the fact that when utilizing the interpolation
techniques, methods must be selected that will enforce and enhance the influence of tree canopy
spread measurements closer to one another, which are more likely to be similar, rather than points
further apart, which tend to be dissimilar.
3.2.4. Covariance Cloud Analysis

Through analyzing the covariance cloud, it is established that the values between the tree canopies
spread shows almost similar range of value between points that are closer to each other, and shows
expected values that the spread of tree canopy closer to one another are similar. This correctly applies
to the tree canopy spread measured, as many of the measurements collected and observed shared
similar values closer together, given that the trees close together were likely to be planted in a closely
related time period, while the tree values further apart show less similarity, as they are more than likely
to have been planted at a different time period. Figure 16 displays the covariance cloud and confirms
results and observations established in the data collection process:
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
23 23
3.2.5. Trend Analysis

A brief analysis of any trends that may occur in the tree canopy cover spread is analyzed in order to
determine if there are any trend influences which may effect the output derived from the utilization of
the interpolation techniques, and to observe the trend to understand if there is any validation in
removing trends, if observed.
Figure 17 provides a three-dimensional perspective of the tree canopy cover spread in the Fairview
Neighbourhood of St. Catharines study area:

Figure 17: Tree Canopy Spread Trend Analysis


From the graph it can be derived that no particular trend is occurring. Although there are larger tree
canopy spread values in the Northwest quadrant of the study area, this was determined to more than
likely be due to the heavier sampling that location of the study area received. As tree canopy spread is a
natural phenomenon, there is no valid reasoning to remove any trends that are in the data, as the tree
canopy spread measured is an accurate record of reality, and these values must accordingly be utilized
to predict unknown values, and will do so more precisely and truthfully through data that is not
manipulated. Trees can have been planted at different times, different species of trees are located in
the study area, and the spread of the tree growth can be limited by surrounding trees; these factors
mustnt be manipulated to derive accurate prediction of tree canopy values, as the trees are a product
of their environment, and better prediction will transpire through data not manipulated.
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
24 24
4. Interpolation Technique Analysis

4.1. Inverse Distance Weighted Examination

The Inverse Distance Weighting technique is utilized to perform a straightforward and immediate
geospatial coverage prediction of the tree canopy cover and examine the results. The IDW technique
has many advantages for performing a basis of predicting values in a study area including: Simplicity,
speed of calculation, programming ease, and reasonable results for certain types of data (Babish,
2006). By initially performing an IDW analysis, the ability to have a nearly immediate prediction of tree
canopy cover spread in the study area is valuable as the characteristics of the results can be spatially
investigated and direct exploratory analysis can occur. Because the results are immediate, it is a
significant tool in understanding the data as compared to the kriging interpolation method which
demands more planning and understanding of the data and a more demanding manipulation process to
provide an accurate or near accurate predictive surface.
As the goal of the project is to provide accuracy in determining if the results achieved from the
prediction processes are accurate enough to be adoptable to increase the study area to a larger extent,
the IDW interpolation technique provides the ability to undertake immediate analysis, and to observe
the results which can be used to determine to plan and implement for the possibility of a tree based
inventory to occur.
4.1.1. IDW Procedure

In order to successfully comprehend the IDW technique, and the resultant map created which shows the
predicted values of tree canopy spread in areas not directly measured, a range of procedures had to be
undertaken. Even though the IDW technique is relatively simple, there are parameters that must be
considered before performing the procedure, these parameters and decisions generated based on the
understanding of the method will be summarized and explained below:
4.1.2. Power

The power parameter is an important function that must be thoroughly investigated prior to performing
the IDW interpolation, as the IDW relies mainly on the inverse of the distance raised to a mathematical
power (Esri, 2012). For the tree canopy spread cover, the power parameter will need to be utilized by
defining a higher power, which will consequently employ more importance on the nearer points. One
can assume that points nearer to one another are a more accurate indication of tree canopy cover, given
that the dates of the trees planting would be relatively close, and if the trees share the same type, their
tree canopy spread would be alike. The Lee Inc. team wants the data to reflect an emphasis of
nearness, as the tree canopy spread observed when ground truthing the data collected presented
suburban areas with tree lined streets constructed around the same time period and an urban canopy
that reflected this. The optimal value for the power was discovered by investigating the spatial data and
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
25 25
observing that the minimum mean absolute error was approximately -0.10, and therefore the power
was set to 1, in order to reflect the optimal prediction process that can occur based on the higher value
defined. Figure 18 presents the tree canopy cover cross validation of prediction errors, and the
resultant minimum mean absolute error:
Figure 18: IDW Cross Validation

4.1.3. Search Neighbourhood

The search neighbourhood defined was set to a maximum of 4 nearest neighbours and minimum of 2.
As previously stated in regards to the power, it was determined that the tree canopy spread observed
reflected measurements approximate to the surrounding trees closest to one another. The search
neighbourhood was kept relatively small as well Because things that are close to one another are more
alike than those far away, as the locations get farther away, the measured values will have little
relationship with the value of the prediction location (Esri, 2003).
In condition pertained to the tree canopy cover spread in the study area, this is determined to be valid
and therefore the IDW interpolation technique will be undertaken accordingly.


Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
26 26
4.1.4. IDW - Table of Parameters

Table 2 displays all of the relevant details necessary for the comprehension of accurate data input to
derive an IDW predicted surface, along with the important parameters discussed above:
Table 2: IDW Parameters
Inverse Distance Weighted Parameters
Input Feature Tree Canopy Points
Z Value Tree Canopy Size
Power 1
Search Neighbourhood Standard
Maximum Neighbours 4
Minimum Neighbours 2
Sector Type 4 Sectors

4.1.5. Cross Validation of Inverse Distance Weighted Technique

In order to examine the resultant predicted values processed by performing the Inverse Distance
Weighted interpolation technique, cross-validation was performed in order to assess how good the
model is and how well the technique is predicting the values of tree canopy spread. As the IDW
technique does not allow for manipulation of the data, the resultant values given are established and
definite values. The only parameters which allow any relative influence on the data is the power, and
search neighbourhood. However, as previously stated, the search neighbourhood was kept to a smaller
value as things that are closer together are more likely to be alike, and this can be specifically true in
regards to tree canopy cover as trees closer to one another are more likely to have been planted around
the same time, and the prediction process would need to accurately reflect this. Therefore, a larger
defined search neighbourhood would not be ideal in predicating values. The power method was
optimized in order to subsequently enforce more influence on points closer, rather than farther apart,
which again is more favourable in predicting tree canopy spread values.
Figure 19 presents the prediction errors associated the Inverse Distance Weighting interpolation
technique:

Figure 19: Cross Validation Prediction Errors
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
27 27
The data itself cannot necessarily be manipulated, the tree canopy spread was measured accurately
utilizing the Google Earth measuring tool, and therefore the tree canopy spread values collected are an
accurate representation of the data in reality. Consequently, although it is ideal to have standardized
mean prediction errors near 0, small root-mean-squared prediction errors, average standard error near
root-mean-squared prediction errors, and standardized root-mean-squared prediction errors near 1
(Esri, 2003), this is not always possible. When working with the data collected, these values are correct
and accurate measurements, and possible explanations to why the prediction process is not ideal, could
be due to the preferential sampling in the Northwest quadrant of the study area, which provides a
clustered area of points and the points not being evenly distributed.
4.1.6. Inverse Distance Weighted Results

The following map, Figure 20, presents the results achieved from the procedure undertaken to perform
the Inverse Distance Weighted interpolation technique:
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
28 28

Figure 20: Tree Canopy Predicted Values - IDW
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
29 29
4.2. Kriging Examination

In addition to the IDW interpolation technique undertaken, kriging was completed to predict the
geospatial coverage of tree canopy spread in the Fairview Neighbourhood of St. Catharines study area.
Kriging allows for a more demanding and rigorous process to produce prediction results, and the process
allows for a better understanding of the data and being given the opportunity to influence and
essentially effect the outcome results. The kriging interpolation technique allows the opportunity to fit
the structure of the model specifically to the data, whereas the IDW undertaken only allowed for some
manipulation in the parameters: Kriging is divided into two distinct tasks: quantifying the spatial
structure of the data and producing a prediction. Quantifying the structure, known as variography, is
where you fit a spatial-dependence model to your data. To make a prediction for an unknown value for
a specific location, kriging will use the fitted model from variography, the spatial data configuration, and
the values of the measured sample points around the prediction location (Esri, 2003).
In order to successfully demonstrate the interpolation model manipulated to the data, a range of
procedures undertaken will be provided. The parameters and decisions generated based on the
understanding of the method will be summarized and explained below:
The method utilized for the kriging undertaken was ordinary, this method was applied for the data as
there was no underlying trend discovered through the exploratory spatial analysis completed. Ordinary
kriging is done when there is no underlying spatial trend (drift), the mean of the variable is unknown
and the sum of the kriging weights is equal to one. This method assumes that the data set has a
stationary variance but also a non-stationary mean within the search radius. Ordinary Kriging is highly
reliable and is recommended for most data sets (Babish, 2006).
The transformation type utilized was log, as although data was clustered in the Northwest quadrant, the
transformation type log was previously found to suitably make the variances in the data more constant
and therefore more normally distributed, which can be utilized as an effective method to achieve
accurate prediction results.
The following Figure 21, details the parameters utilized for the kriging interpolation technique:


Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
30 30
Figure 21: Semivariogram Kriging Method
The semivariogram was applied for the variable method as it was previously determined through
exploratory analysis that this function can establish the strength of the tree canopy being related closer
together strongly. As the points are correlated strongly overall in areas where the tree canopy width
measured is closer, the subsequent values predicted will be determinant based on the values that are
closer together, rather than things further apart.
Figure 22 provides details on the search neighbourhood parameters applied:
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
31 31

Figure 22: Kriging Search Neighbourhood
The search parameters utilized were unchanged and constant from the parameters applied when
undertaking the IDW interpolation technique as it was determined that the tree canopy spread
observed when undertaking the spatial exploratory analysis reflected measurements approximate to the
surrounding trees closest to one another. Consequently, the search neighbourhood was kept relatively
small Because things that are close to one another are more alike than those far away, as the locations
get farther away, the measured values will have little relationship with the value of the prediction
location (Esri, 2003). Through the exploratory analysis, it was further verified that all of the values
which shared similar tree canopy spread were close together, and as values deviated in distance, the
correlation weakened overall.
In actuality, there is not a large significance in applying either the semivariogram or covariance method,
as There is a relationship between the semivariogram and the covariance function and Because of
this equivalence, you can perform prediction in Geostatistical Analyst using either function (Esri, 2003).



Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
32 32
Figure 23 displays the cross validation of the kriging interpolation method:

Figure 23: Kriging Cross-Validation

The predicted error, although not ideal, is an accurate reflection of the tree canopy cover, and the data
cannot be manipulated to necessarily fit any particular model. As the tree canopy spread is a natural
phenomenon, even with the manipulation process and the understanding of the data through spatial
analysis, it is still difficult to fit a model to defined parameters in order to achieve accurate values based
on the standards the model demands to produce accurate results.
The resulting map created from the kriging process undertaken is shown in Figure 24 below:




Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
33 33

Figure 24: Resultant Kriging Map
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
34 34
Figure 25: Kriging Interpolation Technique
Figure 26: IDW Interpolation Technique
The above kriging predicts values closely related to that of the IDW interpolation technique undertaken.
As the cross validation was inspected for both, the interpolation methods produced results which are
similar. The expectation that the kriging method would produce strong differences in the result derived
did not transpire, and even though additional parameters allowed for added manipulation, the
prediction of the natural phenomenon did not necessarily become better with the kriging interpolation
technique.
5. Evaluation of Interpolation Methods

Both methods utilized for the prediction on tree canopy spread in the Fairview Neighbourhood of St.
Catharines study area predicted results which are very similar. Although the IDW interpolation
technique was a relatively simple process that did not require much thought to be processed into the
results, the values are alike to the kriging method, which allowed for more manipulation. It is believed
that because the data itself cannot necessarily be manipulated as the phenomenon being studied is
natural, the kriging method would not provide any beneficial circumstances to be applied in predicting
tree canopy values. Figure 25 and Figure 26 below shows the similarities realized through the
undertaking of both interpolation techniques:
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
35 35
Both interpolation techniques established predictions which do not take into consideration commercial
aspects, or areas which do not have any tree canopy cover, therefore the predicted values must be
taken with a grain of salt, and have to be validated and verified dependently. This can further be
accomplished through analysis of the study areas characteristics, in order to determine tree canopy
cover and the extent of the natural phenomenon. Because of the data collection which was not done to
acceptable standards, many areas which can be verified that have large tree canopy cover, are not
shown on the resultant maps. The parameters specified throughout the course of the undertaking were
implemented to limit the prediction to the nearest values, given that closer things are more likely to be
alike and similar. This in effect was detrimental, as there were large areas of the study area that did not
have an extensive amount of data collected in order for the accurate prediction of values and
determining tree canopy cover correspondingly similar to reality.
6. Recommendations and Conclusion

After successfully undertaking both IDW and kriging interpolation techniques, and deriving results which
were not entirely ideal due to the natural phenomenon of tree canopy values, the undertaken processes
returned results which were similar. It was difficult to accurately fit a natural phenomenon, such as tree
canopy spread to a specific model and for the model parameters specified to manipulate and interpret
the data truthfully.
As the sample coverage area was not ideal, due to not only preferential sampling, but a thorough
understanding was not fully comprehended of what would indeed be ideal to fit an interpolation
technique model as the results are not entirely accurate with reality.
The prediction process method can be utilized to be an accurate indicator of tree canopy spread in areas
not directly measured, although the sample points must initially be collected and spread evenly in the
area being examined, and more planning must be implemented into the data collection process in order
to derive data which is accurate of reality. Since there was no prior knowledge or understanding of how
interpolation techniques predicted data, there was consequently not much consideration given to the
collection of the tree canopy spread, other than attempting to spread the data collection out evenly,
which regrettably was not done successfully. When the full extent of the data was explored, it was
instantly evident that the collection process of tree canopy spread undertaken by Lee Inc. was
influenced by the collection of more values in certain locations over others.
In order to derive better data collection in the future it is recommended that:
Tree selection should be based on almost same height and canopy spread for consistency.
Data should be collected based on knowledge of species within the area of interest, which can
provide more understanding.
Trees selection should also be based on trying to derive and identify tress which are not
contained in clusters, this will allow for more accurate measurements and the opportunity to
evenly space the collection out and be provided with prediction which can accurately be
determined.
Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
36 36
More planning should be implemented into understanding and comprehending how the data
will be utilized and what will be derived from the data collection process. Had a better
understanding of interpolation techniques been acquired, the data collection process could
have reflected more accurately reality, and collection would have been done more strongly
correlated to results desired to be derived.






















Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
37 37
References

About our city. (n.d.). Retrieved from City of St. Catharines:
http://www.stcatharines.ca/en/experiencein/AboutOurCity.asp?_mid_=26335
Babish, G. (2006). Geostatistics Without Tears . Regina: Environment Canada.
Esri. (2003). ArcGIS 9 - Using ArcGIS Geostatistical Ananlyst. Retrieved from Dusk 2 Geograpghy:
http://dusk2.geo.orst.edu/gis/geostat_analyst.pdf
Esri. (2011, June 09). How Kriging Works. Retrieved from ArcGIS Resource Centre:
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//009z00000076000000.htm
Esri. (2012). ArcGIS Help 10.1. Retrieved from ArcGIS Resources:
http://resources.arcgis.com/en/help/main/10.1/index.html#//009z00000075000000
Esri. (2013, June 26). Box-Cox, arcsine, and log transformations. Retrieved from ArcGIS Resource Center:
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00310000000s000000
Esri. (2013, June 24). Map the Data. Retrieved from ArcGIS Resource Center:
http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//003100000096000000
Smith, I. (2014, January). Geostatistical Analysis of Student Collected Spatial Data. GISC9308 - Spatial
Statistics. Niagara-on-the-Lake, Ontario, Canada: Niagara College Canada.











Fairview Neighbourhood Tree Canopy Cover, St. Catharines
Geostatistics Report Geospatial Coverage Prediction

21 March, 2014
38 38
Appendix 1

You might also like