You are on page 1of 12

OBJECT-BASED NATIVE VEGETATION MAPPING OF THE MURRAY

CATCHMENT
Adam Roff, Dominic Sivertsen, and Bob Denholm
NSW Department of Environment, Climate Change and Water
Scientific Services Division
Level 18, 59-61 Goulburn Street, Sydney, NSW, 2000
Phone number: 02 9995 5604, Fax number: 02 9995 5924
adam.roff@environment.nsw.gov.au

Abstract
The NSW Department of Environment, Climate Change and Water (DECCW)
recently completed a seamless vegetation map for the Murray Catchment
Management Area (MCMAA). Feature recognition software was used to delineate
native vegetation and a hybrid classification method (spatial modelling and manual
on-screen attribution) was used to combine the features and create a vegetation
map.
Stand scale feature recognition was possible for the entire catchment (35, 000 sq
km) due to the availability of a high performance grid computing network. SPOT 5
satellite imagery was used in the creation of image objects but could not be used in
the classification of vegetation type, as the spectral response of individual SPOT 5
scenes varied too widely within a mosaic. Over 340 new full floristic surveys were
commissioned and the results were combined with 900 existing survey records to
create training areas for spatial modelling. Spatial layers used in the classification
included a DEM, Landsat imagery, radiometric data and soil and climate layers, all of
which are available for the entire state. The relationship between survey sites and
spatial layers was explored using machine learning software and vegetation type
was classified using an object-based nearest neighbour approach. A manual quality
assurance program added vegetation community types that were not able to be
modelled (e.g. chenopods).
Accuracy was assessed with a combination of cross-validation and independent plot
data and results ranged between 58 and 78% for the catchment. The mapping has
been made available as an ESRI geodatabase which allows for easy exploration of
the mapping, plot data and field photos. A system for user generated updates has
been implemented so that the product can evolve as more field data is collected.

Background
Native vegetation plays an important role in maintaining water quality and
biodiversity and preventing land degradation. Information about the extent and type
of native vegetation is critical for effective conservation planning. Vegetation types
rarely occur in a stand of single species but rather as assemblages of species in a
continuum in terms of cover and abundance, cover and height (McKenzie et al.,
2008). Ecological theory suggests that similar environmental conditions should
produce clumping of species into recognisable and predictable plant assemblages
(McKenzie et al., 2008). Classification of vegetation is essentially a compromise
between the desire to preserve these natural groupings as continuously varying
entities and the need to subdivide them for more utilitarian purposes (Beadle and
Costin, 1952).

1
The aim of this project was to deliver a seamless native vegetation map over a
whole catchment management area. Seamless vegetation mapping provides a
regional context that supports property planning and investment, compliance,
conservation planning and vegetation management. The size of the study area
dictated that the process be relatively automated and rely largely on remotely
sensed data. The approach taken evolved from earlier work (Roff, 2009) where
patterns in native vegetation were delineated automatically using feature recognition.
The methods and results presented here are documented more thoroughly in Roff et
al., (2010).
Automated feature recognition aims to mimic the same cognitive process as human
observers to delineate boundaries in images. Human interpreters use a combination
of colour, tone, texture and expert knowledge to delineate patterns in native
vegetation assign a community type (Sivertsen, 2009). Feature recognition uses
segmentation algorithms to create image-objects that are also based on colour,
tone, and texture. Image-objects can be classified and merged based on remotely
sensed data, ecological rules, or environmental layers.

Methods
The Murray Catchment Management Authority Area (MCMAA) is in New South
Wales, Australia, and spans an area of 35,170 square kilometres. It is bounded by
the Murray River to the south, the Murrumbidgee Catchment divide to the north, and
the Australian Alps to the east. Plant communities in the MCMAA are broadly
reflective of the altitudinal, available moisture and landform gradients; ranging from
alpine and montane in the east to some of the lowest relief stagnant alluvial plains in
the state in the north-west of the area. This is matched by a declining rainfall/rising
temperature gradient from east to west. At the subregional and local levels plant
communities appear to be reflective of underlying substrates; this is particularly
evident in the east where a diversity of geologies are exposed or immediately sub-
surface (Roff et al., 2010).

2
0 45 90 180 kms

K
NSW SLATS Landsat scenes were used to divide the Murray Catchment Management
Authority Area (MCMAA) into four. They are referred to in this paper as BALR, ECHU,
WANG, and CORR.
To map native vegetation type image objects were created for the entire MCMAA
using the multi-resolution segmentation algorithm from Definiens eCognition. The
Fractal Net Evolution Approach is described in detail by Baatz & Schäpe (2000) and
Benz et al., (2004). The process is analogous to merging the nearby objects that
contribute least to heterogeneity. The algorithm was applied to pre-processed SPOT
5 (10m) and Landsat (25m) imagery. The approach taken in the MCMAA mapping
was to over-segment the imagery (at a manageable level) and then to merge these
polygons based on their classification, whether that classification is assigned
manually or from a computer model. The results were smoothed to emulate
manually digitised vegetation polygons. The minimum mappable unit was set at two
hectares and narrow stands of trees less than two crowns wide were eliminated.
The heavily modified landscapes encountered in the MCMAA are dominated by
grazing, cropping and other agricultural endeavours. It was necessary to create a
mask to differentiate extant native vegetation from other landuses. Automating the
process using multi-temporal Landsat scenes was problematic, so we reverted to
manual on-screen identification of candidate native vegetation. Over 200, 000
polygons were manually attributed based on evidence of recent modification.
The 2008/09 coverage of SPOT 5 data for NSW has been acquired opportunistically
over a period of up to 18 months, which has seen seasonal effects (such as high
rainfall) dominating global spectral variation. A mosaic based on this data will be
subject to seasonal artefacts, with artificial lines and dramatic classification changes
between adjacent scenes. The SPOT 5 imagery was therefore used to decide create
fine scale local boundaries and the 2008/09 NSW Landsat TM/ETM coverage was
used as an input for classification at a catchment scale. The Landsat data have

3
been atmospherically and radiometrically corrected as part of the NSW State Land
and Tree Study (SLATS) program (Collett et al., 1998), which minimises scene-to-
scene differences and mitigates edge effects.
Spatial layers used in the classification were selected based on their availability over
the entire state. They included a DEM, radiometric data, magnetic data,
multitemporal Landsat imagery, crown scale segmentation, object-based and
neighbourhood values, soil and climate layers.
A total of 341 full floristic surveys commissioned for the project and conducted
according the DECCW Native Vegetation Interim Type Standard (Sivertsen, 2009).
Site locations were allocated using random stratified sampling. Unique sampling
units were created based on available ancillary layers and random locations were
selected within these. The locations of vegetation survey were targeted towards the
significant spatial gaps in existing data.
The NSWVCA class observed at each survey location was placed in a database with
all of the environmental layers and satellite imagery values for that location. Machine
learning algorithms were used to explore the intuitive and hidden relationships
between the environmental layers and the vegetation types. Linear discriminant
analysis (LDA) was used to filter the data and remove any layers that did not
significantly contribute to separating classes. The random forests classifier was then
applied to the dataset (see Breiman, 2001). It handles a very large number of input
variables and was used to determine which vegetation types could be modelled and
which had to be manually attributed.
With that information all selected vegetation types were classified using a nearest
neighbour supervised approach implemented in Definiens eCognition. When
compared to pixel-based training, an object-based approach requires fewer training
samples: one sample image object already covers many typical pixel samples and
their variations (Definiens, 2009). This was useful as some communities were poorly
sampled.
Several vegetation community types were not able to be modelled (e.g. specific
grasslands, chenopods, boree woodland) as they could not be differentiated by
Landsat or other course spatial layers, or too few plots of the community were
available to have a significant contribution to a model. These types were attributed
using manual on-screen digitisation. A subset of the NSWVCA vegetation
communities sampled in the field could not be differentiated from each other.
Despite the combination of spatial layers and manual interpretation of SPOT 5 data
some of the communities appear identical. These were merged into a complex that
contain a mixture of VCA communities.

4
Segmentation of SPOT 5 at 1: 10 000 scale over pan-sharpened SPOT 5 imagery. The
image objects created in segmentation (left) were masked by landcover manually (centre)
and then merged based on their classification (right).
Cross-validation was used in the machine learning classification and data
exploration. For the supervised Nearest Neighbour classification in Definiens the
majority of survey sites were set aside (not used in the model) and therefore were
available for calculating the classification accuracy. Validation of the manually
corrected vegetation map is problematic as a bias has been introduced with manual
changes. All available information was used to improve the classification accuracy of
the final layer so collation of a new set of independent sample points was required.
The availability of high density independent sample points was limited to two
surveys. The first is a survey by EcoLogical Australia (Ecological, 2009). It was
limited to the Urana and Lockhart 1:100, 000 map sheets and described the three
most likely VCA types at each point as well as the dominant species (n= 138). The
second is a collection of rapid surveys conducted by MCMA from across the
catchment. The surveys describe the Biometric type and dominant species in three
strata (n = 84).
The location of scattered paddock trees and narrow roadside remnants plays an
important role in biodiversity conservation and wildlife corridor assessment. During
the scoping of this project MCMA staff expressed a desire to incorporate open
woodlands with crown cover of less than 5%. In order to meet this requirement the
entire catchment was subject to crown scale segmentation. Segmentation at the
crown scale was implemented in ENVI‘s Feature Extraction Module (ITTVIS, 2009).
The algorithm applies a simple edge-based segmentation algorithm followed by a
region merging step. The region merging routine employs the Full Lambda-
Schedule algorithm created by Robinson et al., (2002). Tree crowns were merged to
create a seamless, binary polygon layer of scattered woody vegetation for the entire
CMA.

5
Results
The original aim was to use the field survey data to develop coherent floristic groups
using quantitative analysis and then match these to apriori NSWVCA vegetation
classes published in the NSW VCA (Benson, 2006; Benson et al., 2006; Benson,
2008). Despite the quantitative analysis undertaken in the study area, the paucity of
floristic plots was not sufficient to determine community composition. The alternative
was to manually assign NSWVCA types based on species dominance and
abundance, location, photo pattern and landscape context. The result of this process
was a series of points assigned with NSWVCA equivalents for the entire CMA. Of
the 2036 floristic surveys available, 1066 sites were able to be assigned to one of 80
NSWVCA classes identified in the MCMAAA.
The majority of the candidate native grassland and chenopod shrubland was located
in the north west of the Riverina Bioregion. Native woodlands and forests were
generally restricted to State Forests, waterways, National Parks and hill slopes. The
presence of candidate native vegetation was manually attributed for all 200, 000
polygons based on visual interpretation of pan-sharpened SPOT 5 imagery. The
result was a candidate native layer that covered 38% of the catchment. This layer
was then further classified based on spectral signature and other environmental
layers. The image objects could then be classified hierarchically as ‘candidate
native’, followed by ‘woody’ and ‘non-woody’, and finally as NSWVCA types without
false positives in agricultural areas.

Landcover

Agriculture, Urban and Water Bodies


Candidate Native Vegetation

A manually attributed candidate native vegetation layer produced the most reliable
landcover mask given the data available.
Linear discriminant analysis (LDA) was used to filter the environmental and spatial
data and remove any layers that did not significantly contribute to separating
vegetation classes. More detail about which of the environmental best explains the
NSWVCA type found at each survey location is available in Roff et al. (2010). The
random forests classification (Breiman, 2001) gave accuracy figures for each
vegetation class. Undersampled communities (less than 5 surveys) were lost in the
modelling and would not appear on a vegetation map created using this classifier

6
(e.g. Black Roly Poly low open shrubland, VCA 216). However, some particularly
unique or highly separable classes could be predicted accurately even with a low
number of surveys. For example, Sandplain mallee, (VCA 173) was only described
in 7 surveys but was able to be predicted with 85.7% accuracy.
Those communities with a greater number of samples were generally predicted with
greater success. River Red Gum - Lignum (VCA 11) was readily confused with Black
Box - Lignum communities (VCA 13) and certainly share the same ecological
requirements and many of the same species. Similarly Yellow Box - White Cypress
Pine grassy woodland (VCA 75) was readily confused with Inland Grey Box - White
Cypress Pine tall woodland (VCA 80). These communities grade into each other,
share the same distribution, and are difficult to tell apart visually with remotely
sensed data as their form is very similar.
Combining VCA types by dominant species improved the classification accuracy in
the random forests classification and provided a much more consistent result. Of the
groups that were not undersampled (more than 5 samples) only White Cypress
Pine, Yellow Box and Nitre Goosefoot were predicted with less than 50% accuracy.
White Cypress Pine and Yellow Box were commonly confused with Inland Grey Box
which unfortunately dominated the surveys. It became clear that some communities
would not be able to be predicted statistically (e.g. Nitre Goosefoot) and would have
to be manually added/corrected for using manual interpretation of remotely sensed
data.
Individual VCA types were modelled using a supervised nearest neighbour
classification. The majority of survey sites were set aside (not used in the model)
and were therefore available for calculating the classification accuracy. Of the 514
survey sites that were available over the Echu Landsat scene, only 154 were used
for training samples. That left 344 sites to be used as independent samples in the
validation. Approximately 58% of the independent surveys were in category 1 or 2a
(considered 'correct' according to the DECCW Native Vegetation Interim Type
Standard).
Of the 190 survey sites that were available over the Balr Landsat scene, only 27
were used as training samples. That left 158 sites to be used as independent
samples in the validation. Approximately 68% of the independent surveys were in
category 1 or 2a. Of the 322 survey sites that were available over the Wang Landsat
scene, only 104 were used for training. That left 218 sites to be used as independent
samples in the validation. Approximately 61% of the independent surveys were in
category 1 or 2a.
The approach to classification in the MCMAA vegetation mapping is not based
exclusively on modelling. A considerable proportion of resources used in the project
consisted of manual interpretation of remotely sensed data. Due to the heavy
influence of manual corrections of vegetation type based on photo pattern it is
desirable to assess the accuracy of the results after these corrections have been
made.
The results presented by the nearest neighbour classification were heavily modified
but in doing so all of the available survey points were used as a guide to the
interpreters, including rapid surveys, any available information from external bodies,
previous mapping and other layers. The outcome is that the nearest neighbour

7
classification accuracy results cannot be used as a guide to the success of the
project. Nor can the existing survey data help assess the accuracy of the product.
Of the 138 independent survey sites that were collated over Urana and Lockhart
map sheets 108 surveys were in category 1 or 2a (considered 'correct' according to
the DECCW Native Vegetation Interim Type Standard). This represents 78% of the
total number surveys. Of the 84 independent survey sites that were available over
the whole catchment 61 surveys were in category 1 or 2a. This represents 73% of
the total number surveys.
The eastern quarter of the study area did not have enough survey information to
warrant applying a classification (see Roff et al., 2010 for a detailed explanation).
Image-objects were created for CORR but were attributed with existing mapping
(Gellie, 2005).
The mapping is stored in an ESRI file geodatabase format. It can be viewed using
ESRI ArcMap or through a browser on the NSW DECCW Intranet. The ESRI
ArcMap version is designed to be stored on a central server. At the time of writing
we are in the process of making the data editable at the client level. Any changes
based on survey data, local knowledge or new information are stored as suggested
changes. Once these are accepted by a moderator the corporate database is
updated to reflect the changes. The browser-based version of the MCMAA
Vegetation Geodatabase is read-only but fully functional as far as search, summary
statistics, viewing site photographs and layer queries.

Vegetation map from the 2010 MCMAA Vegetation Geodatabase Draft 5.0 (see Roff et al.,
2010 for detail)
The crown scale segmentation was implemented with a focus on scattered tree
crowns. It was not designed to distinguish individual crowns in closed forest so the
layer was dissolved to merge adjacent crowns. The method excluded shadows and
was highly effective at detecting scattered woody crowns, particularly in dry
grassland and shrubland. The layer is made up of 2 million polygons but these do
not always represent single crown. Where trees occur in stands they appear as a
single large polygon. The product of crown scale segmentation underwent a manual

8
quality assurance program to remove any agricultural activities or urban areas that
were erroneously categorised as woody.

K
0 0.5 1 2 kms
Image Objects
Tree Crown Layer Candidate Native

Crown scale image objects were created based on SPOT 5 imagery. The method works
particularly well on scattered crowns in dry grasslands.

Crown Layer
Non-Woody Vegetation
Woody Vegetation at a Crown Scale

Woody vegetation based on a crown scale segmentation layer for the MCMAA.
The crown layer provides a novel way to quantify woody vegetation at a crown scale
across an entire catchment. The crown layer established that 20.5% of the
catchment is woody vegetation. The results cannot be compared directly to the area
estimates of candidate native vegetation, as these did not account for scattered
trees. However, the results were compared to the NSW SLATS estimates used for
detection of change in woody vegetation. Foliage Projected Cover (FPC) rasters
derived from 2008 Landsat images were compared to the area of woody vegetation

9
captured during segmentation. There was a linear correlation between the two as
expected ranging between r2=0.49 and r2=0.69 depending on the Landsat scene.

Discussion and Conclusion


The use of feature recognition allowed for a large area to be mapped rapidly and
cost effectively. It produced a polygon layer that is flexible and allows for re-
attribution of individual stands of trees and the iterative publication of updated maps.
The project was constrained by software and hardware limitations in its early phase
but was later aided by the grid computing facilities of Definiens Server. Future
projects can now be based on pan-sharpened SPOT 5 multispectral data and other
high resolution remote sensing data, including lidar, at a catchment wide scale.
Further research is required to optimise the approach when using optical data in
areas with high rainfall and in mountainous terrain.
The manual attribution of a candidate native class dealt well dealt well with
chenopod shrublands and native grasses which are difficult to identify using other
approaches. Manually selecting all candidate native vegetation was a time
consuming process but it was made easier by the ability to select or 'paint' large
swaths of polygons at a time with an attribute. An assessment of the accuracy of this
product is desirable and could be achieved with randomised rapid field surveys or
with higher resolution imagery but this was outside the scope of the project. The
process could be improved with the use of a woody mask based on crown scale
segmentation.
Crown scale segmentation performed poorly over fertilized or irrigated pasture as it
is spectrally similar to woody vegetation. It was optimised for tree crowns and was
not effective for chenopod shrublands and small native pine trees with narrow
crowns. However, the area of these polygons combined is likely to be the most
accurate inventory of woody vegetation in the MCMAA area and is based on higher
resolution imagery than previous efforts at quantifying woody vegetation (e.g.
SLATS). The results have the advantage of being able to be interrogated as a
polygon coverage. The results have applications in carbon accounting and the
monitoring of woody vegetation change.
Rare and under sampled communities were difficult to map and existing survey
information was scarce for the size of the catchment. Full floristic surveys used a
majority of the project budget and served an important role in the mapping but did
not adequately sample the diversity of vegetation expected. Stratification was
hampered in the Riverina bioregion as there was little existing vegetation data
outside State Forests and reserves and little variation in substrate. The non-woody
native communities in particular were poorly sampled. A combination of full floristic
survey and targeted rapid survey is recommended for future mapping.
In isolation, SPOT 5 and Landsat bands are not sufficient to differentiate all the
vegetation communities encountered. The object-based supervised classification
approach applied here combined environmental layers and satellite imagery. It was
a practical compromise for this project but needs to be developed further. Manual
attribution based on photo pattern interpretation was time consuming but is still an
important part of vegetation mapping.
The MCMAA Vegetation Geodatabase Draft 5.0 is available as a single
geodatabase file and can be viewed on the DECCW intranet through a web browser
and in ESRI ArcMap. The opportunity exists to create a distributed geodatabase that

10
would allow for online amendments and links to other vegetation databases that are
regularly updated (e.g. YETI). More research on this potential is recommended so
that static paper maps can be replaced with user-generated products that can evolve
as more information is collected.

References
Baatz, M., And Schäpe, A., 2000, Multiresolution Segmentation–an optimization
approach for high quality multi-scale image segmentation. Angewandte
geographische informationsverarbeitung, 12, 12-23.
Benson, J.S. (2006) New South Wales Vegetation Classification and Assessment:
Introduction - the classification, database, assessment of protected areas and
threat status of plant communities. Cunninghamia 9(3): 331-382.
Benson, J.S., Allen, C., Togher, C. & Lemmon, J. (2006) New South Wales
Vegetation Classification and Assessment: Part 1 Plant communities of the
NSW Western Plains. Cunninghamia 9(3): 383-451.
Benson, J.S. (2008) New South Wales Vegetation Classification and Assessment:
Part 2 Plant communities in the NSW South-western Slopes Bioregion and
update of NSW Western Plains plant communities. Version 2 of the
NSWVCA database. Cunninghamia 10(4): 599-673.
Benz, U. C., Hofmann, P., Willhauck, G., Lingenfelder, I., And Heynen, M., 2004,
Multi-resolution, object-oriented fuzzy analysis of remote sensing data for
GIS-ready information. ISPRS Journal of Photogrammetry and Remote
Sensing, 58, 239-258.
Beadle, N., And Costin, A., 1952, Ecological classification and nomenclature. In
Linn. Soc. New South Wales, 77, 62-82.
Breiman, L., 2001, Random forests. Machine learning, 45, 5-32.
Collett, L., Goulevitch, B., And Danaher, T., 1998, SLATS Radiometric Correction: A
Semi-Automated, Multi-Stage Process for the Standardisation of Temporal
and Spatial Radiometric Differences.
Definiens, 2009, Definiens Professional 7 Reference Book. München, Germany.
Department of the Environment, Water, Heritage and the Arts, 2005, Interim
Biogeographic Regionalisation for Australia (IBRA) Version 6.1, Department
of the Environment, Water, Heritage and the Arts.
Department of Environment and Resource Management (2009). Land cover change
in Queensland 2007–08: a Statewide Landcover and Trees Study (SLATS)
Report, Oct, 2009. Department of Environment and Resource Management,
Brisbane.
Gellie, N. J. H., 2005, Native Vegetation of the Southern Forests: South-east
Highlands, Australian Alps, South-west Slopes, and SE Corner bioregions.
Cunninghamia, 9, 219-253.
Hay, G. J., Blaschke, T., Marceau, D. J., And Bouchard, A., 2003, A comparison of
three image-object methods for the multiscale analysis of landscape
structure. ISPRS Journal of Photogrammetry and Remote Sensing, 57, 327-
345.
ITT Visual Information Solutions, 2009, ENVI Reference Guide: ENVI Version 4.7,
ITT Visual Information Solutions, Ohio.
Mckenzie, N., Grundy, M., Webster, R., And Ringrose-Voase, A., 2008, Guidelines
for Surveying Soils and Land Resources. CSIRO Publishing, Melbourne, 557.

11
Rakotomalala, R., 2005, TANAGRA Data Mining v1.4.36: Free Software for
Research and Academic Purposes: Proceedings of EGC, RNTI-E-3, p. 697-
702.
Robinson, D., Redding, N., And Crisp, D., 2002, Implementation of a fast algorithm
for segmenting SAR imagery, Defense Science and Technology
Organization.
Roff, A., 2009, Floristics, Fuel Load and Condition: Object-Based, Hyperspectral
Remote Sensing of Native Vegetation, PhD thesis, University of NSW,
Sydney.
Roff, A., Sivertsen, D., and Denholm, B., 2010, The Native Vegetation of the Murray
Catchment Management Authority Area. NSW Department of Environment,
Climate Change and Water, Sydney.
Sivertsen, D.P., 2009, Native Vegetation Type Standard Version 0.9. NSW
Department of Environment, Climate Change and Water, Sydney.

12

You might also like