Machine Learning Based Autism Detection Using Brain Imaging PDF

Rochester Institute of Technology
RIT Scholar Works

Theses Thesis/Dissertation Collections
3-27-2017
Machine Learning Based Autism Detection Using

Brain Imaging
Gajendra Jung Katuwal
gjk3423@rit.edu
Follow this and additional works at: http://scholarworks.rit.edu/theses
Recommended Citation
Katuwal, Gajendra Jung, "Machine Learning Based Autism Detection Using Brain Imaging" (2017). Thesis. Rochester Institute of
Technology. Accessed from
This Dissertation is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for
inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact ritscholarworks@rit.edu.
Machine Learning
Based Autism Detection
Using Brain Imaging
A dissertation
by
Gajendra Jung Katuwal

Submitted in partial fulfilment
of
Doctor of Philosophy in Imaging Science
Chester F. Carlson Center for Imaging Science
Rochester, New York, USA
March 27, 2017

Chester F. Carlson Center for Imaging Science
Rochester, New York, USA
CERTIFICATE OF APPROVAL
PhD Degree Dissertation
The PhD Degree Dissertation of Gajendra Jung Katuwal has been examined and approved
by the following dissertation committee as satisfactory for a PhD degree in Imaging Science.
Dr. Andrew Michael, Dissertation Advisor Date
Dr. Stefi Baum, Dissertation Advisor Date
Dr. Manuela Campanelli, Committee Chair Date
Dr. Nathan Cahill, Committee Member Date
Dr. John Kerekes, Committee Member Date

Dedicated to my parents.
ACKNOWLEDGMENTS
First of all, I would like to express my sincere gratitude to my advisors, Dr. Andrew Michael and
Dr. Stefi Baum for their guidance and training during this fulfilling journey of four years and some
months. Andrew believed in me despite of my zero experience in brain imaging and guided me to
this point. Stefi, despite being physically on the other side of the 49th parallel, always inspired me
to ask the right questions, which was pivotal for the evolution of my thought process and was
immensely helpful during the course of my research.
I would also like to thank the rest of the dissertation committee: Dr. Nathan Cahill, Dr. John
Kerekes, and Dr. Manuela Campanelli, for many insightful comments and questions. My sincere
gratitude also goes to my cool lab mates Chao, Chase, and Viraj. Without them, this journey would
have been difficult. My sincere gratitude goes to all the individuals behind the data, ideas,
publications, software, etc. used in my research and all helpful folks from stackexchange. Without
their contributions, I would not have made this far.
I would like to thank Ronnie and Rob for their help in grammatical debugging. Finally, I am
grateful to several baristas from the beautiful town of Lewisburg to the hip places in Cambridge
and Somerville for helping me to overclock my brain—a necessity for finishing this dissertation.
ABSTRACT
Autism Spectrum Disorder (ASD) is a group of heterogeneous developmental disabilities
that manifest in early childhood. Currently, ASD is primarily diagnosed by assessing the behavioral
and intellectual abilities of a child. This behavioral diagnosis can be subjective, time consuming,
inconclusive, does not provide insight on the underlying etiology, and is not suitable for early
detection. Diagnosis based on brain magnetic resonance imaging (MRI)—a widely used non-
invasive tool—can be objective, can help understand the brain alterations in ASD, and can be
suitable for early diagnosis. However, the brain morphological findings in ASD from MRI studies
have been inconsistent. Moreover, there has been limited success in machine learning based ASD
detection using MRI derived brain features. In this thesis, we begin by demonstrating that the low
success in ASD detection and the inconsistent findings are likely attributable to the heterogeneity
of brain alterations in ASD. We then show that ASD detection can be significantly improved by
mitigating the heterogeneity with the help of behavioral and demographics information. Here we
demonstrate that finding brain markers in well-defined sub-groups of ASD is easier and more
insightful than identifying markers across the whole spectrum. Finally, our study focused on brain
MRI of a pediatric cohort (3 to 4 years) and achieved a high classification success (AUC of 95%).
Results of this study indicate three main alterations in early ASD brains: 1) abnormally large
ventricles, 2) highly folded cortices, and 3) low image intensity in white matter regions suggesting
myelination deficits indicative of decreased structural connectivity. Results of this thesis
demonstrate that the meaningful brain markers of ASD can be extracted by applying machine
learning techniques on brain MRI data. This data-driven technique can be a powerful tool for
early detection and understanding brain anatomical underpinnings of ASD.
i
ABBREVIATIONS
ABIDE: Autism Brain Imaging Data Exchange
ADHD: Attention Deficit Hyperactivity Disorder
AS: Autism Severity
ASD: Autism Spectrum Disorder
AUC: Area Under the Receiver Operating Curve
CARS: Childhood Autism Rating Scale
CDC: Center for Disease Control and Prevention
CTR: Controls
CSF: Cerebrospinal Fluid
Curvind: Curvature Index
DB: Demographics and Behavioral Measures
DC: Diencephalon
FAST: FMRIB's Automated Segmentation Tool
Foldind: Folding Index
FS: FreeSurfer
FSL: FMRIB Software Library
Gauscurv: Gaussian Curvature
GM: Gray Matter
Meancurv: Mean Curvature
MRI: Magnetic Resonance Imaging
SPM: Statistical Parametric Mapping
ii
Std.: Standard Deviation
TDC: Typically Developing Controls
TIV: Total Intracranial Volume
VIQ: Verbal Intelligent Quotient
WM: White Matter
iii
CONTENTS
ABSTRACT ........................................................................................................................................ i
ABBREVIATIONS ........................................................................................................................... ii
FIGURES.. ......................................................................................................................................... x
TABLES… ........................................................................................................................................ xii
CHAPTER 1: INTRODUCTION ................................................................................................... 1
1.1 Autism Spectrum Disorder ..................................................................................................................1
1.2 Importance of early detection of ASD .................................................................................................2
1.3 Problem 1: Behavior based diagnosis of ASD is late and does not help to understand
underpinnings of ASD ..................................................................................................................................3
1.4 Problem 2: Inconsistent brain anatomical findings in ASD ................................................................4
1.5 Problem 3: Low Classification success in large multi-site data ............................................................4
1.6 Thesis Contributions ............................................................................................................................4
1.6.1 Demonstrating MRI as a potential tool for detection ................................................................5
1.6.2 Relating inconsistent findings and low classification success to ASD heterogeneity and
methodological differences ....................................................................................................................5
1.6.3 Providing the evidence of brain image processing tool dependent findings and suggesting
multi-variate techniques as a partial solution ........................................................................................6
1.6.4 Divide and Conquer: we propose identifying brain biomarkers in sub-groups of ASD is easier
and more meaningful than across the whole spectrum .........................................................................7
1.6.5 Brain biomarkers discovery for early detection of ASD .............................................................7
1.7 Thesis Organization .............................................................................................................................8
1.7.1 Chapter 1: Introduction .............................................................................................................8
iv
1.7.2 Chapter 2: Background ..............................................................................................................9
1.7.3 Chapter 3: Method Dependent Findings ...................................................................................9
1.7.4 Chapter 4: Brain Morphology Based ASD Detection and Tackling ASD Heterogeneity ......10
1.7.5 Chapter 5: Early Detection of ASD .........................................................................................10
1.7.6 Chapter 6: Conclusion .............................................................................................................10
1.7.7 Appendices ...............................................................................................................................10
2 CHAPTER 2: BACKGROUND ................................................................................................ 12
2.1 Brain Imaging .....................................................................................................................................12
2.2 MRI………………………………………………………………………………………………………………………………….14
2.2.1 MRI acquisition .......................................................................................................................14
2.3 Brain Feature extraction from MRI ...................................................................................................16
2.3.1 SPM……….. ............................................................................................................................16
2.3.2 FSL…….. .................................................................................................................................18
2.3.3 FS………… .............................................................................................................................19
2.4 Analysis of neuroimaging data ............................................................................................................23
2.4.1 Voxel-based ..............................................................................................................................23
2.4.2 Vertex-based or Surface-based.................................................................................................24
2.4.3 Feature-based ...........................................................................................................................24
2.5 Machine Learning ..............................................................................................................................25
2.5.1 Random Forest (RF) .................................................................................................................27
2.5.2 Gradient Boosting Machine (GBM) .........................................................................................29
3 CHAPTER 3: DEPENDENCY OF BRAIN FINDINGS ON IMAGE PROCESSING
TOOLS AND POTENTIAL SOLUTIONS ................................................................................. 31
3.1 Introduction ........................................................................................................................................31
3.2 Methods…… ........................................................................................................................................35
v
3.2.1 Data……. .................................................................................................................................35
3.2.2 Tissue Segmentation and Brain Volumes Estimation ..............................................................36
3.2.3 Statistical Analysis ....................................................................................................................38
3.3 Results……...........................................................................................................................................40
3.3.1 Estimated brain volumes and inter-method differences ...........................................................40
3.3.2 ASD vs. TDC inter-group differences is dependent on the method used ................................42
3.3.3 Differential bias of methods to the diagnostic group ................................................................45
3.4 Discussion....................................................................................................................................... 46
3.4.1 ASD vs. TDC inter-group differences dependent on the method used ...................................47
3.4.2 Comparison of ASD vs. TDC brain volume differences with previous studies .......................48
3.4.3 Methods have different biases for diagnostic group, and many of them are larger than inter-
group differences .................................................................................................................................50
3.4.4 Locations of the inter-method segmentation discrepancies .....................................................50
3.4.5 Sources of the inter-method discrepancies in tissue segmentation ...........................................55
3.4.6 Implications to future neuroimaging studies ............................................................................56
3.5 Conclusion ..........................................................................................................................................60
4 CHAPTER 4: UTILIZING NON-BRAIN INFORMATION TO IMPROVE ASD
DETECTION BASED ON BRAIN MORPHOLOGY ............................................................... 62
4.1 Introduction ........................................................................................................................................62
4.2 Methods……….. ...................................................................................................................................65
4.2.1 MRI Data .................................................................................................................................65
4.2.2 Reasons for using male subjects only .......................................................................................66
4.2.3 MRI Data processing ...............................................................................................................67
4.2.4 Classification Algorithms ..........................................................................................................67
4.2.5 Metrics Used ............................................................................................................................68
vi
4.2.6 Adding AS, VIQ, and age information to the brain morphometric features ..........................69
4.3 Results……...........................................................................................................................................70
4.3.1 Classification using only brain morphometric properties ........................................................70
4.3.2 Age and VIQ used as training features in conjunction with brain morphometric features .....70
4.3.3 Sub-grouping subjects by AS, VIQ, and age ...........................................................................71
4.3.4 Multivariate analysis: Important features for classification and their variability across sub-
groups…………………………………………………………………………………………….….75
4.4 Discussion …………………………………………………………………………………………………………………………81
4.4.1 Classification becomes difficult with increase in AS.................................................................82
4.4.2 Folding index and curvature features may be important markers for early detection of ASD 84
4.4.3 ASD heterogeneity in brain morphometry ..............................................................................86
4.4.4 Low classification success in large multi-site data due to ASD heterogeneity ..........................87
4.4.5 Future direction for neuroimaging studies ..................................................................................88
4.5 Conclusion ..........................................................................................................................................93
5 CHAPTER 5: EARLY DETECTION OF AUTISM USING BRAIN MORPHOLOGY .... 94
5.1 Introduction ........................................................................................................................................95
5.2 Material and Methods ........................................................................................................................96
5.2.1 Subjects…… .............................................................................................................................96
5.2.2 Brain Features Extraction ........................................................................................................98
5.2.3 Univariate Analysis...................................................................................................................99
5.2.4 Multivariate Analysis: ASD vs. CTR Classification .................................................................99
5.3 Results…….........................................................................................................................................100
5.3.1 ASD vs. CTR Brain Differences ............................................................................................100
5.3.2 ASD vs. CTR Classification ...................................................................................................104
5.3.3 Important Features for ASD vs. TDC Classification .............................................................105
vii
5.4 Discussion ……………………………………………………………………………………………………………………….110
5.4.1 Amygdala in ASD ..................................................................................................................111
5.4.2 Early brain overgrowth in ASD .............................................................................................111
5.4.3 Overgrowth is related to increase in cortical surface area but not thickness .........................111
5.4.4 Higher cortical folding in ASD brains and its relation to early brain overgrowth and cortical
expansion……...................................................................................................................................112
5.4.5 Larger Ventricles and extra-axial volume in ASD may cause additional cortical folding .....114
5.4.6 WM in frontal and temporal regions less myelinated in ASD ...............................................115
5.4.7 Brain overgrowth in ASD is followed by arrested growth and even degeneration ................116
5.5 Conclusion ........................................................................................................................................118
CHAPTER 6: CONCLUSION AND FUTURE WORK ........................................................... 120
5.6 Contributions ....................................................................................................................................121
5.6.1 We demonstrated MRI as a potential tool for ASD detection ...............................................121
5.6.2 We showed that the methodological differences are a cause of inconsistent brain imaging
findings……………. .........................................................................................................................121
5.6.3 We showed that identifying brain biomarkers in sub-groups of ASD is easier and more
meaningful than across the whole spectrum .....................................................................................122
5.6.4 We identified potential brain markers for early detection of ASD.........................................123
5.7 Future Work ....................................................................................................................................123
5.7.1 Critical need for large longitudinal studies .............................................................................123
5.7.2 Data-driven brain features .....................................................................................................124
BIBLIOGRAPHY ......................................................................................................................... 126
Appendix A..................................................................................................................................... 151
Appendix B ..................................................................................................................................... 154
viii
Appendix C ..................................................................................................................................... 161
ix
FIGURES
Figure 1.1 Thesis Organization ...................................................................................................... 9
Figure 2.1: Sagittal view of a T1 weighted brain MRI. ................................................................ 13
Figure 2.2: MRI acquisition process. ............................................................................................ 15
Figure 2.3: SPM Tissue Segmentation. ........................................................................................ 18
Figure 2.4: Freesurfer (FS) recon-all processing workflow .............................................................. 20
Figure 2.5: Sub-cortical segmentation & cortical parcellation in Freesurfer (FS). ........................ 22
Figure 2.6: Random Forest is an ensemble of decision trees. ....................................................... 28
Figure 3.1: Distribution of estimated brain volumes and inter-method differences. .................... 42
Figure 3.2: ASD – TDC brain volume differences are method dependent .................................. 44
Figure 3.3: Inter-method segmentation comparison .................................................................... 52
Figure 3.4: Multi-variate pattern is more robust across methods ................................................. 59
Figure 4.1: Improvement in classification by sub-grouping .......................................................... 72
Figure 4.2: Important features for classification are different across sub-groups.......................... 77
Figure 4.3: Variability of the important features. ......................................................................... 78
Figure 4.4: Folding index and curvature features are important for classification in young subjects
............................................................................................................................................... 85
Figure 4.5: ASD vs. TDC group difference in ventricular volumes decreases with verbal IQ (VIQ)
............................................................................................................................................... 91
Figure 5.1: Subject Selection Flowchart ....................................................................................... 97
Figure 5.2: Classification Flowchart............................................................................................ 100
Figure 5.3: Statistically Significant Brain Alterations. ................................................................ 101
x
Figure 5.4: Global Volume Features ........................................................................................... 104
Figure 5.5: ASD vs. CTR Classification Scores. ......................................................................... 105
Figure 5.6: Important brain feature for ASD vs. CTR classification.......................................... 107
Figure 5.7: Important morphology features for ASD vs. CTR classification. ............................ 108
Figure 5.8: Important intensity features for ASD vs. CTR classification. .................................. 109
Figure 5.9: Cortical expansion theory on cortical gyrification ................................................... 113
Figure 5.10: Higher cortical folding in ASD brain. .................................................................... 115
Figure 5.11: Inefficient communication in ASD brain. .............................................................. 116
Figure 5.12: Arrested brain growth in ASD ............................................................................... 118
Figure 8.1: Gradient Boosting Machine: Improvement in classification by sub-grouping based on
autism severity (AS), age and Verbal IQ (VIQ). ................................................................. 154
Figure 8.2. Gradient Boosting Machine (GBM): Important features for classification are variable
across sub-groups. ............................................................................................................... 155
Figure 8.3: Inter and intra sub-groups classification AUC scores .............................................. 156
Figure 8.4: Random Forest: Important features for classification in sub-groups (10% threshold).
............................................................................................................................................. 157
Figure 8.5: Random Forest: Important features for classification by in sub-groups (25% threshold).
............................................................................................................................................. 158
Figure 8.6: Classification performance degrades with Autism Severity (AS). ............................. 159
Figure 9.1: Histogram of the effect sizes of the statistically significant (at 0.05) differences in brain
features ................................................................................................................................ 161
xi
TABLES
Table 3.1: Subject demographics and behavioral (DB) measures ................................................ 35
Table 3.2: Estimated brain volumes and inter-method differences .............................................. 41
Table 3.3: ASD vs. TDC brain volume differences ...................................................................... 43
Table 3.4: Differential bias for diagnostic group (ASD) ............................................................... 46
Table 4.1: Previous ASD vs. TDC Classification studies ............................................................. 64
Table 4.2: Subject demographics and behavioral (DB) measures ................................................ 66
Table 4.3: Sub-groups definition .................................................................................................. 70
Table 4.4: Classification AUC in sub-groups created by AS, VIQ and age................................. 73
Table 7.1: Estimated brain volumes and inter-method differences in NYU .............................. 151
Table 7.2: ASD vs. TDC brain volume differences in NYU ...................................................... 152
Table 7.3: Differential bias for diagnostic group (ASD) in NYU ............................................... 153
Table 8.1: Correlation between the feature importance scores from RF and GBM .................. 160
xii
1 CHAPTER 1: INTRODUCTION
1.1 Autism Spectrum Disorder
Autism Spectrum Disorder (ASD) is a group of lifelong developmental disabilities that
manifest in early childhood. According to DSM 5 (American Psychiatric Association, 2013), ASD
is characterized by impairment in social-communication and behavior domains, including
repetitive behaviors and restrictive interests. According to a recent Center for Disease Control and
Prevention (CDC) survey (CDC, 2014), 1 in 68 children (1 in 42 boys and 1 in 189 girls) have ASD.
A more recent CDC survey of parents indicated that this number can be as high as 1 in 45
(Zablotsky et al., 2015). In addition to the substantial difficulties faced by ASD individuals and their
parents, the economic burden of ASD is high. A recent study by Leigh and Du (2015) has estimated
ASD’s economic cost for 2015 to be $268 billion in the United States alone. The study projects
annual costs rising to $461 billion in 2025 if ASD’s prevalence remains constant and more than $1
trillion by 2025 if ASD’s prevalence continues the steep rise seen over the last decade.
ASD is highly heterogeneous in its etiology, comorbidity, pathogenesis, genetics, and
severity (Betancur, 2011; Happé et al., 2006; Jeste & Geschwind, 2014; Ronald et al., 2006). It has
a strong genetic basis and is highly heritable (Ronald et al., 2006). It has been reported that 10%–
20% of individuals with ASD have an identified genetic etiology. The genetic architecture of ASD
is highly heterogeneous (Geschwind & Levitt, 2007; Jeste & Geschwind, 2014); ASD has been
linked to more than 100 different genes affecting different aspects of neurodevelopment and
function (Betancur, 2011). In addition, ASD shows high comorbidity with other psychiatric
1
disorders (Dougherty et al., 2016b; Matson et al., 2013). For example, it has been estimated that
14 to 78% of children with ASD also meet the criteria for Attention Deficit and Hyperactivity
Disorder (ADHD) (Gargaro et al., 2011), up to 42% meet the criteria for anxiety disorders (Matson
et al., 2013), and 25 to 70% have some level of intellectual disability (Fombonne, 2009). Similarly,
brain alterations in ASD have been found to be highly heterogeneous which is discussed in Section
1.6.2
1.2 Importance of early detection of ASD
Early detection of ASD is important because it allows for the application of early
intervention methods. It has been shown that early intervention is effective in decreasing
impairments (Dawson et al., 2010) and may result in more positive long-term outcomes for the
child (Pickles et al., 2016; Rogers & Vismara, 2010). The annual economic cost of ASD has been
significant—and the largest factors contributing to the cost are lost productivity and adult care
(Ganz, 2007). With successful application of early intervention methods, in addition to improving
the quality of life of ASD individuals, the economic cost related to ASD can also be significantly
reduced. And, for successful application of early intervention methods, early detection of ASD is
required.
Early detection of ASD may help to disentangle the effects of genetic and environmental
risk factors of ASD and may improve our knowledge of its underlying etiology. A large study by
Sandin et al. (2014) including 2 million children (~14500 ASD) have reported that the risk of ASD
is influenced equally by genetic and environmental factors. Environmental factors include a wide
range of influences such as parental age (Sandin et al., 2015), obstetric conditions (Kolevzon et al.,
2007), medication used (Boukhris et al., 2015), maternal nutrition (Lyall et al., 2013; Schmidt et
al., 2014), exposure to chemicals during prenatal stage, prenatal stress (Kinney et al., 2009), etc.
2
The environmental factors related to the risk of ASD can interact with each other making the
etiology of ASD more complex. If detection can be achieved at the neonatal stage, it can help in
separating out the effects of post-natal environmental risk factors of ASD—thus, improving our
knowledge of ASD etiology.
In addition, it may be the case that ASD is closer to normal development in its presentation
at a younger age, before environmental factors have influenced its postnatal development. That
is, when a child is tested for ASD as early as possible, fewer environmental factors will have
contributed to the development of ASD after birth—which means that the underlying etiology and
perhaps its manifestation may be more homogenous—which in turn, would make it easier to detect
and understand the underlying etiology of ASD in children through the application of biomarkers
at very young age.
Below we briefly discuss the major challenges faced by current techniques for ASD
detection and studies investigating brain morphology of ASD.
1.3 Problem 1: Behavior based diagnosis of ASD is late and does not help
to understand underpinnings of ASD
Currently ASD diagnosis is based on clinical assessments based on observations of the
individual's behavior and intellectual abilities. This diagnosis procedure can be subjective, time
consuming, and inconclusive due to factors such as comorbidity (Close et al., 2012). An objective
diagnostic tool is highly valuable and its importance further increases from the fact that currently
existing behavioral diagnosis has been found to be highly subjective especially at a young age where
there is so little to observe. In addition, since the present diagnosis procedure is based only on
behavioral symptoms, it does not provide insight on the brain anatomical underpinnings and
underlying etiology of ASD. Furthermore, it is difficult to utilize it for early diagnosis and
3
intervention. According to a recent report by CDC (CDC, 2014), the median age of ASD diagnosis
is 53 months. Similarly, it has been reported that more than half of school-aged kids were age 5 or
older when they were first diagnosed with ASD and less than 20% were diagnosed by age 2 years
(Pringle et al., 2012).
1.4 Problem 2: Inconsistent brain anatomical findings in ASD
A number of studies utilizing structural Magnetic Resonance Imaging (MRI) data to
investigate the brain morphology of ASD have reported alterations in brain regions involved in
language and social behavior, particularly fronto-temporal regions (Bigler et al., 2007; Ha et al.,
2015), and the amygdala-hippocampus complex (Groen et al., 2010; Nordahl et al., 2012). Early
brain overgrowth (Campbell et al., 2014), alterations in corpus callosum (Wolff et al., 2015),
cerebellum (D’Mello et al., 2015), and fusiform (Dougherty et al., 2016a; van Kooten et al., 2008)
have also been reported by studies. However, these findings have been somewhat inconsistent
(Amaral et al., 2008; Chen et al., 2011; Katuwal et al., 2015a).
1.5 Problem 3: Low Classification success in large multi-site data
In recent years, several studies have applied machine learning on MRI derived brain
features, targeted at ASD detection. These studies have been successful with high classification
accuracies (80%) for well-matched small datasets (n < 200). However, the studies using the large
multi-site ABIDE dataset (n > 700) (Haar et al., 2014; Katuwal et al., 2015b; Sabuncu &
Konukoglu, 2014) have reported low classification accuracies (~60%).
1.6 Thesis Contributions
4
1.6.1 Demonstrating MRI as a potential tool for detection
As a solution to the Problem 1 i.e. “Behavior based diagnosis of ASD is late and does not help to
understand the underpinnings of ASD”, we demonstrate that ASD can be successfully detected using
machine learning on MRI data. Using brain images of both children and adult subjects, data-
driven potential brain biomarkers that are consistent with the current biological understanding of
ASD were found. In addition, we could successfully classify ASD subjects from non-ASD subjects
in many cases. From the results of this thesis, we can safely argue that the application of machine
learning on MRI data can be a useful technique for early detection of ASD and for understanding
its brain anatomical underpinnings. This analysis and results are presented in Chapter 4 and
Chapter 5.
1.6.2 Relating inconsistent findings and low classification success to ASD
heterogeneity and methodological differences
In this thesis, we have attempted to identify the causes of Problem 2: “Inconsistent brain
anatomical findings in ASD” and Problem 3: “Low Classification success in large multi-site data”. We
attribute methodological differences and heterogeneity of ASD, or a combination of both, as the
likely underlying causes of these discrepancies.
The methodological differences can arise from the differences in image preprocessing tools,
analytical models, covariates, etc. In this thesis, we focus on methodological differences due to the
choice of image preprocessing tool. This analysis and results are presented in Chapter 3.
The heterogeneity of ASD brain morphology can be mainly attributed to the significant
dependence of brain alterations on demographics and behavioral (DB) measures such as age, sex,
IQ, handedness, etc. Previous studies have reported variability in brain alterations with factors
such as age (Lin et al., 2015), gender (Lai et al., 2013), handedness (Floris et al., 2013) etc. In this
5
thesis, we focus on the variability ASD brain alterations with respect to three DB measures: age,
verbal IQ (VIQ), and autism severity (AS). This analysis and results are presented in Chapter 4.
1.6.3 Providing the evidence of brain image processing tool dependent findings and
suggesting multi-variate techniques as a partial solution
We investigate differences in image preprocessing tools as a possible cause behind the
inconsistent anatomical findings in ASD and in neuroimaging in general. In particular, we focus
on the estimation of the following brain volumes: gray matter (GM), white matter (WM),
cerebrospinal fluid (CSF), and total intracranial volume (TIV) from T1-weighted MRIs using three
popular preprocessing methods: SPM, FSL, and FreeSurfer (FS). We found that the estimated
brain volumes did not agree across methods and the methods showed differential biases for ASD,
and many of the biases were larger than ASD versus TDC differences. In summary, we
demonstrated that the differences in the choice of image preprocessing tool is a reason behind the
inconsistent neuroimaging findings in ASD.
Improving the current brain image preprocessing tools to make them more accurate and
standardizing them is the best possible solution to remove the effects of methodological differences
in the end findings. However, when the existing popular tools have to be used, we suggest
investigating multi-variate relationships (in addition to univariate relationships) in brain alterations
because multi-variate relationships are more robust across methods and scanners. In this thesis, we
provide few examples to corroborate this view. This analysis and results are presented in Chapter
3.
6
1.6.4 Divide and Conquer: we propose identifying brain biomarkers in sub-groups of
ASD is easier and more meaningful than across the whole spectrum
We first propose and then demonstrate that the heterogeneity in ASD brain morphometry
can be mitigated by augmenting the information from demographics and behavioral measures
(DB) of the subjects. We demonstrate that identifying ASD brain alterations in relatively
homogenous sub-groups is easier and more insightful than across the whole heterogeneous ASD
spectrum. We use automatically extracted multiple brain morphological features and multiple
classification techniques for this investigation. We explore if DB measures of the subjects can be
utilized to mitigate the ASD heterogeneity and hence facilitating a better understanding of the
brain alterations associated with ASD and aiding in its detection. First, we investigate the
incremental predictive power that can be gained by adding DB measures such as age, VIQ, and
AS to the brain morphological features derived from MRI. Second, we probe if sub-grouping the
subjects by the above-mentioned DB measures helps to improve ASD vs. TDC classification and
provides better insight into the brain morphometric alterations in ASD. This analysis and results
are presented in Chapter 4.
1.6.5 Brain biomarkers discovery for early detection of ASD
We apply machine learning on automatically extracted multiple brain morphometric and
intensity features to classify young ASD subjects (3 to 4 years) from the controls of the same age
group. We achieved very high success rate (> 90% AUC) for ASD vs. control classification. We
also identified three potential brain markers for early detection of ASD: larger ventricles, larger
TIV, higher amount of cortical folding, and myelination deficits particularly in frontal and
temporal regions. We also noticed that larger TIV of ASD brain is related to larger cortical surface
area but relatively independent to cortical thickness. Further, we could show that the higher
7
amount of cortical folding in ASD brains may be an aftereffect of the early brain overgrowth in
ASD. We also hypothesize that the higher amount of cortical folding in ASD is due to the greater
compressive stress in the cortex induced by both hyper-expansion of the cortex and abnormally
large ventricles. To our knowledge this is the first study to leverage clinical imaging archives to
investigate early brain markers in ASD. The high degree of success in classification and the
biologically relevant potential brain markers indicate that application of advanced analytic
methods on brain features holds promise for aiding early identification of ASD.
1.7 Thesis Organization
This thesis is divided into six chapters and three appendices; see Figure 1.1.
1.7.1 Chapter 1: Introduction
This Chapter contains a brief introduction to ASD and studies investigating brain
alterations in ASD using MRI. It starts with the shortcomings of behavioral based ASD diagnosis
and then provides motivation for brain morphology based ASD detection using machine learning.
This chapter makes an effort to educate the reader about the importance of early detection of ASD
and then presents MRI based ASD detection as a potential technique for it. Finally, this chapter
briefly discusses the challenges to characterize the ASD brain morphology using MRI and then
provides the possible solutions as the aims and contributions of this thesis.
8
Figure 1.1 Thesis Organization
1.7.2 Chapter 2: Background
This Chapter contains the basic elements required to get acquainted with brain imaging in
general and brain MRI data processing and analysis in particular. It includes a brief introduction
to MRI, extraction of brain features from MRI, and univariate and multi-variate analysis
techniques used to identify meaningful patterns from these brain features.
1.7.3 Chapter 3: Method Dependent Findings
This Chapter is based on the Frontiers in Neuroscience research article “Inter-Method
Discrepancies in Brain Volume Estimation May Drive Inconsistent Findings in Autism” (Katuwal
et al., 2016b). Using the case study of brain volumes estimation from three popular brain image
processing tools, this Chapter provides the evidence that the differences in the choice of processing
tool is a cause behind the inconsistent brain anatomical findings in ASD.
9
1.7.4 Chapter 4: Brain Morphology Based ASD Detection and Tackling ASD
Heterogeneity
This Chapter is based on the PLOS ONE journal research article “Divide and Conquer:
Sub-Grouping of ASD Improves ASD Detection Based on Brain Morphometry” (Katuwal et al.,
2016c). Using the brain images of adult subjects (6 to 40 years) from the ABIDE dataset, this
Chapter shows how information from DB measures such as age, VIQ, and AS can help to mitigate
the heterogeneity in ASD brain morphology. This Chapter demonstrates that searching for the
brain biomarkers in meaningful sub-groups of ASD is easier and more meaningful than searching
them across the whole ASD spectrum.
1.7.5 Chapter 5: Early Detection of ASD
Utilizing the brain images of subjects 3 to 4 years of age from a clinical imaging archive at
Geisinger Health System, this Chapter demonstrates that applying machine learning on MRI-
derived brain morphology features, early brain markers of ASD can be identified in a data-driven
fashion and be used for early detection of ASD.
1.7.6 Chapter 6: Conclusion
This Chapter summarizes the methodology and the results of this thesis and provides a
discussion on the implications of the findings of this thesis. Finally, it concludes by discussing the
future work.
1.7.7 Appendices
Appendix A, B, and C contain the supplementary materials form Chapter 3, 4, and 5
respectively.
10
11
2 CHAPTER 2: BACKGROUND
2.1 Brain Imaging
Brain imaging includes the use of diverse techniques to image the structure, function, and
biochemical processes of the brain. Imaging technologies of various modalities now provide the
visualization of the structure and function of the brain with high resolution. Brain imaging is
becoming an important technique in both research and clinical care. It is being successfully used
to facilitate the understanding of the structure and the functions of the brain and also has been a
vital diagnostic tool for many neurological disorders (O’Brien, 2007). Brain imaging can be broadly
categorized into three categories: structural, functional, and molecular imaging (Asbury, 2011).
Structural imaging techniques can capture the anatomical structure of the brain and
include X-ray, Computer Assisted Tomography (CT), MRI, Diffusion Tensor Imaging (DTI) etc.
MRI is based on the principle of nuclear magnetic resonance and uses radiofrequency waves to
probe tissue structure. DTI is a special form of MRI technique that measures the diffusion of water
as a function of spatial location and is used to characterize the microstructural changes in the tissues
(Le Bihan et al., 2001). It is widely used to estimate the WM connectivity patterns in neurological
disorders such as ASD and ADHD (Assaf & Pasternak, 2008).
Functional imaging techniques capture the function or physiology of the brain or measure
the brain metabolism (Raichle, 1998). Most of the techniques such as functional Magnetic
Resonance Imaging (fMRI) and Positron Emission Tomography (PET), measure the amount of
cerebral blood flow as a proxy to the brain metabolism rate assuming that there is more blood flow
in functionally active brain regions.
12
Molecular and cellular imaging techniques probe the biochemical activities of cells and
their molecules (Massoud & Gambhir, 2003). They generally use different kinds of light
microscopes on light-emitting probes. The light-emitting probes are the molecules emitting radio
frequencies of various wavelengths to contrast the target cells.
In addition to the unimodal studies based on brain images of a single modality, there are
several other studies utilizing the joint information from the brain images of multiple modalities
(Michael, 2009; Michael et al., 2010, 2011; Pichler et al., 2010; Townsend & Cherry, 2001).
This thesis makes an effort to identify brain markers for ASD detection using the brain
morphology captured by MRI. So, we will limit our scope to MRI from here on. To be more
particular, MRI refers to structural MRI in this thesis. All the brain MRI images used in this thesis
were T1-weighted. In T1-weighted images, tissues with high fat content (e.g. WM) appear bright
whereas fluids (e.g. CSF) and air appear dark (Hornak, 1996). Figure 2.1 shows the sagittal view
of a typical T1-weighted brain MRI.
Figure 2.1: Sagittal view of a T1 weighted brain MRI.
13
2.2 MRI
MRI is a non-invasive tool for examining the anatomy of the human body. It is based on
the principle of magnetic resonance imaging and it utilizes the magnetic properties of the proton
of the hydrogen atom which is abundant in our body.
2.2.1 MRI acquisition
Here we briefly explain the MRI acquisition process. For detailed information, please refer
to https://radiopaedia.org/articles/mri-introduction, Pooley (2005), Hornak (1996) and Section
2.1 of Michael (2009). A typical MRI machine consists of three major components: a very strong
main magnet, gradient magnets, and a radio frequency emitter. A typical MRI process includes
four major components: slice selection, phase encoding, frequency encoding, and signal
reconstruction (see Figure 2.2).
Slice selection: The protons spin around their axes effectively acting as small magnetic
dipoles. Initially, the magnetic dipoles (protons) are randomly aligned. When a strong external
magnetic field of the MRI machine is applied, in addition to the spinning motion, the dipoles start
to precess along the direction of the external field at the Larmor frequency, which is proportional
to the external magnetic field. After that, a combination of gradient magnets is turned on, by effect
of which, the net magnetic field varies linearly along the axis perpendicular to the 2D slice of
interest—i.e. the net magnetic field is uniform in each slice but is different across the slices—which
means the protons at each slice are precessing with unique frequency and hence can only absorb
electromagnetic waves of particular frequencies to resonate, echo or excite. In other words, varying
net magnetic field allows us to selectively excite the hydrogen atoms only in the slice of interest.
14
Figure 2.2: MRI acquisition process.
Phase encoding. A suitable radio frequency pulse is emitted for a brief time so that the
hydrogen atoms only in the slice of interest are excited. Then the radio frequency is turned off and
the excited hydrogen atoms are allowed to return to the low energy state through a gradual decay.
During relaxation, electromagnetic waves are emitted at the precessing frequency of the Hydrogen
atoms. Then another set of gradient magnets is turned on for a short interval, by effect of which,
the net magnetic field along one axis of the slice of interest varies. Since the precession frequency
is dependent on the external magnetic field, protons at different locations precess with different
frequencies creating location dependent phase lag relative to their initial state. This location
dependent phase lag effectively encodes the spatial information in the slice of interest. In practice,
several phase encoding steps are repeated so that the captured signal has enough information to
reconstruct the anatomical image.
Frequency encoding. After phase encoding, another gradient is applied along the axis
perpendicular to the phase encoding axis so that the hydrogen atoms at different columns precess
with different frequencies proportional to the net magnetic field.
15
Signal reconstruction. The excited hydrogen atoms return to the low energy state
releasing electromagnetic waves. These waves have frequency and phase dependent on their spatial
location and are picked up by receiver coils and the collected 2D signal is called the k-space. A
Fourier Transform is applied on the k-space image to reconstruct the 2D anatomical image of the
slice. Multiple 2D slices are collected and combined to form the 3D MRI volume.
2.3 Brain Feature extraction from MRI
MRI data have to be processed to extract brain features before performing further analyses
on them. There are a number of automatic software tools available from different labs and research
universities for MRI data processing. Among them, the most popular are SPM (Ashburner &
Friston, 2005), FS (Dale et al., 1999a; Fischl et al., 1999b), FSL (Jenkinson & Smith, 2001), AFNI
(Cox, 1996), and LONI (Dinov et al., 2010). MRI data processing can be roughly divided into two
steps. The first step includes preprocessing tasks such as motion correction, non-uniform intensity
normalization, skull stripping, etc. These tasks ensure that the final image is suitable for further
processing. In other words, these preprocessing tasks are image processing operations performed
on raw brain images so that they have desirable properties that are consistent with the assumptions
made by the subsequent registration and segmentation steps. The second step includes tasks such
as registration of a brain image to a standard brain template and segmentation of cortical and sub-
cortical brain regions.
Below we briefly discuss the brain feature extraction steps of three popular neuroimaging
preprocessing tools that were used in this thesis.
2.3.1 SPM
SPM (http://www.fil.ion.ucl.ac.uk/spm/) is a brain imaging software tool mainly designed
for the analysis of the brain image sequences. It was developed and is maintained by the members
16
& collaborators of the Wellcome Trust Centre for Neuroimaging at University College London.
In Chapter 3 of this thesis, SPM was utilized for the segmentation of following brain tissues: GM,
WM, and CSF.
Tissue segmentation in SPM is performed using the New Segment tool (Ashburner et al.,
2013). New Segment utilizes a Gaussian Mixture Model (GMM) with components for GM, WM,
CSF, bone, soft tissue, and air/background where each component is modeled as one or more
Gaussians. The distribution or histogram of image intensities in a T1 weighted brain image (Figure
2.1A) are modeled by a GMM (Figure 2.3c) where the different peaks correspond to the different
image intensities of the tissue classes. The mixture model is updated by combining the spatial
information from a standard tissue probability map (TPM) and the intensity information of the
input MRI (Ashburner & Friston, 2005). The standard TPM (Figure 2.3d) of a tissue contains
information about the spatial location of the tissue in a probabilistic sense. The value at a certain
voxel of a TPM represents the probability of a certain tissue (GM, WM or CSF) belonging to that
voxel. In this example, a voxel value of the TPM in Figure 2.3d represents the probability that the
voxel contains GM. New Segment uses ICBM-452 T1 brain atlases (Mazziotta et al., 2001a) as
standard TPMs.
At first the T1 image is registered in the same space as the standard TPM. After the
registration, the registered T1 image is segmented roughly based on intensity thresholds to get an
initial segmentation map (Figure 2.3b) to create an initial TPM. Then, the segmented tissue map
is refined using Bayes’ theorem with the standard TPM as a priori. The maximum likelihood
estimate of the parameters of the mixture model are estimated using the Expectation Maximization
(EM) algorithm (Do & Batzoglou, 2008). At each iteration of EM, model parameters and the TPM
are updated. The iteration continues until the final parameters of the mixture model are estimated.
A voxel value in the final TPM (Figure 2.3e) represents the probability that the voxel contains the
17
particular tissue. Further details on SPM tissue segmentation can be found at (Ashburner et al.,
2013).
Figure 2.3: SPM Tissue Segmentation.

The distribution (histogram) of image intensities of the T1 image (upper left) are modeled by a Gaussian
Mixture Model (bottom left) where the different peaks correspond to the different image intensities of the
tissue classes. First, the T1 image is registered in same space as standard Tissue Probability Map (TPM)
which contains the anatomical information of a particular tissue in probabilistic sense. Second, the T1 image
is segmented roughly based on intensity thresholds. Third, the segmented tissue map is refined using Bayes
theorem with the standard TPM as a priori. This process is repeated until there is no significant change in
the segmented tissue map. A voxel in the final tissue map represents the probability of that voxel containing
a particular tissue type. Note: A part of this figure was borrowed from Mietchen and Gaser (2009).
2.3.2 FSL
FSL (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki) is a brain imaging software tool designed for the
analysis of both structural and functional brain images. It was developed and is maintained by the
Analysis Group at the Oxford Centre for Functional MRI of the Brain, at the Oxford University.
In Chapter 3 of this thesis, FSL was utilized to for the segmentation of following brain tissues: GM,
WM, and CSF.
Tissue classification in FSL is performed by FMRIB's Automated Segmentation Tool
(FAST) (Zhang et al. 2001). FAST utilizes a hidden Markov random field (HMRF) (Rabiner &
18
Rabiner, 1989) which is a generalized version of the finite mixture model. In the HMRF model
utilized by FAST, each tissue class in the mixture model is represented by a Gaussian. In addition
to a basic mixture model, HMRF model incorporates a Markov random field (MRF) to utilize the
neighborhood information of the voxels. The hidden variables specifying the identity of the
mixture component (parametric distribution or Gaussian in this case) of each observation (voxel
intensity) are related by a Markov process in the HMRF model unlike a finite mixture model where
they are independent of each other. The HMRF model in FAST does not use a standard TPM as
a priori; instead it uses K-means segmentation (Kanungo et al., 2002) to estimate the initial
parameters of the tissue classes. It uses the EM algorithm to find the maximum likelihood estimate
of the model parameters. Before tissue segmentation using FAST, the brain region is extracted
using the brain extraction tool (Smith, 2002).
2.3.3 FS
FS (https://surfer.nmr.mgh.harvard.edu/) is a brain imaging software tool designed for the
analysis of both structural and functional brain images, and was originally developed for the
construction of cortical surface models. It was developed and is maintained by the Laboratory for
Computational Neuroimaging at the Athinoula A. Martinos Center for Biomedical Imaging. In
Chapter 4 and 5 of this thesis, FS was utilized to compute various morphometric properties of a
number of cortical and sub-cortical structures of the brain. In this thesis, the recon-all workflow of
FS v. 5.3.0 (Dale et al., 1999a; Fischl et al., 1999a, 2002; Ségonne et al., 2004) was used to extract
brain morphometric features.
Recon-all is a fully automated workflow (see Figure 2.4) that performs all the FS cortical
reconstruction and sub-cortical segmentation steps in a unified pipeline. It includes several
processing stages such as motion correction, non-uniform intensity normalization, Talairach
19
(Talairach & Tournoux, 1988) transform computation, intensity normalization, skull stripping,
sub-cortical segmentation, and cortical parcellation steps. Detailed steps of recon-all workflow can
be found at https://surfer.nmr.mgh.harvard.edu/fswiki/recon-all. Cortical parcellation and sub-
cortical segmentation steps are briefly explained below.
Figure 2.4: Freesurfer (FS) recon-all processing workflow

Recon-all is a fully automated workflow for cortical reconstruction and sub-cortical segmentation steps.
Detailed steps of recon-all workflow can be found at https://surfer.nmr.mgh.harvard.edu/fswiki/recon-all
2.3.3.1 Sub-cortical Segmentation
FS utilizes the Bayesian approach for the segmentation of sub-cortical structures (Fischl et
al., 2002). At first, a brain image is affine registered to a standard probabilistic atlas called Aseg atlas
(Fischl et al., 2002) which contains the information regarding statistical properties of 37 anatomical
structures. At each voxel of the standard atlas, one of the 37 anatomical labels including left and
right caudate, putamen, pallidum, thalamus, lateral ventricles, hippocampus, and amygdala (see
Figure 2.5a), can belong to a voxel. The intensity distribution of each label is modeled as a
Gaussian. Local spatial relationships between the labeled brain structures are encoded by an
anisotropic non-stationary MRF. After registration to the atlas, the maximum posteriori estimate
of the segmentation or the probability of an anatomical label belonging to each voxel is calculated.
In this Bayesian approach, two forms of prior information are utilized. The first is the global spatial
20
information or the probability that an anatomical label can be present at a particular voxel. This
information is derived from the atlas and the affine transformation of the input image to the atlas.
The second is the local spatial information or the local spatial relationship between anatomical
labels such as “posterior amygdala is frequently superior to anterior hippocampus, but never
inferior to it” (Fischl et al., 2002). This spatial relationship is encoded by an anisotropic non-
stationary MRF.
2.3.3.2 Cortical parcellation
FS subdivides the cerebral cortex on a brain MRI into a number of gyral based regions of
interest or parcellations (Fischl et al., 2004; Desikan et al., 2006). It utilizes the prior information
from a standard atlas such as the Desikan-Killiany atlas (Desikan et al., 2006) which was created
using a dataset of 40 MRI scans, where in each scan, 35 cortical regions were manually identified
in each of the individual hemispheres (see Figure 2.5b). The local surface geometry of a location in
the cortex and the prior information derived from the atlas are combined in a Bayesian framework
to calculate the maximum posteriori estimate of a brain parcellation belonging to that location.
Similar to sub-cortical segmentation, the prior information has two sources. The first source is the
global spatial information; it is the probability that a given parcellation occurs at a particular
location in the atlas, independent of the local surface geometry. This information is provided by
the atlas and the registration of the input image to the atlas. The second source is the local spatial
information or the local spatial relationship between parcellation labels such as “precentral gyrus
can be anterior, superior or inferior to central sulcus but never posterior to it” (Fischl et al., 2004).
This spatial relationship is encoded by an anisotropic non-stationary MRF.
21
Figure 2.5: Sub-cortical segmentation & cortical parcellation in Freesurfer (FS).
Figure a) was borrowed from http://slideplayer.com/slide/5222876 and figure b) was borrowed from
Klein and Tourville (2012).
2.3.3.3 FS Features
After the segmentation of cortical and sub-cortical brain regions, different morphometric
and intensity properties or features of these regions can be calculated. Volume, intensity mean,
intensity standard deviation (std.) of 40 sub-cortical brain structures are automatically estimated by
the recon-all workflow. Similarly, recon-all automatically estimates the volume, surface area,
Gaussian curvature, mean curvature, curvature index, folding index, thickness mean, thickness
standard deviation, intensity mean, and intensity std. for the 34 cortical brain regions of each
hemisphere.
22
!"#$ %!"&'
Mean curvature is the arithmetic mean of principal curvatures where K *+, and
(
K *-. are maximum and minimum principal curvatures respectively. Mean curvature measures the
extrinsic curvature. Extrinsic curvature refers to the amount of folding created with no distortion.
Gaussian curvature is the square of the geometric mean of principal curvatures K *-. K *+, .
Gaussian curvature measures the intrinsic curvature and its sign can be used to characterize the
surface. Intrinsic curvature refers to the folding created with distortion or shearing i.e. it is
excess/deficit in the surface area compared to a plane at the point (Schaer et al., 2008).
/
Folding index is defined as K *+, K *+, − K *-. 𝑑𝐴 (VanEssen & Drury, 1997).
01
Folding index is an overall measure of the folding of a surface. For any sulci having the shape of a
half-cylinder, its folding index is proportional to its length.
2.4 Analysis of neuroimaging data
Neuroimaging data analysis can be mainly categorized in three groups: voxel-based, vertex-
based, and feature-based.
2.4.1 Voxel-based
In voxel-based studies, the voxel intensities are compared between the images. The analysis
is usually preceded by the registration of the brain images to a standard space to ensure voxel-to-
voxel correspondence across images by adjusting the 3D coordinates of a voxel to best match the
MRI intensities across subjects at that particular voxel (Greve, 2011). Image registration is
generally followed by image smoothing and the amount of smoothing or the size of the low pass
filter used for smoothing is proportional to the size of the region of interest. The intensity of a voxel
in a final smoothed image may represent intensity, volume, or concentration of a tissue class at that
location. An independent univariate parametric statistical model is fitted for each voxel.
23
Representative voxel-based neuroimaging studies include: Nishida et al. (2011) reporting larger
anterior insular volume in patients with obsessive–compulsive disorder, May (2011) demonstrating
experience-dependent structural plasticity in the adult human brain, Matsuda et al. (2012)
detecting Alzheimer’s, Walther et al. (2009) relating brain structural differences to body mass index
in older females, Demirakca et al. (2011) reporting diminished GM in the hippocampus of cannabis
users, Kühn, Schubert, and Gallinat (2010) reporting reduced thickness of medial orbitofrontal
cortex in smokers, etc.
2.4.2 Vertex-based or Surface-based
In vertex-based or surface-based studies, the surface area of a brain structure is explicitly
represented by a mesh of triangles. The point where the vertices of the neighboring triangles meet
is called a vertex. A vertex can be localized by two spherical coordinates (longitude and latitude)
on a surface. After a brain surface such as the pial surface is represented by a mesh of triangles,
several geometric features such as area, curvature, thickness, etc. can be computed. For group
analysis, the surface based registration is performed where the 2D spherical coordinate of a vertex
is adjusted to match the curvature across subjects so that the folding patterns are aligned as
described by Greve (2011). Representative vertex-based neuroimaging studies include: Tondelli et
al. (2012) reporting atrophy in the right medial temporal lobe and the right hippocampus in
Alzheimer’s patients 10 years before clinical diagnosis, Im et al. (2006) reporting cortical thickening
in women in localized anatomical regions, Zarei et al. (2010) reporting significant bilateral regional
atrophy in the dorsal-medial part of the thalamus in Alzheimer’s patients.
2.4.3 Feature-based
In feature-based studies, the different features or properties of the brain regions are
analyzed. The features can be morphometric such as volume and area of a brain region or intensity
24
based such as the mean intensity of a brain region. The feature-based analysis is usually preceded
by the segmentation of various cortical and sub-cortical brain regions and the calculation of
different features of these regions. The feature extraction may involve many vertex and voxel based
techniques. In this thesis, all analyses are feature-based and the scope is limited to feature-based
analysis from here on.
Usually in feature-based studies, the group differences in a feature or a group of features
are investigated. When a study involves only one feature or treats each feature in a group to be
independent of each other, it is called a univariate study. A typical univariate study utilizes a
statistical model such as General Linear Model (Nelder et al., 2006) to investigate the group
differences in brain features. In contrast, when a study treats the individual features to be
dependent of each other and tries to identify a multi-variate pattern between two groups, it is called
a multi-variate study. A typical multi-variate study utilizes machine learning techniques to identify
multi-variate differences in a data-driven fashion.
2.5 Machine Learning
Machine learning is the process of learning the underlying structure of a dataset and using
that knowledge to make predictions on unseen data (Alpaydin, 2010). An algorithm or a collection
of algorithms are applied on samples of a population to learn underlying structure or study the
probability distribution of the population. Machine learning techniques capture the multi-variate
relationships in data and hence are well-suited to detect subtle and distributed differences in the
data. So, compared to univariate techniques, machine learning techniques can perform better in
capturing the brain morphology of heterogeneous conditions like ASD. Thus, they hold promise
for improving our knowledge of ASD brain morphology and identifying brain biomarkers helpful
for ASD diagnosis.
25
Machine learning techniques cover three major areas: supervised learning, unsupervised
learning, and reinforcement learning.
In supervised learning a training dataset with class labels are given and the algorithm learns
the input to output mapping function or a pattern in the training data. The learned mapping
function is then used to predict the labels of an unseen test data. The inherent assumption here is
that the training and testing data are similar. In other words, the same probability function
generates both training and test data. Supervised learning can be a classification problem when
the target label is categorical, or a regression problem when the target label is continuous. Some
applications of supervised learning are stock prediction (Rather et al., 2017), mortality prediction
(Katuwal & Chen, 2016), spam detection (Chakraborty et al., 2016), cancer detection (Bazazeh &
Shubair, 2016), ASD detection (Kim et al., 2016) etc. Some examples of supervised learning
algorithms are regression models, naïve Bayes, decision trees, Random Forest (RF), Gradient
Boosting Machine (GBM), neural networks, etc.
In unsupervised learning, class label information is absent and the algorithm has to learn
the underlying structure of the data by itself. Unsupervised learning can be a goal in itself, such as
finding sub-groups or clusters in the data or can be an intermediate step such as dimensionality
reduction for subsequent machine learning steps. Some applications of unsupervised learning are
human action categorization (Niebles et al., 2008), extracting hierarchical features for object
recognition (Ranzato et al., 2007), discovery of human neural-behavioral maps (Vogelstein et al.,
2014), identifying sub-groups of ASD (Gupta, 2015), etc. Some examples of unsupervised learning
algorithms are K-means, mixture models, principal component analysis, autoencoder, and
generative adversarial networks.
In reinforcement learning, an agent or a group of agents is trained in an environment to
maximize a cumulative reward. The environment is generally represented by a Markov decision
26
process (Puterman & L., 1994) with its inputs as the actions of the agent and outputs as the
observations and rewards sent to the agent. Reinforcement learning is particularly well-suited to
problems where trade-off between long-term and short-term reward is important. Some
applications of reinforcement learning are robot control (Kober et al., 2013), playing Go (Silver et
al., 2016) , and playing poker (Brown & Sandholm, 2017).
In this thesis, we applied supervised learning techniques namely RF (Breiman, 2001) and
GBM (Friedman, 2000). These two algorithms are briefly explained below.
2.5.1 Random Forest (RF)
RF (Breiman, 2001) is an ensemble of decision trees and its output class is the mode value
of the output classes of the individual decision trees (see Figure 2.6). It is an ensemble technique
that relies on the reduction of the variance of the general error term. For the squared error loss,
the expected generalization error of a model φ at a given point 𝐱 can be decomposed into three
components (Gelman & Hill, 2006) as in Equation 2.1.
E Error φ(x) = noise 𝐱 + bias ( 𝐱 + var(𝐱) (2.1)
In Equation 2.1, the first term noise 𝐱 is the irreducible error or Bayes error. It is the
theoretical lower bound on the generalization error and is independent of both learning algorithm
and data. The second term bias ( 𝐱 is the difference between the average prediction of the model
and the prediction of the Bayes model. The third term var(𝐱) is the variability of the predictions
at point 𝐱 over the models learned from all possible subsets of population.
The main idea of RF is to decrease the variance term by keeping the bias constant, thereby
decreasing the overall error of the ensemble. It achieves this variance reduction by averaging the
high variance classifiers or decision tree classifiers. The more diverse or uncorrelated the decision
trees, the more error reduction is achieved by averaging. To make the decision trees different from
27
each other, RF introduces randomness while constructing the trees, hence the name ‘random
forest’. The randomization is introduced at first during data sampling and then while constructing
the decision trees. Each tree learns from a bootstrap replica of the data obtained by random
sampling with replacement in the original data. This introduces a degree of randomness in the
decision trees because they are trained with different bootstrap replicas. While growing decision
trees, the quality of a node split is based only on a random subsample of the variables instead of all
of them. Most of the randomization comes from this step.
Figure 2.6: Random Forest is an ensemble of decision trees.

a) Generalized error broken down into three terms: noise, bias, and variance. Random Forest (RF) relies
on the reduction of variance without increasing bias, thus reducing the overall error. The directions of the
green arrows correspond to the direction of change of the error terms due to ensembling. b) General block
diagram of RF as an ensemble of diverse decision trees.
A decision tree is a simple classifier which iteratively partitions the data space into more
homogenous partitions using simple decision rules. The quality of the data partitions or the child
nodes after a node splitting in a decision tree is measured by their homogeneity. The homogeneity
of the data samples in a partition is generally quantified using the metrics such as Gini impurity
28
(Breiman, 1996), entropy (Gray & M., 1990), etc. and the quality of the node split is quantified by
change in these metrics.
In a RF, the importance of a variable is the sum of the weighted impurity decreases in all
node splits by the variable, averaged over all trees in the forest (Breiman, 2001; Louppe, 2015). It
is calculated as in Equation 2.2.

T
1
Imp XK = 1(jO = j) p(t)Δi(sO , t) (2. 2)
M
*U/ S"
Where,
Imp XK = the importance of variable XK
jO = the identifier of the variable used for splitting node t
sO = a split at node t due to variable XK
XY
p t = the proportion of samples at node t where NO is the number of samples at node t
X
and N is the total number of samples
Δi sO , t = decrease in node impurity by split sO
φ* = mth decision tree
M = the number of decision trees in the forest
In this thesis, RF was used as the first choice classifier since it is inherently suitable for
parallel processing, has very few hyper parameters to tune, does not require data scaling, is
theoretically resistant to overfitting, provides variable importance, and has been found to be very
good for a variety of datasets (Breiman, 2001; Caruana & Niculescu-Mizil, 2006).
2.5.2 Gradient Boosting Machine (GBM)
Boosting is an ensemble technique that relies on bias reduction to reduce the generalized
error of an ensemble (Bauer & Kohavi, 1999). A general boosting technique iteratively combines
29
several weak or base learners with high bias and low variance such as decision tree stumps into one
strong learner. The base learners are combined so that the ensemble bias decreases while variance
remains the same, thereby reducing the net ensemble error. At each iteration or boosting step,
GBM constructs a new base learner to be the most parallel to the negative gradient of a loss
function along the observed data so that the new base learner focuses on the weakness of the model
(Natekin & Knoll, 2013). In other words, it performs functional approximation of a model by
consecutively improving along the negative direction of a loss function. In this thesis, a binomial
loss function was used and decision trees were used as base learners.
30
3 CHAPTER 3: DEPENDENCY OF BRAIN
FINDINGS ON IMAGE PROCESSING TOOLS
AND POTENTIAL SOLUTIONS
Some material from (Katuwal et al., 2016b) has been reused in this chapter under Creative Commons Attribution
(CC BY) license.
Previous studies applying automatic brain image processing methods on MRI report
inconsistent neuroanatomical alterations in ASD. In this study, we investigate methodological
differences as a possible cause behind these inconsistent findings. In particular, we focus on the
estimation of the following brain volumes: GM, WM, CSF, and TIV. T1-weighted MRIs of 417
ASD subjects and 459 TDC from the ABIDE dataset were estimated using three popular
preprocessing methods: SPM, FSL, and FS. The estimated brain volumes were correlated but had
significant inter-method biases. ASD vs. TDC differences in all brain volume estimates were highly
dependent on the method used. When methods were compared with each other, they showed
differential biases for ASD, and several biases were larger than ASD vs. TDC differences of the
respective methods. After manual inspection, we found inter-method segmentation mismatches in
the cerebellum, sub-cortical structures, and inter-sulcal CSF. In summary, method dependent ASD
vs. TDC differences indicate that the inter-method discrepancy can contribute to inconsistent
neuroimaging findings in ASD. We suggest cross-validation across methods and emphasize the
need to develop better methods to increase the robustness of neuroimaging findings.
3.1 Introduction
31
MRI is a powerful tool used to investigate the human brain in vivo and to find associations
between brain morphometry and brain disorders. Although a large number of MRI studies have
been conducted, consistent MRI markers for brain disorders are yet to be found (Chen et al. 2011).
This can mainly be attributed to low statistical power of neuroimaging and neuroscience studies
(Button et al., 2013). Inconsistent results may be due to differences in the demographics of data
(Stanfield et al., 2008), image acquisition settings (Auzias et al., 2014a; Styner et al., 2002),
assumptions made on data and algorithms used (Eggert et al., 2012; Fellhauer et al., 2015;
Nordenskjöld et al., 2013), and even machines used to process the data (Gronenschild et al., 2012).
With the increasing use of automated preprocessing methods in neuroimaging studies, the
effect of inter-method variations in neuroimaging findings merit an investigation. Although
automatic methods are more objective than manual methods, they possess method-specific bias
and variance (Eggert et al., 2012; Nordenskjöld et al., 2013). The bias and variance across methods
arise mainly due to method specific assumptions made on data, varying definition of brain
structures, different image processing algorithms, varying sensitivity to imaging artifacts such as
motion, and use of inconsistent a priori information such as brain templates. A number of previous
studies have reported significant inter-method inconsistencies (Eggert et al., 2012; Fellhauer et al.,
2015; Hansen et al., 2015; Ren et al., 2005; Tsang et al., 2008). Eggert et al. (2012) reported
pronounced differences (11%) in mean segmented GM volumes from four standard segmentation
algorithms: SPM8 New Segment, SPM8 VBM, FSL v. 4.1.6 and FS v. 4.5. According to Hansen et
al. (2015), compared to manual segmentation, FS 4.5 underestimated TIV by 7 %. Similarly,
differences in segmentation accuracies between FSL and SPM5 were reported by Tsang et al.
(2008).
One important concern is that inconsistent results driven by inter-method variations can
change the end results of a study and hence change the subsequent biological interpretation.
32
Several previous studies have shown that the magnitude and even the direction of the effect size
can be dependent on the method used. Boekel et al. (2015) performed a replication study on 17
brain structure-behavior correlations from five neuroimaging studies and were not able to replicate
any of the correlations. A response paper by Muhlert and Ridgway (2015) pointed out that one of
the reason the correlations could not be replicated is the methodological differences between SPM
and FSL. Nordenskjöld et al. (2013), compared SPM8 and FS v. 5.1.0 TIV estimates to reference
TIV obtained from manual segmentation of proton density weighted images. They report that
both SPM8 and FS overestimated TIV. In addition, SPM showed systematic bias associated with
gender (systematic overestimation of TIV in females) and aging atrophy while FS showed bias for
reference TIV (systematic overestimation of TIV for larger skull size). Notably, hippocampal
volume showed different associations with education depending on which TIV measure (SPM or
FS) was used for hippocampal volume normalization. When normalized with SPM TIV, there was
no association between hippocampal volume and education, whereas when normalized with FS
TIV, the association was significant. Similarly, (Callaert et al., 2014) measured the effect of age on
GM volume using four different methods: SPM8 Unified Segmentation, SPM8 New Segment, FSL
v. 4.1.5, and a method combining intensity based segmentation and atlas-to-image non-rigid
registration. They found that the age specific effect changed with the different methods. Age related
differences according to the Unified Segmentation and New Segment were significantly larger and
smaller respectively than other methods. Similarly, according to Rajagopalan et al. (2014), in ALS
patients with frontotemporal dementia, FSL v. 4.1.5 showed that the GM volume in motor region
is significantly reduced, whereas SPM8 did not show any significant changes in GM. The above
results suggest that inter-method discrepancies are a source of inconsistent findings in
neuroimaging and that the choice of preprocessing method can affect end results. Thus, the effect
of inter-method variations in neuroimaging results deserves a detailed investigation.
33
In this study, we investigate inter-method discrepancies as a source of inconsistent
neuroimaging findings in ASD. Brain anatomical findings in ASD compared to that of TDC have
been highly inconsistent across studies (Amaral et al., 2008; Chen et al., 2011; Jumah et al., 2016;
Katuwal et al., 2016c). Recently a number of studies (Haar et al., 2014; Kucharsky Hiess et al.,
2015; Valk et al., 2015; Riddle et al., 2016) have used the large multi-site (~1000 subject, age 6-65
years) Brain Imaging Data Exchange (ABIDE) (Di Martino et al., 2014) to investigate brain
anatomical differences in ASD. Haar et al. (2014) using ABIDE did not replicate many of the
previously reported anatomical alterations in ASD except significantly larger ventricular volumes,
smaller corpus callosum volume (central segment only), and several cortical areas with increased
thickness in the ASD group. One of the most replicated findings in ASD is that toddlers with ASD
(age 2–4 years) on average have a larger head size than TDC (Carper et al., 2002; Courchesne et
al., 2011b, 2011a; Campbell et al., 2014). However, several recent large studies (Raznahan et al.,
2013; Zwaigenbaum et al., 2014) have shown that there is no overall difference in head
circumference between ASD and TDC over the first 3 years and the results of previous studies
reporting large head sizes in ASD may be due to the bias in population norms. Campbell et al.
(2014) have reported that abnormally rapid rate of brain growth during the first years of life seem
to occur in a very small subgroup of ASD children.
Here we investigate if inter-method differences are a source of inconsistent
neuroanatomical findings in ASD. We address this question by using global brain volume
measures. We estimate GM, WM, CSF volumes, and TIV of ASD and TDC subjects using the
large ABIDE dataset applying three widely used preprocessing methods: SPM, FSL, and FS. We
answer the following three questions in this study:
1. How large are inter-method differences of brain volume estimates and how do they
influence ASD vs. TDC group differences?
34
2. Do inter-method differences show differential bias towards diagnosis group (in this case
ASD) and how does it compare with ASD vs. TDC differences?
3. What are potential reasons behind inter-method differences in brain volume
estimation?
3.2 Methods
3.2.1 Data
A total of 1,112 MRI scans were downloaded from the ABIDE consortium representing
imaging data from 17 different sites. Each MRI was visually inspected to detect significant motion
and other artifacts. In total, 172 images with poor image quality and motion artifacts were
discarded. An additional 64 subjects were discarded due to failure during segmentation by FS or
FSL; none of the images failed during SPM segmentation. T1 weighted brain MRIs from a total
of 876 subjects from 15 sites were retained for volumetric analyses. Of the 876 subjects, 417 (367
males, 50 females) were ASD and 459 (382 males, 77 females) were TDCs. Subject demographics
and behavioral measures are presented in Table 3.1.
Table 3.1: Subject demographics and behavioral (DB) measures

ASD TDC ASD vs. TDC
t-test P-value
N 417 459
M=367, F=50 M=382, F=77
Age(years) 17.8 ± 8.9 17.7 ± 8.0 0.88
(7 to 64) (6.47 to 56.2)
VIQ 104.6 ± 17.8 112.4 ± 12.9 3.1E-10*
PIQ 105.0 ± 16.7 108.1 ± 12.9 5.2E-3*
FIQ 105.4 ± 16.5 111.5 ± 12.1 4.3E-9*
ADOS 11.9 ± 3.7 NA NA
* statistical significance at 0.05
M: Male; F: Female; VIQ: Verbal IQ; PIQ: Performance IQ; FIQ: Full IQ, ADOS: Diagnostic
Observation Schedule
35
3.2.2 Tissue Segmentation and Brain Volumes Estimation
All MRIs were processed with SPM8 (Ashburner & Friston, 2005), FSL 5.0.4 (Jenkinson &
Smith, 2001), and FS 5.3.0 (Dale et al., 1999b; Fischl et al., 1999b). In order to minimize manual
intervention and to make the study more objective and replicable, default parameters set by the
respective toolboxes were used. Final results of the automatic segmentations were manually
inspected and subjects with segmentation failures (64 in total) were discarded from the study.
SPM: Using New Segment tool (Ashburner et al., 2013), each T1 brain image was segmented
into probabilistic maps of GM, WM, and CSF tissues. Spm_get_volumes script was used to calculate
the tissue volumes using c1, c2, and c3 probabilistic maps corresponding to native space tissue
maps of GM, WM, and CSF respectively. Native space volumes were selected to minimize volume
changes due to spatial transformations. TIV was calculated as the sum of the GM, WM, and CSF
volumes in the native space of the MRI. This method of TIV calculation was performed according
to SPM’s recommendation (https://en.wikibooks.org/wiki/SPM/VBM) and has been utilized in
several previous studies (Nordenskjöld et al., 2013; Ridgway et al., 2011).
FSL: Using FAST tool (Zhang et al. 2001), each T1 brain image was segmented into
probabilistic maps of GM, WM, and CSF tissues. Brain regions were extracted using the brain
extraction tool (BET) (Smith, 2002) before tissue segmentation using FAST. Fractional intensity
threshold (f option in BET) was kept at the default value of f = 0.5. Brain slices (from the slicedir
directory) produced by FSL script fslvbm_1_bet –b were used to visually verify that brain regions
were accurately extracted. Images for which brain extraction was not successful with the default
value of f = 0.5 were excluded from the study. Fslstats script was used to calculate the tissue volumes
using partial volume maps (pve_0, pve_1, pve_2 images) in the native space produced by FAST as
36
recommended by FSL
(http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FAST#Tissue_Volume_Quantification). TIV was
calculated using the SIENAX function of FSL as recommended by FSL (Smith et al., 2002) and
the ENIGMA protocol (http://enigma.ini.usc.edu). SIENAX first strips out non-brain tissues using
BET to extract the regions corresponding to the brain and the skull. After brain extraction, the
skull image is affine registered to the MNI52 template (Mazziotta et al., 2001b) with a scaling factor
between the subject's image and the standard space as the output. The scaling factor is computed
as the determinant of the affine transformation matrix that registers the subject’s image to the
MNI152 template. Finally, the TIV of the subject’s image was calculated by dividing the TIV of
MNI152 template brain (1.847712 L) by the scaling factor (Mazziotta et al., 1995, 2001a, 2001b).
FS: Tissue segmentation in FS was performed by the recon-all preprocessing workflow
(https://surfer.nmr.mgh.harvard.edu/fswiki/recon-all). Recon-all is a fully automated workflow
that performs all the FS cortical reconstruction processes. As recommended by FS, GM and WM
volumes were extracted from aseg.stats file which is an output of the recon-all workflow (see
https://surfer.nmr.mgh.harvard.edu/fswiki/MorphometryStats). FS does not output total CSF
volume. In FS TIV was calculated by a technique similar to FSL (see
http://surfer.nmr.mgh.harvard.edu/fswiki/eTIV). However, FSL uses both brain and skull
whereas FS uses only the brain to guide the registration of a subject’s image to a template.
Estimated TIV (eTIV) is calculated by dividing the atlas mask volume from MNI305 template by
the determinant of the affine transformation matrix (T) that maps the native space image into MNI
space (Buckner et al., 2004). Talairach registration, the third step of recon-all, computes the affine
transform T that transforms the original image to the MNI305 template (Evans et al., 1992).
37
3.2.3 Statistical Analysis
Statistical software package R v3.2.0. (http://www.R-project.org/) was used for all
statistical analyses. Brain volume estimates from the three preprocessing methods are compared
with each other to test inter-method differences and their biases for diagnostic group (ASD). All p-
values reported in this study were corrected for multiple comparisons using false discovery rate
(Benjamini & Hochberg, 1995) unless otherwise explicitly mentioned.
3.2.3.1 Inter-method differences in brain volumes estimation
Inter-method differences across all 876 subjects are summarized by the following statistics:
mean volume difference, percentage difference, correlation coefficient, Cohen’s d, and paired t-
test p-value. Separate comparisons were performed for different volume types. Cohen’s d (Cohen,
1988) was used as a measure of effect size. Cohen’s d is the standardized difference between two
means and is defined as (mean1−mean2)/SDpooled where SDpooled is the weighted average of the standard
deviations of two groups. Paired t-tests were performed to test the statistical significance of the
inter-method differences in brain volume estimates.
3.2.3.2 ASD vs. TDC inter-group differences in brain volumes
ASD vs. TDC brain volume differences were tested using independent two sample t-tests
for each preprocessing method. In addition, ASD vs. TDC brain volume differences were tested
after adjusting for the effects of age, sex, and site by fitting a linear mixed-model using lmer4
package in R (Bates et al., 2014). As fixed effects, we entered diagnostic group (ASD/TDC), sex,
age, and age2. As random effects, we included random intercepts and slopes for the effect of
diagnostic group at each level of site. Fixed effect of diagnostic group is reported as the ASD vs.
TDC group difference. The p-values were calculated using the Satterthwaite’s approximated
degrees of freedom (Satterthwaite, 1946) implemented in lmertest (Kuznetsova et al., 2013). In
38
another separate experiment, in addition to age, sex, and site, the effect of full scale IQ (FIQ) was
adjusted, where subjects with missing FIQ (N = 63) were excluded. For each method, separate
models were built for different brain volume types.
3.2.3.3 Method bias for diagnostic group
The inter-method difference in brain volumes (∆y = y( − y/ ) were modeled by a linear
mixed-model. Here, y/ and y( are brain volume estimates from two different methods where y( is
considered as a reference value. The fixed and random effects of the model were same as in the
model described in previous section. The fixed effect of diagnostic group on ∆y (β_à ) can be
interpreted as the amount of brain volume by which method y/ systematically over/under
estimates ASD subjects compared to TDC. Here our null hypothesis is that different methods do
not have systematic differential bias to the diagnostic group. Our null hypothesis will be rejected
when the fixed effect of diagnostic group β_à is statistically significant. If β_à is statistically
significant, we will conclude that with reference to y/ , method y( has systematic biases for ASD or
bcde
TDC subjects. Percentage bias of y( for ASD (with reference to y/ ) was calculated as ×100,
fg
where y/ is the mean volume across all subjects according to y/ . For each pair of methods, separate
models were fitted for the different brain volumes. Multiple comparisons correction across different
tissues was performed separately for each pair.
3.2.3.4 Experiments repeated with NYU Data
Data used for the above experiments were acquired from multiple scanning sites and the
effects of scanning site were adjusted by fitting a linear mixed-model with site as a random effect.
This model does not capture the non-linear site effects. In order to completely eliminate the effects
of site and to focus only on the differences due to methods, we repeated the above experiments
39
using only the NYU site. NYU was selected because it had the largest number of subjects (71 ASD
and 58 TDC) and all its subjects had FIQ information. NYU results are presented in Appendix A.
3.3 Results
3.3.1 Estimated brain volumes and inter-method differences
Brain volumes estimated by SPM, FSL, and FS are presented in Table 3.2. Statistics of the
inter-method differences (mean difference, correlation coefficient, Cohen’s d, and paired t-test p-
value) for all 876 subjects are presented in the shaded cells. The distributions of the estimated brain
volumes are visualized as boxplots in Figure 3.1A. To visualize the inter-method distribution, the
estimates from SPM, FSL, and FS are plotted against the estimates from SPM in Figure 3.2B.
The brain volume estimates from different methods moderately agreed except for CSF (see
Figure 3.1B). SPM vs. FSL correlation coefficients were 0.85, 0.64, 0.82, and 0.50 for TIV, GM,
WM, and CSF volumes respectively (see Table 3.2). Similarly, SPM vs. FS correlation coefficients
were 0.83, 0.77, and 0.93 for TIV, GM, and WM volumes respectively. FSL vs. FS correlation
coefficients were 0.83, 0.75, and 0.81 for TIV, GM, and WM volumes respectively. In summary,
the inter-method correlation coefficients were high for TIV and WM, followed by GM and were
the lowest for CSF.
Except WM, the average estimates of all the brain volumes by SPM were higher than that
of FSL and FS; see purple boxplot in Figure 3.1A. FSL estimates of TIV were the lowest while FSL
estimates of WM were the highest. Similarly, FS estimates of WM were the lowest. In summary,
the following inter-method differences were observed: TIVSPM (1.57 L) > TIVFS (1.56 L) > TIVFSL
(1.38 L), GMSPM (0.75 L) > GMFS (0.71 L) > GMFSL (0.64 L), WMFSL (0.53 L) > WMSPM (0.52 L)
> WMFS (0.49 L) and CSFSPM (0.30 L) > CSFFSL (0.22 L). Except TIVSPM vs. TIVFS, all inter-
40
method differences were statistically significant (p < 0.001; Cohen’s d > 0.5 in 7 and d > 1 in 4 out
of 10 comparisons).
Table 3.2: Estimated brain volumes and inter-method differences

TIV GM WM CSF
SPM (L) 1.566 ± 0.15 0.748 ± 0.07 0.519 ± 0.06 0.299 ± 0.04
SPM – FSL mean diff. (ml) 183.4 106.0 -14.4 77.9
SPM
Correlation Coefficient 0.851 0.639 0.821 0.494
vs.
Cohen’s d 1.20 1.41 -0.21 1.62
FSL
Paired t-test p-value <1E-100* <1E-100* 7E-18* <1E-100*
FSL (L) 1.383 ± 0.15 0.642 ± 0.08 0.533 ± 0.08 0.221 ± 0.05
FSL – FS mean diff. (ml) -177.4 -70.0 40.3 204.9
FSL
Correlation Coefficient 0.829 0.745 0.808 NA
vs.
Cohen’s d -1.05 -0.82 0.53 NA
FS
Paired t-test p-value <1E-100* <1E-100* <1E-100* NA
FS (L) 1.560 ± 0.18 0.708 ± 0.08 0.493 ± 0.07 NA
SPM – FS mean diff. (ml) 6.1 40.1 25.9
SPM
vs.
Cohen’s d 0.04 0.54 0.42 NA
FS
Paired t-test p-value 0.07 <1E-90* <1E-100* NA
Mean and standard deviation of the brain volumes estimated by SPM, FSL, and FS are presented. Cells
corresponding to CSFFS are filled as ‘NA’ since FS does not output total CSF volume. Inter-method
differences and corresponding statistics are presented in shaded cells. Correlation coefficient is used to
measure the association between the brain volumes estimated by two different methods. Cohen’s d is used
to measure the effect size of the inter-method difference and paired t-test was used to check the statistical
significance. Statistically significant differences are denoted by * for p < 0.05. Correlation coefficients show
that methods agree on volume estimates but the Cohen’s d and paired t-test p-values indicate significant
inter-method biases.
The inter-method correlation coefficients within NYU subjects were slightly higher than
that from using all subjects (see Table 7.1 in Appendix A). Inter-method differences were mostly
similar to that of the experiment using all subjects.
41
Figure 3.1: Distribution of estimated brain volumes and inter-method differences.
A) The distribution of brain volumes estimated by SPM (purple), FSL (blue) and FS (orange) are
presented by boxplots and indicate significant inter-method differences. CSFFS is not presented since
FS does not output total CSF volume. B) Volumes estimated by FSL and FS are plotted against the
volumes estimated by SPM. The brain volume estimates from different methods moderately agreed
except for CSF.
3.3.2 ASD vs. TDC inter-group differences is dependent on the method used
The mean ASD vs. TDC (ASD – TDC) group difference for TIV, GM, WM, and CSF
according to SPM, FSL, and FS are presented in Table 3.3. The ASD vs. TDC distribution of the
brain volume estimates are presented as boxplots in Figure 3.2A. The mean inter-group differences
in percentage are presented as bar plots in Figure 3.2B.
42
TIV GM WM CSF
diff diff p-val diff diff p-val diff diff p-val diff diff p-val
(ml) % (ml) % (ml) % (ml) %
Raw Volumes
SPM 24.0 1.53 0.019* 11.1 1.49 0.016* 3.7 0.71 0.33 9.2 3.08 0.001*
FSL 11.8 0.86 0.26 -1.6 -0.24 0.77 -1.4 -0.26 0.80 3.8 1.70 0.30
FS 5.6 0.36 0.65 0.3 0.04 0.96 -4.5 -0.92 0.33 NA NA NA
Adjusted for age, sex, and site
SPM 20.8 1.34 0.04* 13.7 1.84 0.02* 3.7 0.71 0.34 5.9 1.99 0.011*
FSL 12.7 0.92 0.32 1.37 0.21 0.85 -0.1 -0.02 0.99 -1.6 -0.73 0.47
FS 8.3 0.53 0.60 3.11 0.44 0.64 -2.9 -0.59 0.57 NA NA NA
Adjusted for age, sex, site, and FIQ§
SPM 28.1 1.81 0.001* 17.5 2.35 0.006* 6.2 1.22 0.13 6.5 2.22 0.006*
FSL 19.1 1.44 0.16 3.5 0.55 0.67 2.2 0.42 0.70 -1.4 0.62 0.55
FS 16.1 1.03 0.33 7.9 1.11 0.26 0.3 0.07 0.95 NA NA NA
Table 3.3: ASD vs. TDC brain volume differences
§63 subjects with missing FIQ were excluded from the particular analysis.
Mean (ASD – TDC) difference (diff) in brain volume estimates according to SPM, FSL, and FS. Percentage
ASD vs. TDC group difference was calculated as 100*(𝐴𝑆𝐷 − 𝑇𝐷𝐶)/𝑇𝐷𝐶, where 𝑇𝐷𝐶 is the group mean
of the TDC subjects. Cells corresponding to CSFFS are filled as ‘NA’ since FS does not output total CSF
volume. Statistically significant differences are denoted by * for p < 0.05. ASD vs. TDC differences are
dependent upon the method used and only in SPM TIV, GM and CSF volumes in ASD were significantly
larger than TDC.
43
Figure 3.2: ASD – TDC brain volume differences are method dependent
A) The distribution of raw brain volumes estimated by SPM, FSL, and FS for ASD and TDC. B) ASD vs.
TDC difference in the estimated brain volumes as a percentage of mean TDC is presented as a bar plot for
each method. ASD vs. TDC brain volume difference varied with methods which suggests that subsequent
interpretations are highly dependent on the method of choice.
Results show that ASD vs. TDC differences are highly dependent upon the method used.
According to SPM, ASD had 1.53% (p = 0.019) more TIV than TDC; see Table 3.3. and purple
bar in Figure 3.2B. Whereas according to FS and FSL, ASD had only 0.36% (p = 0.65) and 0.86%
(p = 0.26) more TIV than TDC respectively (see yellow bar in Figure 3.2B.). Similarly, according
to SPM, ASD had 1.49% (p = 0.016) more GM than TDC. In contrast, FSL estimates show that
ASD has 0.24% (p = 0.77) less GM than TDC. Whereas, according to FS, there was small
difference in GM of ASD and TDC. Similar method dependent ASD vs. TDC differences in WM
44
and CSF volume estimates were noted. Method dependence of ASD vs. TDC differences persisted
even after the effects of age, sex, and site were removed (see Table 3.3). Similar results were
observed even after removing the effects of FIQ where 63 subjects with missing FIQ were excluded
from the analysis. We verified our results by removing the effects of site by using subjects from only
one scanning site (NYU). Here again, ASD vs. TDC volume differences were dependent on the
method used (see Table 7.2 in Appendix A).
3.3.3 Differential bias of methods to the diagnostic group
Biases were pairwise calculated between methods using estimates from one method as the
reference and are presented in Table 4. Compared to FSL, SPM showed statistically significant
bias for ASD subjects in TIV (0.8%, p = 0.039), GM (1.9%, p = 0.007), WM (1%, p = 0.038), and
CSF (3%, p = 0.006) volumes. Similarly, compared to FS, SPM showed statistically significant bias
for ASD subjects in GM (1.4%, p = 0.004) and WM (1.5%, p = 0.004) volume estimates. We also
noted that several method biases were larger than ASD vs. TDC differences according to the same
methods. For example, with reference to FSL, SPM bias for ASD subjects in WM estimation was
7 ml, whereas, the inter-group difference in WM volumes according to SPM was 3.7ml (see Table
3). In summary, with reference to FSL and FS, SPM showed positive bias for ASD subjects in
multiple brain volumes. In other words, with reference to SPM, FSL and FS showed negative bias
in brain volume estimation of ASD subjects. Many of the biases were larger than ASD vs. TDC
inter-group differences according to the respective methods. When we repeated these experiments
using subjects from only one scanning site (NYU) we got similar results; see Table 7.3 in Appendix
A for details.
45
Table 3.4: Differential bias for diagnostic group (ASD)
TIV GM WM CSF
bias bias beta bias bias beta bias bias beta bias bias beta
(ml) % p-val (ml) % p-val (ml) % p-val (ml) % p-val
SPM
vs. 11 0.8 0.039* 12 1.9 0.003* 5 1.0 0.038* 7 3.0 0.006*
FSL#
FSL
vs. 5 0.3 0.75 -1 -0.2 0.75 2 0.37 0.75 NA NA NA
FS#
SPM
vs. 13 0.8 0.180 10 1.4 0.004* 8 1.5 0.004* NA NA NA
FS#
# reference method
bias (ml): brain volume (in ml) by which a method systematically overestimates in ASD subjects than in
TDCs.
% bias: the percentage of brain volume by which a method systematically overestimates in ASD subjects
than in TDCs.
3.4 Discussion
In this work, we investigated discrepancies in brain volumes (TIV, GM, WM, and CSF)
estimated by three different preprocessing methods (SPM, FSL, and FS). Brain volume estimates
between the methods had significant correlation, but the absolute values of brain volume estimates
were significantly different between methods. In other words, there was significant method specific
biases while estimating brain volumes. These biases had an influence on ASD vs. TDC brain
volume differences and our results indicate that ASD vs. TDC brain volume differences were
dependent on the method used to estimate the brain volumes. When the methods were compared
pair-wise, significant differential biases for the diagnostic group (ASD) were revealed, and most
biases were larger than ASD vs. TDC differences according to the respective methods. Below we
compare our results with previous findings and discuss potential reasons behind brain tissue volume
discrepancies by investigating segmentation disagreements at anatomical locations. We further
46
provide discussions on inter-method discrepancies at a conceptual level. Finally, we discuss
methods for minimizing inter-method discrepancies.
3.4.1 ASD vs. TDC inter-group differences dependent on the method used
In this study, we found that ASD vs. TDC inter-group differences in brain volumes are
dependent upon the method used. This dependency was evident in all brain volume types. For
example, SPM showed 1.53% more TIV in ASD compared to TDC with statistical significance.
Whereas FS showed that ASD had only 0.36% more TIV compared to TDC and the difference
was not statistically significant. In other words, a research study using SPM to estimate TIV will
report statistically significant TIVASD > TIVTDC but a different study using FS on the same data
will not report statistically significant difference. Similarly, ASD had 1.49% more GM than TDC
according to SPM, and the difference was statistically significant. Whereas according to FSL, ASD
had 0.24 % less GM than TDC. In other words, a study using SPM would report larger GM
volumes in ASD whereas a study using FSL would report smaller GM volumes in ASD. These
results show that the magnitude and even the direction of the effect under investigation is
dependent on the method used, and previous studies have reported similar findings. For example,
Callaert et al. (2014) reported that the age effect on GM volume was significantly dependent upon
the method used to estimate the GM volume. Similarly, Nordenskjöld et al. (2013) reported that
hippocampal volume showed different associations with education depending on which TIV
measure (SPM or FS) was used for hippocampal volume normalization. Rajagopalan et al. (2014)
found significant GM reduction in the motor region of the brain of ALS patients using SPM but
could not replicate the finding using FSL. These results demonstrate that the choice of a
preprocessing method used to estimate brain volumes can have a significant effect on the end
results of a study.
47
3.4.2 Comparison of ASD vs. TDC brain volume differences with previous studies
Due to various heterogeneous neuroimaging findings in ASD our comparisons here are
limited only to meta analytic studies (Via et al., 2011; Radua et al., 2011; Stanfield et al., 2008;
Redcay & Courchesne, 2005) and studies that used the ABIDE dataset (Haar et al., 2014;
Kucharsky Hiess et al., 2015; Riddle et al., 2016). It should be noted that the exclusion criteria and
the number of subjects included in the studies that used ABIDE data were not consistent across
studies. In our study, TIV in ASD was larger than TDC according to all methods, however, the
difference was statistically significant only according to SPM. Riddle et al. (2016) using SPM8
VBM also report greater TIV in ASD (1.58%) with statistical significance. Kucharsky Hiess et al.
(2015) used a different toolbox (ART Brainwash, www.nitrc.org/projects/art) and reported higher
TIV in ASD (1.73%) with statistical significance. Haar et al. (2014) using FS found TIV to be
higher only in 2 of the 18 sites they used and when these 2 sites were removed (26 subjects) overall
ASD vs. TDC difference was not statistically significant and this result is similar to our FS results.
Further, Redcay and Courchesne, (2005) and Stanfield et al. (2008) have also reported similar
results. Interestingly, 1.53% more TIV in ASD reported by SPM in our study is very close to an
estimate (1.534%) predicted by a model proposed by Redcay and Courchesne ( 2005). This model
prediction was for ASD and TDC subjects at age 17.75 years which is the mean age of the subjects
used in our study. Kucharsky Hiess et al. (2015) have also reported similar prediction.
In our study, GM volume in ASD was greater than TDC (1.5%) with statistical significance
according to SPM. An ABIDE study by Riddle et al. (2016) using SPM8 VBM also report a similar
result (1.58% more in ASD). However, in our study GM volume in ASD was slightly smaller
according to FSL (0.2%) without statistical significance and there was little difference in GM
volume according to FS. Haar et al. (2014) report slightly larger GM volume in ASD but without
48
statistical significance, using both FS (d = 0.01 in cortical GM; d = 0.13 in cerebellar GM) and
FSL (d = 0.18). A large meta-analytic study by Via et al. (2011) report no GM volume differences
(d = 0.006) between ASD and TDC. In summary, studies using SPM tend to report slightly larger
GM volume in ASD but results of studies using FSL and FS are inconsistent.
In our study, WM volume in ASD was slightly greater than in TDC (0.7%) without
statistical significance according to SPM. Riddle et al. (2016) also using SPM report similar results
(0.67% more in ASD). Similarly, Haar et al. (2014) report slightly larger WM volume in ASD but
without statistical significance, using both FSL (d = 0.03) and FS (d = 0.13 in cortical WM; d =
0.04 in cerebellar WM). However, in our study, WM volume in ASD was slightly smaller according
to both FSL (0.3%) and FS (0.9%) without statistical significance. A large meta-analytic study by
Radua et al. (2011) report slightly smaller WM volume (d = 0.006) in ASD. In summary, studies
using SPM tend to report slightly larger WM volume in ASD but results are inconsistent in studies
that used FSL or FS.
In our study, CSF volume was larger in ASD according to SPM (3.1%) with statistical
significance. Similarly Lin et al. (2015) using SPM8 New Segment also found greater CSF volume in
ASD (4.75%) with statistical significance. Riddle et al. (2016) using SPM8 VBM also report 1.54%
more CSF in ASD but with no statistical significance. Haar et al. (2014) using FSL also report
slightly larger CSF volume in ASD (d = 0.15) without statistical significance and this is similar to
our FSL finding: 1.7% greater in ASD without statistical significance. In summary, our result
agrees with previous finding of greater CSF volume in ASD and that the magnitude of the
difference is method dependent.
49
3.4.3 Methods have different biases for diagnostic group, and many of them are larger
than inter-group differences
We found that methods have systematic differential biases for diagnostic group (ASD) and
several biases were larger than ASD vs. TDC differences according to the respective methods. In
other words, inherent systematic bias of a method to a variable of interest (ASD) is larger than the
actual effect of the variable (brain volume difference due to ASD). The differential biases shown
by the methods for explains the method dependent ASD vs. TDC group difference in brain
volumes presented in section 3.2.2. With reference to FSL and FS, SPM showed positive bias for
ASD subjects in multiple brain volumes. From different perspective or considering SPM as the
reference method, it can be said that FSL and FS showed negative bias for ASD subjects. In other
words, with reference to SPM, FSL and FS systematically underestimated brain volumes in ASD
subjects compared to TDC. This might be one reason why SPM shows greater brain volumes in
ASD compared to TDC while FSL and FS do not. To conclude which method captures the true
ASD vs. TDC difference further investigation using ground truth data is necessary.
Similar results have been reported in previous studies (Nordenskjöld et al., 2013), where it
was reported that SPM showed bias associated with gender and atrophy while FS showed bias
dependent on skull size. In summary, the above results indicate that systematic differential biases
of a preprocessing method can be assigned as brain volume difference due to thus leading to
incorrect findings.
3.4.4 Locations of the inter-method segmentation discrepancies
To identify the locations of inter-method segmentation discrepancies, twenty subjects were
randomly chosen and their tissue probability maps (TPM) were individually inspected using
MRIcron (http://www.mccauslandcenter.sc.edu/mricro/mricron/index.html). For each tissue
50
type, a subject with the most common segmentation discrepancy was chosen and these
discrepancies are presented in Figure 3.3 where TPMs of different methods are overlaid using
MRIcron.
3.4.4.1 Overestimation of GM by SPM compared to FSL and FS
In Figure 3.3A(i), red represents the regions where voxel probabilities in SPM TPM for
GM are higher than that of FSL TPM; green, vice-versa and yellowish-green represents regions
where voxel probabilities from both methods are similar. Figure 3.3A(ii) compares the histograms
of the voxel probabilities in GM TPMs produced by SPM and FSL.
Our results showed that SPM overestimated GM volume compared to FSL in the following
four main brain regions.
1) Cerebellum: FSL under segments GM (less GM volume or less GM voxel probability)
in the cerebellum – see red regions in box 1 of Figure 3.3A(i).
2) Subcortical Structures: The proportion of GM (compared to WM) according to
SPM is greater in subcortical structures – see red regions in box 2 in Figure 3.3A(i).
Probability values in subcortical voxels are generally greater than 0.8 in GM TPM for
SPM. This accounts for the higher red curve for voxel probabilities greater than 0.8 in
the histogram of voxel probability presented in Figure 3A(ii). In FSL, however,
subcortical voxel probability values are in the 0.3–0.6 range, which accounts for the
higher green curve in the 0.3–0.6 range.
51
Figure 3.3: Inter-method segmentation comparison
Tissue Probability Maps (TPMs) from different methods are overlaid on one another. Red/green represents
the voxels where only one TPM has non-zero probability value. Yellowish green or orange represents
overlapping regions. (A) SPM vs. FSL GM segmentation, (B) SPM vs. FSL CSF segmentation, (C) SPM
vs. FS WM segmentation and (D) SPM vs. FSL full brain map (GM+WM+CSF). A(ii) & B(ii) are histograms
of voxel probability values in GM & CSF TPMs respectively. Although TPMs of different methods predominantly
overlap, there are mismatching regions/voxel values of segmentation that contribute to inter-method differences in brain volumes
estimates.
3) Boundaries of GM and other Structures: SPM assigns regions close to the GM
boundaries of different brain structures as GM indicated by red lines in box 3 and box
5 of Figure 3.3A(i).
52
4) Inter-sulcal CSF: SPM segments some inter-sulcal CSF as GM – indicated by red
regions in box 4. The overestimation of GM in these brain regions by SPM explains the
higher GM estimates by SPM.
3.4.4.2 Low correlation in CSF volume estimates by SPM and FSL
SPM and FSL produced similar ventricular CSF segmentations; overlap or agreement is
presented in orange, box 2 in Figure 3.3B(i). However, the probability values in ventricular voxels
are in the 0.95–0.99 range in CSF TPM of SPM, while probability values are exactly 1 in CSF
TPM of FSL. This accounts for the shift of the FSL (green) curve to the right in the 0.9–1 range in
Figure 3.3B(ii). Discrepancies in non-ventricular CSF estimates were mainly from following three
brain regions. 1) Brain Boundary: The estimation of CSF surrounding the brain, indicated by the
red regions in box 3 and in other brain slices of Figure 3.3B(i) was higher for SPM. This accounts
for the higher SPM (red) curve in the 0.7–0.99 range in the histogram of CSF. This may be due to
the fact that the a priori CSF TPM used by SPM (in New Segment) during segmentation has a thick
layer of CSF at the boundary of the brain. 2) Inter-sulcal CSF: FSL segments greater CSF
compared to SPM in inter-sulcal regions (see green regions in Box1 of Figure 3.3B(i)), where CSF
TPMs of SPM and FSL have probability values in the ranges of 0–0.3 and 0.3–0.7, respectively.
This introduces higher SPM (red) curve in the 0–0.3 range in Figure 3.3B(ii).
Our results indicate that the segmentation discrepancy in non-ventricular CSF
segmentation is the primary cause of the low inter-method correlation in CSF volumes.
Misclassification of bone/air as CSF or vice-versa can be another major source of discrepancy. T1-
weighted images provide a reasonable amount of contrast between GM (dark gray), WM (lighter
gray) and CSF (black). However, dense bone and air also appear dark like CSF. This makes the
segmentation of CSF challenging, especially at the sulcal regions since it is difficult to distinguish
53
between the inner skull and sulcal CSF. Accuracy in CSF segmentation can be improved by
augmenting information from the T2-weighted image as it provides additional contrast between
CSF (bright) and brain tissue (dark).
3.4.4.3 Underestimation of WM by FS compared to SPM
The WM volumes estimated by FS were the lowest in general. The final results of surface
reconstruction and parcellation produced by recon-all were used to report WM segmentation by FS.
Our study indicates that FS produces a considerable number of areas where WM is misclassified
as non-WM (red dots in box 1 of Figure 3.3C). The misclassified areas were primarily due to WM
hypo-intensities misclassified as GM or partial voluming in which WM+GM voxels look like non-
WM and are segmented as non-WM. WM hypo-intensities have values much lower than the
average WM intensity. Although recon-all automatically adds the volume of WM hypo-intensities
to the total WM volume, our results indicate that it cannot still identify all the hypo intensities and
the WM segmentations of FS require significant manual editing.
3.4.4.4 Brain Mask (SPM vs. FSL)
Brain masks in SPM and FSL presented in Figure 3.3D(i) were created by the summation
of GM, WM, and CSF TPMs. The brain mask of SPM (red) is larger than that of FSL (green).
This is mainly due to the overestimation of CSF surrounding the brain by SPM compared to FSL.
In the FSL brain mask, voxel probabilities have only two values, 0 or 1, while the voxel probabilities
in SPM brain masks are continuous in the 0–1 range (see Figure 3.3D(ii)). In FAST of FSL, the
HMRF model used for classification has only three components (GM, WM and CSF); hence, the
summation of these TPMs add up to one in the brain regions. In SPM, however, the Gaussian
mixture model uses a mixture of six Gaussian components for GM, WM, CSF, bone, soft tissue
and air/background; but the brain mask was created by summing only three mixture components:
54
GM, WM and CSF. Therefore, the voxel probabilities in the brain mask of SPM are in the 0–1
range.
3.4.5 Sources of the inter-method discrepancies in tissue segmentation
Inter-method segmentation discrepancies in different locations of the brain were shown in
the previous section. This section discusses the possible reasons behind the inter-method
segmentation discrepancies at a conceptual level. Inter-method differences in tissue segmentation
can be mainly attributed to differences in two factors: method dependent differences in the brain
template and method dependent differences in the spatial normalization process.
3.4.5.1 Brain templates
A brain template or atlas is an anatomical representation of a brain. It is a pre-segmented
standard brain image generated from a single subject or a cohort of subjects. A brain template is
generally used as an a priori to guide tissue classification. The different preprocessing methods used
for tissue segmentation in this study utilize different standard brain templates for prior spatial
information of the brain structures. For example, New Segment of SPM uses ICBM-452 T1 brain
atlas (Mazziotta et al., 2001a) and FS uses MNI 305 (Collins et al., 1994) brain atlas. Whereas
FAST of FSL does not use brain atlas but utilizes HMRF model to encode spatial information
through contextual constraints of neighboring pixels in an image (Zhang et al., 2001). The
differences among brain templates can arise primarily from two sources. First, the definition of
brain structures can vary among the brain templates. Second, the brain templates are created from
different cohorts of subjects scanned under different scanners and hence have different biases for
subject demographics, image quality, and scanner settings. The inter-method discrepancies in
brain structures segmentation can be minimized with the use population specific brain templates
(Mandal et al., 2012; Tanga et al., 2010).
55
3.4.5.2 Spatial normalization
Spatial normalization is the process whereby individual MRIs are registered to the common
anatomical space defined by the standard brain template. Spatial normalization conceptually
consists of two elements: image representation and transformation. The differences due to spatial
normalization starts from the choice in the mathematical representation of an image, i.e. how an
image is mathematically represented to be used in subsequent mathematical operations. For image
representation, several assumptions are made about the properties of the images, and these
assumptions vary with the preprocessing methods. The effect of the differences in these
assumptions propagate further and are finally evident with the discrepant segmentation results.
Similarly, differences arise also from the choice of the transformation applied to the individual
images to register it to a space defined by a standard template. In addition, different spatial
normalization techniques behave differently with different image acquisition parameters, motion
artifacts, and imaging artifacts such as bias field and intensity inhomogeneity. This adds further
discrepancy to the segmentations.
3.4.6 Implications to future neuroimaging studies
Our findings have important implications for the ongoing search for neuroimaging
biomarkers in ASD and other brain disorders. Inconsistencies across previous studies and lack of
evidence for brain biomarkers in ASD may in part be a result of failure to account for the issues
we have raised in this study. To reduce the impact of inter-method differences, we suggest the
following directions that need further investigation.
3.4.6.1 Cross-validation of findings
Results of our study and of many previous studies (Callaert et al., 2014; Eggert et al., 2012;
Nordenskjöld et al., 2013; Rajagopalan et al., 2014) have demonstrated the method dependence
56
of neuroimaging results. Each method has its own strengths and weakness, and there is no general
agreement on which method is optimal. Therefore, we suggest using multiple methods to segment
brain images to cross validate results across methods. In addition, a multi-variate classifier can be
trained on the outputs of several methods to improve the overall segmentation results. A simple
classifier would be a majority voting system where the final decision is made based on the majority
votes. For example, when SPM, FSL, and FS are used for binary tissue segmentation and if SPM
and FSL labels a voxel as CSF whereas FS labels it as GM, then the multivariate system would
label the voxel as CSF.
3.4.6.2 Development of better methods
The present preprocessing schemes apply spatial transformations that may introduce errors
in tissue segmentations and are one of the major causes of inter-method differences. Machine
learning based methods can be a way to perform segmentation with minimal preprocessing, and
among them, deep-learning methods have proven to be more promising. Recently, a few studies
have successfully performed the segmentation of brain structures from MRI using deep learning.
De Brébisson and Montana (2015) have reported competitive accuracy for segmentation of cortical
and sub-cortical structures from MRIs without performing any non-linear registration. Similarly,
(Kim et al., 2013; Lai, 2015) have reported successful segmentation of hippocampus using deep
learning.
3.4.6.3 Focus on improving data itself in addition to methods
Accurate segmentations may not be possible using a single type of MRI since it may not
have sufficient contrast to discriminate the boundaries of brain structures. For example, T1-
weighted MRI do not have enough information to distinguish between brain structures. It is
difficult to segment CSF surrounding the brain region using T1-weighted MRIs since dense bone
57
as well as air appear dark in the CSF images. A very low inter-method correlation of 0.5 in CSF
volume estimates obtained in this study also demonstrates the difficulty. Whereas CSF appears
bright and air appears dark in T2-weighted MRI and this is very helpful for the CSF segmentation.
Segmentation accuracy of CSF as well as other brain structures can be improved if information
from T1 and T2 images are used together if T2 images are available.
3.4.6.4 Multi-variate patterns are more robust across methods and scanners
In addition to currently popular mass univariate group difference characterization
techniques, a pattern within extracted biomarkers can be explored using multivariate classification
techniques so that the pattern is robust across methods and sites. In this study, we have shown an
example of how a multi-variate pattern is more robust compared to a univariate. In Figure 3.4, the
relation between GM volume and age according to SPM, FSL, and FS are shown. According to
FSL and FS there is linear decrease in GM volume from 6 to 40 years. This is consistent with the
biological understanding of GM in human brain. However, according to SPM the GM volume
first increases until 25 years old age and then decreases afterwards. So, there is a clear mismatch
between the information provided by the three methods. But when the GM volume is divided by
TIV, and the % GM volume is plotted against age, all three methods show the same trend in GM
decrease with age. This example supports our claim that the multi-variate patterns are more robust
across methods.
58
Figure 3.4: Multi-variate pattern is more robust across methods
Gray matter (GM) age dependence is method dependent. But the %GM (100*GM/TIV) is a very basic
multivariate pattern and is robust across methods.
Some recent anatomical studies have utilized multivariate classification techniques to
identify anatomical patterns differing across ASD and TDC group instead of focusing on just one
measure at a time. These studies have reported remarkable accuracies in decoding the group
59
identity in subject-level using anatomical measures such as cortical thickness, geometry curvature,
surface area, etc. (Ecker et al., 2010a; Haar et al., 2014; Jiao, 2011; Uddin et al., 2011). In addition,
it is very likely that brain alterations in heterogeneous condition such as ASD are multi-variate in
nature.
Moreover, novel techniques that are robust to inter-scanner variability can be
implemented. For example, Vardhan et al. (2014) has proposed GM/WM ratio as an indirect way
to model longitudinal growth curves in early childhood since there is change in the GM/WM
contrast as the brain undergoes myelination. According to the above study, the contrast ratio is
relatively invariant to location, scanner type and scanning conditions, which is a highly desirable
feature for multi-site studies
3.5 Conclusion
We demonstrate that ASD vs. TDC group differences in brain volumes are method
dependent. According to SPM, ASD brain volumes were higher than TDC with statistical
significance but according to FSL and FS the difference was not significant. Inter-method brain
volume differences can be attributed to varying definitions of brain structures, use of different
templates, differences in image processing algorithms, and the varying effects of imaging artifacts
and acquisition settings. We suggest that research studies should cross-validate findings across
multiple methods before providing biological interpretations. To our knowledge current studies do
not account for the method dependency of results. Accounting for methodological differences will
be an important step in increasing the reliability and consistency of future neuroimaging findings
of ASD and other brain disorders, leading to a greater likelihood of establishing valid and reliable
neuroimaging biomarkers. We also emphasize that future work is needed to investigate the reasons
behind inter-method discrepancies and the need to develop better methods. Moreover, we also
60
suggest on using multi-variate techniques to characterize the brain alterations in ASD because the
multi-variate patterns are more robust across methods and it is very likely that brain alterations in
heterogeneous condition such as ASD are multi-variate in nature.
61
4 CHAPTER 4: UTILIZING NON-BRAIN
INFORMATION TO IMPROVE ASD DETECTION
BASED ON BRAIN MORPHOLOGY
Some material from (Katuwal et al., 2016c) has been reused in this chapter under Creative Commons Attribution
(CC BY) license.
Low success (<60%) in ASD classification based on brain morphology in the large multi-
site datasets and inconsistent findings on brain morphometric alterations in ASD can be attributed
to the ASD heterogeneity. Morphometric features from MRIs of 734 males (ASD: 361, controls:
373) of ABIDE were derived using FS. Applying the RF classifier, an AUC of 0.61 was achieved.
By augmenting the information from VIQ and age to brain morphometric features we were able
to increase the classification performance. The important features were mainly from the frontal,
temporal, ventricular, right hippocampal and left amygdala regions. However, the important
features highly varied with AS, VIQ, and age. The curvature and folding index features from
frontal, temporal, lingual, and insular regions were dominant in younger subjects suggesting their
importance for early detection. Our findings suggest that identifying brain biomarkers in sub-
groups of ASD can yield more robust and insightful results than searching across the whole
spectrum. Further, it may allow identification of sub-group specific brain biomarkers that are
optimized for early detection and monitoring, increasing the utility of MRI as an important tool
for early detection of ASD.
4.1 Introduction
62
ASD diagnosis based on MRI is highly desirable because it can be objective and can be
utilized for early detection (Glenn, 2010). In addition, it can improve our understanding of brain
anatomical underpinnings of ASD. However, characterizing the brain morphology of ASD has
been challenging due to its heterogeneity (Lenroot & Yeung, 2013). A number of studies have
investigated brain anatomical alterations in ASD compared to that of TDC subjects. The reported
anatomical alterations are mostly contained in the fronto-temporal regions (Bigler et al., 2007; Ha
et al., 2015), the amygdala-hippocampus complex (Groen et al., 2010; Nordahl et al., 2012), corpus
callosum (Wolff et al., 2015), and cerebellum (D’Mello et al., 2015). However, previously reported
regions have not been consistent across studies (Amaral et al., 2008; Chen et al., 2011; Katuwal et
al., 2015a). The variability in differences can be due to the heterogeneity of ASD, differences in
methodological approaches, or a combination of both.
More recently, several studies have applied machine learning on MRI derived brain
features, in an effort for ASD detection. Previous studies on ASD vs. TDC classification can be
mainly categorized into two groups based on the number of subjects used in the study: (1) small
dataset (n < 200) matched for DB measures such as age, sex, and IQs (Ecker et al., 2010b; Jiao et
al., 2010; Uddin et al., 2011; Wee et al., 2014) and (2) large heterogeneous datasets such as the
ABIDE dataset (n > 700) (Haar et al., 2014; Katuwal et al., 2015b; Sabuncu & Konukoglu, 2014).
Table 4.1, below, shows classification accuracies from previous studies and the disparity of results
is clear. For instance, Group 1 using small datasets reports high classification accuracies while
Group 2 using the large heterogeneous ABIDE dataset reports classification accuracies less than
60%.
63
Table 4.1: Previous ASD vs. TDC Classification studies
Study Sample MRI Feature Classification Classification
Accuracy (%)
#ASD/#TDC Technique
Group 1 (small dataset; n < 200)
Ecker et al. 2010 22/22 (Males) Gray matter SVM 81
Ecker, 20/20 (right- Cortical thickness in SVM 90
Marquand, et handed) left hemisphere
al. 2010
Jiao et al. 2010 22/16 Regional cortical LMT 87
thickness
Uddin et al. 24/24 Gray matter in SVM 90
2011 default mode
network regions
Wee et al. 2014 58/59 Regional and inter- SVM 96
regional cortical and
subcortical features
Group 2 (large dataset; n > 700)
Haar et al. 2014 539/573 Regional volume, LDA, QDA <60
surface area and
cortical thickness
Sabuncu & 325/325 Regional volume, SVM, <60
Konukoglu 2014 surface area and
NAF, RVM
cortical thickness
Katuwal et al. 373/361 Volume, surface RF, GBM, SVM 60
2015 area, cortical
thickness, thickness
std., mean curvature,
Gaussian curvature,
folding index
Katuwal et al., 373/361 Zernike moments of RF <60
2016a sub-cortical
structures
SVM: Support Vector Machine; LMT: Logistic Model Trees; LDA: Linear Discriminant Classifier; QDA: Quadratic
Discriminant Classifier; NAF: Neighborhood Approximation Forest; RVM: Bayesian Relevance Vector Machine; RF:
Random Forest; GBM: Gradient Boosting Machine
In this study, we investigate the heterogeneity in ASD as a major reason behind the
inconsistent neuroanatomical findings and the disparity in classification accuracies. We investigate
if DB measures of the subjects can be utilized to mitigate the ASD heterogeneity and facilitate our
effort to understand ASD brain alterations, thereby helping to predict ASD using brain
64
morphology. We use multiple automatically extracted brain morphometric features and multiple
classification techniques for this investigation. The aims of this chapter are as follows:
1. We investigate the incremental predictive power that can be gained by adding DB
measures such as age, VIQ, and AS to the brain morphometric features derived from
MRI.
2. We investigate if sub-grouping the subjects by the above-mentioned DB measures
helps to improve ASD vs. TDC classification.
3. We explore the important features for classification in the sub-groups and how they
change with DB measures.
4. We explain the discrepancies in the reported neuroimaging findings on brain
alterations in ASD and classification accuracies in relation to the heterogeneity in
ASD brain morphology.
4.2 Methods
4.2.1 MRI Data
The ABIDE (Di Martino et al., 2014) dataset with 1,112 MRIs from 17 different sites was used
in this study. Each MRI was inspected visually; those with significant motion or other artifacts were
excluded from analysis. Seventy-six subjects from two ABIDE sites were excluded due to poor
image quality (motion/artifacts). An additional 96 subjects were excluded due to poor image
quality, and another 64 subjects were discarded due to FS (Fischl, 2012) segmentation failure.
Since ASD is highly prevalent in males (Fombonne, 2005) and also to avoid the gender effects, 142
female subjects were excluded. Finally, 734 male subjects (ASD: 361, TDC: 373) were used from
65
the remaining 876 subjects. Summary of DB measures of the used sample is presented in Table
4.2 . Scanner information of the individual sites can be obtained from Table 1 in (Haar et al., 2014)
Table 4.2: Subject demographics and behavioral (DB) measures

ASD TDC ASD vs. TDC
t-test
(p-value)
N 361 373
Age(years) 17.9 ± 8.7 18.1 ± 8.2 0.7
(7 to 64) (6.47 to 56.2)

VIQ 104.5 ± 17.8 112.4 ± 12.9 6.2E-10*
PIQ 105.2 ± 16.8 108.7 ± 13.2 5.7E-3*
FIQ 105.2 ± 16.6 111.8 ± 12.3 4.8E-9*
ADOS 11.9 ± 3.7 NA NA
AS 7.1 ± 2.1 NA NA
4.2.2 Reasons for using male subjects only
We decided to use only male subjects in this study for several reasons. First, we wanted to
remove the gender effects on the ASD brain heterogeneity and focus only on the brain
morphometry of ASD males. This decision was motivated by the previous findings that there are
significant differences between the brain anatomy of ASD males and females. Brain alterations in
female ASD subjects reported by studies using only female subjects (Calderoni et al., 2012; Craig
et al., 2007) have a very small overlap with the alterations reported by the studies performing meta
analyses of predominantly male subjects (Radua et al., 2011; Via et al., 2011). A recent study by
Lai et al. (2013) focusing on the brain anatomical differences of ASD males and females has
reported that the neuroanatomy of adult ASD males and females differed and there was minimal
spatial overlap in both grey and white matter. Second, compared to females, ASD is more
66
prevalent in males (Fombonne, 2005). In addition, among the 876 high quality MRIs available
from ABIDE, 84 % (734) of them were of males.
4.2.3 MRI Data processing
The recon-all preprocessing workflow of FS v. 5.3.0 (Dale et al., 1999a; Fischl et al., 1999a,
2002; Ségonne et al., 2004) was used to extract brain morphometric features. Volume of 40 sub-
cortical structures from Aseg atlas (Fischl et al., 2002) and volume, surface area, Gaussian curvature,
mean curvature, folding index, thickness mean and thickness standard deviation of 34 cortical
structures from Desikan-Killiany atlas (Desikan et al., 2006) were derived from each MRI. In total,
538 brain morphometric features were derived for each subject. The volume features of each
subject were normalized by TIV since percentage or relative brain volumes have been found to be
more robust across scanner types and scanner drifts (Takao et al., 2011) and thus this should reduce
sensitivity to the fact that the data were collected at multiple sites.
4.2.4 Classification Algorithms
RF (Breiman, 2001) classification models were trained using brain morphometric features
for ASD vs. TDC classification. RF was used since it is inherently suitable for parallel processing,
has very few hyper parameters to tune, does not require scaling the data, is theoretically resistant
to overfitting, provides variable importance and has been found to be very good for a variety of
datasets (Breiman, 2001; Caruana & Niculescu-Mizil, 2006). To avoid the results from being
affected by model selection bias, we repeated the experiments using GBM (Friedman, 2000)
classifier models and the results are presented in Appendix B. For both RF and GBM, scikit-learn
0.16.1 (Pedregosa & Varoquaux, 2011) was used. For all classifiers, hyper parameter tuning was
performed with cross-validation and classification performance was estimated by 10-fold cross-
67
validation. RF and GBM classification models are briefly explained below and are explained in
detail in the cited publications.
The optimal number of predictors used to split a node of a decision tree was automatically
estimated by performing grid search within (Öm -Öm /2, Öm + Öm/2) where m is the number of
features. Gini impurity (Breiman, 1996) was minimized while growing decision trees. Gini impurity
is the probability that a randomly chosen sample would be incorrectly labeled if the samples were
labeled according to the distribution of class labels. In general, the higher the number of decision
trees in a RF, the more reliable is the prediction and the interpretability of the variable importance
(Strobl et al., 2009). So, a large number (5000) of decision trees were used in the RF classification
models used in this study.
A grid search on the depth of decision tree (1 to 12) and the subsample ratio (0.5 and 0.7)
was performed to automatically estimate their optimum values. To increase the stability of the
model and increase the reliability of the variable importance, a large number (5000) of decision
trees were used in the GBM classification models used in this study. To avoid overfitting, a very
low learning rate (0.001) was used.
4.2.5 Metrics Used
Classification accuracy and area under the ROC curve (AUC) were used to measure the
success of classification. Accuracy is a threshold-based metric and AUC is a ranking based metric.
AUC is the probability that a classifier will rank a randomly chosen positive example higher than
a randomly chosen negative example with the assumption that the positive example ranks higher
than the negative example (Bradley, 1997). The practical difference or effect size of ASD vs. TDC
group difference was quantified using Cohen’s d (Cohen, 1988).
68
4.2.6 Adding AS, VIQ, and age information to the brain morphometric features
We reduced the heterogeneity of subjects by adding DB measures (AS, VIQ, and age) to
brain morphometric features. The use of AS information for ASD classification may appear to be
a self-fulfilling prophecy. In this study, AS information was used in only to answer the following
questions: 1) Is classification difficulty dependent on severity? and 2) Are important features for
classification consistent with severity?
AS, VIQ, and age information were used under the following two schemes. First, VIQ, and
age were used as training features in conjunction with the morphometric features.
Sub-grouping: In the second scheme, the subjects were sub-grouped by AS, VIQ, and
age into three sub-groups by each as defined in Table 4.3. TDC subjects were not used while sub-
grouping by AS. AS = 5 was used as the threshold between low and mid AS sub-groups according
to Table 2 in (Gotham et al., 2009). The AS = 8 was used as the threshold between mid and high
AS sub-groups since the mean and median for AS ³ 6 are 7.9 and 8 respectively. Only 167 ASD
subjects who had AS information were used in the sub-groups by AS. In sub-grouping by VIQ,
the threshold of VIQ = 90 instead of natural choice 85 (one standard deviation below the median
100) was used to divide the low and normal VIQ sub-groups because there were very few subjects
with VIQ £ 85. Only 586 subjects (296 ASD, 290 TDC) who had VIQ information were used in
the sub-groups by VIQ. In sub-grouping by age, the thresholds 13 and 18 years were chosen
because they approximately reflect pre-puberty, adolescence and early adulthood, and also yielded
in sub-groups with similar sample sizes. Subjects greater than 40 years were not used as they were
very few in number (17).
In each sub-group, ASD vs. TDC classification models were trained using morphometric
features. The sub-grouping resulted in an unbalanced classification problem in sub-groups i.e.
69
classes with uneven sizes (Chawla, 2005). This problem was addressed in two ways: up-sampling
the smaller class in each training fold and down-sampling the larger class. AUC has been used to
evaluate the performance of classification models in sub-groups since AUC is insensitive to the
unbalanced classes (Fawcett, 2006).
Table 4.3: Sub-groups definition

Sub-groups Definition #ASD/#TDC
Up-sampling Down-sampling
AS mild 4 ≤ AS ≤ 5 20/373 20/20
moderate 6 ≤ AS ≤ 7 60/373 60/60
high 8 ≤ AS ≤ 10 76/373 76/76
VIQ low 75 ≤ VIQ ≤ 90 57/15 15/15
normal 90 < VIQ < 115 146/142 142/142
high 115 ≤ VIQ ≤ 150 93/133 93/93
Age young 6 ≤ Age < 13 116/120 NA
(years) mid 13 ≤ Age < 18 114/108 NA
old 18 ≤ Age ≤ 40 121/138 NA
NA: Not Applicable. Down-sampling was not performed in the sub-groups by age since the number of
subjects were comparable and very few subjects had AS and VIQ to use for matching the subjects.
4.3 Results
4.3.1 Classification using only brain morphometric properties
Applying RF on brain morphometric features, classification accuracy of 60% and AUC of
0.61 was achieved. Similar classification performance has been reported by previous studies using
the ABIDE dataset: Katuwal et al. 2015 (60%), Haar et al. 2014 (<60%) and Sabuncu &
Konukoglu 2014 (<60%).
4.3.2 Age and VIQ used as training features in conjunction with brain morphometric
features
We first sought to determine whether simply adding age or VIQ as training features would
improve brain morphometric classification. When age was added to the brain morphometric
70
features for training the classifier, AUC improved to 0.62. When VIQ was added, AUC improved
to 0.66. When both age and VIQ were added, AUC improved to 0.68. In addition, site information
was explicitly added to the morphometric features for training the classifier. One-hot coding
method was used to represent the site information, i.e. for each scanning site, a binary feature was
added with values of one for subjects from the scanning site and values of zero for other subjects.
In total, 17 binary features for 17 sites representing the site information were added to the 538
morphometric features. After adding site information to the morphometric features for training,
AUC did not improve and was same as that from using only the morphometric features.
4.3.3 Sub-grouping subjects by AS, VIQ, and age
4.3.3.1 Up-sampling smaller class in each training fold
In each sub-group, the smaller class was randomly up-sampled in each training fold to
match the number of ASD and TDC subjects. The AUC scores achieved in the sub-groups are
presented in Figure 4.1A and Table 4.4. In Figure 4.1A, a point represents the mean and an error
bar represents the one standard deviation of the AUC scores from 10 test folds. AUC scores and
number of ASD and TDC subjects are presented below the error bar.
71
A) ASD/TDC subjects numbers matched by upsampling the smaller group in each training fold
AS VIQ age
0.9
0.8
0.7 AUC=0.78 0.8

60/373
#ASD/TDC=20/373 0.72
0.75
AUC
0.6 76/373
57/15 0.63 0.62 0.66 0.65
146/142 116/120
0.5 93/133 121/138
0.4 0.5
114/108
0.3
4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10 75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150 6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40
B) ASD/TDC subjects numbers matched by downsampling the larger group

AS VIQ
1.0
0.9 age, VIQ matched age matched
0.8
AUC=0.92
#ASD/TDC=20/20
0.7 0.81
AUC
60/60
0.6 0.69 0.8
76/76 16/16
0.5
0.62 0.59
142/142
93/93
0.4
0.3
4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10 75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150
Figure 4.1: Improvement in classification by sub-grouping

The AUC scores of the classification in the sub-group by grouping based on autism severity (AS), age, and
Verbal IQ (VIQ) are presented. A point represents the mean and an error bar represents the one standard
deviation of the AUC scores from 10 test folds. A) Smaller classes were up-sampled in each training fold to
balance the number of ASD & TDC subjects. Sub-grouping improved the classification with the most and
least improvements from sub-grouping by AS and age respectively. B) Larger classes were down-sampled
matching the demographics of the smaller classes. This scheme further improved the classification
performance.
In sub-groups by AS, AUC was 0.78, 0.8 and 0.72 for low, moderate, and high sub-groups
respectively. The sample sizes in the sub-groups by AS were unequal. To determine if the results
may have been due to unequal sample sizes, separate classification models were built for the
subjects with AS = 4-5 (#ASD/#TDC = 20/373), 6 (33/373), 7 (27/373), 8 (25/373), 9 (29/373)
and 10 (22/373). The results of this experiment are presented in Figure 8.6 in Appendix B. In this
experiment, where the sample sizes in the sub-groups were comparable, AUC decreased with the
AS according to both RF and GBM. There was strong negative correlation (RF: r = -0.72, p =
0.1, GBM: r = -0.86, p = 0.028*) between mean AUC and mean AS of sub-groups.
72
Table 4.4: Classification AUC in sub-groups created by AS, VIQ and age
Up-sampling Down-sampling
Sub- AUC AUC
groups RF GBM RF GBM
AS mild 0.78 ± 0.09 0.74 ± 0.16 0.92 ± 0.12 0.92 ± 0.11
moderate 0.80± 0.10 0.76 ± 0.09 0.81 ± 0.12 0.80 ± 0.10
high 0.72 ± 0.07 0.69 ± 0.07 0.69 ± 0.08 0.68 ± 0.11
VIQ low 0.75 ± 0.20 0.71 ± 0.31 0.80 ± 0.30 0.80 ± 0.31
normal 0.63 ± 0.05 0.65 ± 0.05 0.62 ± 0.11 0.63 ± 0.11
high 0.62 ± 0.08 0.62 ± 0.07 0.59 ± 0.12 0.54 ± 0.13
Age young 0.66 ± 0.11 0.67 ± 0.10 NA NA
(years) mid 0.50 ± 0.13 0.51 ± 0.14 NA NA
old 0.65 ± 0.13 0.66 ± 0.14 NA NA
NA: Not Applicable. Down-sampling was not performed in the sub-groups by age since the number of
subjects were comparable and very few subjects had AS and VIQ to use for matching the subjects. Mean
and standard deviation of the AUC across 10 test folds are presented.
In sub-groups by VIQ, AUC decreased with VIQ, with AUC of 0.75, 0.63 and 0.62 for
low, normal and high VIQ sub-groups respectively. In sub-groups by age, AUC was modest in
young and old sub-groups with AUC of 0.66 and 0.65 respectively. AUC was low (0.5) in mid-age
sub-group.
In summary, sub-grouping the subjects by AS, VIQ, and age improved the classification
rate with the most and least improvements from sub-grouping by AS and age respectively. The
results from GBM were similar and are presented in Table 4.4 and Figure 8.1 in Appendix B.
4.3.3.2 Down-sampling the bigger class to match the demographics of the smaller class
In the above section, although the subjects were more homogenous after sub-grouping, the
distribution of other DB measures of ASD and TDC subjects in the sub-groups might be different.
This raises a concern that the results from the up-sampling scheme could have been influenced by
the difference in DB measures distribution. To check if the results are not due to the different
demographics, ASD and TDC subjects in each sub-group were matched on demographics. In each
73
sub-group, the bigger class was down-sampled to match the ASD and TDC subjects on age and/or
VIQ; see Table 4.3 for the number of subjects. Subjects were matched by age and VIQ in the sub-
groups by AS and by age in the sub-groups by VIQ. For sub-groups by age, down-sampling was
not performed as the number of subjects in each group were comparable and very few subjects
had AS and VIQ.
Classification performance further improved after matching the subject demographics. A
high AUC of 0.92 was achieved in the low AS sub-group; see Table 4.4 and in Figure 4.1B.
Similarly, high AUCs of 0.81 and 0.80 were achieved for moderate AS and low VIQ sub-groups
respectively. The AUC trends from this experiment were the same as that from the up-sampling
scheme presented above, i.e. AUC decreases with AS and VIQ. The results from GBM were
similar and are presented in Table 4.4 and Figure 8.1 of Appendix B. When separate classification
models were built for the ASD subjects with each level of AS, AUC sharply decreased with AS
according to both RF and GBM (RF: r = -0.86, p = 0.029*, GBM: r = -0.87, p = 0.026*); see
Figure 8.6 of Appendix B.
To confirm that the increase in classification performance after sub-grouping is not due to
optimization issues and is actually due to the reduction in the heterogeneity in the sub-groups, we
tested the RF classification model trained in one sub-group on other sub-groups. To obtain the
distribution of test scores, the classification model trained in one sub-group was tested on 200
bootstrap replications from another sub-group. Classification results are presented in Figure 8.3 of
Appendix B, where each sub-plot represents a sub-group and three data points correspond to the
performance scores (when tested on the sub-group) of the models trained in three sub-groups. The
AUC scores in intra-subgroup classification were much larger than the AUC scores in inter-
subgroup classification in 16 out of 18 comparisons. AUC score of intra-subgroup classification
was lower than inter-subgroup classification in 2 comparisons. This disparity occurred in the mid-
74
age sub-group where the intra-subgroup classification was close to chance (50% success) and the
inter-subgroup rates were also close to chance (53% and 52% success). Moreover, the AUC scores
decreased when the difference between the training and testing sub-groups increased along the
variable by which sub-groups were defined. For example, when the classification models were
tested on the low-VIQ sub-group, the AUC scores decreased from 0.75, 0.64, and 0.35 respectively
as the subjects from the low, mid, and high VIQ sub-groups were used for training the models.
4.3.4 Multivariate analysis: Important features for classification and their variability
across sub-groups
The top 10 important features for classification in each sub-group with matched subjects
(i.e. from section 4.3.3.2) are presented in Figure 4.2. The top features for classification across all
subjects are in Figure 4.2D. Each feature is represented by a bar whose length is proportional to
its importance for the classification. The feature importance was calculated as an average of the
importance scores from 10 test folds. Before each feature, ASD vs. TDC Cohen’s d and two sample
t-test significance (P<0.005** and P<0.05*) are presented. The different morphometric features
are color coded and have been grouped together. Findings for the volume features reported in this
study are after they were normalized by TIV.
The important features for classification varied across the sub-groups. However, the
important features were mainly from the frontal, temporal, insular, ventricular, right hippocampal,
and left amygdala regions. Most of the important features from RF and GBM were common; see
Figure 4.2D for RF results and Figure 8.2 of Appendix B for GBM results.
75
A) AS
4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10
1.65**lh_lateralorbitofrontal_area 0.78**X3rd.Ventricle_volume 0.63**Right.choroid.plexus_volume
1.31**rh_medialorbitofrontal_area 0.73**lh_parahippocampal_volume 0.21 SubCortGrayVol_volume
1.75**lh_fusiform_thicknessstd 0.77**rhSurfaceHoles_volume 0.48**X3rd.Ventricle_volume
1.38**lh_bankssts_thicknessstd 0.92**rh_inferiorparietal_meancurv 0.63**rh_medialorbitofrontal_thicknessstd
1.17**lh_inferiortemporal_thickness 0.83**lh_superiortemporal_meancurv 0.65**rh_middletemporal_thicknessstd
1.31**rh_parsopercularis_thicknessstd 0.8** lh_middletemporal_meancurv 0.61**rh_parsopercularis_thicknessstd
1.22**lh_inferiorparietal_meancurv 0.5* rh_precentral_meancurv 0.42* lh_superiorparietal_thicknessstd
1.25**lh_lateraloccipital_meancurv 0.89**rh_inferiortemporal_meancurv 0.67**lh_middletemporal_meancurv
1.13**rh_lateraloccipital_gauscurv 0.09 rh_caudalmiddlefrontal_gauscurv 0.57**lh_frontalpole_meancurv
0.58 lh_supramarginal_gauscurv 0.3 lh_supramarginal_gauscurv 0.05 lh_frontalpole_gauscurv
B) VIQ
75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150
1.66**rh_parsopercularis_area 0.14 rh_parahippocampal_volume 0.25 lhCorticalWhiteMatterVol_volume
1.48**rh_WhiteSurfArea_area 0.26* lh_insula_volume 0.24 CorticalWhiteMatterVol_volume
1.33**lh_WhiteSurfArea_area 0.01 rh_parsorbitalis_volume 0.32* Right.Hippocampus_volume
1.44**rh_medialorbitofrontal_area 0.42**Left.Lateral.Ventricle_volume 0.3* Left.Hippocampus_volume
0.92* rh_parahippocampal_area 0.37**Right.Lateral.Ventricle_volume 0.36* Right.vessel_volume
1.41**rh_middletemporal_area 0.35**rh_parahippocampal_thickness 0.22 rhCorticalWhiteMatterVol_volume
1.03* lh_rostralanteriorcingulate_area 0.09 lh_parsopercularis_foldind 0.5** lh_frontalpole_thicknessstd
1.45**lh_precentral_area 0.13 lh_supramarginal_foldind 0.46**lh_inferiortemporal_thicknessstd
1.14**rh_precuneus_area 0.07 rh_precuneus_foldind 0.32* lh_frontalpole_gauscurv
0.13 lh_rostralanteriorcingulate_foldind 0.22 lh_frontalpole_gauscurv 0.02 rh_lateralorbitofrontal_gauscurv
C) Age (years)
6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40
0.23 rh_insula_foldind 0.48**CC_Central_volume 0.5** lh_transversetemporal_volume
0.2 lh_lateralorbitofrontal_foldind 0.35* CC_Mid_Anterior_volume 0.49**rh_bankssts_volume
0.29* lh_fusiform_foldind 0.39**X3rd.Ventricle_volume 0.45**lh_inferiortemporal_volume
0.19 lh_superiortemporal_foldind 0.36* Left.Lateral.Ventricle_volume 0.42**rh_superiortemporal_volume
0.18 lh_lateralorbitofrontal_gauscurv 0.03 lh_caudalmiddlefrontal_area 0.34* lhCortexVol_volume
0.28* lh_frontalpole_gauscurv 0.42**lh_lingual_thickness 0.27* Left.Putamen_volume
0.01 rh_medialorbitofrontal_gauscurv 0.34* rh_lingual_thicknessstd 0.32* Right.Hippocampus_volume
0.02 rh_lingual_gauscurv 0.29* lh_pericalcarine_thickness 0.41**Left.Amygdala_volume
0.35* lh_fusiform_gauscurv 0.26 lh_caudalanteriorcingulate_thickness 0.17 lh_medialorbitofrontal_thicknessstd
0.25 rh_parsopercularis_gauscurv 0.17 lh_entorhinal_gauscurv 0.38**lh_pericalcarine_meancurv
0 25 50 75 100 0 25 50 75 100 0 25 50 75 100

Feature Importance(%)
D) ALL Subjects
ALL
0.25**Left.Lateral.Ventricle_volume
Morphometric Attributes
0.02 rh_parahippocampal_volume
volume
0.24**CSF_volume area
0.23**Left.Inf.Lat.Vent_volume thickness
0.14 Left.Amygdala_volume foldind
0.17* CorticalWhiteMatterVol_volume meancurv
gauscurv
0.17* lhCorticalWhiteMatterVol_volume
0.07 rh_bankssts_volume Group Difference
0.04 rh_parstriangularis_area a TDC > ASD
a ASD > TDC
0.13 lh_frontalpole_gauscurv
0 25 50 75 100
76
Figure 4.2: Important features for classification are different across sub-groups.
Top 10 important features for ASD spectrum disorder (ASD) vs. typically developing controls (TDC)
classification in each sub-group (by AS, VIQ, age) are presented. Each feature is represented by a colored
bar; the length of the bar represents the relative % importance for classification with respect to the top
feature. The features have been grouped and color-coded by volume, area, thickness mean, thickness
standard deviation, folding index, mean curvature and Gaussian curvature. Before each feature, Cohen’s d
and two sample t-test significance (P<0.005** and P<0.05*) of ASD vs. TDC group difference are
presented. The important features for classification varied across the sub-groups demonstrating the
heterogeneity in ASD brain morphometry.
To remove the concern that the arbitrary cutoff of top 10 might have influenced our results,
the important features were also selected by another technique based on cumulative distribution
of the feature importance scores. After sorting the features in descending order of their importance
scores, the scores were cumulatively added starting from the most important feature. The features
required to reach 10% of the total sum of the scores were considered important and the
corresponding feature importance plot for RF is presented in Figure 8.4 of Appendix B. In
addition, we relaxed our criteria for important features and used 25% threshold; see Figure 8.5 of
Appendix B. Even after using this different technique to select the important features with multiple
thresholds, the top features for classification were highly dissimilar across the sub-groups.
The important features according to two classifiers were similar suggesting that the results
are not influenced by model choice. To statistically verify the similarity, we performed the
Pearson’s correlation test between the importance scores of all features from the two classifiers. We
performed the test separately in nine sub-groups and the correlation coefficients are reported in
Table 8.1 of Appendix B. All correlation coefficients were high (r > 0.75 in 9 and r > 0.85 in 7
sub-groups) and statistically significant (p < E-16). The high similarity between the feature
importance scores from two different classifiers supports that the important features reported in
this study are not affected by model choice and hence are robust.
77
To demonstrate the extent of the heterogeneity in brain alterations, variability of the 13
important features with AS, VIQ and age are presented in Figure 4.3. The 13 features include the
top feature from each sub-group (9 in total) and 4 important features from the classification using
all the subjects.
Statistical Significance (p<0.05) FALSE TRUE
lh_fusiform_thicknessstd lh_inferiortemporal_thicknessstd Left.Lateral.Ventricle_volume

rh_inferiorparietal_meancurv rh_insula_foldind CorticalWhiteMatterVol_volume
Right.choroid.plexus_volume CC_Mid_Anterior_volume Left.Amygdala_volume
lh_rostralanteriorcingulate_foldind lh_pericalcarine_meancurv
rh_parahippocampal_volume lh_frontalpole_gauscurv
AS VIQ age
1.5
1.0
ASD - TDC effect size (Cohen's d)
0.8
0.5
0.3
0.2
0.1
0.0
-0.1
-0.2
-0.3
-0.5
-0.8
Sub-groups
Figure 4.3: Variability of the important features.

A total of 13 important features for classification are presented; the top feature for each sub-group (9 in
total) and the 4 important features for all subjects. The magnitude and direction of the ASD vs. TDC group
differences of the top features varied with ASD severity (AS), verbal IQ (VIQ), and age demonstrating the
heterogeneity in brain morphometry.
78
Curvature and thickness based features were predominant in the sub-groups by AS; see
Figure 4.2. Interestingly, there were no important volume features in the low-AS sub-group.
Volume features were present in moderate and high-AS sub-groups and many of them were from
ventricles. Thickness standard deviation of the left fusiform gyrus (red line in Figure 4.3) was the
most important feature in the low AS sub-group and had very large ASD vs. TDC group difference
(d = 1.75, p = 4E-16*). The group difference decreased with AS but was still high (d = 0.57, p =
0.0006) in the high AS sub-group. Interestingly, the group difference even changed its direction
with VIQ. The difference was negative (ASD < TDC) with small effect size (d = -0.11, p = 0.7) in
the low-VIQ sub-group but was positive and statistically significant with medium effect size (d =
0.36, p = 0.01*) in the high-VIQ sub-group. Similarly, mean curvature of the inferior parietal
gyrus (blue line in Figure 4.3) was the most important feature in the moderate-AS sub-group. It
was significantly larger in ASD (d = 0.91, 2E-6*). The group difference decreased with AS but was
still high (d = 0.51, p = 0.0006*) in the high AS sub-group. This feature also showed the reversal
in the direction of the group difference- ASD > TDC with medium effect size (d = 0.41, p = 0.02*)
in the young-age sub-group and ASD < TDC with medium effect size (d = -0.3, p = 0.03*) in the
old-age sub-group. Right choroid plexus volume (green line in Figure 4.3) was the most important
feature in the high-AS sub-group and had positive (ASD>TDC) group difference with large effect
size (d = 0.55, p = 9E-4*). Across all subjects, it was larger in ASD with small effect size (d = 0.18,
p = 0.02*). There was large positive group difference (d = 0.71, p = 0.05) in the low-VIQ sub-
group, however, it decreased with VIQ and was negative in the high-VIQ sub-group (d = -0.15, p
= 0.3).
Folding index of left rostral anterior cingulate gyrus (orange line in Figure 4.3) was the most
important feature in the low-VIQ sub-group with small negative group difference (d = -0.13, p =
0.7). It is an interesting observation that it is the most important feature for classification even when
79
the ASD vs. TDC group difference is very small and statistically insignificant. One thing to
remember is that it is the most important in the multi-variate setting where the importance of a
feature is dependent on its relationship with other features. For example, the group difference of
the ratio of folding index of left and right rostral anterior cingulate gyrus was large (d = 0.7, p =
0.05). This demonstrates the superiority of MVPTs over univariate techniques by its ability to
automatically find inter-variable relationships important for inter-group distinction. Similarly,
volume of right parahippocampal gyrus (black line in Figure 4.3) was the most important feature
in the mid-VIQ sub-group with small negative group difference (d = -0.15, p = 0.2). It was also an
important feature in classification using all subjects; see Figure 4.2D. Thickness standard deviation
of left inferior temporal gyrus (brown line in Figure 4.3) was the most important feature in the high-
VIQ sub-group where it was larger in ASD with medium effect size (d = 0.46, p = 0.002*).
However, the group difference was nearly zero in the normal-VIQ sub-group and even flipped its
direction in the low-VIQ sub-group (d = -0.45, p = 0.2).
The important features across the sub-groups by age were distinct. Folding index and
Gaussian curvature features from the frontal and temporal regions were predominant and there
were very few volume, thickness, and area based important features in the young-age sub-group.
The volume features became more dominant with increase in age and most of the important
features in the old-age sub-group were volume-based. Folding index of right insula gyrus (purple
line in Figure 4.3) was the most important feature in the low-age sub-group with small positive
ASD vs. TDC group difference (d = 0.23, p = 0.08). The group difference decreased with age and
was nearly zero for the old-age sub-group. Volume of the mid anterior corpus callosum (dotted red
line in Figure 4.3) was the most important feature in the mid-age sub-group where it was smaller
in ASD (d = 0.35, p = 0.03*). However, in the young-age sub-group, it was larger in ASD (d =
0.15, p = 0.4). Mean curvature of the left pericalcarine gyrus (dotted blue line in Figure 4.3) was
80
the most important feature in the old-age sub-group and was larger in ASD (d = 0.3, p = 0.02*)
but was smaller in ASD (d = -0.38, p = 0.002*) in the old-age sub-group.
Across all subjects, Gaussian curvature of frontal pole was the most important feature. The
volume features were predominant and were mainly from the left amygdala, right
parahippocampal, ventricular and temporal regions. As other important curvature based features,
the group difference in Gaussian curvature of frontal pole (dotted green line in Figure 4.3) was the
largest in younger subjects (d = 0.28, p = 0.03*) and was the smallest for older subjects (d=0.02, p
= 0.9). Among the ventricular volumes, left lateral ventricle volume (dotted orange line in Figure
4.3) was the most important across all subjects (d=0.24, p = 0.002*). It was larger in ASD and the
group difference decreased with VIQ and increased with age. Across all subjects, all the ventricles
were larger in ASD compared to TDC and the group differences were statistically significant
(before multiple comparisons). In general, except 3rd and 4th ventricles, the group difference in
ventricles decreased with VIQ; see Figure 4.3. Left amygdala (dotted brown line in Figure 4.3) was
also an important feature for classification across all subjects. It was larger in ASD in the old-age
sub-group with medium effect size (d = 0.41, p = 0.001*) but was smaller in the young-age sub-
group (d = -0.15, p = 0.3). Likewise, most of the important features showed high variability with
AS, VIQ, and age and even changed the ASD vs. TDC group difference direction.
4.4 Discussion
Modest classification success is achieved using only brain morphometric properties. We
demonstrated that the low success rate was due to the heterogeneity in ASD alterations by showing
the variability of important features across the sub-groups created by demographics and behavioral
measures AS, VIQ and age. To mitigate the challenges imposed by the ASD heterogeneity, we
then utilized AS, VIQ and age information in conjunction with the brain morphometric features.
81
We could significantly improve the classification success after utilizing extra information from AS,
VIQ and age and demonstrated this using two different classification techniques. When the
classification models trained in one sub-group were tested on other sub-groups, the inter-subgroup
classification scores were much lower than the intra-subgroup classification scores. Moreover, the
classification scores decreased when the difference between the training and testing sub-groups
increased along the variable by which sub-groups were defined. These results support our
hypothesis that the sub-grouping of subjects results in the heterogeneity reduction and hence the
improvement in classification performance.
The analysis of the important features for classification in conjunction with the univariate
tests provided valuable insight on structural alterations of autistic brains. The alterations were
mainly from ventricular, frontal, temporal, left amygdala and right hippocampal regions of the
brain. The important features from two different classification techniques were similar,
demonstrating the robustness of our results. Below we discuss some interesting observations on
heterogeneity of ASD brain morphometry in relation to the previous inconsistent neuroanatomical
findings and discrepancies in the classification accuracies. In addition, challenges and future
directions for neuroimaging studies on ASD prediction using brain morphometry are discussed.
4.4.1 Classification becomes difficult with increase in AS
Classification AUC was the lowest in the high-AS sub-group for both up-sampling and
down-sampling schemes; see Figure 4.1A. The results from GBM were similar and are presented
in Figure 8.1. Moreover, AUC decreased with AS when sub-groups were created for each AS
values according to both RF and GBM classification models; see Figure 8.6. This result is opposite
to that of Katuwal et al. (2015b) where it was reported that the classification accuracy increases
with Autism Diagnostic Observation Schedule (ADOS) score. AS is the standardized version of
82
ADOS (Gotham et al., 2009). This discrepancy is likely due to the difference in experiment design-
the classification model was trained across all subjects in the experiment of Katuwal et al. (2015b)
and separate classification models were trained in each sub-group in this study. In addition, only
167 ASD subjects had AS scores available and were present in AS sub-groups of this study, while
in Katuwal et al. (2015b), 361 ASD subjects were used to train classification models in leave-one-
out cross validation framework.
The decrease in AUC suggests that the most severely autistic subjects are the most difficult
to classify, perhaps because the most severely autistic subjects are the most heterogeneous. In each
sub-group, to check the heterogeneity in ASD data, we calculated the net variance as mean of the
relative standard deviation (s/mean) of all features. The net variance in ASD brain morphometry
increased with AS – 0.42, 0.47 and 0.49 for low, mid and high AS sub-groups respectively. Even
in the down-sampling scheme with demographics matched subjects, the net variance of the brain
morphometry of ASD subjects increased with AS- 0.46, 0.55 and 0.57 for low, mid and high AS
sub-groups respectively. This suggests that the brain morphology of ASD subjects becomes
increasingly dissimilar in more severe cases, reinforcing the difficulties in the classification.
In this study, we are not suggesting that the predictive models constructed for each AS sub-
group can be directly used in clinical practice. ASD subjects were sub-grouped by AS only to
investigate how ASD vs. TDC classification success and the features important for classification
change with AS. By demonstrating that the important features are very dissimilar across the
severity scores, we are suggesting that stratifying ASD subjects by severity scores might be helpful
to better understand the brain alterations in ASD subjects. In future clinical practice, improved
knowledge of brain alterations in ASD subjects will provide evidences to support currently existing
clinical diagnosis that are based on behavior.
83
4.4.2 Folding index and curvature features may be important markers for early
detection of ASD
In the young age sub-group, where the classification was performed between young ASD
and young TDC subjects, the most common top important features for classification were folding
index and curvatures (mean and Gaussian) of frontal, temporal, lingual and insular regions. The
importance of these features decreased with age and eventually in the old-age sub-group, volume
features were predominant; see Figure 4.2. The folding index and curvature features that were
important in the young-age sub-group and/or whose ASD vs. TDC group differences across all
734 subjects were statistically significant (multiple comparisons uncorrected) are presented in
Figure 4.4. All features (except Gaussian curvature of right lingual gyrus) were larger in young ASD
subjects compared to young TDC subjects. The group difference decreased with age and the
direction for 12 out of 17 features even flipped the direction (smaller in ASD) in the old-age sub-
group. This result not only demonstrates the heterogeneity in ASD brain morphometric differences
but also gives an important clue for the early diagnosis of ASD using brain morphometry. This
result suggests that the most prominent brain morphometric alterations in young ASD subjects
may be the folding index and (mean and Gaussian) curvatures of the frontal, temporal, lingual and
insular regions. A study by Nordahl et al. (2007) has also reported that cortical shape alterations
(measured by sulcal depth) were pronounced in children of age 7.5 to 12.5 years. Similarly, a study
by Auzias et al. (2014a) has reported a statistically significant and consistent pattern of shape
alterations in central, intra-parietal and frontal medial sulci in the children (18-108 months). The
study also reported that the shape descriptors of several sulci from frontal and temporal regions
and age were statistically significant. Moreover, the study also reported significant correlations
84
between the different sulcus shape descriptors and Childhood Autism Rating Scale (CARS) and
ADOS scores.
rh_superiortemporal_meancurv lh_fusiform_gauscurv
rh_rostralmiddlefrontal_meancurv rh_parsopercularis_gauscurv
rh_middletemporal_meancurv rh_fusiform_foldind
rh_inferiortemporal_meancurv rh_insula_foldind
lh_frontalpole_meancurv lh_fusiform_foldind
lh_bankssts_meancurv lh_lateralorbitofrontal_foldind
lh_lateralorbitofrontal_gauscurv lh_superiortemporal_foldind
lh_frontalpole_gauscurv lh_pericalcarine_meancurv
rh_lingual_gauscurv
age
0.5
0.4
0.3
0.2
0.1
0.0
-0.1
-0.2
-0.3
6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40

Sub-groups
Figure 4.4: Folding index and curvature features are important for classification in young
subjects
The important folding index and curvature features from the young-age sub-group and/or whose ASD vs.
TDC group differences across all subjects were statistically significant (multiple comparisons uncorrected)
are presented. The features are mainly from frontal, temporal, lingual and insular region, and are larger in
85
ASD. However, the group differences decrease with age and even the direction of the group difference flips
for 12 out of 17 features.
Knowledge of brain morphometric differences in young ASD subjects is comparatively
more valuable than that in older ASD subjects. Successful identification of robust brain biomarkers
for ASD diagnosis in young patients would allow early intervention, likely increasing the success of
ASD treatment. Our results show that the shape alterations of frontal, temporal, lingual, and
insular brain regions are important to classify ASD from TDC in young children. For this reason,
the shape of these brain regions merit special attention. However, most of the previous studies on
ASD brain alterations are based on volume, area, and thickness features and very few studies are
based on shape features such as curvature, folding index, and sulcal depth. We therefore
emphasize the need to include less conventional features of brain morphometry in order to improve
existing classification procedures for ASD.
4.4.3 ASD heterogeneity in brain morphometry
We were able to increase the classification AUC up to 0.68 from 0.61 by adding AS, VIQ
and age with the brain morphometric features. This shows that the brain morphometry and DB
measures have some non-overlapping information. In addition, when the subjects were sub-
grouped by AS, VIQ, and age, and the classification models were trained in each sub-group, there
was significant improvement in classification performance. The important features for
classification and the strength and the directionality of the ASD vs. TDC group difference in the
important features highly varied across the sub-groups. This variability was presented in detail in
section 4.4. Previous studies have also reported variability in brain alterations with factors such as
age, gender, handedness etc. A study by M.-C. Lai et al. (2013) reported that the brain regions
affected in ASD males and ASD females have little overlap. Similarly, Floris et al. (2013) has
reported strong rightward lateralization in the posterior and anterior mid-body of corpus callosum
86
demonstrating that the pattern of the lateralization is strongly depended on the handedness of the
subjects with ASD. A recent study by Lin et al. (2015) reported that the regional brain volume
differences between ASD and TDC males are highly age-dependent. Through the age-stratified
analyses, they showed that the patterns in GM and WM volumetric alterations in ASD are distinct
among the subsamples of children, adolescents and adults. Therefore, the results from this study
and the previous studies support the hypothesis that brain alterations in ASD are highly
heterogeneous across the ASD population.
4.4.4 Low classification success in large multi-site data due to ASD heterogeneity
The heterogeneous nature of the brain morphometry in ASD partially explains the
discrepancy in the predictive performances reported by the two groups of previous studies: small
sample size studies reporting high classification accuracies and the studies using the large multi-site
ABIDE dataset reporting low classification accuracies. This is counterintuitive to the general idea
that the generalized classification performance increases with the training sample size (Sordo &
Zeng, 2005). Large number of subjects from the multi-site datasets such as ABIDE provide more
information about brain morphometry than that from small sample size. However, the amount of
variance added due the ASD heterogeneity can surpasses the extra information gained with the
increase in sample size. As the heterogeneity of the data increases, the training, validation, and
testing folds used for estimating generalized predictive performance of a model become
increasingly dissimilar to each other. As a result, the model trained using training and validation
folds performs poorly in the test fold, hence decreasing the generalized predictive performance
estimated by cross validation. This explains the low accuracies (<60%) achieved by the previous
studies using the ABIDE dataset. A much larger dataset is required so that the information gained
from the increase in sample size is greater than the increase in variance introduced by the
87
heterogeneity of ASD. Larger standardized datasets easily accessible to the research community
would, therefore, be highly valuable. On the other hand, subjects collected in a site and matched
for factors such as age, sex, IQs etc. are relatively homogenous. The training, validation, and
testing folds are more similar to each other, hence, the predictive performances of the models
estimated by cross validation are larger. This explains the high accuracies achieved by the previous
studies using small well-matched data. In addition, although all previous studies using small sample
size have reported the use of cross validation to estimate the predictive performance, many of them
have not explicitly mentioned that the feature selection and classification steps were done under
the same cross validation framework. When the two steps are under different cross validation
framework i.e. when the feature selection is performed using all data, there will be a data leak. The
data leak results in an over-fitted model whose predictive performance cannot be generalized with
confidence. A recent study by Katuwal et al. (2015b) has reported that the high classification
accuracies reported in some of the previous studies might be due to over-fitted models caused by
data leak. In addition, the study demonstrated that the amount of over-fitting and hence the
overestimated predictive performance increases with the decrease in sample size. Another reason
for high classification accuracies in small datasets may arise due to manual feature engineering
performed specific to the dataset. Caution should be taken to interpret the manually crafted
features used in these studies. The interpretation of the important features for classification would
be more relevant and meaningful with respect to the subjects’ DB measures such as age, sex, IQs,
handedness etc.
4.4.5 Future direction for neuroimaging studies
Developing a single successful classification model to predict ASD type and severity using
brain morphometry is likely the ultimate objective of the research on ASD diagnosis using brain
88
morphometry. However, there are a few limitations and road blocks to be overcome before this
objective can be achieved. First, it requires a much larger standardized data set than currently
available datasets such as ABIDE. Second, based on the results shown here, it might not provide
insight into the neuroanatomical basis of ASD as it would be difficult to perform exploratory
analysis across large heterogeneous subjects compared to that in distinct homogenous sub-groups.
4.4.5.1 Divide and conquer: focus on smaller distinct homogenous sub-groups
ASD as currently diagnosed is a collection of autisms. It is highly heterogeneous in its
etiology, comorbidity, pathogenesis, genetics, severity, and brain morphology (Betancur, 2011;
Happé et al., 2006; Jeste & Geschwind, 2014; Lenroot & Yeung, 2013). As shown in Figure 4.2
and Figure 4.3, brain alterations in ASD compared to TDC are highly variable across AS, VIQ,
and age. For highly heterogeneous conditions such as ASD, it is very hard to find a robust global
brain biomarker. A better alternative may be to focus on the relatively more homogenous smaller
sub-groups defined by several criteria such as age, sex, IQs, handedness, severity, etc. Several
previous studies have suggested the same (Lenroot & Yeung, 2013; Volkmar et al., 2009). Dividing
the ASD population into distinct sub-groups provides more exploratory power to the study and
provides deeper insights into the anatomical alterations in ASD. The brain biomarkers identified
by this technique albeit local with respect to some DB measure, are more robust and provide
valuable insight on their effects in relation to the respective DB measure. For example, when the
subjects were grouped by age and classification was done separately, folding index and curvature
features were predominant in younger subjects (see Figure 4.2C and Section 4.3.4). ASD detection
in young age is more desirable as it would provide more time for early intervention. So, for the
classification in young subjects, one approach might be to focus on the folding index and curvature
based features from insular, fusiform, frontal and temporal regions. Similarly, although the
89
ventricular volumes were larger in ASD and were important features for classification, the ASD
vs. TDC group difference sharply decreased with VIQ; in subjects with high VIQ, the
directionality of group difference even reversed for some ventricular volumes; see Figure 4.5. This
result suggests that the ventricle volumes may be robust biomarkers in the low and normal VIQ
population but certainly not for the high VIQ population. Thus, brain biomarkers identified in
distinct sub-groups will be more robust and insightful and hence more helpful to understand the
neuroanatomical basis of the ASD heterogeneity.
In addition to the sub grouping of the ASD population, we propose adding the DB
measures of the subjects with the MRI features to train multivariate machine learning models. This
data-driven automatic approach would help to identify the relationship of DB measures with the
multi-variate brain morphometry and hence provide better insights on brain alterations in ASD.
For example, identifying the robust relationship between age and multi-variate patterns in brain
morphometry would be highly valuable to the understanding of the pathogenesis of ASD.
Eventually, this could make MRI a powerful tool for early detection of ASD.
90
Left.Lateral.Ventricle_volume X3rd.Ventricle_volume
Left.Inf.Lat.Vent_volume X4th.Ventricle_volume
Left.choroid.plexus_volume X5th.Ventricle_volume
Right.Lateral.Ventricle_volume TotalVentricular_volume
Right.Inf.Lat.Vent_volume EstimatedTotalIntraCranialVol_volume
Right.choroid.plexus_volume
VIQ
1.5
1.0
0.8
0.5
0.3
0.2
0.1
0.0
-0.1
-0.2
75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150
Sub-groups
Figure 4.5: ASD vs. TDC group difference in ventricular volumes decreases with verbal IQ
(VIQ)
The ventricular volumes were normalized by total intracranial volume (TIV). Across all subjects, except 3rd
and 4th, all ventricles and TIV were larger in ASD. When the subjects were sub-grouped by VIQ, the group
differences were the largest in the low-VIQ sub-group but decreased with VIQ. For some ventricular, the
direction of group difference even flipped in the high-VIQ sub-group i.e. volumes were larger in TDC for
some ventricles.
91
4.4.5.2 Need for better features and better methods
The morphometric features estimated using the preprocessing tools may not accurately
reflect the underlying morphology for several reasons. First, T1-weighted MRIs do not have
enough information to distinguish between all the brain structures. For example, they do not have
enough contrast between inter-sulcal CSF and skull. Second, preprocessing tools make several
assumptions which do not always hold in all subjects, leading to erroneous estimates. For example,
Morey et al. (2009) showed that the correlation of the amygdala volume from manual tracing with
the volumes estimated from FS and FSL were only 0.56 and 0.24 respectively. This suggests that
the uncertainty or the noise added during the automatic extraction of brain features is one of the
major obstacles to automatic ASD detection using brain morphometry. So, the development of
better automatic preprocessing tools is needed. In addition, preprocessing tools can be optimized
to utilize data fusion techniques to decrease the uncertainty in the estimation of brain features. For
example, T1 and T2-weighted images contain complementary information which can be utilized
for more robust estimation of brain features.
In this study, we utilized regional-level features i.e. values describing the morphometric
features of the brain regions. There may be hindrances in achieving high classification performance
through the exploration of these features, such as the need for impractically large sample sizes,
suggesting directions for future work. Features from different spatial scales can be unified to
improve the predictive performance of the classification using MRI. In addition, shape features
can be incorporated in the classification model. Moreover, hierarchical architecture such as
convolution neural network (CNN) (LeCun et al., 2015) can be applied for automatic identification
and extraction of features. This may help to improve the predictive performance and may also
provide better insights into the neuroanatomical underpinnings of ASD due to the hierarchical
92
nature of its features. Additionally, it would avoid the use of preprocessing tools and hence the
uncertainty associated with them.
4.5 Conclusion
This study demonstrated that brain alterations in ASD are highly heterogeneous, and the
heterogeneity makes the understanding and diagnosis of ASD using brain morphometry a
challenging problem. We showed that the heterogeneity can be mitigated when the demographics
and behavioral (DB) measures such as autism severity, VIQ and age are utilized in conjunction
with brain morphometric features, hence making the problem of understanding and diagnosis of
ASD easier. Utilizing DB measures, the ASD vs. TDC classification success rate was significantly
improved. Focusing on relatively homogenous sub-groups of ASD by sub-grouping the subjects
according to autism severity, VIQ and age, interesting and valuable relationships between these
DB measures and brain morphometry were observed. The heterogeneity in ASD brain
morphometric differences demonstrated in this study explains the inconsistent neuroanatomical
findings in ASD and the low classification success in large multi-site data. The results of this study
suggest that identifying brain biomarkers in relatively homogenous sub-groups defined by different
measures such as age, IQs, severity, sex, handedness etc., compared to identifying markers across
the whole ASD population, is easier and provides better insights on neuroanatomical
underpinnings of ASD.
93
5 CHAPTER 5: EARLY DETECTION OF AUTISM
USING BRAIN MORPHOLOGY
A comprehensive investigation of early brain alterations in ASD is critical for
understanding the neuroanatomical basis of ASD and for establishing methods for early diagnosis.
Most previous brain imaging studies in ASD, however, are based on children older than 6 years—
well after the median age of ASD diagnosis (46 months). In this study, we use brain images that
were collected as part of routine clinical scans from patients who were later diagnosed with ASD.
Using 15 subjects with ASD and 18 control (CTR) subjects of age 3 to 4 years, we perform a
comprehensive comparison of different brain morphometric and image intensity features. We find
that, although TIV of ASD was 5.5% larger than CTR, brain volumes of many other brain areas
(as a percentage of TIV) were smaller in ASD and can be partly attributed to larger (>10%)
ventricles in ASD. The folding indices of 58 of 68 cortices were higher in ASD indicating increased
gyrification. The larger TIV in ASD was related to larger surface area and increased amount of
cortical folding but was relatively independent of cortical thickness. Further, predominately in the
frontal and temporal regions the white matter was less bright suggesting myelination deficit. We
achieved 95% AUC in ASD classification using all brain features. When the classification was
performed separately for each brain feature type, image intensity yielded the highest predictive
power (95% AUC), followed by folding index (69%), volume (69%), and surface area (68%). The
high degree of success in prediction indicates that application of advanced analytic methods on
brain features holds promise for aiding early identification of ASD. To our knowledge this is the
first study to leverage a clinical imaging archive to investigate early brain markers in ASD.
94
5.1 Introduction
Early detection of ASD is important for three main reasons. First, early detection allows for
the application of early intervention methods. It has been shown that early intervention is effective
in reducing the impact of impairments (Dawson et al., 2010) and may result in more positive long-
term outcomes for the child (Pickles et al., 2016; Rogers & Vismara, 2010). Second, the early
detection of ASD can improve our knowledge of ASD etiology by separating out the effects of post-
natal environmental risk factors of ASD. The risk for ASD is influenced by genetic and pre, peri,
and post-natal environmental factors (Sandin et al. 2014). The environmental factors related to the
risk for ASD can interact with genetic factors making the identification of etiology of ASD
extraordinarily complex. So, if ASD can be detected early very few post-natal risk factors would
come into play and hence easier to understand its etiology. Third, ASD detection may be relatively
easier at younger age because of the less complex ASD etiology and manifestations.
Currently ASD diagnosis is based on a clinical assessment of the individual's behavior and
intellectual abilities. This approach is limited, however, as early diagnosis is not straightforward.
According to a recent report by CDC (2014), the median age of ASD diagnosis is 46 months. In
addition, behaviorally based diagnosis procedures can be subjective, time consuming, and
inconclusive due to factors such as comorbidity (Close et al., 2012). In addition, such an approach
does not provide insight into the neural underpinnings and the underlying etiology of ASD since
it is based only on behavioral symptoms. MRI is a non-invasive tool widely used to capture brain
morphometry. ASD diagnosis based on MRI can be objective and can potentially be utilized even
at the prenatal and neonatal stage (Glenn, 2010) and hence can be a useful tool for brain biomarker
discovery for early detection of ASD. MRI has already been successfully utilized for early diagnosis
of brain disorders such as Alzheimer’s (Frisoni et al., 2010; Hampel et al., 2010).
95
Most MRI studies that investigate brain alterations in ASD have used subjects older than
6 years; very few studies’ participants are in early childhood (<4 years) (Auzias et al., 2014b; Hazlett
et al., 2017; Nordahl et al., 2012). Of the studies using subjects less than age 4 years, only a small
subset of morphometric features such as volume and shape have been investigated. As a result,
there is a significant knowledge gap in brain imaging research on early brain alterations of ASD.
A comprehensive investigation of brain alterations in early childhood population with ASD is
needed to better characterize this disorder and merits special attention due to the importance of
early detection of ASD.
In this study, we compare brain features of ASD subjects in early childhood (3 to 4 years)
to non-ASD (CTR) subjects of the same age group using a comprehensive set of morphometric
and intensity features of brain cortical and sub-cortical structures. Multi-variate brain anomalies
are identified in a purely data-driven fashion by machine learning models trained for ASD vs. CTR
classification using the morphometric and intensity features.
5.2 Material and Methods
5.2.1 Subjects
We used clinical imaging archive of Geisinger Health System in this study. Initially, we had
access to the brain images of 413 CTR and 167 ASD subjects who were less than 4 years of age.
In total, there were 777 coronal 3D T1 images (631 from 413 CTR and 146 from 167 ASD); note
that some subjects did not have coronal 3D T1 images. The flowchart of subject selection process
is presented in Figure 5.1. We performed a strict data quality check to remove images with imaging
artifacts, motion, lesions, and abnormally large ventricles. The remaining images (247 from CTR
and 122 from ASD) were processed using FS recon-all workflow with default settings. FS
segmentation failed for 12 images (5 from CTR and 7 from ASD) and were excluded from the
96
study. We then removed 120 images of CTRs mental disorders identified using ICD 9 codes. This
resulted in a total of 112 images from CTR and 115 from ASD subjects. Since ASD is highly
prevalent in males (Fombonne, 2005) and also to avoid gender confounds (Lai et al., 2013), 54
female subjects (20 CTR, 34 ASD) were excluded.
The lower threshold for age was decided based on the trade-off between the need to
investigate the brain morphological alterations in ASD as early as possible and the applicability of
FS on the brain images of very young subjects. FS which was initially designed for adult subjects
may not provide truthful results for subjects younger than 3 years. Images of subjects as young as
3 years have been successfully segmented by FS (Retico et al., 2016). So, we decided to use the
subjects as young as 3 years. The upper threshold was chosen as 4 years based on the trade-off
between the age dependent brain alterations in ASD (Lin et al., 2015) and sample size
requirements. Finally, we used 41 images (23 from CTR, 18 from ASD) from male subjects of age
3 to 4 years.
Figure 5.1: Subject Selection Flowchart

Initially we had access to 777 coronal 3D T1 images from Geisinger Health System clinical archive; note
that some subjects did not have coronal 3D T1 images. After excluding inferior quality images, images for
which the segmentation failed, controls with mental disorders, female subjects, subjects outside the age
group of 36 to 48 months, 33 images were retained at the end and were used for further analyses.
97
5.2.2 Brain Features Extraction
Brain features were extracted using the recon-all workflow of FS v. 5.3.0 (Dale et al., 1999a;
Fischl et al., 1999a, 2002; Ségonne et al., 2004). The volume of 40 sub-cortical structures from Aseg
atlas (Fischl et al., 2002) were extracted. Similarly, volume, surface area, Gaussian curvature, mean
curvature, folding index, curvature index, thickness mean, thickness standard deviation (std.),
intensity mean, and intensity std. of 34 cortical structures from Desikan-Killiany atlas (Desikan et al.,
2006) were extracted. In total, 687 brain morphometric and intensity features were derived for
each image. Intensity mean and intensity std. of a brain structure is the mean and standard
deviation respectively of the voxel intensities within a certain brain structure. The volume features
of each subject were normalized by TIV since relative volumes are easy to compare and are more
robust against scanner effects (Takao et al., 2011). To eliminate the effects of image intensity biases,
intensity and intensity standard deviation features of each image were normalized by their
respective sums across all the brain structures.
The cortical and sub-cortical segmentation results of FS were assessed using the ENIGMA
protocols (http://enigma.ini.usc.edu/protocols/imaging-protocols/). Five CTR images and one
ASD image with imperfect segmentations were excluded from the study. After that, two subjects
had two images each and the image with lower quality was excluded for each subject. The
remaining 18 CTR and 15 ASD images were used in the subsequent analyses. Finally, a group of
15 males with ASD (42.6 ± 3.5 months; age range = 37.7 to 47.3 months) and 18 CTR males (41.4
± 2.9 months; age range = 37.0 to 47.0 months) were used in this study. There was no difference
in the age distribution of ASD and CTR subjects (Kolgmogorov-Smirnov statistic = 0.23, p-value
= 0.76).
98
5.2.3 Univariate Analysis
The statistical significances of ASD vs. CTR brain feature differences were estimated using
two sample t-tests. For each feature type, multiple comparisons correction was performed using
false discovery rate (Benjamini & Hochberg, 1995). The effect size of the differences were
quantified by Cohen’s d (Baron-Cohen et al., 2000).
5.2.4 Multivariate Analysis: ASD vs. CTR Classification
ASD vs. CTR classification was performed by RF (Breiman, 2001) classification models
trained with brain morphometric and intensity features. The classification flow chart is presented
in Figure 5.2. At first, the models were built with all 637 brain features. In addition, separate models
for each feature type were trained. Area under the ROC curve (AUC) was used to measure the
success of classification and the average classification success was estimated by 5-fold cross-
validation with stratified folds. Scikit-learn 0.17.1 (Pedregosa & Varoquaux, 2011) was used to
perform all the multivariate analyses.
‘Information gain’ measure was used to measure the quality of a node split while growing
decision trees. The following hyper parameters were used for optimal model selection: max_depth,
the number of features used to judge the quality of the impurity of a node, min_samples_split, the
minimum number of samples required in a node for further splitting, and min_samples_leaf, the
minimum number of samples required to be at a leaf node. Hyper parameter optimization was
performed through random search (Bergstra & Bengio, 2012).
99
Figure 5.2: Classification Flowchart.
Classification flowchart. Independent experiments were performed for each feature type. For each
classification experiment, the model success was quantified by AUC metric and the average model
assessment was estimated by 5–fold cross-validation. The 5-folds were generated by stratified sampling to
ensure dissimilarity between the folds. Each classification was performed using RF classifier where the node
purity was quantified by ‘Information Gain’. The optimum hyper parameters (max_depth, min_samples_split,
min_samples_leaf) were estimated using random search. Feature importance scores across the 5 folds were
averaged.
5.3 Results
5.3.1 ASD vs. CTR Brain Differences
The brain features for which ASD vs. CTR differences were statistically significant (p <
0.05, uncorrected) are presented in Figure 5.3. Effect sizes of all the differences were moderate to
large (Cohen’s d > 0.5); see Figure 9.1 in Appendix C.
100
Figure 5.3: Statistically Significant Brain Alterations.
Brain features with statistically significant differences (at p =0.05, uncorrected for multiple comparisons).
The numeric text near the points denote p-values.
Area: Statistically significant area features were mainly located in frontal, temporal,
supramarginal, and posterior cingulate regions. Except for the area of the left temporal pole, the
other 8 statistically significant area features were larger in ASD. The thickness mean of entorhinal
gyrus was greater in ASD and the effect size was large (d > 1; p = 9E-4) whereas the thickness std.
101
of the right entorhinal gyrus was smaller in ASD (d = 0.75; p = 0.035). Similarly, thickness std. of
the right superior frontal, fusiform, and caudal middle frontal gyri were larger in ASD (d > 0.60).
The total white surface area (GM/WM surface) was 7.2% larger in ASD without statistical
significance (d = 0.53, p = 0.14).
Volume: Most of the volumes with statistically significant differences were smaller in ASD
except for the right postcentral, supramarginal, and 5th ventricle volumes. However, one should
note that these reported volume features have been normalized by TIV. Raw volumes of these
structures were in fact larger in ASD but the differences were not statistically significant. This
discrepancy is mainly because TIV in ASD was larger than in CTR by 5.5% (d = 0.4, p = 0.17);
see Figure 5.4. Larger TIV in ASD was mainly due to the larger ventricles. All ventricles were
larger (>10%) in ASD, in particular the 5th ventricle was 290% larger in ASD (p = 0.015). Total
ventricular CSF volume was 27.89% larger in ASD (d = 0.38, p = 0.28) and after normalizing by
TIV, it was 19.14% larger (d = 0.29, p = 0.42). All global raw volumes were larger in ASD, but
when normalized by TIV, most of them were smaller in ASD. After normalizing, cerebral GM
was slightly larger in ASD whereas cerebellum GM was 5% smaller in ASD (d = -0.45, p = 0.20)
and cerebral WM was 2.1% smaller in ASD (p = 0.57) whereas cerebellum WM was 3.5% larger
in ASD (p = 0.6).
Folding index and curvature index. The folding index and curvature index features
with statistically significant differences were from frontal, temporal, cingulate, postcentral, and
precuneus regions and all were larger in ASD. Folding indices of 58 out of 68 cortices were higher
in ASD. The average amount of cortical folding (average of the folding indices of the cortices) was
12.7% greater in ASD with statistical significance (ASD = 62.2, TDC = 55.2, d = 0.71, p = 0.05).
The differences in curvature index of the left posterior cingulate gyrus and the folding index of the
102
right precuneus gyrus were large (d >1). Similarly, the Gaussian curvature of the left posterior
cingulate gyrus and left pericalcarine gyri were larger in ASD.
Intensity. In image intensity mean features, there were two distinct type of differences—
in general, intensity mean in WM regions were greater in ASD while intensity mean in sub-cortical
regions were lower in ASD. Intensity mean of the WM structures near frontal, supramarginal,
precuneus, precentral, and pars opercularis gyri were lower in ASD. In contrast, intensity mean of
the right thalamus, left caudate, left putamen, and left accumbens were higher in ASD. In general,
intensity std. of WM near gyri was higher in ASD whereas, intensity std. in cerebellum WM and
corpus callosum (not significant) were smaller in ASD.
103
●
40 d=0.43
p=0.23
●
30 d=0.34
p=0.34 ●
d=0.38
p=0.28
volume type
● ●
d=0.32 ● raw
ASD−CTR %
20 d=0.26
p=0.37 ● ●
p=0.46
d=0.5
d=0.29 ● normalized
p=0.42
p=0.16
●
● d=0.24
● ASD > CTR
d=0.17 p=0.51 ●
10 ● p=0.63
d=0.31 ● CTR > ASD
d=0.43 p=0.38
p=0.22
● ●
d=0.49 d=0.51 ●
p=0.17 p=0.15 d=0.41 ●
● d=0.18 ●
d=0.21
p=0.25 d=0.19
0 ● ● p=0.59 p=0.6 p=0.54
d=0.09 d=0.02 ●
p=0.8 p=0.96 d=−0.14 ● ●
d=−0.17 ●
p=0.69 d=−0.2 d=−0.21
p=0.57 p=0.63
● p=0.56
d=−0.45
p=0.2 ●
d=−0.61
−10 p=0.09
e
e
m
um
um
um
um
um
m
lu
lu
lu
lu
lu
lu
lu
lu
ol
ol
ol
ol
ol
vo
vo
vo
vo
vo
_v
_v
_v
_v
_v
__
__
__
__
__
__
__
_
r_
e_
_
IV
la
le
le
le
SF
te
te
te
te
te
te
cl
da
r ic
ric
ric
eT
at
at
at
at
at
at
tr i
−C
nt
nt
nt
yg
M
−M
en
ar
Ve
Ve
Ve
y−
y−
y−
te
Am
−V
te
te
ul
hi
d−
h−
ra
ra
ra
hi
hi
ric
l
al
ft−
lW
ra
−G
−G
3r
4t
−W
er
nt
te
l−
Le
ra
l−
at
l
ve
La
ra
ta
eb
ta
−L
llu
To
l−
eb
lu
ft−
er
To
e
ta
ht
el
er
C
eb
Le
To
ig
eb
C
er
R
er
C
Figure 5.4: Global Volume Features

Global Volume Features. TIV was larger in ASD and was mainly due to abnormally large ventricles.
Normalized global volumes were smaller in ASD.
5.3.2 ASD vs. CTR Classification
ASD vs. CTR Classification AUC scores for each feature type are presented in Figure 5.5.
The point and error bar represent the mean and standard deviation of the AUC scores from 5 test
folds.
104
Figure 5.5: ASD vs. CTR Classification Scores.
Mean classification AUC for all brain features and each feature type across 5 folds of testing. The error bar
represents the standard deviation of the AUC scores of 5 folds.
An average AUC of 0.92 was achieved under 5-fold cross-validation when all brain features
were used. When classification was performed separately for each brain feature type, we found that
most of the predictive power came from intensity mean (AUC = 0.83), folding index (AUC=0.69),
volume (AUC=0.69), and area (AUC=0.69) features. When intensity mean and intensity std.
features were used together, AUC of 0.95 was achieved. Thickness features yielded near chance
classification success rates.
5.3.3 Important Features for ASD vs. TDC Classification
The important features in classification, only for the feature types which yielded high
AUCs—intensity mean, intensity std., folding index, volume, and area—are presented. See Figure
5.6 for important brain features when all features were used for classification (ALL). See Figure 5.7
for important brain morphology features, and see Figure 5.8 for important brain intensity features.
After sorting the importance scores of the features in descending order, the features whose
cumulative sum of scores was at least 50% of the total importance scores were deemed as important
for classification. The length of the bar corresponding to a feature is proportional to its contribution
105
for the classification, relative to the most important feature. The numbers at the left of the bars
represent ASD-CTR Cohen’s d and the stars represent statistical significance at 0.05 (*) and 0.01
(**).
In classification using all features, the most important feature was the intensity mean of the
WM neighboring the rostral middle frontal gyrus. Intensity means of WM connecting left temporal
pole and right caudal middle frontal gyrus, were also important for classification and were smaller
in ASD. Other important features were volumes of the left acumbens-area, the right ventral
diencephalon, the left and right putamen, the left inferior temporal gyrus, etc.
Volume. In classification using volume features (Figure 5.7a), volumes of the left putamen,
left inferior temporal gyrus, and left accumbens-area were important for classification and all were
smaller in ASD. One should note that these volume features have been normalized by the TIV of
respective subject.
Area. In classification using area features (Figure 5.7b), the surface areas of the left middle
temporal gyrus, left posterior cingulate gyrus, right bank of superior temporal sulcus (bankssts), and
right rostral middle frontal gyrus were important for classification. All important area features were
larger in ASD.
106
Figure 5.6: Important brain feature for ASD vs. CTR classification
The length of the bar corresponding to a feature is proportional to its contribution towards classification.
The numbers at the left of the bars represent ASD-CTR Cohen’s d (positive is represented by red) and the
stars represent statistical significance at 0.05 (*) and 0.01 (**).
Folding index. In classification using folding index features (Figure 5.7c), the left middle
temporal gyrus was the most important structure for classification. It was also the most important
structure while performing classification using area features. Other important structures were left
middle temporal, left rostral middle frontal, and left inferior parietal. All important folding index
features were larger in ASD.
107
Figure 5.7: Important morphology features for ASD vs. CTR classification.
108
Figure 5.8: Important intensity features for ASD vs. CTR classification.
Intensity mean: In classification using image intensity mean features (Figure 5.8a), the
most important feature was the intensity mean of the WM neighboring rostral middle frontal gyrus
109
and compared to CTR, it was lower in ASD (d = 0.77, p = 0.04). Note that the same region was
the most important feature in the classification using all features. The intensity means of the WM
neighboring the right caudal middle frontal, left temporal pole, and the right pars opercularis were
also important for classification and were smaller in ASD. Other important brain structures were
the left accumbens-area, WM hypo-intensities, and right vessel.
Intensity std. In classification using image intensity std. features (Figure 5.8b), the most
important feature was the intensity std. of the WM hypo-intensities. Most of the important brain
features were from WM regions.
5.4 Discussion
High ASD vs. CTR classification success using brain morphometric and intensity features
were achieved in this study. The important features for classification were mainly from frontal and
temporal regions and these regions have been consistently associated with ASD (Bigler et al., 2007;
Ha et al., 2015). Most of the discriminative or predictive power for classification came from the
intensity features followed by folding index, volume, and area. In summary, three main brain
alterations in ASD were noted: abnormally large ventricles, higher gyrification, and less intensity
mean in WM of frontal and temporal region. This is the first study to perform ASD vs. CTR
classification in a very young population (3 to 4 years) using comprehensive brain features and to
achieve high classification success rates.
Very few studies have performed similar investigations to identify brain alterations in ASD.
Xiao et al. (2016) achieved an AUC of 0.88 while classifying ASD vs. subjects with developmental
delay using cortical thickness features. However, in our study thickness features yielded near
chance classification success. This may be due to the fact that their comparison control group had
developmental delay.
110
5.4.1 Amygdala in ASD
The amygdala is one of the most studied brain structure in relation to ASD, mainly
motivated by the amygdala theory of ASD (Baron-Cohen et al., 2000). Our study found smaller
amygdala volume in ASD without statistical significance: left(1%), right (2.8%) for raw volumes
and left (7.5%), right (5.9%) for normalized volumes. Nordahl et al. (2012) has reported the
opposite, the amygdala being 6% and 9% larger in ASD during 2 to 4 years.
5.4.2 Early brain overgrowth in ASD
One of the most replicated findings in ASD is that toddlers with ASD (age 2–4 years) on
average have a larger head size than TDC (Carper et al., 2002; Courchesne et al., 2011b, 2011a;
Campbell et al., 2014; Hazlett et al., 2012). A recent study by Hazlett et al. (2017) also reported
5.5% larger TIV in high risk ASD compared to negative high risk ASD at the age of two years.
Our study also shows 5.5% larger TIV and 7 % in larger cortical surface area in ASD.
The larger TIV in ASD in our study was mainly due to the larger ventricular volumes in
ASD (27.9 % raw, 19.1% normalized) whereas other normalized global volumes were smaller in
ASD. Padilla et al. (2015) has also reported smaller total volumes of temporal, occipital, insular,
and limbic regions in ASD after adjusting for total brain volume.
5.4.3 Overgrowth is related to increase in cortical surface area but not thickness
In our study, cortical surface area was 7.2% larger in ASD compared that of controls. Eight
out of nine area features with statistically significant differences were larger in ASD. However,
there was only one statistically significant thickness feature. In addition, area features yielded an
AUC of 69% whereas thickness features yielded near chance classification success. A study by
Hazlett et al. (2011) has also reported no differences in cortical thickness but larger surface area in
ASD at both 2 years and 4.5 years. Moreover, a recent study by Hazlett et al. (2017) has also
111
reported 7% larger cortical surface area in ASD but no difference in cortical thickness.
Importantly, they also report significant contribution of area features but insignificant contribution
of thickness features in ASD classification. These results suggest that the early brain overgrowth in
ASD brain is related with the increase in cortical surface area but is relatively independent to the
cortical thickness.
5.4.4 Higher cortical folding in ASD brains and its relation to early brain overgrowth
and cortical expansion
Folding indices of most of the gyri (58 out of 68) were greater in ASD and folding index
features yielded AUC of 0.69. The cortex of ASD brains were 12.7% more folded in average. This
suggests that ASD brains generally exhibit more gyrification and the amount of cortical folding is
a potential biomarker for early detection of ASD. Similar to our study, Ecker et al. (2016) have
reported that ASD individuals had a significant increase in gyrification around the left pre and
post-central gyrus. A study by Katuwal et al. (2016c) reported that the folding index features from
frontal, temporal, lingual, and insular regions were important in ASD vs. CTR classification for
subjects aged 6-12 years old. Similarly, a study by Auzias et al. (2014) reported higher folding of
the right intraparietal, the left medial frontal, and the left central sulci in children with ASD.
The higher amount of cortical folding in ASD may be the aftereffect of early brain
overgrowth in ASD toddlers. We hypothesize that higher folding in ASD could be due to larger
compressive stress in the cortex produced by the tangential hyper-expansion of the cortical layer.
We also hypothesize that the abnormally larger ventricles in ASD induces additional compressive
stress in the cortex and hence more cortical folding. A popular hypothesis on brain cortical
gyrification is that the tangential expansion of the cortical layer relative to sublayers generates a
compressive stress, leading to the mechanical folding of the cortex (Tallinen et al., 2014, 2016); see
112
Figure 5.9. This hypothesis has been substantiated by both physical and numerical models of the
brain and has shown that gyrification increases with brain growth (Tallinen et al., 2016).
Figure 5.9: Cortical expansion theory on cortical gyrification

The tangential expansion of the cortical layer relative to sublayers (white matter) generates a compressive
stress in cortex (GM), leading to its mechanical folding (Tallinen et al., 2014, 2016)
A recent study by Hazlett et al. (2017) which reported very similar findings to our study in
TIV and total surface area of the cortex, also reports that the rate of cortical surface area expansion
significantly increased in individuals with ASD from 6 to 12 months. Most importantly, they also
report the association between the subsequent brain overgrowth and the emergence of social
deficits. These results support our understanding that the hyper-expansion of the cortex and
subsequent brain overgrowth is related to ASD. In this study, in addition to relating hyper-
expansion of the cortex to ASD, we also find that the hyper-expansion may be associated with the
increased cortical folding we observe in ASD. We found that there was a correlation of 0.68 (p =
0.005) for ASD and 0.78 (p = 0.0002) for TDC between the average amount of cortical folding
and total surface area of the cortex. Similarly, there was a correlation of 0.53 (p = 0.04) for ASD
and 0.65 (p = 0.03) for TDC between the average amount of cortical folding and TIV.
113
5.4.5 Larger Ventricles and extra-axial volume in ASD may cause additional cortical
folding
We suggest that the increase in cortical folding may not only be due to the cortical hyper-
expansion; see Figure 5.10. In addition to the hyper-expansion, the abnormally larger ventricles in
ASD produce additional mechanical stress in the cortex, thereby introducing additional folds
according to the model proposed by Tallinen et al. (2014). In this study, we note that the correlation
between average amount of cortical folding and total ventricular CSF was 0.56 (p = 0.02) for ASD,
but was -0.13 (p = 0.6) for TDC. The statistically significant positive correlation in ASD suggests
the possibility of some degree of cortical folding may be due to the larger ventricles. In contrast,
the non-significant correlation in TDC suggests that the cortical folding is not affected by the
ventricles because they are of normal sizes and hence do not extend additional compressive stress
to the cortex.
Furthermore, it is possible that the larger volume of extra-axial fluid (CSF in the sub-
arachnoid space) in ASD also contributes to the compressive stress in the cortex and hence more
folding. A study by Shen et al. (2013) has reported that extra-axial fluid in ASD was 25% more
than in low-risk typical infants at the age of 6 to 24 months. In our study, we could not perform
the analysis for the extra-axial fluid volume because FS does not output the measure since T1 MRI
does not contain enough information to discriminate between CSF and air as both appear dark.
114
Figure 5.10: Higher cortical folding in ASD brain.
Higher amount of cortical folding in ASD brain may be the aftereffect of the hyper expansion of cortex,
larger ventricles, and more extra-axial volume.
5.4.6 WM in frontal and temporal regions less myelinated in ASD
Image intensity mean of the WM neighboring frontal and temporal regions were smaller
in ASD and some of these features were the most important for classification. Less image intensity
in WM means it is less bright suggesting less myelination surrounding the axons. In addition, an
AUC of 0.83 was achieved with intensity mean features and an AUC of 0.95 was achieved
combining intensity mean and intensity std. features. Most importantly, most of the important
features in all of these cases were WM neighboring frontal and temporal regions. This suggests
myelination deficits in frontal and temporal regions may be a potential early brain marker of ASD.
Several previous studies have also reported WM myelination deficits in ASD (Croteau-
Chonka et al., 2016; Peters et al., 2012; Zinkstok et al., 2012). Zinkstok et al. (2012) reported that
individuals with ASD had significantly less myelin content in numerous brain regions and WM
tracts. In addition, they have reported that the observed myelination deficit increased with autism
115
severity. Similarly, Peters et al. (2012) have reported myelination deficits in WM in autistic brains
of age 0.5 to 25 years. A study by Lazar et al. (2014) has reported that in ASD males of 18 to 25
years old, the axonal water fraction (a measure of axonal caliber and density) and intra-axonal
diffusivity were significantly lower and were associated with reduced processing speed of the brain.
All of these results suggest that brain disconnectivity resulting from insufficient development of the
myelin sheath may be one of the underlying causes of ASD and myelination deficits in frontal and
temporal regions in particular, may be a potential marker for early detection of ASD; see Figure
5.11.
Figure 5.11: Inefficient communication in ASD brain.

Insufficient development of the myelin sheath in frontal and temporal regions of ASD brain may cause
inefficient communication.
5.4.7 Brain overgrowth in ASD is followed by arrested growth and even degeneration
Several studies have reported that the early brain overgrowth in ASD might be followed
by arrested brain growth and even degeneration (Courchesne et al., 2011b; Schumann et al., 2010).
Our results also show similar results; see Figure 5.12a. From 3 to 4 years, TIV of ASD slightly
decreases to stabilize at the normal adult brain volume of 1.5 L whereas TIV of TDC continues to
116
increase. The limited sample size of our data does not have enough statistical power to discriminate
if TIV in ASD goes through arrested growth or degeneration or a combination of both. However,
from our results, we can certainly see that a normal brain continues to grow whereas the growth
rate of an autistic brain significantly decreases.
We noticed a similar pattern in the total cortical surface area with age; see Figure 5.12b.
The surface area in ASD remains somewhat constant from the age of 3 to 4 years whereas the
surface area in TDC continues to increase. Interestingly, the total ventricular volume decreases
steeply in ASD from the age of 3 to 4 years whereas remains constant in TDC; see Figure 5.12c.
Similarly, the average amount of cortical folding decreases in ASD but remains constant in TDC;
see Figure 5.12d. One important observation here is the decrease of the cortical folding in ASD
even when both cortical surface area and TIV are somewhat constant. However, the total
ventricular volume decreases during the time course and this decrease in ventricular volume may
be a reason behind the decrease in cortical folding. This pattern corroborates our hypothesis that
the larger ventricles in ASD induce additional compression stress in the cortex forcing it to fold
more. Moreover, it can be noticed that the amount of cortical folding in TDC remains constant
even when both TIV and cortical surface area increase. This may be due to the fact that the cortical
folding of a normal brain, caused by the compressive stress of standard brain growth, is already
stabilized by the age of 3 years.
117
Figure 5.12: Arrested brain growth in ASD
a) TIV of ASD slightly decreases to stabilize at the normal adult brain volume of 1.5 L whereas TIV of
TDC continues to increase. b) The surface area in ASD remains somewhat constant from the age of 3 to 4
years whereas the surface area in TDC continues to increase. c) The total ventricular volume decreases
steeply in ASD from the age 3 to 4 years whereas remains constant in TDC. d) The average amount of
cortical folding decreases in ASD but remains constant in TDC. The cortical folding in ASD decreases
even when both cortical surface area and TIV are somewhat constant. This decrease in cortical folding is
most likely due to the decrease in the total ventricular volume.
5.5 Conclusion
118
Using brain features derived from MRI, we were able to classify ASD from CTR subjects
with high accuracies. We also identified three potential brain markers for early detection of ASD:
larger ventricles, higher amount of cortical folding, and myelination deficits particularly in frontal
and temporal regions. Further, we were able to show that higher cortical folding in ASD brains
may be an aftereffect of early brain overgrowth and the additional compression in the cortex due
the abnormally large ventricles. In addition, we showed that the growth of autistic brain
significantly decreases after the age of 3 years.
119
6 CHAPTER 6: CONCLUSION AND FUTURE
WORK
In this thesis, we sought imaging biomarkers for ASD detection by applying machine
learning on brain morphological featues captured by MRI data. We started by pointing out the
drawbacks of currently used behavioral based ASD diagnosis. We then discussed how MRI based
ASD detection can provide a better alternative for ASD detection because it is objective, can be
utilized for earlier diagnosis, and will facilitate better understanding of the neuroanatomical basis
of ASD. After that, we presented the current literature in ASD brain morphometry captured using
MRI. We pointed out two major problems in MRI based studies: inconsistent brain anatomical
findings in ASD and low ASD classification success in the large multisite ABIDE dataset. We
attributed the inconsistent findings and low classification success to methodological differences and
ASD heterogeneity; and then investigated each of these factors one by one. In Chapter 3, we
investigated the effect of methodological differences, particularly the choice of brain image
processing tools, on the final results. We found that the brain anatomical findings were affected by
the choice of tools to a great extent. We concluded by suggesting the improvement of the image
processing tools, cross-validating the findings across tools, and the use of machine learning to
capture the multi-variate patterns. In Chapter 4, we investigated the heterogeneity in ASD brain
morphology and its effects on ASD classification. We found that the ASD brain morphology is
highly heterogeneous and the heterogeneity can be mitigated and hence ASD classification
improved by utilizing the additional information from demographics and behavioral measure. We
concluded that investigating brain markers of ASD in more meaningful sub-groups is easier and
more insightful than across the whole spectrum. Finally, in Chapter 5, we were able to identify
120
biologically plausible brain markers for early detection of ASD and were able to detect ASD at the
age of 3 to 4 years with greater than 90% AUC.
6.1 Contributions
The main contributions of this thesis and their implications are discussed below.
6.1.1 We demonstrated MRI as a potential tool for ASD detection
Using brain images of both children and adult subjects, we demonstrated that MRI has a
potential to successfully detect ASD based on brain morphology. Applying machine learning on
MRI derived brain morphological features, data-driven potential brain markers were identified;
the identified markers are consistent with the current biological understanding of ASD. In addition,
we could successfully classify ASD subjects from non-ASD subjects in many cases. The results of
this thesis prove that applying machine learning on MRI data is a potential technique for early
detection of ASD and understanding its brain anatomical underpinnings.
6.1.2 We showed that the methodological differences are a cause of inconsistent brain
imaging findings
In Chapter 3, we showed that methodological differences, the differences in brain image
processing tools are a source behind the inconsistent brain anatomical findings in ASD and
neuroimaging in general. We found that the ASD vs. TDC group differences in brain tissue
volumes were highly dependent on the tools used to estimate the volume of these tissues. Form this
work which was published in Frontier in Neuroscience (Katuwal et al., 2016b), we made an effort to
caution the neuroimaging community about the effects of brain image processing tools and the
pitfalls of blindly using them. In this work, we concluded by suggesting the need for the
121
improvement of the image processing tools, cross-validating the findings across tools, and the use
of machine learning to capture the multi-variate patterns.
Improving the current brain image preprocessing tools to make them more accurate and
standardizing them is the best possible solution to remove the effects of methodological differences.
However, when the existing popular tools have to be used, we suggest investigating multi-variate
relationships in addition to univariate ones in brain alterations because multi-variate relationships
are more robust across methods and scanners. In this thesis, we have provided an example to
corroborate this view.
6.1.3 We showed that identifying brain biomarkers in sub-groups of ASD is easier
and more meaningful than across the whole spectrum
In Chapter 3 of this thesis, we demonstrated that the heterogeneity in ASD brain
morphometry can be mitigated by augmenting the information from demographics and behavioral
(DB) measures. We showed that the ASD classification can be improved significantly utilizing the
additional information from DB measures. We also demonstrated that identifying ASD brain
alterations in relatively homogenous sub-groups is easier and more insightful than across the whole
heterogeneous ASD spectrum. From this work, which was published in PLOS ONE (Katuwal et al.,
2016c), we were able to demonstrate how ASD includes a wide range of variations and each sub-
group might be distinct and might have to be treated separately. We showed that characterizing
the brain morphology of ASD across the whole spectrum is a very complex problem. This complex
problem can be solved by breaking it down to simpler sub-problems where each sub-problem
would be identifying the brain alterations in a sub type of ASD.
122
6.1.4 We identified potential brain markers for early detection of ASD
Using multiple features automatically extracted from brain images of young subjects (3 to
4 years), we achieved very high success rates (>90% AUC) in ASD vs. control classification. We
also identified three potential brain markers for early detection of ASD: larger ventricles, higher
amount of cortical folding, and myelination deficits particularly in frontal and temporal regions.
Further, we were able to show that higher gyrification in ASD brains may be an aftereffect of the
early brain overgrowth. Furthermore, we demonstrated that the higher amount of cortical folding
in ASD is due to the greater compressive stress in the cortex induced by both hyper-expansion of
the cortex and abnormally large ventricles.
To our knowledge this is the first study to leverage clinical imaging archives to investigate
early brain markers in ASD. The high degree of success in classification and the biological relevant
potential brain markers indicate that application of advanced analytic methods on brain features
holds promise for aiding early identification of ASD.
6.2 Future Work
6.2.1 Critical need for large longitudinal studies
Brain overgrowth that begins before two years of age is clearly one of the brain markers of
ASD. However, there have been very few longitudinal studies to investigate this overgrowth
(Courchesne et al., 2011a; Schumann et al., 2010; Hazlett et al., 2017). The sample sizes of these
studies are limited, especially considering the heterogeneity of ASD. The question of when
overgrowth starts, its underlying neurobiological causes, and its mechanical effects in the anatomy
of the brain is overarching. In addition, it is important to find the causal association between the
brain overgrowth and the autistic symptoms. We therefore see that a large longitudinal study,
123
preferably starting from the neonatal stage, is a future work with a very high importance. In
alignment with this critical need, Infant Brain Imaging Study (IBIS) (http://ibisnetwork.org/) is
collecting longitudinal brain scans of children at risk for ASD (i.e., younger siblings of older autistic
individuals) from 3 to 24 months of age.
6.2.2 Data-driven brain features
Most of the studies investigating the brain markers of ASD, including this thesis, have used
pre-defined sets of global brain features. It is likely that the set of these brain features might not
have enough information to discriminate sub-types of ASD and to classify ASD from the normal
population. First, our knowledge of ASD brain morphology is very limited and the findings on
ASD brain alteration have been highly inconsistent. Consequently, limited and/or misguided
information may inhibit us from identifying the pertinent brain features to characterize ASD.
Second, ASD is highly heterogeneous and the global brain features that are being used might be
too simplistic to effectively capture the brain alterations in ASD.
Data-driven brain features can contain much richer information and are objective. Deep
learning is one of the very promising techniques for the extraction of informative data-driven
features especially due to the fact that the extracted features are hierarchical in nature. In addition,
the use of deep learning directly on MRI scans also bypasses the problem of method dependent
brain features. Recently, there have been a few studies that have applied deep learning to MRI
data for survival prediction of ALS patients (van der Burgh et al. 2016), early detection of
Alzheimer’s (Liu et al. 2014), ASD (Hazlett et al. 2017), and other ailments.
Considering the same underlying methodology shared by the studies on different brain
disorders and the promise of data-driven techniques, I have initiated a personal project to develop
a unified framework to identify the brain markers of major brain disorders in data-driven fashion.
124
State-of-the-art machine learning techniques can be applied on brain scans of multiple modalities
in this proposed framework. This will allow us to detect and investigate the etiology and progression
of several brain disorders within a common framework. In addition, the framework will provide
an unbiased platform to investigate the comorbidities among the brain disorders. For detailed and
up to date information, please visit http://predictbrain.com.
125
BIBLIOGRAPHY
Alpaydin, E. (2010). Introduction to machine learning. MIT Press.
Amaral, D.G., Schumann, C.M. & Nordahl, C.W. (2008). Neuroanatomy of autism. Trends in
neurosciences. 31 (3). p.pp. 137–45.
American Psychiatric Association (2013). Diagnostic and Statistical Manual of Mental Disorders:
DSM-5. 5th Edn Arlington, VA: American Psychiatric Association.
Asbury, C. (2011). Brain imaging technologies and their applications in neuroscience. The Dana
Foundation. p.pp. 1–45.
Ashburner, J., Chen, C., Moran, R., Henson, R., Glauche, V. & Phillips, C. (2013). SPM8 Manual.
Ashburner, J. & Friston, K.J. (2005). Unified segmentation. NeuroImage. 26 (3). p.pp. 839–51.
Assaf, Y. & Pasternak, O. (2008). Diffusion tensor imaging (DTI)-based white matter mapping in
brain research: A review. Journal of Molecular Neuroscience. 34 (1). p.pp. 51–61.
Auzias, G., Breuil, C., Takerkart, S. & Deruelle, C. (2014a). Detectability of brain structure
abnormalities related to autism through MRI-derived measures from multiple scanners. In:
IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI). June 2014, Ieee,
pp. 314–317.
Auzias, G., Viellard, M., Takerkart, S., Villeneuve, N., Poinso, F., Fonséca, D. Da, Girard, N. &
Deruelle, C. (2014b). Atypical sulcal anatomy in young children with autism spectrum
disorder. NeuroImage: Clinical. 4. p.pp. 593–603.
Baron-Cohen, S., Ring, H. a., Bullmore, E.T., Wheelwright, S., Ashwin, C. & Williams, S.C.R.
(2000). The amygdala theory of autism. Neuroscience and Biobehavioral Reviews. 24 (3). p.pp. 355–
364.
126
Bates, D., Mächler, M., Bolker, B. & Walker, S. (2014). Fitting Linear Mixed-Effects Models using
lme4. arXiv. 67 (1). p.p. arXiv:1406.5823.
Bauer, E. & Kohavi, R. (1999). An Empirical Comparison of Voting Classification Algorithms:
Bagging, Boosting, and Variants. Machine Learning. 36 (1/2). p.pp. 105–139.
Bazazeh, D. & Shubair, R. (2016). Comparative study of machine learning algorithms for breast
cancer detection and diagnosis. In: 2016 5th International Conference on Electronic Devices, Systems
and Applications (ICEDSA). December 2016, IEEE, pp. 1–4.
Benjamini, Y. & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful
approach to multiple testing. Journal of the Royal Statistical Society. 57 (1) p.pp. 289–300.
Bergstra, J. & Bengio, Y. (2012). Random Search for Hyper-Parameter Optimization. Journal of
Machine Learning Research. 13. p.pp. 281–305.
Betancur, C. (2011). Etiological heterogeneity in autism spectrum disorders: More than 100 genetic
and genomic disorders and still counting. Brain Research. 1380. p.pp. 42–77.
Bigler, E.D., Mortensen, S., Neeley, E.S., Ozonoff, S., Krasny, L., Johnson, M., Lu, J., Provencal,
S.L., McMahon, W. & Lainhart, J.E. (2007). Superior temporal gyrus, language function, and
autism. Developmental neuropsychology. 31 (2). p.pp. 217–38.
Le Bihan, D., Mangin, J.-F., Poupon, C., Clark, C.A., Pappata, S., Molko, N. & Chabriat, H.
(2001). Diffusion tensor imaging: Concepts and applications. Journal of Magnetic Resonance
Imaging. 13 (4). p.pp. 534–546.
Boekel, W., Wagenmakers, E.J., Belay, L., Verhagen, J., Brown, S. & Forstmann, B.U. (2015). A
purely confirmatory replication study of structural brain-behavior correlations. Cortex. 66.
p.pp. 115–133.
Boukhris, T., Sheehy, O., Mottron, L. & Bérard, A. (2015). Antidepressant Use During Pregnancy
and the Risk of Autism Spectrum Disorder in Children. JAMA Pediatrics. p.p. 1.
127
Bradley, A.P. (1997). The use of the area under the ROC curve in the evaluation of machine
learning algorithms. Pattern Recognition. 30 (7). p.pp. 1145–1159.
De Brébisson, A. & Montana, G. (2015). Deep neural networks for anatomical brain segmentation.
IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 2015–Octob.
p.pp. 20–28.
Breiman, L. (2001). Random Forrest. Machine Learning. p.pp. 1–33.
Breiman, L. (1996). Technical note: Some properties of splitting criteria. Machine Learning. 24. p.pp.
41–47.
Brown, N. & Sandholm, T. (2017). Safe and Nested Endgame Solving for Imperfect-Information Games Prior
Approaches to Endgame Solving in.
Buckner, R.L., Head, D., Parker, J., Fotenos, A.F., Marcus, D., Morris, J.C. & Snyder, A.Z. (2004).
A unified approach for morphometric and functional data analysis in young, old, and
demented adults using automated atlas-based head size normalization: reliability and
validation against manual measurement of total intracranial volume. NeuroImage. 23 (2). p.pp.
724–38.
Button, K.S., Ioannidis, J.P. a, Mokrysz, C., Nosek, B. a, Flint, J., Robinson, E.S.J. & Munafò,
M.R. (2013). Power failure: why small sample size undermines the reliability of neuroscience.
Nature reviews. Neuroscience. 14 (5). p.pp. 365–76.
Calderoni, S., Retico, A., Biagi, L., Tancredi, R., Muratori, F. & Tosetti, M. (2012). Female
children with autism spectrum disorder: An insight from mass-univariate and pattern
classification analyses. NeuroImage. 59 (2). p.pp. 1013–1022.
Callaert, D. V., Ribbens, A., Maes, F., Swinnen, S.P. & Wenderoth, N. (2014). Assessing age-
related gray matter decline with voxel-based morphometry depends significantly on
segmentation and normalization procedures. Frontiers in Aging Neuroscience. 6 (JUN). p.pp. 1–
128
14.
Campbell, D.J., Chang, J. & Chawarska, K. (2014). Early generalized overgrowth in autism
spectrum disorder: Prevalence rates, gender effects, and clinical outcomes. Journal of the
American Academy of Child and Adolescent Psychiatry. 53 (10). p.pp. 1063–1073.
Carper, R.A., Moses, P., Tigue, Z.D. & Courchesne, E. (2002). Cerebral lobes in autism: early
hyperplasia and abnormal age effects. NeuroImage. 16 (4). p.pp. 1038–51.
Caruana, R. & Niculescu-Mizil, A. (2006). An empirical comparison of supervised learning
algorithms. Proceedings of the 23rd international conference on Machine learning. C (1). p.pp. 161–168.
CDC (2014). Prevalence of autism spectrum disorder among children aged 8 years - autism and developmental
disabilities monitoring network, 11 sites, United States, 2010.
Chakraborty, M., Pal, S., Pramanik, R. & Ravindranath Chowdary, C. (2016). Recent
developments in social spam detection and combating techniques: A survey. Information
Processing & Management. 52 (6). p.pp. 1053–1073.
Chawla, N. V. (2005). Data Mining for Imbalanced Datasets: An Overview. Data Mining and
Knowledge Discovery Handbook. p.pp. 853–867.
Chen, R., Jiao, Y. & Herskovits, E.H. (2011). Structural MRI in autism spectrum disorder. Pediatric
research. 69 (5 Pt 2). p.p. 63R–8R.
Close, H.A., Lee, L.-C., Kaufmann, C.N. & Zimmerman, A.W. (2012). Co-occurring Conditions
and Change in Diagnosis in Autism Spectrum Disorders. Pediatrics. 129 (2). p.pp. e305–e316.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Statistical Power Analysis for
the Behavioral Sciences. 2nd p.p. 567.
Collins, D.L., Neelin, P., Peters, T.M. & Evans, A.C. (1994). Automatic 3D intersubject
registration of MR volumetric data in standardized Talairach space. Journal of computer assisted
tomography. 18 (2). p.pp. 192–205.
129
Courchesne, E., Campbell, K. & Solso, S. (2011a). Brain growth across the life span in autism:
Age-specific changes in anatomical pathology. Brain Research. 1380. p.pp. 138–145.
Courchesne, E., Karns, C.M., Davis, H.R., Ziccardi, R., Carper, R. a, Tigue, Z.D., Chisum, H.J.,
Moses, P., Pierce, K., Lord, C., Lincoln, A.J., Pizzo, S., Schreibman, L., Haas, R.H.,
Akshoomoff, N. a & Courchesne, R.Y. (2011b). Unusual brain growth patterns in early life in
patients with autistic disorder: an MRI study. Neurology. 57 (24). p.pp. 2111–2111.
Cox, R.W. (1996). AFNI: Software for Analysis and Visualization of Functional Magnetic
Resonance Neuroimages. Computers and Biomedical Research. 29 (3). p.pp. 162–173.
Craig, M.C., Zaman, S.H., Daly, E.M., Ter, W.J.C., Robertson, D.M.W., Hallahan, B., Toal, F.,
Reed, S., Ambikapathy, A., Ammer, M.B.R., Murphy, C.M. & Murphy, D.G.M. (2007).
Women with autistic-spectrum disorder:magnetic resonance imaging study of brain anatomy.
British journal of psychiatry. 191. p.pp. 224–228.
Croteau-Chonka, E.C., Dean, D.C., Remer, J., Dirks, H., O’Muircheartaigh, J. & Deoni, S.C.L.
(2016). Examining the relationships between cortical maturation and white matter
myelination throughout early childhood. NeuroImage. 125. p.pp. 413–421.
D’Mello, A.M., Crocetti, D., Mostofsky, S.H. & Stoodley, C.J. (2015). Cerebellar gray matter and
lobular volumes correlate with core autism symptoms. NeuroImage: Clinical. 7. p.pp. 631–639.
Dale, A.M., Fischl, B. & Sereno, M.I. (1999a). Cortical Surface-Based Analysis:I. Segmentation,
Surface Reconstruction. NeuroImage. 9 (2). p.pp. 179–194.
Dale, A.M., Fischl, B. & Sereno, M.I. (1999b). Cortical surface-based analysis. I. Segmentation
and surface reconstruction. NeuroImage. 9 (2). p.pp. 179–94.
Dawson, G., Rogers, S., Munson, J., Smith, M., Winter, J., Greenson, J., Donaldson, A. & Varley,
J. (2010). Randomized, Controlled Trial of an Intervention for Toddlers With Autism: The
Early Start Denver Model. Pediatrics. 125 (1). p.pp. e17–e23.
130
Demirakca, T., Sartorius, A., Ende, G., Meyer, N., Welzel, H., Skopp, G., Mann, K. & Hermann,
D. (2011). Diminished gray matter in the hippocampus of cannabis users: Possible protective effects of
cannabidiol.
Desikan, R.S., Ségonne, F., Fischl, B., Quinn, B.T., Dickerson, B.C., Blacker, D., Buckner, R.L.,
Dale, A.M., Maguire, R.P., Hyman, B.T., Albert, M.S. & Killiany, R.J. (2006). An automated
labeling system for subdividing the human cerebral cortex on MRI scans into gyral based
regions of interest. NeuroImage. 31 (3). p.pp. 968–80.
Dinov, I., Lozev, K., Petrosyan, P., Liu, Z., Eggert, P., Pierce, J., Zamanyan, A., Chakrapani, S.,
van Horn, J., Parker, D.S., Magsipoc, R., Leung, K., Gutman, B., Woods, R. & Toga, A.
(2010). Neuroimaging study designs, computational analyses and data provenance using the
LONI pipeline. PLoS ONE. 5 (9).
Do, C.B. & Batzoglou, S. (2008). What is the expectation maximization algorithm? Nature
Biotechnology. 26 (8). p.pp. 897–899.
Dougherty, C.C., Evans, D.W., Katuwal, G.J. & Michael, A.M. (2016a). Asymmetry of fusiform
structure in autism spectrum disorder: trajectory and association with symptom severity.
Molecular Autism. 7 (1). p.p. 28.
Dougherty, C.C., Evans, D.W., Myers, S.M., Moore, G.J. & Michael, A.M. (2016b). A
Comparison of Structural Brain Imaging Findings in Autism Spectrum Disorder and
Attention-Deficit Hyperactivity Disorder. Neuropsychology Review. 26 (1). p.pp. 25–43.
Ecker, C., Andrews, D., Dell’Acqua, F., Daly, E., Murphy, C., Catani, M., Thiebaut De Schotten,
M., Baron-Cohen, S., Lai, M.C., Lombardo, M. V., Bullmore, E.T., Suckling, J., Williams,
S., Jones, D.K., Chiocchetti, A. & Murphy, D.G.M. (2016). Relationship between cortical
gyrification, white matter connectivity, and autism spectrum disorder. Cerebral Cortex. 26 (7).
p.pp. 3297–3309.
131
Ecker, C., Marquand, A., Mourão-Miranda, J., Johnston, P., Daly, E.M., Brammer, M.J.,
Maltezos, S., Murphy, C.M., Robertson, D., Williams, S.C. & Murphy, D.G.M. (2010a).
Describing the brain in autism in five dimensions--magnetic resonance imaging-assisted
diagnosis of autism spectrum disorder using a multiparameter classification approach. The
Journal of neuroscience : the official journal of the Society for Neuroscience. 30 (32). p.pp. 10612–10623.
Ecker, C., Rocha-Rego, V., Johnston, P., Mourao-Miranda, J., Marquand, A., Daly, E.M.,
Brammer, M.J., Murphy, C. & Murphy, D.G. (2010b). Investigating the predictive value of
whole-brain structural MR scans in autism: A pattern classification approach. NeuroImage. 49
(1). p.pp. 44–56.
Eggert, L.D., Sommer, J., Jansen, A., Kircher, T. & Konrad, C. (2012). Accuracy and Reliability
of Automated Gray Matter Segmentation Pathways on Real and Simulated Structural
Magnetic Resonance Images of the Human Brain Y. Fan (ed.). PLoS ONE. 7 (9). p.p. e45081.
Evans, A.C., Marrett, S., Neelin, P., Collins, L., Worsley, K., Dai, W., Milot, S., Meyer, E. & Bub,
D. (1992). Anatomical mapping of functional activation in stereotactic coordinate space.
NeuroImage. 1 (1). p.pp. 43–53.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters. 27 (8). p.pp. 861–
874.
Fellhauer, I., Zöllner, F.G., Schröder, J., Degen, C., Kong, L., Essig, M., Thomann, P.A. & Schad,
L.R. (2015). Comparison of automated brain segmentation using a brain phantom and
patients with early Alzheimer’s dementia or mild cognitive impairment. Psychiatry Research:
Neuroimaging. 233 (3). p.pp. 299–305.
Fischl, B. (2012). FreeSurfer. NeuroImage. 62 (2). p.pp. 774–81.
Fischl, B., Van Der Kouwe, A., Destrieux, C., Halgren, E., Ségonne, F., Salat, D.H., Busa, E.,
Seidman, L.J., Goldstein, J., Kennedy, D., Caviness, V., Makris, N., Rosen, B. & Dale, A.M.
132
(2004). Automatically Parcellating the Human Cerebral Cortex. Cerebral Cortex. 14 (1). p.pp.
11–22.
Fischl, B., Salat, D.H., Busa, E., Albert, M., Dieterich, M., Haselgrove, C., van der Kouwe, A.,
Killiany, R., Kennedy, D., Klaveness, S., Montillo, A., Makris, N., Rosen, B. & Dale, A.M.
(2002). Whole brain segmentation: automated labeling of neuroanatomical structures in the
human brain. Neuron. 33 (3). p.pp. 341–55.
Fischl, B., Sereno, M.I. & Dale, A.M. (1999a). Cortical Surface-Based Analysis II: Inflation,
Flattening, and a Surface-Based Coordinate System. NeuroImage. 9 (2). p.pp. 195–207.
Fischl, B., Sereno, M.I., Tootell, R.B. & Dale, a M. (1999b). High-resolution intersubject
averaging and a coordinate system for the cortical surface. Human brain mapping. 8 (4). p.pp.
272–84.
Floris, D.L., Chura, L.R., Holt, R.J., Suckling, J., Bullmore, E.T., Baron-Cohen, S. & Spencer,
M.D. (2013). Psychological correlates of handedness and corpus callosum asymmetry in
autism: The left hemisphere dysfunction theory revisited. Journal of Autism and Developmental
Disorders. 43 (8). p.pp. 1758–1772.
Fombonne, E. (2009). Epidemiology of pervasive developmental disorders. Pediatric Research. 65 (6).
p.pp. 591–598.
Fombonne, E. (2005). The Changing Epidemiology of Autism. Journal of Applied Research in
Intellectual Disabilities. 18 (4). p.pp. 281–294.
Friedman, J.H. (2000). Greedy Function Approximation: A Gradient Boosting Machine. Annals of
Statistics. 29. p.pp. 1189--1232.
Frisoni, G.B., Fox, N.C., Jack, C.R., Scheltens, P. & Thompson, P.M. (2010). The clinical use of
structural MRI in Alzheimer disease. Nature Reviews. 6 (2). p.pp. 67–77.
Ganz, M.L. (2007). The lifetime distribution of the incremental societal costs of autism. Archives of
133
pediatrics & adolescent medicine. 161 (4). p.pp. 343–349.
Gargaro, B.A., Rinehart, N.J., Bradshaw, J.L., Tonge, B.J. & Sheppard, D.M. (2011). Autism and
ADHD: How far have we come in the comorbidity debate? Neuroscience and Biobehavioral
Reviews. 35 (5). p.pp. 1081–1088.
Gelman, A. & Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. Cambridge
University Press.
Geschwind, D.H. & Levitt, P. (2007). Autism spectrum disorders: developmental disconnection
syndromes. Current opinion in neurobiology. 17 (1). p.pp. 103–11.
Glenn, O.A. (2010). MR imaging of the fetal brain. Pediatric Radiology. 40 (1). p.pp. 68–81.
Gotham, K., Pickles, A. & Lord, C. (2009). Standardizing ADOS Scores for a Measure of Severity
in Autism Spectrum Disorders. J Autism Dev Disord. 39 (5). p.pp. 693–705.
Gray, R.M. & M., R. (1990). Entropy and information theory. Springer-Verlag.
Greve, D.N. (2011). An Absolute Beginner’s Guide to Surface- and Voxel-based Morphometric
Analysis. Proceedings of the International Society for Magnetic Resonance in Medicine. i. p.pp. 1–7.
Groen, W., Teluij, M., Buitelaar, J. & Tendolkar, I. (2010). Amygdala and Hippocampus
Enlargement During Adolescence in Autism. Journal of the American Academy of Child & Adolescent
Psychiatry. 49 (6). p.pp. 552–560.
Gronenschild, E.H.B.M., Habets, P., Jacobs, H.I.L., Mengelers, R., Rozendaal, N., van Os, J. &
Marcelis, M. (2012). The effects of FreeSurfer version, workstation type, and Macintosh
operating system version on anatomical volume and cortical thickness measurements. S.
Hayasaka (ed.). PloS one. 7 (6). p.p. e38234.
Gupta, A. (2015). Unsupervised Learning of Disease Subtypes from Continuous Time Hidden
Markov Models of Disease Progression. PhD Thesis. (December).
Ha, S., Sohn, I.-J., Kim, N., Sim, H.J. & Cheon, K.-A. (2015). Characteristics of Brains in Autism
134
Spectrum Disorder: Structure , Function and Connectivity across the Lifespan. Experimental
Neurobiology. 24 (4). p.pp. 273–284.
Haar, S., Berman, S., Behrmann, M. & Dinstein, I. (2014). Anatomical Abnormalities in Autism?
Cerebral cortex. p.pp. 1–13.
Hampel, H., Frank, R., Broich, K., Teipel, S.J., Katz, R.G., Hardy, J., Herholz, K., Bokde,
A.L.W., Jessen, F., Hoessler, Y.C., Sanhai, W.R., Zetterberg, H., Woodcock, J. & Blennow,
K. (2010). Biomarkers for Alzheimer’s disease: academic, industry and regulatory
perspectives. Nature reviews. Drug discovery. 9 (7). p.pp. 560–574.
Hansen, T.I., Brezova, V., Eikenes, L., Håberg, A. & Vangberg, X.T.R. (2015). How does the
accuracy of intracranial volume measurements affect normalized brain volumes? sample size
estimates based on 966 subjects from the HUNT MRI cohort. American Journal of Neuroradiology.
36 (8). p.pp. 1450–1456.
Happé, F., Ronald, A. & Plomin, R. (2006). Time to give up on a single explanation for autism.
Nature neuroscience. 9 (10). p.pp. 1218–1220.
Hazlett, H.C., Gu, H., Munsell, B.C., Kim, S.H., Styner, M., Wolff, J.J., Elison, J.T., Swanson,
M.R., Zhu, H., Botteron, K.N., Collins, D.L., Constantino, J.N., Dager, S.R., Estes, A.M.,
Evans, A.C., Fonov, V.S., Gerig, G., Kostopoulos, P., Mckinstry, R.C., Pandey, J., Paterson,
S., Pruett Jr, J.R., Schultz, R.T., Shaw, D.W., Zwaigenbaum, L. & Piven, J. (2017). Early
brain development in infants at high risk for autism spectrum disorder. Nature Publishing Group.
542 (7641). p.pp. 348–351.
Hazlett, H.C., Poe, M., Gerig, G., Styner, M., Chappell, C., Smith, R.G., Vachet, C. & Piven, J.
(2011). Early Brain Overgrowth in Autism Associated with an Increase in Cortical Surface
Area Before Age 2 years. Archives of General Psychiatry. 68 (5). p.pp. 467–476.
Hazlett, H.C., Poe, M.D., Lightbody, A.A., Styner, M., MacFall, J.R., Reiss, A.L. & Piven, J.
135
(2012). Trajectories of early brain volume development in fragile X syndrome and autism.
Journal of the American Academy of Child and Adolescent Psychiatry. 51 (9). p.pp. 921–933.
Hornak, J.P. (1996). The basics of MRI.
Im, K., Lee, J.-M., Lee, J., Shin, Y.-W., Kim, I.Y., Kwon, J.S. & Kim, S.I. (2006). Gender
difference analysis of cortical thickness in healthy young adults with surface-based methods.
NeuroImage. 31 (1). p.pp. 31–38.
Jenkinson, M. & Smith, S. (2001). A global optimisation method for robust affine registration of
brain images. Medical image analysis. 5 (2). p.pp. 143–56.
Jeste, S.S. & Geschwind, D.H. (2014). Disentangling the heterogeneity of autism spectrum disorder
through genetic findings. Nat Rev Neurol. 10 (2). p.pp. 74–81.
Jiao, Y. (2011). Predictive models of autism spectrum disorder based on brain regional cortical
thickness. Autism. 50 (2). p.pp. 589–599.
Jiao, Y., Chen, R., Ke, X., Chu, K., Lu, Z. & Herskovits, E.H. (2010). Predictive models of autism
spectrum disorder based on brain regional cortical thickness. NeuroImage. 50 (2). p.pp. 589–
599.
Jumah, F., Ghannam, M., Jaber, M., Adeeb, N. & Tubbs, R.S. (2016). Neuroanatomical variation
in autism spectrum disorder: A comprehensive review. Clinical Anatomy. 29 (4). p.pp. 454–465.
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R. & Wu, A.Y. (2002). An
efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on
Pattern Analysis and Machine Intelligence. 24 (7). p.pp. 881–892.
Katuwal, G., Baum, S. & Michael, A. (2016a). Zernike moments as shape descriptors of subcortical
structures for autism classification. In: Organization for Human Brain Mapping. 2016, Geneva.
Katuwal, G.J., Baum, S.A., Cahill, N.D., Dougherty, C.C., Evans, E., Evans, D.W., Moore, G.J.
& Michael, A.M. (2016b). Inter-method discrepancies in brain volume estimation may drive
136
inconsistent findings in autism. Frontiers in Neuroscience. 10 (SEP). p.p. 439.
Katuwal, G.J., Baum, S.A., Cahill, N.D. & Michael, A.M. (2016c). Divide and Conquer: Sub-
Grouping of ASD Improves ASD Detection Based on Brain Morphometry. PloS one. 11 (4).
p.p. e0153331.
Katuwal, G.J., Cahill, N.D., Baum, S.A., Dougherty, C.C., Evans, D.W., Moore, G.J. & Michael,
A.M. (2015a). Inter-method Inconsistencies and Inter-site Variability of Brain Volume in
Autism. In: Organization for Human Brain Mapping. 2015, p. 10.13140/RG.2.1.3449.2249.
Katuwal, G.J., Cahill, N.D., Baum, S.A. & Michael, A.M. (2015b). The Predictive Power of
Structural MRI in Autism Diagnosis. In: Engineering in Medicine and Biology Society (EMBC), 2015
37th Annual International Conference of the IEEE. 2015, pp. 4270–4273.
Katuwal, G.J. & Chen, R. (2016). Machine Learning Model Interpretability for Precision Medicine.
Kim, J., Kwon, J., Kim, M., Do, J., Lee, D. & Han, H. (2016). A small number of abnormal brain
connections predicts adult autism spectrum disorder. Nature Communications.
Kim, M., Wu, G. & Shen, D. (2013). Unsupervised deep learning for hippocampus segmentation in 7.0 Tesla
MR images.
Kinney, D.K., Munir, K.M., Crowleya, D.J. & Miller, A.M. (2009). Prenatal Stress and Risk for
Autism. Neuroscience Biobehavioral Reviews. 32 (8). p.pp. 1519–1532.
Klein, A. & Tourville, J. (2012). 101 Labeled Brain Images and a Consistent Human Cortical
Labeling Protocol. Frontiers in Neuroscience. 6 (DEC). p.pp. 1–12.
Kober, J., Bagnell, J.A. & Peters, J. (2013). Reinforcement learning in robotics: A survey. The
International Journal of Robotics Research. 32 (11). p.pp. 1238–1274.
Kolevzon, A., Gross, R. & Reichenberg, A. (2007). Prenatal and Perinatal Risk Factors for Autism.
Archives of Pediatrics & Adolescent Medicine. 161 (4). p.p. 326.
van Kooten, I.A.J., Palmen, S.J.M.C., von Cappeln, P., Steinbusch, H.W.M., Korr, H., Heinsen,
137
H., Hof, P.R., van Engeland, H. & Schmitz, C. (2008). Neurons in the fusiform gyrus are
fewer and smaller in autism. Brain. 131 (4). p.pp. 987–999.
Kucharsky Hiess, R., Alter, R., Sojoudi, S., Ardekani, B.A., Kuzniecky, R. & Pardoe, H.R. (2015).
Corpus Callosum Area and Brain Volume in Autism Spectrum Disorder: Quantitative
Analysis of Structural MRI from the ABIDE Database. Journal of Autism and Developmental
Disorders. 45 (10). p.pp. 3107–3114.
Kühn, S., Schubert, F. & Gallinat, J. (2010). Reduced Thickness of Medial Orbitofrontal Cortex
in Smokers. Biological Psychiatry. 68 (11). p.pp. 1061–1065.
Kuznetsova, A., Brockhoff, P.B. & Christensen, R.H.B. (2013). lmerTest: Tests for random and
fixed effects for linear mixed effect models (lmer objects of lme4 package). R package version. 2
(6).
Lai, M. (2015). Deep Learning for Medical Image Segmentation. arXiv preprint arXiv:1505.02000.
Lai, M.-C., Lombardo, M. V, Suckling, J., Ruigrok, A.N. V, Chakrabarti, B., Ecker, C., Deoni,
S.C.L., Craig, M.C., Murphy, D.G.M., Bullmore, E.T. & Baron-Cohen, S. (2013). Biological
sex affects the neurobiology of autism. Brain : a journal of neurology. 136 (Pt 9). p.pp. 2799–815.
Lazar, M., Miles, L.M., Babb, J.S. & Donaldson, J.B. (2014). Axonal deficits in young adults with
High Functioning Autism and their impact on processing speed. NeuroImage: Clinical. 4. p.pp.
417–425.
LeCun, Y., Bengio, Y. & Hinton, G. (2015). Deep learning. Nature. 521 (7553). p.pp. 436–444.
Leigh, J.P. & Du, J. (2015). Brief Report: Forecasting the Economic Burden of Autism in 2015 and
2025 in the United States. Journal of Autism and Developmental Disorders. 45 (12). p.pp. 4135–
4139.
Lenroot, R.K. & Yeung, P.K. (2013). Heterogeneity within Autism Spectrum Disorders: What
have We Learned from Neuroimaging Studies? Frontiers in human neuroscience. 7 (October). p.p.
138
733.
Lin, H.-Y., Ni, H.-C., Lai, M.-C., Tseng, W.-Y.I. & Gau, S.S.-F. (2015). Regional brain volume
differences between males with and without autism spectrum disorder are highly age-
dependent. Molecular Autism. 6 (1). p.p. 29.
Louppe, G. (2015). Understanding Random Forests from Theory to Practice. Univeristy of Liege.
Lyall, K., Munger, K.L., O’Reilly, É.J., Santangelo, S.L. & Ascherio, A. (2013). Maternal dietary
fat intake in association with autism spectrum disorders. American Journal of Epidemiology. 178
(2). p.pp. 209–220.
Mandal, P.K., Mahajan, R. & Dinov, I.D. (2012). Structural brain atlases: Design, rationale, and
applications in normal and pathological cohorts. Journal of Alzheimer’s Disease. 31 (Suppl 3).
p.pp. S169-88.
Di Martino, A., Yan, C.-G., Li, Q., Denio, E., Castellanos, F.X., Alaerts, K., Anderson, J.S., Assaf,
M., Bookheimer, S.Y., Dapretto, M., Deen, B., Delmonte, S., Dinstein, I., Ertl-Wagner, B.,
Fair, D.A., Gallagher, L., Kennedy, D.P., Keown, C.L., Keysers, C., Lainhart, J.E., Lord, C.,
Luna, B., Menon, V., Minshew, N.J., Monk, C.S., Mueller, S., Müller, R.-A., Nebel, M.B.,
Nigg, J.T., O’Hearn, K., Pelphrey, K.A., Peltier, S.J., Rudie, J.D., Sunaert, S., Thioux, M.,
Tyszka, J.M., Uddin, L.Q., Verhoeven, J.S., Wenderoth, N., Wiggins, J.L., Mostofsky, S.H.
& Milham, M.P. (2014). The autism brain imaging data exchange: towards a large-scale
evaluation of the intrinsic brain architecture in autism. Molecular psychiatry. 19 (6). p.pp. 659–
67.
Massoud, T.F. & Gambhir, S.S. (2003). Molecular imaging in living subjects: seeing fundamental
biological processes in a new light. Genes & development. 17 (5). p.pp. 545–80.
Matson, J.L., Rieske, R.D. & Williams, L.W. (2013). The relationship between autism spectrum
disorders and attention-deficit/hyperactivity disorder: An overview. Research in Developmental
139
Disabilities. 34 (9). p.pp. 2475–2484.
Matsuda, H., Mizumura, S., Nemoto, K., Yamashita, F., Imabayashi, E., Sato, N. & Asada, T.
(2012). Automatic voxel-based morphometry of structural MRI by SPM8 plus diffeomorphic
anatomic registration through exponentiated lie algebra improves the diagnosis of probable
Alzheimer Disease. AJNR. American journal of neuroradiology. 33 (6). p.pp. 1109–14.
May, A. (2011). Experience-dependent structural plasticity in the adult human brain. Trends in
Cognitive Sciences. 15 (10). p.pp. 475–482.
Mazziotta, J., Toga, A., Evans, A., Fox, P., Lancaster, J., Zilles, K., Woods, R., Paus, T., Simpson,
G., Pike, B., Holmes, C., Collins, L., Thompson, P., MacDonald, D., Iacoboni, M.,
Schormann, T., Amunts, K., Palomero-Gallagher, N., Geyer, S., Parsons, L., Narr, K.,
Kabani, N., Le Goualher, G., Feidler, J., Smith, K., Boomsma, D., Pol, H.H., Cannon, T.,
Kawashima, R. & Mazoyer, B. (2001a). A Four-Dimensional Probabilistic Atlas of the
Human Brain. Journal of the American Medical Informatics Association. 8 (5). p.pp. 401–430.
Mazziotta, J., Toga, a, Evans, a, Fox, P., Lancaster, J., Zilles, K., Woods, R., Paus, T., Simpson,
G., Pike, B., Holmes, C., Collins, L., Thompson, P., MacDonald, D., Iacoboni, M.,
Schormann, T., Amunts, K., Palomero-Gallagher, N., Geyer, S., Parsons, L., Narr, K.,
Kabani, N., Le Goualher, G., Boomsma, D., Cannon, T., Kawashima, R. & Mazoyer, B.
(2001b). A probabilistic atlas and reference system for the human brain: International
Consortium for Brain Mapping (ICBM). Philosophical transactions of the Royal Society of London.
Series B, Biological sciences. 356. p.pp. 1293–1322.
Mazziotta, J.C., Toga, a W., Evans, a, Fox, P. & Lancaster, J. (1995). A probabilistic atlas of the
human brain: theory and rationale for its development. The International Consortium for
Brain Mapping (ICBM). NeuroImage. 2 p.pp. 89–101.
Michael, A.M. (2009). Imaging schizophrenia: data fusion approaches to characterize and classify. Rochester
140
Institute of Technology.
Michael, A.M., Baum, S.A., White, T., Demirci, O., Andreasen, N.C., Segall, J.M., Jung, R.E.,
Pearlson, G., Clark, V.P., Gollub, R.L., Schulz, S.C., Roffman, J.L., Lim, K.O., Ho, B.C.,
Bockholt, H.J. & Calhoun, V.D. (2010). Does function follow form?: Methods to fuse
structural and functional brain images show decreased linkage in schizophrenia. NeuroImage.
49 (3). p.pp. 2626–2637.
Michael, A.M., King, M.D., Ehrlich, S., Pearlson, G., White, T., Holt, D.J., Andreasen, N.C.,
Sakoglu, U., Ho, B.-C., Schulz, S.C. & Calhoun, V.D. (2011). A Data-Driven Investigation
of Gray Matter-Function Correlations in Schizophrenia during a Working Memory Task.
Frontiers in human neuroscience. 5 (July). p.p. 71.
Mietchen, D. & Gaser, C. (2009). Computational morphometry for detecting changes in brain
structure due to development, aging, learning, disease and evolution. Frontiers in
neuroinformatics. 3 (August). p.p. 25.
Morey, R.A., Petty, C.M., Xu, Y., Hayes, J.P., Wagner, H.R., Lewis, D. V, LaBar, K.S., Styner,
M. & McCarthy, G. (2009). A comparison of automated segmentation and manual tracing
for quantifying hippocampal and amygdala volumes. NeuroImage. 45 (3). p.pp. 855–66.
Muhlert, N. & Ridgway, G.R. (2015). Failed replications, contributing factors and careful
interpretations: Commentary on “A purely confirmatory replication study of structural brain-
behaviour correlations” by Boekel etal., 2015. Cortex. 4. p.pp. 4–8.
Natekin, A. & Knoll, A. (2013). Gradient boosting machines, a tutorial. Frontiers in Neurorobotics. 7
(DEC).
Nelder, J.A., Baker, R.J., Nelder, J.A. & Baker, R.J. (2006). Generalized Linear Models. In:
Encyclopedia of Statistical Sciences. Hoboken, NJ, USA: John Wiley & Sons, Inc.
Niebles, J.C., Wang, H. & Fei-Fei, L. (2008). Unsupervised Learning of Human Action Categories
141
Using Spatial-Temporal Words. International Journal of Computer Vision. 79 (3). p.pp. 299–318.
Nishida, S., Narumoto, J., Sakai, Y., Matsuoka, T., Nakamae, T., Yamada, K., Nishimura, T. &
Fukui, K. (2011). Anterior insular volume is larger in patients with obsessive–compulsive
disorder. Progress in Neuro-Psychopharmacology and Biological Psychiatry. 35 (4). p.pp. 997–1001.
Nordahl, C.W., Dierker, D., Mostafavi, I., Schumann, C.M., Rivera, S.M., Amaral, D.G. & Essen,
D.C. Van (2007). Cortical Folding Abnormalities in Autism Revealed by Surface-Based
Morphometry. The Journal of Neuroscience. 27 (43). p.pp. 11725–11735.
Nordahl, C.W., Scholz, R., Yang, X., Buonocore, M.H., Simon, T., Rogers, S. & Amaral, D.G.
(2012). Increased Rate of Amygdala Growth in Children Aged 2 to 4 Years With Autism Spectrum Disorders.
69 (1). p.pp. 53–61.
Nordenskjöld, R., Malmberg, F., Larsson, E.-M., Simmons, A., Brooks, S.J., Lind, L., Ahlström,
H., Johansson, L. & Kullberg, J. (2013). Intracranial volume estimated with commonly used
methods could introduce bias in studies including brain volume measurements. NeuroImage.
83. p.pp. 355–60.
O’Brien, J.T. (2007). Role of imaging techniques in the diagnosis of dementia. The British Journal of
Radiology. 80 (special_issue_2). p.pp. S71–S77.
Padilla, N., Eklöf, E., Mårtensson, G.E., Bölte, S., Lagercrantz, H. & Ådén, U. (2015). Poor Brain
Growth in Extremely Preterm Neonates Long Before the Onset of Autism Spectrum Disorder
Symptoms. Cerebral Cortex. (February 2016). p.pp. 1–8.
Pedregosa, F. & Varoquaux, G. (2011). Scikit-learn: Machine Learning in Python. Journal of
Machine Learning Research. 12. p.pp. 2825–2830.
Peters, J.M., Sahin, M., Vogel-Farley, V.K., Jeste, S.S., Nelson, C.A., Gregas, M.C., Prabhu, S.P.,
Scherrer, B. & Warfield, S.K. (2012). Loss of White Matter Microstructural Integrity Is
Associated with Adverse Neurological Outcome in Tuberous Sclerosis Complex. Academic
142
Radiology. 19 (1). p.pp. 17–25.
Pichler, B.J., Kolb, A., Nägele, T. & Schlemmer, H.-P. (2010). PET/MRI: paving the way for the
next generation of clinical multimodality imaging applications. Journal of nuclear medicine : official
publication, Society of Nuclear Medicine. 51 (3). p.pp. 333–336.
Pickles, A., Le Couteur, A., Leadbitter, K., Salomone, E., Cole-Fletcher, R., Tobin, H., Gammer,
I., Lowry, J., Vamvakas, G., Byford, S., Aldred, C., Slonims, V., McConachie, H., Howlin,
P., Parr, J.R., Charman, T. & Green, J. (2016). Parent-mediated social communication
therapy for young children with autism (PACT): long-term follow-up of a randomised
controlled trial. The Lancet. 388 (10059). p.pp. 2501–2509.
Pooley, R.A. (2005). Fundamental Physics of MR Imaging. RadioGraphics. 25 (4). p.pp. 1087–1099.
Pringle, B., Colpe, L.J., Blumberg, S.J., Avila, R.M. & Kogan, M.D. (2012). Diagnostic history and
treatment of school-aged children with autism spectrum disorder and special health care
needs. NCHS Data Brief. (97). p.pp. 1–8.
Puterman, M.L. & L., M. (1994). Markov decision processes : discrete stochastic dynamic programming. Wiley.
Rabiner, L.R. & Rabiner, L.R. (1989). A tutorial on hidden Markov models and selected
applications in speech recognition. PROCEEDINGS OF THE IEEE. 77 (2). p.pp. 257--286.
Radua, J., Via, E., Catani, M. & Mataix-Cols, D. (2011). Voxel-based meta-analysis of regional
white-matter volume differences in autism spectrum disorder versus healthy controls.
Psychological medicine. 41 (7). p.pp. 1539–1550.
Raichle, M.E. (1998). Behind the scenes of functional brain imaging: A historical and physiological
perspective. Proceedings of the National Academy of Sciences. 95 (3). p.pp. 765–772.
Rajagopalan, V., Yue, G.H. & Pioro, E.P. (2014). Do preprocessing algorithms and statistical
models influence voxel-based morphometry (VBM) results in amyotrophic lateral sclerosis
patients? A systematic comparison of popular VBM analytical methods. Journal of Magnetic
143
Resonance Imaging. 40 (3). p.pp. 662–667.
Ranzato, M., Huang, F.J., Boureau, Y.-L. & LeCun, Y. (2007). Unsupervised Learning of
Invariant Feature Hierarchies with Applications to Object Recognition. In: 2007 IEEE
Conference on Computer Vision and Pattern Recognition. June 2007, IEEE, pp. 1–8.
Rather, A.M., Sastry, V.N. & Agarwal, A. (2017). Stock market prediction and Portfolio selection
models: a survey. OPSEARCH. p.pp. 1–22.
Raznahan, A., Wallace, G.L., Antezana, L., Greenstein, D., Lenroot, R., Thurm, A., Gozzi, M.,
Spence, S., Martin, A., Swedo, S.E. & Giedd, J.N. (2013). Compared to what? Early brain
overgrowth in autism and the perils of population norms. Biological Psychiatry. 74 (8). p.pp.
563–575.
Redcay, E. & Courchesne, E. (2005). When is the brain enlarged in autism? A meta-analysis of all
brain size reports. Biological Psychiatry. 58 (1). p.pp. 1–9.
Ren, J., Sneller, B., Rueckert, D., Hajnal, J., Heckemann, R., Smith, S., Vickers, J. & Hill, D.
(2005). A comparison of the tissue classification and the segmentation propagation techniques
in MRI brain image segmentation Jinsong J. M. Fitzpatrick & J. M. Reinhardt (eds.).
Proceedings of SPIE. 5747. p.pp. 1682–1691.
Retico, A., Gori, I., Giuliano, A., Muratori, F. & Calderoni, S. (2016). One-class support vector
machines identify the language and default mode regions as common patterns of structural
alterations in young children with autism spectrum disorders. Frontiers in Neuroscience. 10 (JUN).
Riddle, K., Cascio, C.J. & Woodward, N.D. (2016). Brain structure in autism: a voxel-based
morphometry analysis of the Autism Brain Imaging Database Exchange (ABIDE). Brain
Imaging and Behavior.
Ridgway, G., Barnes, J., Pepple, T. & Fox, N. (2011). Estimation of total intracranial volume; a
comparison of methods. Alzheimer’s & Dementia. 7 (4). p.pp. S62–S63.
144
Rogers, S.J. & Vismara, L. a (2010). Evidence-Based Comprehensive Treatments for Early Autism.
Journal of Clinical Child and Adolescent Psychology. 37 (1). p.pp. 8–38.
Ronald, A., Happe, F., Bolton, P., Butcher, L.M., Price, T.S., Wheelwright, S., Baron-Cohen, S.
& Plomin, R. (2006). Genetic heterogeneity between the three components of the autism
spectrum: a twin study. Journal of the American Academy of Child & Adolescent Psychiatry. 45 (6).
p.pp. 691–699.
Sabuncu, M.R. & Konukoglu, E. (2014). Clinical Prediction from Structural Brain MRI Scans: A
Large-Scale Empirical Study. Neuroinformatics. 13 (1). p.pp. 31–46.
Sandin, S., Lichtenstein, P., Larsson, H., Cm, H. & Reichenberg, A. (2014). The familial risk of
autism. 311 (17). p.p. 24794370.
Sandin, S., Schendel, D., Magnusson, P., Hultman, C., Surén, P., Susser, E., Grønborg, T.,
Gissler, M., Gunnes, N., Gross, R., Henning, M., Bresnahan, M., Sourander, a, Hornig, M.,
Carter, K., Francis, R., Parner, E., Leonard, H., Rosanoff, M., Stoltenberg, C. &
Reichenberg, a (2015). Autism risk associated with parental age and with increasing
difference in age between the parents. Molecular psychiatry. (April). p.pp. 1–8.
Satterthwaite, F.E. (1946). An approximate distribution of estimates of variance components.
Biometrics. 2 (6) p.pp. 110–114.
Schaer, M., Bach Cuadra, M., Tamarit, L., Lazeyras, F., Eliez, S. & Thiran, J.P. (2008). A Surface-
based approach to quantify local cortical gyrification. IEEE Transactions on Medical Imaging. 27
(2). p.pp. 161–170.
Schmidt, R.J., Tancredi, D.J., Krakowiak, P., Hansen, R.L. & Ozonoff, S. (2014). Maternal intake
of supplemental iron and risk of autism spectrum disorder. American Journal of Epidemiology. 180
(9). p.pp. 890–900.
Schumann, C.M., Bloss, C.S., Barnes, C.C., Wideman, G.M., Carper, R.A., Akshoomoff, N.,
145
Pierce, K., Hagler, D., Schork, N., Lord, C. & Courchesne, E. (2010). Longitudinal magnetic
resonance imaging study of cortical development through early childhood in autism. The
Journal of neuroscience : the official journal of the Society for Neuroscience. 30 (12). p.pp. 4419–27.
Ségonne, F., Dale, a M., Busa, E., Glessner, M., Salat, D., Hahn, H.K. & Fischl, B. (2004). A
hybrid approach to the skull stripping problem in MRI. NeuroImage. 22 (3). p.pp. 1060–75.
Shen, M.D., Nordahl, C.W., Young, G.S., Wootton-Gorges, S.L., Lee, A., Liston, S.E.,
Harrington, K.R., Ozonoff, S. & Amaral, D.G. (2013). Early brain enlargement and elevated
extra-axial fluid in infants who develop autism spectrum disorder. Brain. 136 (9). p.pp. 2825–
2835.
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J.,
Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J.,
Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T. &
Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search.
Nature. 529 (7587). p.pp. 484–489.
Smith, S.M. (2002). Fast robust automated brain extraction. Human brain mapping. 17 (3). p.pp. 143–
55.
Smith, S.M., Zhang, Y., Jenkinson, M., Chen, J., Matthews, P.M., Federico, A. & De Stefano, N.
(2002). Accurate, Robust, and Automated Longitudinal and Cross-Sectional Brain Change
Analysis. NeuroImage. 17 (1). p.pp. 479–489.
Sordo, M. & Zeng, Q. (2005). On Sample Size and Classification Accuracy: A Performance
Comparison. Lect Notes Comput Sc. 3745. p.pp. 193–201.
Stanfield, A.C., McIntosh, A.M., Spencer, M.D., Philip, R., Gaur, S. & Lawrie, S.M. (2008).
Towards a neuroanatomy of autism: a systematic review and meta-analysis of structural
magnetic resonance imaging studies. European psychiatry : the journal of the Association of European
146
Psychiatrists. 23 (4). p.pp. 289–99.
Strobl, C., Malley, J. & Gerhard, T. (2009). An Introduction to Recursive Partitioning: Rationale,
Application Psychol Methods. Psychological methods. 14 (4). p.pp. 323–348.
Styner, M.A., Charles, H.C., Park, J. & Gerig, G. (2002). Multi-site validation of image analysis
methods - Assessing intra and inter-site variability. In: SPIE. 2002, San Diego, pp. 1–9.
Takao, H., Hayashi, N. & Ohtomo, K. (2011). Effect of scanner in longitudinal studies of brain
volume changes. Journal of magnetic resonance imaging : JMRI. 34 (2). p.pp. 438–44.
Talairach, J. & Tournoux, P. (1988). Co-planar stereotaxic atlas of the human brain. 3-Dimensional
proportional system: an approach to cerebral imaging. New York: Thieme.
Tallinen, T., Chung, J.Y., Biggins, J.S. & Mahadevan, L. (2014). Gyrification from constrained
cortical expansion. Proceedings of the National Academy of Sciences of the United States of America. 111
(35). p.pp. 12667–72.
Tallinen, T., Chung, J.Y., Rousseau, F., Girard, N., Lefèvre, J. & Mahadevan, L. (2016). On the
growth and form of cortical convolutions. Nature Physics. 476 (7358). p.pp. 57–62.
Tanga, Y., Hojatkashanib, C., Dinovb, I.D., Suna, B., Fana, L., Lina, X., Qic, H., Huab, X., Liua,
S. & Toga, A.W. (2010). A morphometric comparison study between Chinese and Caucasian
cohorts. NeuroImage. 51 (1). p.pp. 33–41.
Tondelli, M., Wilcock, G.K., Nichelli, P., de Jager, C.A., Jenkinson, M. & Zamboni, G. (2012).
Structural MRI changes detectable up to ten years before clinical Alzheimer’s disease.
Neurobiology of Aging. 33 (4). p.p. 825.e25-825.e36.
Townsend, D.W. & Cherry, S.R. (2001). Combining anatomy and function: The path to true
image fusion. European Radiology. 11 (10). p.pp. 1968–1974.
Tsang, O., Gholipour, A., Kehtarnavaz, N., Member, S., Gopinath, K., Briggs, R. & Panahi, I.
(2008). Comparison of tissue segmentation algorithms in neuroimage analysis software tools.
147
In: IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI). 2008, pp.
3924–3928.
Uddin, L.Q., Menon, V., Young, C.B., Ryali, S., Chen, T., Khouzam, A., Minshew, N.J. &
Hardan, A.Y. (2011). Multivariate searchlight classification of structural magnetic resonance
imaging in children and adolescents with autism. Biological Psychiatry. 70. p.pp. 833–841.
Valk, S.L., Di Martino, A., Milham, M.P. & Bernhardt, B.C. (2015). Multicenter mapping of
structural network alterations in autism. Human Brain Mapping. 36 (6). p.pp. 2364–2373.
VanEssen, D.C. & Drury, H. a (1997). Structural and functional analyses of human cerebral cortex
using a surface-based atlas. Journal of Neuroscience. 17 (18). p.pp. 7079–7102.
Via, E., Radua, J., Cardoner, N., Happé, F. & Mataix-Cols, D. (2011). Meta-analysis of gray
matter abnormalities in autism spectrum disorder: should Asperger disorder be subsumed
under a broader umbrella of autistic spectrum disorder? Archives of general psychiatry. 68 (4). p.pp.
409–418.
Vogelstein, J.T., Park, Y., Ohyama, T., Kerr, R.A., Truman, J.W., Priebe, C.E. & Zlatic, M.
(2014). Discovery of Brainwide Neural-Behavioral Maps via Multiscale Unsupervised
Structure Learning. Science. 344 (6182). p.pp. 386--392.
Volkmar, F.R., State, M. & Klin, A. (2009). Autism and autism spectrum disorders: Diagnostic
issues for the coming decade. Journal of Child Psychology and Psychiatry and Allied Disciplines. 50 (1–
2). p.pp. 108–115.
Walther, K., Birdsill, A.C., Glisky, E.L. & Ryan, L. (2009). Structural brain differences and
cognitive functioning related to body mass index in older females. Human Brain Mapping. 31
(7). p.pp. 1052–1064.
Wee, C.-Y., Wang, L., Shi, F., Yap, P.-T. & Shen, D. (2014). Diagnosis of autism spectrum
disorders using regional and interregional morphological features. Human brain mapping. 35 (7).
148
p.pp. 3414–30.
Wolff, J.J., Gerig, G., Lewis, J.D., Soda, T., Styner, M.A., Vachet, C., Botteron, K.N., Elison, J.T.,
Dager, S.R., Estes, A.M., Hazlett, H.C., Schultz, R.T., Zwaigenbaum, L. & Piven, J. (2015).
Altered corpus callosum morphology associated with autism over the first 2 years of life. Brain.
138 (7). p.pp. 2046–2058.
Xiao, X., Fang, H., Wu, J., Xiao, C., Xiao, T., Qian, L., Liang, F., Xiao, Z., Chu, K.K. & Ke, X.
(2016). Diagnostic model generated by MRI-derived brain features in toddlers with autism
spectrum disorder. Autism Research. p.pp. 1–11.
Zablotsky, B., Black, L.I., Maenner, M.J., Schieve, L.A. & Blumberg, S.J. (2015). Estimated
Prevalence of Autism and Other Developmental Disabilities Following Questionnaire
Changes in the 2014 National Health Interview Survey. National Health Statistics Reports. (87).
p.pp. 1–21.
Zarei, M., Patenaude, B., Damoiseaux, J., Morgese, C., Smith, S., Matthews, P.M., Barkhof, F.,
Rombouts, S., Sanz-Arigita, E. & Jenkinson, M. (2010). Combining shape and connectivity
analysis: An MRI study of thalamic degeneration in Alzheimer’s disease. NeuroImage. 49 (1).
p.pp. 1–8.
Zhang, Y., Brady, M. & Smith, S. (2001). Segmentation of brain MR images through a hidden
Markov random field model and the expectation-maximization algorithm. IEEE transactions
on medical imaging. 20 (1). p.pp. 45–57.
Zinkstok, J., Kolind, S., D’Almeida, V., Shahidiani, A., Williams, S.C., Murphy, D.G. & Deoni,
S.C. (2012). Is Myelin Content Altered In Young Adults with Autism? In: INSAR. 2012,
Toronto.
Zwaigenbaum, L., Young, G.S., Stone, W.L., Dobkins, K., Ozonoff, S., Brian, J., Bryson, S.E.,
Carver, L.J., Hutman, T., Iverson, J.M., Landa, R.J. & Messinger, D. (2014). Early Head
149
Growth in Infants at Risk of Autism: A Baby Siblings Research Consortium Study. Journal of
American Academy of Child & Adolescent Psychiatry. 53 (10). p.pp. 1053–62.
150
7 Appendix A
Table 7.1: Estimated brain volumes and inter-method differences in NYU

TIV GM WM CSF
SPM (L) 1.517 ± 0.16 0.730 ± 0.07 0.507 ± 0.06 0.280 ± 0.04
SPM – FSL mean diff. (ml) 197.4 65.0 15.22 126.6
SPM
Correlation Coefficient 0.885 0.698 0.896 0.710
vs.
Cohen’s d 1.27 0.85 0.26 3.51
FSL
Paired t-test p-value 4E-74* 3E-29* 4E-11* <E-100*
FSL (L) 1.320 ± 0.15 0.665 ± 0.08 0.492 ± 0.06 0.154 ± 0.03
FSL – FS mean diff. (ml) -180.8 -41.6 11.8 NA
FSL
vs.
Cohen’s d -1.12 -0.53 0.18 NA
FS
Paired t-test p-value 7E-64* 2E-54* 5E-9* NA
FS (L) 1.501 ± 0.17 0.706 ± 0.08 0.480 ± 0.07 NA
SPM – FS mean diff. (ml) 16.6 23.4 27.0 NA
SPM
vs.
Cohen’s d 0.1 0.31 0.44 NA
FS
Paired t-test p-value 0.013* 9E-9* 1E-39* NA
Mean and standard deviation of the brain volumes estimated by SPM, FSL, and FS are presented. Cells
corresponding to CSFFS are filled as ‘NA’ since FS does not output total CSF volume. Inter-method
differences and corresponding statistics are presented in shaded cells. Correlation coefficient is used to
measure the association between the brain volumes estimated by two different methods. Cohen’s d is used
to measure the effect size of the inter-method difference and paired t-test was used to check the statistical
significance. Statistically significant differences are denoted by * for p < 0.05. Correlation coefficients show
that methods agree on volume estimates but the Cohen’s d and paired t-test p-values indicate significant
inter-method biases.
151
Table 7.2: ASD vs. TDC brain volume differences in NYU
TIV GM WM CSF
diff diff diff diff diff diff diff diff
p-val p-val p-val p-val
(ml) % (ml) % (ml) % (ml) %
Raw Brain Volumes
SPM 5.17 0.34 0.84 9.3 10.28 0.42 -4.4 -0.87 0.63 0.29 0.11 0.96
FSL -15.4 -1.16 0.52 1.7 0.25 0.90 -12.6 -2.54 0.20 -7.1 -4.52 0.14
FS -44.1 -2.9 0.10 5.4 0.77 0.66 -10.8 -2.22 0.31 NA NA NA
Adjusted for age and sex
SPM -0.81 -0.05 0.97 2.4 0.33 0.81 -4.89 -0.96 0.52 1.69 0.60 0.76
FSL -22.5 -1.70 0.28 -15.0 -2.26 0.16 -14.4 -2.89 0.10 -2.7 -1.73 0.36
FS -51.9 -3.42 0.03* -10.1 -1.44 0.34 -10.3 -2.12 0.24 NA NA NA
Adjusted for age, sex, and FIQ
SPM 10.9 0.72 0.61 8.1 1.12 0.41 -0.8 -0.15 0.92 3.6 1.28 0.52
FSL -8.9 -0.67 0.66 -9.5 -0.53 0.37 -9.3 -1.43 0.28 -1.3 -1.87 0.66
FS -37.0 -2.44 0.12 -3.6 -0.52 0.72 -5.2 -1.08 0.55 NA NA NA
Mean (ASD – TDC) difference (diff) in brain volume estimates according to SPM, FSL, and FS.
Percentage ASD vs. TDC group difference was calculated as 100*(𝐴𝑆𝐷 − 𝑇𝐷𝐶)/𝑇𝐷𝐶, where
𝑇𝐷𝐶 is the group mean of the TDC subjects. Cells corresponding to CSFFS are filled as ‘NA’ since
FS does not output total CSF volume. Statistically significant differences are denoted by * for
p<0.05. ASD vs. TDC differences are dependent upon the method used and only in SPM TIV,
GM and CSF volumes in ASD were significantly larger than TDC.
152
Table 7.3: Differential bias for diagnostic group (ASD) in NYU
TIV GM WM CSF
bias bias beta bias bias beta bias bias beta bias bias beta
(ml) % p-val (ml) % p-val (ml) % p-val (ml) % p-val
S
SPM
2 1.64 0.13 17 2.6 0.09 10 1.9 0.09 4.4 2.90 0.36
vs.
FSL#
F
FSL
21 1.96 0.09 -5 0.70 0.23 -4 0.85 0.30 NA NA NA
vs.
FS#
S
PM
51 3.4 0.001* 12.5 1.8 0.09 5.4 1.1 0.13 NA NA NA
vs.
FS#
# reference method
bias (ml): brain volume (in ml) by which a method overestimates in ASD subjects than in TDCs.
% bias: the percentage of brain volume by which a method overestimates in ASD subjects than in TDCs.
153
8 Appendix B
A) ASD/TDC subjects numbers matched by upsampling the smaller group in each training fold
AS VIQ age
0.9
0.8
0.7
0.76
AUC=0.74 60/373 0.69
AUC
0.6 #ASD/TDC=20/373
76/373 0.65 0.67
146/142 0.62 116/120 0.66
0.5
0.71 93/133 121/138
57/15
0.4 0.51
114/108
0.3
B) ASD/TDC subjects numbers matched by downsampling the larger group

AS VIQ
1.0
0.9 age, VIQ matched age matched
0.8
AUC=0.92
#ASD/TDC=20/20
0.7 0.8
AUC
60/60
0.6 0.68 0.8
16/16
76/76 0.63
0.5 142/142
0.54
0.4 93/93
0.3
4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10 75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150
Figure 8.1: Gradient Boosting Machine: Improvement in classification by sub-grouping

based on autism severity (AS), age and Verbal IQ (VIQ).
The results are similar to that of Random Forest which are presented in Figure 1. A point represents the
mean and an error bar represents the one standard deviation of the AUC scores from 10 test folds. A)
Smaller classes were up-sampled in each training fold to balance the number of ASD & TDC subjects. Sub-
grouping improved the classification with the most and least improvements from sub-grouping by AS and
age respectively. B) Larger classes were down-sampled matching the demographics of the smaller classes.
This scheme further improved the classification performance.
154
A) AS
4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10
1.13**Optic.Chiasm_volume 0.2 Right.Accumbens.area_volume 0.63**Right.choroid.plexus_volume
1.31**rh_medialorbitofrontal_area 0.73**lh_parahippocampal_volume 0.21 SubCortGrayVol_volume
1.65**lh_lateralorbitofrontal_area 0.78**CSF_volume 0.48**X3rd.Ventricle_volume
1.75**lh_fusiform_thicknessstd 0.78**X3rd.Ventricle_volume 0.63**rh_medialorbitofrontal_thicknessstd
1.31**rh_parsopercularis_thicknessstd 0.59**rh_posteriorcingulate_volume 0.47**rh_entorhinal_thickness
1.38**lh_bankssts_thicknessstd 0.45* rh_rostralanteriorcingulate_thickness 0.42* lh_parsorbitalis_thickness
1.2** rh_pericalcarine_thicknessstd 0.8** lh_fusiform_thicknessstd 0.32* rh_insula_thicknessstd
1.07**rh_parstriangularis_thicknessstd 0.62**rh_lateralorbitofrontal_thicknessstd 0.42* rh_medialorbitofrontal_thickness
1.17**lh_inferiortemporal_thickness 0.92**rh_inferiorparietal_meancurv 0.37* lh_posteriorcingulate_thicknessstd
1.13**rh_lateraloccipital_gauscurv 0.09 rh_caudalmiddlefrontal_gauscurv 0.57**lh_frontalpole_meancurv
B) VIQ
75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150
1.04* rh_temporalpole_volume 0.14 rh_parahippocampal_volume 0.36* Right.vessel_volume
1.66**rh_parsopercularis_area 0.26* lh_insula_volume 0.32* Right.Hippocampus_volume
0.92* rh_parahippocampal_area 0.01 rh_parsorbitalis_volume 0.09 lh_isthmuscingulate_volume
1.03* lh_rostralanteriorcingulate_area 0.3* CSF_volume 0.25 lhCorticalWhiteMatterVol_volume
1.44**rh_medialorbitofrontal_area 0.15 rh_caudalmiddlefrontal_volume 0.14 lh_transversetemporal_thicknessstd
1.43**rh_bankssts_area 0.37**Right.Lateral.Ventricle_volume 0.5** lh_frontalpole_thicknessstd
1.45**lh_precentral_area 0.26* rh_entorhinal_thicknessstd 0.36* lh_superiortemporal_thicknessstd
1.48**rh_WhiteSurfArea_area 0.01 rh_pericalcarine_foldind 0.32* lh_frontalpole_gauscurv
0.13 lh_rostralanteriorcingulate_foldind 0.13 lh_supramarginal_foldind 0.02 rh_lateralorbitofrontal_gauscurv
0.19 rh_parsopercularis_foldind 0.22 lh_frontalpole_gauscurv 0.11 lh_pericalcarine_gauscurv
C) Age (years)
6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40
0.1 Left.Cerebellum.White.Matter_volume0.35* CC_Mid_Anterior_volume 0.5** lh_transversetemporal_volume
0.15 Right.Accumbens.area_volume 0.48**CC_Central_volume 0.45**lh_inferiortemporal_volume
0.01 lh_superiortemporal_volume 0.39**X3rd.Ventricle_volume 0.49**rh_bankssts_volume
0.14 lh_transversetemporal_area 0.18 rh_bankssts_volume 0.29* Left.Inf.Lat.Vent_volume
0.4** rh_rostralanteriorcingulate_area 0.42**lh_lingual_thickness 0.32* Right.Hippocampus_volume
0.23 rh_insula_foldind 0.34* rh_lingual_thicknessstd 0.42**rh_superiortemporal_volume
0.2 lh_lateralorbitofrontal_foldind 0.29* lh_pericalcarine_thickness 0.21 Right.vessel_volume
0.29* lh_fusiform_foldind 0.26 lh_caudalanteriorcingulate_thickness 0.12 rh_parahippocampal_volume
0.28* lh_frontalpole_gauscurv 0.22 rh_precuneus_thicknessstd 0.17 lh_medialorbitofrontal_thicknessstd
0.18 lh_lateralorbitofrontal_gauscurv 0.22 lh_bankssts_meancurv 0.38**lh_pericalcarine_meancurv
0 25 50 75 100 0 25 50 75 100 0 25 50 75 100

D) ALL Subjects
ALL
0.24**CSF_volume
Morphometric Attributes
0.14 Left.Amygdala_volume
volume
0.02 rh_parahippocampal_volume area
0.25**Left.Lateral.Ventricle_volume thickness
0.07 rh_bankssts_volume foldind
0.23**Left.Inf.Lat.Vent_volume meancurv
gauscurv
0.04 rh_medialorbitofrontal_area
0.04 rh_parstriangularis_area Group Difference
0.18* rh_transversetemporal_thicknessstd a TDC > ASD
a ASD > TDC
0 25 50 75 100
Figure 8.2. Gradient Boosting Machine (GBM): Important features for classification are
variable across sub-groups.
Top 10 important features for autism spectrum disorder (ASD) vs. typically developing controls (TDC)
classification in each sub-group are presented. Each feature is represented by a colored bar; the length of
the bar represents the relative % importance for classification with respect to the top feature. The features
have been grouped and color-coded by volume, area, thickness mean, thickness standard deviation, folding
index, mean curvature and Gaussian curvature. Before each feature, Cohen’s d and two sample t-test
significance (P<0.005** and P<0.05*) of ASD vs. TDC group difference are presented. Important features
for classification were similar to that from random forest presented in Fig 2. The important features highly
varied across the sub-groups demonstrating the heterogeneity in ASD brain morphometry.
155
a inter-subgroup classification a intra-subgroup classification
Autism Severity (AS)

4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10
1.0
0.9 0.92
AUC scores
0.8 0.81
0.7 0.71 0.69
0.69 0.68
0.6
0.63
0.52
0.5 0.49
4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10 4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10 4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10
Training sub-groups
Verbal (VIQ)
75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150
0.8
0.75
0.7
AUC scores
0.64 0.63 0.62

0.6
0.54 0.54 0.54
0.5
0.44
0.4
0.35
0.3
75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150 75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150 75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150
Training sub-groups
Age (years)
6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40
0.7
0.66 0.65
AUC scores
0.6
0.62
0.57
0.53 0.52
0.5 0.5
0.47
0.43
0.4
6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40 6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40 6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40
Training sub-groups
Figure 8.3: Inter and intra sub-groups classification AUC scores

Mean and standard deviations of AUC scores when classification models trained in different sub-groups
were tested on each other. The title of each sub-plot represents the sub-group on which the testing was
performed. Red data points correspond to when training and testing were performed on the same sub-
group (intra-subgroup) and black data points correspond to when training and testing were performed on
different sub-groups (inter-subgroup).
Intra-subgroup classification: A random forest classification model was trained in each sub-group
under the 10-fold cross-validation framework. Inter-subgroup classification: A random forest
classification model trained in each sub-group was tested on 200 bootstrap replications of the test sub-group.
Intra sub-groups AUC scores were much larger than inter sub-groups AUC scores in 16 out of 18
comparisons. The AUC scores decreased with the increasing distance between training sub-group and test
sub-group.
156
A) AS B) VIQ C) Age (years) D) ALL Subjects
4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10 75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150 6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40 ALL
1.65**lh_lateralorbitofrontal_area 0.78**X3rd.Ventricle_volume 0.63**Right.choroid.plexus_volume 1.66**rh_parsopercularis_area 0.14 rh_parahippocampal_volume 0.25 lhCorticalWhiteMatterVol_volume 0.1 Left.Cerebellum.White.Matter_volume
0.48**CC_Central_volume 0.5** lh_transversetemporal_volume 0.25**Left.Lateral.Ventricle_volume
1.48**rh_WhiteSurfArea_area 0.26* lh_insula_volume 0.24 CorticalWhiteMatterVol_volume 0.02 rh_parahippocampal_volume
0.4** rh_rostralanteriorcingulate_area 0.35* CC_Mid_Anterior_volume 0.49**rh_bankssts_volume
1.31**rh_medialorbitofrontal_area 0.73**lh_parahippocampal_volume 0.21 SubCortGrayVol_volume 0.24**CSF_volume
1.33**lh_WhiteSurfArea_area 0.01 rh_parsorbitalis_volume 0.32* Right.Hippocampus_volume 0.23 rh_insula_foldind 0.39**X3rd.Ventricle_volume 0.45**lh_inferiortemporal_volume 0.23**Left.Inf.Lat.Vent_volume
1.75**lh_fusiform_thicknessstd 0.77**rhSurfaceHoles_volume 0.48**X3rd.Ventricle_volume 1.44**rh_medialorbitofrontal_area 0.42**Left.Lateral.Ventricle_volume 0.3* Left.Hippocampus_volume 0.2 lh_lateralorbitofrontal_foldind 0.36* Left.Lateral.Ventricle_volume 0.42**rh_superiortemporal_volume 0.14 Left.Amygdala_volume
1.38**lh_bankssts_thicknessstd 0.87**lh_middletemporal_thickness 0.29 Right.Putamen_volume 0.92* rh_parahippocampal_area 0.37**Right.Lateral.Ventricle_volume 0.36* Right.vessel_volume 0.29* lh_fusiform_foldind 0.23 lhCorticalWhiteMatterVol_volume 0.34* lhCortexVol_volume 0.17* CorticalWhiteMatterVol_volume
0.13 lh_rostralanteriorcingulate_foldind 0.3* CSF_volume 0.22 rhCorticalWhiteMatterVol_volume 0.17* lhCorticalWhiteMatterVol_volume
1.17**lh_inferiortemporal_thickness 0.92**rh_inferiorparietal_meancurv 0.63**rh_medialorbitofrontal_thicknessstd 0.19 lh_superiortemporal_foldind 0.3* Right.Lateral.Ventricle_volume 0.27* Left.Putamen_volume
0.15 rh_caudalmiddlefrontal_volume 0.2 rh_parahippocampal_volume 0.07 rh_bankssts_volume
0.4** lh_lateraloccipital_foldind 0.18 rh_bankssts_volume 0.32* Right.Hippocampus_volume 0.23**X3rd.Ventricle_volume
1.22**lh_inferiorparietal_meancurv 0.83**lh_superiortemporal_meancurv 0.65**rh_middletemporal_thicknessstd 0.07 Left.Amygdala_volume 0.05 CC_Mid_Anterior_volume
0.18 lh_lateralorbitofrontal_gauscurv 0.21 rhCorticalWhiteMatterVol_volume 0.41**Left.Amygdala_volume 0.22**Right.Lateral.Ventricle_volume Morphometric Attributes
1.13**rh_lateraloccipital_gauscurv 0.8** lh_middletemporal_meancurv 0.61**rh_parsopercularis_thicknessstd 0.25* lh_parsopercularis_volume 0.23 lh_parstriangularis_area 0.16* rhCorticalWhiteMatterVol_volume volume
0.28* lh_frontalpole_gauscurv 0.1 rh_parahippocampal_volume 0.33* CortexVol_volume
0.29* Right.Inf.Lat.Vent_volume 0.21 lh_temporalpole_area 0.18* Right.choroid.plexus_volume area
0.5* rh_precentral_meancurv 0.42* lh_superiorparietal_thicknessstd 0.01 rh_medialorbitofrontal_gauscurv 0.18 Left.Thalamus.Proper_volume 0.31* rhCortexVol_volume
0.08 rh_parahippocampal_area 0.09 rh_precentral_area 0.08 CC_Mid_Anterior_volume thickness
0.02 rh_lingual_gauscurv 0.23 CorticalWhiteMatterVol_volume 0.33* Left.Hippocampus_volume 0.19* Right.Inf.Lat.Vent_volume foldind
0.89**rh_inferiortemporal_meancurv 0.44* rh_parsorbitalis_thickness 0.35**rh_parahippocampal_thickness 0.5** lh_frontalpole_thicknessstd
0.35* lh_fusiform_gauscurv 0.03 lh_caudalmiddlefrontal_area 0.41**rh_transversetemporal_volume 0.17* rh_pericalcarine_volume meancurv
0.92**lh_lateraloccipital_meancurv 0.54**rh_parstriangularis_thicknessstd 0.26* rh_entorhinal_thicknessstd 0.46**lh_inferiortemporal_thicknessstd
0.25 rh_parsopercularis_gauscurv 0.42**lh_lingual_thickness 0.29* Left.Inf.Lat.Vent_volume 0.04 rh_parstriangularis_area gauscurv
0.26* rh_precentral_thickness 0.36* lh_superiortemporal_thicknessstd 0.04 rh_medialorbitofrontal_area
0.63**lh_inferiortemporal_meancurv 0.42* rh_medialorbitofrontal_thickness 0.48**rh_paracentral_gauscurv 0.34* rh_lingual_thicknessstd 0.12 rh_parahippocampal_volume Group Difference
0.09 lh_parsopercularis_foldind 0.45**rh_paracentral_thickness 0.04 rh_isthmuscingulate_area
0.09 rh_caudalmiddlefrontal_gauscurv 0.39* rh_lateralorbitofrontal_thickness 0.23 rh_insula_gauscurv 0.29* lh_pericalcarine_thickness 0.33* TotalGrayVol_volume 0.14 lh_transversetemporal_area a TDC > ASD
0.13 lh_supramarginal_foldind 0.14 lh_transversetemporal_thicknessstd
0.09 rh_frontalpole_gauscurv 0.26 lh_caudalanteriorcingulate_thickness0.17 lh_medialorbitofrontal_thicknessstd 0.11 rh_transversetemporal_area a ASD > TDC
0.3 lh_supramarginal_gauscurv 0.58**rh_parsopercularis_thickness 0.07 rh_precuneus_foldind 0.37* lh_medialorbitofrontal_thicknessstd
0.22 rh_precuneus_thicknessstd 0.43**lh_inferiortemporal_thickness 0.25**rh_lingual_thicknessstd
0.23 rh_middletemporal_foldind 0.35* rh_superiorparietal_thicknessstd 0.14 lh_inferiortemporal_thicknessstd
0.47**rh_entorhinal_thickness 0.29* rh_paracentral_thickness 0.4** lh_superiortemporal_thickness
0.01 rh_pericalcarine_foldind 0.42* rh_fusiform_thicknessstd 0.2* lh_isthmuscingulate_thicknessstd
0.67**lh_middletemporal_meancurv 0.22 lh_frontalpole_gauscurv 0.3* rh_inferiortemporal_foldind 0.35* lh_lingual_thicknessstd 0.38**lh_pericalcarine_meancurv 0.19* rh_posteriorcingulate_thicknessstd
0.32* lh_frontalpole_gauscurv 0.29* lh_posteriorcingulate_thickness 0.32* rh_pericalcarine_gauscurv 0.17* lh_medialorbitofrontal_thicknessstd
0.57**lh_frontalpole_meancurv
0.21 rh_fusiform_foldind 0.18* rh_transversetemporal_thicknessstd
0.02 rh_lateralorbitofrontal_gauscurv
0.63**lh_superiortemporal_meancurv 0.08 rh_insula_foldind
0.11 lh_pericalcarine_gauscurv 0.22 lh_bankssts_meancurv
0.22**lh_frontalpole_meancurv
0.05 lh_frontalpole_gauscurv 0.17 lh_rostralmiddlefrontal_gauscurv 0.17 lh_entorhinal_gauscurv 0.13 lh_frontalpole_gauscurv
0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100
Feature Importance(%) Feature Importance(%) Feature Importance(%) Feature Importance(%)
Figure 8.4: Random Forest: Important features for classification in sub-groups (10% threshold).
Important features for autism spectrum disorder (ASD) vs. typically developing controls (TDC) classification
in each sub-group are presented. The features required to have 10 % of the total feature importance scores
of all the features were considered as important. Each feature is represented by a colored bar; the length of
the bar represents the relative % importance for classification with respect to the top feature. The features
significance (P<0.005** and P<0.05*) of ASD vs. TDC group difference are presented. The important
features for classification varied across the sub-groups demonstrating the heterogeneity in ASD brain
morphometry. Important features are similar to that from random forest presented in Fig 2 where top 10
features are presented. The important features were dissimilar across the sub-groups.
157
A) AS B) VIQ C) Age (years) D) ALL Subjects
4 ≤ AS ≤ 5 6 ≤ AS ≤ 7 8 ≤ AS ≤ 10 75 ≤ VIQ ≤ 90 90 < VIQ < 115 115 ≤ VIQ ≤ 150 6 ≤ Age < 13 13 ≤ Age < 18 18 ≤ Age ≤ 40 ALL
1.13**Optic.Chiasm_volume 0.78**X3rd.Ventricle_volume 0.63**Right.choroid.plexus_volume 1.04* rh_temporalpole_volume 0.14 rh_parahippocampal_volume 0.25 lhCorticalWhiteMatterVol_volume 0.1 Left.Cerebellum.White.Matter_volume0.48**CC_Central_volume 0.5** lh_transversetemporal_volume 0.25**Left.Lateral.Ventricle_volume
1.66**rh_parsopercularis_area 0.26* lh_insula_volume 0.24 CorticalWhiteMatterVol_volume 0.15 Right.Accumbens.area_volume 0.35* CC_Mid_Anterior_volume 0.49**rh_bankssts_volume 0.02 rh_parahippocampal_volume
1.17**lhSurfaceHoles_volume 0.73**lh_parahippocampal_volume 0.21 SubCortGrayVol_volume 0.24**CSF_volume
1.48**rh_WhiteSurfArea_area 0.01 rh_parsorbitalis_volume 0.32* Right.Hippocampus_volume 0.01 lh_superiortemporal_volume 0.39**X3rd.Ventricle_volume 0.45**lh_inferiortemporal_volume
1.65**lh_lateralorbitofrontal_area 0.77**rhSurfaceHoles_volume 0.48**X3rd.Ventricle_volume 0.23**Left.Inf.Lat.Vent_volume
1.33**lh_WhiteSurfArea_area 0.42**Left.Lateral.Ventricle_volume 0.3* Left.Hippocampus_volume 0.21 lh_frontalpole_volume 0.36* Left.Lateral.Ventricle_volume 0.42**rh_superiortemporal_volume 0.14 Left.Amygdala_volume
1.31**rh_medialorbitofrontal_area 0.78**CSF_volume 0.29 Right.Putamen_volume 1.44**rh_medialorbitofrontal_area 0.37**Right.Lateral.Ventricle_volume 0.36* Right.vessel_volume 0.2 CC_Mid_Anterior_volume 0.23 lhCorticalWhiteMatterVol_volume 0.34* lhCortexVol_volume 0.17* CorticalWhiteMatterVol_volume
0.65 rh_parsopercularis_area 0.79**SurfaceHoles_volume 0.3 Left.Putamen_volume 0.92* rh_parahippocampal_area 0.3* CSF_volume 0.22 rhCorticalWhiteMatterVol_volume 0.24 rh_pericalcarine_volume 0.3* Right.Lateral.Ventricle_volume 0.27* Left.Putamen_volume 0.17* lhCorticalWhiteMatterVol_volume
1.41**rh_middletemporal_area 0.15 rh_caudalmiddlefrontal_volume 0.2 rh_parahippocampal_volume 0.08 Right.Caudate_volume 0.18 rh_bankssts_volume 0.32* Right.Hippocampus_volume 0.07 rh_bankssts_volume
0.94* rh_inferiortemporal_area 0.2 Right.Accumbens.area_volume 0.31 Left.Accumbens.area_volume
1.03* lh_rostralanteriorcingulate_area 0.07 Left.Amygdala_volume 0.05 CC_Mid_Anterior_volume 0.18 CC_Central_volume 0.21 rhCorticalWhiteMatterVol_volume 0.41**Left.Amygdala_volume 0.23**X3rd.Ventricle_volume
1.75**lh_fusiform_thicknessstd 0.79**lhSurfaceHoles_volume 0.5** Optic.Chiasm_volume 1.45**lh_precentral_area 0.25* lh_parsopercularis_volume 0.32* lh_insula_volume 0.22**Right.Lateral.Ventricle_volume
0.4** rh_rostralanteriorcingulate_area 0.1 rh_parahippocampal_volume 0.33* CortexVol_volume
1.38**lh_bankssts_thicknessstd 0.59**rh_posteriorcingulate_volume 0.28 Right.VentralDC_volume 0.16* rhCorticalWhiteMatterVol_volume
1.14**rh_precuneus_area 0.29* Right.Inf.Lat.Vent_volume 0.27 rh_frontalpole_volume 0.14 lh_transversetemporal_area 0.18 Left.Thalamus.Proper_volume 0.31* rhCortexVol_volume 0.18* Right.choroid.plexus_volume
1.17**lh_inferiortemporal_thickness 0.71**rh_medialorbitofrontal_area 0.36* X4th.Ventricle_volume 1.43**rh_bankssts_area 0.33* Left.Inf.Lat.Vent_volume 0.09 lh_isthmuscingulate_volume 0.33* lh_temporalpole_area 0.23 CorticalWhiteMatterVol_volume 0.33* Left.Hippocampus_volume 0.08 CC_Mid_Anterior_volume
0.97* lh_parsorbitalis_area 0.24* rh_cuneus_volume 0.22 Left.Pallidum_volume 0.29* rh_fusiform_area 0.3* CSF_volume 0.41**rh_transversetemporal_volume 0.19* Right.Inf.Lat.Vent_volume
1.31**rh_parsopercularis_thicknessstd 0.79**lh_lateralorbitofrontal_area 0.12 rh_precuneus_volume
1.36**rh_lateraloccipital_area 0.18 rh_bankssts_volume 0.07 lh_parahippocampal_volume 0.13 lh_lateraloccipital_area 0.17 lh_superiorparietal_volume 0.29* Left.Inf.Lat.Vent_volume 0.17* rh_pericalcarine_volume
1.07**rh_parstriangularis_thicknessstd 0.5* lh_caudalanteriorcingulate_area 0.12 Right.Accumbens.area_volume 1.38**rh_superiorfrontal_area 0.14 rh_superiortemporal_volume 0.13 Left.Inf.Lat.Vent_volume 0.16* lh_transversetemporal_volume
0.43**rh_entorhinal_thickness 0.19 Optic.Chiasm_volume 0.12 rh_parahippocampal_volume
1.2** rh_pericalcarine_thicknessstd 0.7** lh_bankssts_area 0.14 MaskVol_volume 1.35**rh_inferiorparietal_area 0.11 rh_parstriangularis_volume 0.16 Left.vessel_volume 0 rh_parsorbitalis_volume
0.25 rh_insula_thicknessstd 0.15 rh_posteriorcingulate_volume 0.33* TotalGrayVol_volume
1.2** rh_paracentral_area 0.18 lh_parsorbitalis_volume 0.14 lh_temporalpole_volume 0.09 rh_superiortemporal_volume
1.29**rh_parahippocampal_thickness 0.87**lh_middletemporal_thickness 0.38* lh_inferiortemporal_area 0.29* rh_parstriangularis_thicknessstd 0.12 lh_insula_volume 0.32* Left.Lateral.Ventricle_volume 0.09 Right.Hippocampus_volume
0.76* rh_postcentral_area 0.22 rh_insula_volume 0.04 Left.Thalamus.Proper_volume 0.38**lh_parahippocampal_thickness 0.11 lh_parahippocampal_volume 0.23 lh_middletemporal_volume
1.29**lh_insula_thickness 0.8** lh_fusiform_thicknessstd 0.63**rh_medialorbitofrontal_thicknessstd 0.12 Right.Accumbens.area_volume
0.13 lh_rostralanteriorcingulate_foldind 0.22 Right.choroid.plexus_volume 0.23 lh_parstriangularis_area 0.36* rh_parahippocampal_thickness 0.17 Left.Amygdala_volume 0.21 Right.vessel_volume 0.13 Left.Cerebellum.Cortex_volume
1.1** lh_cuneus_thicknessstd 0.75**lh_insula_thickness 0.65**rh_middletemporal_thicknessstd 0.19 rh_parsopercularis_foldind 0.18 Right.Caudate_volume 0.21 lh_temporalpole_area 0.11 lh_lateraloccipital_thickness 0.03 rh_parsopercularis_volume 0.25* rh_middletemporal_volume 0.15* rh_transversetemporal_volume
1.21**lh_parsorbitalis_thicknessstd 0.91**lh_inferiortemporal_thickness 0.61**rh_parsopercularis_thicknessstd 0.58 rh_middletemporal_foldind 0.29* X3rd.Ventricle_volume 0.09 rh_precentral_area 0.18 lh_posteriorcingulate_thicknessstd 0.23 Brain.Stem_volume 0.26* rh_supramarginal_volume 0.11 lh_parsopercularis_volume
0.38 lh_lateralorbitofrontal_foldind 0.14 lh_inferiortemporal_volume 0.29 lh_caudalmiddlefrontal_area 0.12 Left.Hippocampus_volume
0.87* lh_rostralanteriorcingulate_foldind 0.62**rh_lateralorbitofrontal_thicknessstd 0.42* lh_superiorparietal_thicknessstd 0.24 rh_parsorbitalis_thicknessstd 0.03 rh_parsorbitalis_volume 0.35* lh_superiortemporal_volume
0.53 rh_temporalpole_gauscurv 0.1 Left.VentralDC_volume 0.25 rh_fusiform_area 0.15* Optic.Chiasm_volume
1.22**lh_inferiorparietal_meancurv 0.78**rh_transversetemporal_thicknessstd0.44* rh_parsorbitalis_thickness 0.25 rh_posteriorcingulate_thicknessstd 0.08 rh_fusiform_volume 0.31* lh_superiorfrontal_volume 0.11 X4th.Ventricle_volume
0.22 rh_transversetemporal_volume 0.23 rh_rostralanteriorcingulate_area 0.23 rh_insula_foldind 0.14 rh_medialorbitofrontal_volume 0.15 Right.Putamen_volume
1.25**lh_lateraloccipital_meancurv 0.69**rh_inferiortemporal_thickness 0.54**rh_parstriangularis_thicknessstd 0 MaskVol_volume
0.08 rh_parahippocampal_area 0.13 rh_precuneus_area 0.2 lh_lateralorbitofrontal_foldind 0.1 Right.Amygdala_volume 0.28* Left.Pallidum_volume 0.1 lh_superiortemporal_volume
1** lh_inferiortemporal_meancurv 0.79**rh_parstriangularis_thicknessstd 0.42* rh_medialorbitofrontal_thickness 0.25* lh_parsopercularis_area 0.14 rh_lateralorbitofrontal_area 0.29* lh_fusiform_foldind 0.03 lh_caudalmiddlefrontal_area 0.11 rh_fusiform_volume 0.01 lh_inferiortemporal_volume
1.46**rh_lateraloccipital_meancurv 0.45* rh_rostralanteriorcingulate_thickness0.39* rh_lateralorbitofrontal_thickness 0.2 rh_caudalmiddlefrontal_area 0.08 rh_transversetemporal_area 0.19 lh_superiortemporal_foldind 0.09 rh_posteriorcingulate_area 0.05 rh_cuneus_volume 0.08 Right.vessel_volume
0 rh_lateraloccipital_area 0.2 rh_lateraloccipital_area 0.09 Left.Putamen_volume
1.13**rh_lateraloccipital_gauscurv 0.84**lh_fusiform_thickness 0.58**rh_parsopercularis_thickness 0.4** lh_lateraloccipital_foldind 0.22 rh_bankssts_area 0.1 CSF_volume
0.11 lh_entorhinal_area 0.21 lh_insula_area 0.04 rh_parstriangularis_area
0.28* rh_medialorbitofrontal_foldind 0.13 lh_parahippocampal_area 0.16 lh_rostralmiddlefrontal_volume 0.04 rh_medialorbitofrontal_area
0.58 lh_supramarginal_gauscurv 0.78**lh_parahippocampal_thickness 0.47**rh_entorhinal_thickness 0.17 lh_parahippocampal_area 0.19 lh_postcentral_area 0.12 lh_parsopercularis_foldind 0.1 rh_parahippocampal_area 0.31* lh_paracentral_volume 0.04 rh_isthmuscingulate_area
0.92* rh_superiorparietal_gauscurv 0.64**rh_parahippocampal_thickness 0.57**rh_superiortemporal_thicknessstd 0.35**rh_parahippocampal_thickness 0.22 lh_rostralmiddlefrontal_area
0.22 rh_superiorparietal_foldind 0.07 rh_precuneus_area 0.26* rh_superiorparietal_volume 0.14 lh_transversetemporal_area
0.87* rh_pericalcarine_gauscurv 0.92**rh_inferiorparietal_meancurv 0.42* lh_parsorbitalis_thickness 0.26* rh_entorhinal_thicknessstd 0.04 rh_parstriangularis_area 0.11 rh_transversetemporal_area
0.15 rh_caudalmiddlefrontal_foldind 0.12 rh_parstriangularis_area 0.19 rh_lateralorbitofrontal_volume
0.26* rh_precentral_thickness 0.16 rh_frontalpole_area 0.08 rh_rostralanteriorcingulate_area
0.34 lh_superiortemporal_gauscurv 0.83**lh_superiortemporal_meancurv 0.46* rh_parstriangularis_thickness 0.06 rh_superiorfrontal_foldind 0.42**lh_lingual_thickness 0.28* SubCortGrayVol_volume
0.25* rh_parahippocampal_thicknessstd 0.26 lh_posteriorcingulate_area 0.1 lh_parsopercularis_area
0.16 lh_caudalanteriorcingulate_foldind 0.34* rh_lingual_thicknessstd 0.23 lh_temporalpole_volume
0.74* lh_inferiorparietal_gauscurv 0.8** lh_middletemporal_meancurv 0.58**lh_fusiform_thicknessstd 0.15 lh_lingual_thickness 0.5** lh_frontalpole_thicknessstd 0.1 lh_paracentral_area
0.45**rh_parsopercularis_meancurv 0.29* lh_pericalcarine_thickness 0.26* lh_paracentral_area 0.06 lh_insula_area Morphometric Attributes
1.05**rh_lingual_gauscurv 0.5* rh_precentral_meancurv 0.3 rh_lingual_thickness 0.29* lh_parahippocampal_thickness 0.46**lh_inferiortemporal_thicknessstd
0.34* lh_frontalpole_meancurv 0.26 lh_caudalanteriorcingulate_thickness0.29* rh_transversetemporal_area 0.03 rh_insula_area volume
0.89**rh_inferiortemporal_meancurv 0.6** lh_bankssts_thicknessstd 0.1 rh_cuneus_thicknessstd 0.36* lh_superiortemporal_thicknessstd
0.5** rh_paracentral_meancurv 0.22 rh_precuneus_thicknessstd 0.19 rh_pericalcarine_area 0.23**rh_pericalcarine_area area
0.08 lh_isthmuscingulate_thickness 0.45**rh_paracentral_thickness 0.12 lh_caudalanteriorcingulate_area
0.92**lh_lateraloccipital_meancurv 0.35* rh_frontalpole_thickness 0.5** lh_fusiform_meancurv 0.29* rh_paracentral_thickness 0.29* rh_bankssts_area thickness
0.03 rh_insula_thickness 0.14 lh_transversetemporal_thicknessstd 0.2* lh_cuneus_area
0.46**rh_caudalmiddlefrontal_meancurv 0.35* lh_lingual_thicknessstd 0.27* lh_transversetemporal_area foldind
0.63**lh_inferiortemporal_meancurv 0.61**rh_precuneus_thicknessstd 0.16 rh_parsorbitalis_thickness 0.37* lh_medialorbitofrontal_thicknessstd 0.02 rh_precentral_area
0.51**lh_superiorfrontal_meancurv 0.29* lh_posteriorcingulate_thickness 0.14 rh_lateraloccipital_area meancurv
0.66**lh_frontalpole_meancurv 0.42* rh_isthmuscingulate_thicknessstd 0.22 lh_lingual_thicknessstd 0.35* rh_superiorparietal_thicknessstd 0.13 rh_paracentral_area
0.52**rh_superiorfrontal_meancurv 0.07 lh_paracentral_thickness 0.04 rh_isthmuscingulate_area 0.05 rh_caudalmiddlefrontal_area gauscurv
0.03 rh_transversetemporal_thickness 0.42* rh_fusiform_thicknessstd
0.63**lh_temporalpole_meancurv 0.33* lh_lingual_thickness 0.32* lh_inferiortemporal_meancurv 0.19 lh_frontalpole_thickness 0.04 rh_insula_area 0.02 rh_precuneus_area
0.07 rh_insula_thicknessstd 0.37* lh_caudalmiddlefrontal_thicknessstd Group Difference
0.88**lh_inferiorparietal_meancurv 0.46* lh_precuneus_thicknessstd 0.46**lh_superiortemporal_meancurv 0.31* lh_pericalcarine_thicknessstd 0.14 lh_rostralanteriorcingulate_area 0.14 rh_frontalpole_area
0.19 rh_entorhinal_thickness 0.31* lh_postcentral_thickness a TDC > ASD
0.36* rh_parsorbitalis_meancurv 0.18 rh_posteriorcingulate_thicknessstd 0.19 rh_paracentral_area 0.09 rh_bankssts_area
0.63**rh_parsopercularis_meancurv 0.37* lh_posteriorcingulate_thicknessstd 0.18 rh_posteriorcingulate_thicknessstd 0.4* rh_precuneus_thicknessstd 0.25**rh_lingual_thicknessstd
0.39**lh_parsopercularis_meancurv 0.26 lh_transversetemporal_thicknessstd 0.17 lh_isthmuscingulate_area a ASD > TDC
0.75**lh_fusiform_meancurv 0.34* lh_isthmuscingulate_thicknessstd 0.2 lh_precentral_thickness 0.16 rh_parahippocampal_thickness 0.14 lh_inferiortemporal_thicknessstd
0.46**lh_lateralorbitofrontal_meancurv 0.26 rh_medialorbitofrontal_thicknessstd 0.17 lh_medialorbitofrontal_thicknessstd
0.11 lh_transversetemporal_thickness 0.38* lh_lateralorbitofrontal_thicknessstd 0.2* lh_isthmuscingulate_thicknessstd
0.09 rh_caudalmiddlefrontal_gauscurv 0.61**lh_temporalpole_thickness 0.18 lh_lateralorbitofrontal_gauscurv 0.19 rh_bankssts_thickness 0.43**lh_inferiortemporal_thickness
0.16 rh_isthmuscingulate_thickness 0.27 lh_middletemporal_thicknessstd 0.19* rh_posteriorcingulate_thicknessstd
0.3 lh_supramarginal_gauscurv 0.33* rh_superiorparietal_thicknessstd 0.28* lh_frontalpole_gauscurv 0.07 rh_postcentral_thicknessstd 0.4** lh_superiortemporal_thickness 0.17* lh_medialorbitofrontal_thicknessstd
0.09 lh_parsopercularis_foldind 0.29* lh_temporalpole_thicknessstd
0.47* lh_superiortemporal_gauscurv 0.41* rh_parahippocampal_thickness 0.01 rh_medialorbitofrontal_gauscurv 0.27* lh_isthmuscingulate_thickness 0.4** lh_lingual_thicknessstd 0.18* rh_transversetemporal_thicknessstd
0.13 lh_supramarginal_foldind 0.3* rh_precentral_thicknessstd
0.02 rh_lingual_gauscurv 0.02 rh_cuneus_thicknessstd 0.26* rh_lingual_thicknessstd 0.24**lh_lingual_thicknessstd
0.24 rh_parsopercularis_gauscurv 0.32* rh_insula_thicknessstd 0.07 rh_precuneus_foldind 0.33* rh_medialorbitofrontal_thicknessstd 0.14 lh_posteriorcingulate_thicknessstd
0.23 rh_middletemporal_foldind 0.33* rh_superiorfrontal_thicknessstd 0.35* lh_fusiform_gauscurv 0.02 rh_superiorparietal_thicknessstd 0.35* lh_isthmuscingulate_thicknessstd
0.68**rh_lingual_gauscurv 0.46* lh_inferiorparietal_thicknessstd 0.05 rh_parsorbitalis_thickness
0.01 rh_pericalcarine_foldind 0.27 lh_inferiorparietal_thicknessstd 0.25 rh_parsopercularis_gauscurv 0.22 rh_lingual_thickness 0.19 rh_lateraloccipital_thickness
0.16* lh_pericalcarine_thickness
0.07 lh_precentral_gauscurv 0.5** rh_lingual_thicknessstd 0.24* rh_fusiform_foldind 0.15 rh_lingual_thicknessstd 0.48**rh_paracentral_gauscurv 0.19 lh_temporalpole_thicknessstd 0.28* rh_bankssts_thicknessstd 0.12 lh_frontalpole_thickness
0.65**rh_pericalcarine_gauscurv 0.45* lh_precentral_thicknessstd 0.17 lh_lateralorbitofrontal_foldind 0.29 lh_precentral_thicknessstd 0.23 rh_insula_gauscurv 0.23 lh_medialorbitofrontal_thickness 0.26* lh_lingual_thickness 0.09 rh_transversetemporal_thickness
0.18 lh_frontalpole_foldind 0.31* lh_superiorfrontal_thicknessstd 0.09 rh_frontalpole_gauscurv 0.2 rh_superiortemporal_thicknessstd 0.17 lh_inferiortemporal_thicknessstd 0.16* lh_lingual_thickness
0.25 rh_inferiorparietal_gauscurv 0.36* rh_inferiortemporal_thickness 0.17* rh_middletemporal_thicknessstd
0.21 rh_rostralmiddlefrontal_foldind 0.25 rh_pericalcarine_thickness 0.38**rh_parsorbitalis_gauscurv 0.23 lh_lateraloccipital_thickness 0.05 rh_parsorbitalis_thickness
0.26 rh_frontalpole_gauscurv 0.51**lh_superiortemporal_thicknessstd 0.46**lh_lateraloccipital_gauscurv 0.12 lh_parahippocampal_thicknessstd 0.24 rh_caudalanteriorcingulate_thicknessstd 0.16* rh_entorhinal_thickness
0.08 rh_paracentral_foldind 0.14 rh_posteriorcingulate_thicknessstd 0.21**lh_bankssts_thicknessstd
0.53**rh_superiorfrontal_thicknessstd 0.25* rh_superiorparietal_foldind 0.06 rh_caudalanteriorcingulate_thickness 0.22 lh_parsopercularis_gauscurv 0.23 rh_lateraloccipital_thickness 0.33* lh_transversetemporal_thickness
0.08 lh_lateraloccipital_thicknessstd
0.52**lh_middletemporal_thickness 0.14 lh_paracentral_foldind 0.36* rh_middletemporal_thicknessstd 0.25 rh_caudalmiddlefrontal_gauscurv 0.17 lh_paracentral_thicknessstd 0.18 rh_posteriorcingulate_thicknessstd 0.02 rh_insula_thicknessstd
0.02 rh_insula_foldind 0.22 rh_postcentral_thickness 0.25 lh_parahippocampal_gauscurv 0.19 rh_middletemporal_thicknessstd 0.27* rh_bankssts_thickness 0.11 rh_lateraloccipital_thickness
0.46* lh_pericalcarine_thickness
0.39**lh_lateralorbitofrontal_meancurv 0.3* rh_inferiortemporal_foldind 0.18 rh_superiorfrontal_gauscurv 0.21 rh_fusiform_foldind 0.12 lh_parsorbitalis_thicknessstd 0.11 lh_parahippocampal_thickness
0.4* lh_medialorbitofrontal_thicknessstd 0.4** lh_inferiorparietal_gauscurv 0.13 rh_insula_foldind 0.04 lh_cuneus_foldind 0.13 rh_isthmuscingulate_thicknessstd
0.38**lh_parsopercularis_meancurv 0.01 lh_parsorbitalis_foldind
0.67**lh_middletemporal_meancurv 0.35* lh_parstriangularis_gauscurv 0.21 lh_parsorbitalis_foldind 0.19 lh_rostralanteriorcingulate_foldind 0.16* lh_middletemporal_thicknessstd
0.18 rh_frontalpole_meancurv 0.12 rh_superiortemporal_foldind 0.02 rh_isthmuscingulate_thickness
0.57**lh_frontalpole_meancurv 0.3* rh_rostralanteriorcingulate_meancurv0.26 lh_posteriorcingulate_foldind 0.16 lh_posteriorcingulate_gauscurv 0.19 rh_supramarginal_foldind 0.15 rh_bankssts_foldind
0.11 rh_superiortemporal_thicknessstd
0.15 lh_bankssts_meancurv 0.35* lh_caudalanteriorcingulate_foldind 0.28* rh_postcentral_gauscurv 0.04 rh_middletemporal_foldind 0.14 rh_parsorbitalis_foldind 0.09 rh_entorhinal_thicknessstd
0.63**lh_superiortemporal_meancurv
0.22 lh_frontalpole_gauscurv 0.18 lh_lateralorbitofrontal_foldind 0.07 lh_supramarginal_gauscurv 0.2 rh_parstriangularis_foldind 0.26* rh_superiortemporal_foldind 0.13 rh_paracentral_thickness
0.43* lh_inferiortemporal_meancurv 0.19 lh_postcentral_foldind 0.06 rh_caudalmiddlefrontal_foldind 0.15* rh_caudalanteriorcingulate_thicknessstd
0.06 lh_fusiform_gauscurv 0.15 lh_supramarginal_foldind
0.45* lh_parstriangularis_meancurv 0.19 lh_lateralorbitofrontal_gauscurv 0.17 rh_superiortemporal_meancurv 0.1 lh_superiortemporal_foldind 0.38**lh_pericalcarine_meancurv 0.08 rh_insula_foldind
0.22 lh_bankssts_meancurv 0.38**rh_pericalcarine_meancurv 0.13 lh_supramarginal_foldind
0.44* lh_bankssts_meancurv 0.03 lh_posteriorcingulate_gauscurv 0.13 lh_pericalcarine_meancurv
0.21**rh_fusiform_foldind
0.04 rh_pericalcarine_gauscurv 0.25 lh_bankssts_meancurv 0.2 lh_lateralorbitofrontal_meancurv 0.35* rh_supramarginal_meancurv 0.1 lh_inferiortemporal_foldind
0.22 lh_rostralanteriorcingulate_gauscurv0.35* lh_frontalpole_meancurv 0.11 rh_parahippocampal_meancurv 0.32* rh_pericalcarine_gauscurv 0.09 lh_frontalpole_foldind
0.23 lh_superiortemporal_gauscurv 0.17 lh_entorhinal_gauscurv 0.31* lh_pericalcarine_gauscurv
0.24* lh_entorhinal_gauscurv 0.09 rh_isthmuscingulate_meancurv 0.06 rh_superiortemporal_foldind
0.01 lh_supramarginal_gauscurv 0.32* lh_frontalpole_gauscurv 0.01 lh_bankssts_gauscurv 0.29* lh_rostralmiddlefrontal_gauscurv 0.22**lh_frontalpole_meancurv
0.41* rh_pericalcarine_gauscurv 0.02 rh_lateralorbitofrontal_gauscurv 0.1 lh_lateralorbitofrontal_gauscurv 0.03 lh_precuneus_gauscurv 0.14 rh_frontalpole_meancurv
0.06 rh_caudalmiddlefrontal_gauscurv 0.12 rh_pericalcarine_meancurv
0.4* rh_superiortemporal_gauscurv 0.11 lh_pericalcarine_gauscurv
0.19* lh_bankssts_meancurv
0.17 lh_rostralmiddlefrontal_gauscurv 0.25 rh_insula_gauscurv 0.13 lh_frontalpole_gauscurv
0.37* rh_lingual_gauscurv 0.06 lh_frontalpole_gauscurv
0.15 lh_parstriangularis_gauscurv 0.04 rh_frontalpole_gauscurv
0.46* rh_middletemporal_gauscurv 0.41* rh_paracentral_gauscurv 0.03 rh_precentral_gauscurv 0.02 rh_middletemporal_gauscurv
0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100 0 25 50 75 100
Feature Importance(%) Feature Importance(%) Feature Importance(%) Feature Importance(%)
Figure 8.5: Random Forest: Important features for classification by in sub-groups (25% threshold).
Important features for autism spectrum disorder (ASD) vs. typically developing controls (TDC) classification
in each sub-group are presented. The features required to have 25 % of the total feature importance scores
across all the features were considered as important. Each feature is represented by a colored bar; the length
of the bar represents the relative % importance for classification with respect to the top feature. The features
significance (P<0.005** and P<0.05*) of ASD vs. TDC group difference are presented. The important
features for classification varied across the sub-groups demonstrating the heterogeneity in ASD brain
morphometry. Important features are similar to that from random forest presented in Fig 2 where top 10
features are presented. The important features were dissimilar across the sub-groups.
158
Figure 8.6: Classification performance degrades with Autism Severity (AS).
Separate classification models were trained for autism spectrum disorder (ASD) subjects with different AS
values. A point represents the mean and an error bar represents the one standard deviation of the AUC
scores from 10 test folds. AUC scores and number of ASD and TDC subjects are presented below the error
bar. Blue line represents the mean AUC vs. mean AS linear model and the shaded region represents the
95% confidence interval of the model. Classification performance decreased with AS according to both
random forest and gradient boosting machine classification techniques.
159
Table 8.1: Correlation between the feature importance scores from RF and GBM
Correlation
Sub-groups
Coefficient
mild 0.76
AS moderate 0.77
high 086
low 0.87
VIQ normal 0.90
high 0.88
young 0.86
Age
mid 0.94
old 0.88
All coefficients were statistically significant (p < E-16)
160
9 Appendix C
15
10
count
5
0
−1.0 −0.5 0.0 0.5 1.0
Effect Size (Cohen's d)
Figure 9.1: Histogram of the effect sizes of the statistically significant (at 0.05) differences in
brain features
161

Machine Learning Based Autism Detection Using Brain Imaging PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning Based Autism Detection Using Brain Imaging PDF

Uploaded by

Copyright:

Available Formats

Rochester Institute of Technology

RIT Scholar Works

Machine Learning Based Autism Detection Using

Follow this and additional works at: http://scholarworks.rit.edu/theses

Gajendra Jung Katuwal

Doctor of Philosophy in Imaging Science

Chester F. Carlson Center for Imaging Science

Rochester Institute of Technology

Rochester, New York, USA

March 27, 2017

Dr. Andrew Michael, Dissertation Advisor Date

Dr. Stefi Baum, Dissertation Advisor Date

Dr. Manuela Campanelli, Committee Chair Date

Dr. Nathan Cahill, Committee Member Date

Dr. John Kerekes, Committee Member Date

immensely helpful during the course of my research.

their contributions, I would not have made this far.

Autism Spectrum Disorder (ASD) is a group of heterogeneous developmental disabilities

myelination deficits indicative of decreased structural connectivity. Results of this thesis

early detection and understanding brain anatomical underpinnings of ASD.

ABIDE: Autism Brain Imaging Data Exchange

ADHD: Attention Deficit Hyperactivity Disorder

AS: Autism Severity

ASD: Autism Spectrum Disorder

AUC: Area Under the Receiver Operating Curve

CARS: Childhood Autism Rating Scale

CDC: Center for Disease Control and Prevention

CSF: Cerebrospinal Fluid

Curvind: Curvature Index

DB: Demographics and Behavioral Measures

FAST: FMRIB's Automated Segmentation Tool

Foldind: Folding Index

FSL: FMRIB Software Library

Gauscurv: Gaussian Curvature

GM: Gray Matter

Meancurv: Mean Curvature

MRI: Magnetic Resonance Imaging

SPM: Statistical Parametric Mapping

TDC: Typically Developing Controls

TIV: Total Intracranial Volume

VIQ: Verbal Intelligent Quotient

WM: White Matter

TABLES… ........................................................................................................................................ xii

CHAPTER 1: INTRODUCTION ................................................................................................... 1

1.1 Autism Spectrum Disorder ..................................................................................................................1

1.2 Importance of early detection of ASD .................................................................................................2

underpinnings of ASD ..................................................................................................................................3

1.4 Problem 2: Inconsistent brain anatomical findings in ASD ................................................................4

1.5 Problem 3: Low Classification success in large multi-site data ............................................................4

1.6 Thesis Contributions ............................................................................................................................4

1.6.1 Demonstrating MRI as a potential tool for detection ................................................................5

methodological differences ....................................................................................................................5

multi-variate techniques as a partial solution ........................................................................................6

and more meaningful than across the whole spectrum .........................................................................7

1.6.5 Brain biomarkers discovery for early detection of ASD .............................................................7

1.7 Thesis Organization .............................................................................................................................8

1.7.1 Chapter 1: Introduction .............................................................................................................8

1.7.3 Chapter 3: Method Dependent Findings ...................................................................................9

1.7.5 Chapter 5: Early Detection of ASD .........................................................................................10

1.7.6 Chapter 6: Conclusion .............................................................................................................10

1.7.7 Appendices ...............................................................................................................................10

2 CHAPTER 2: BACKGROUND ................................................................................................ 12

2.1 Brain Imaging .....................................................................................................................................12