Professional Documents
Culture Documents
Received: April 19, 2015 / Revised: August13, 2015 / Accepted: October 4, 2015
Korean Society of Crop Science and Springer 2015
Abstract
Recent advances in sequencing technology have brought several novel platforms for marker development and subsequent
genotyping. The high-throughput and cost effective marker techniques have changed the entire scenario of marker applications. The huge genotypic data obtained with next generation sequencing (NGS) also demands analytical tools, statistical
advances, and comprehensive understanding to cope with breeding applications. In the present review, we discussed different
available marker techniques, their strengths, and limitations. Emphasis was given on software tools, analytical pipelines, workbenches, and online resources available for marker development. Comparison of SNP genotyping involving complexity reduction techniques like GBS, RRL, RAD, and array-based platforms were presented in a view to describe suitability for specific
purposes. We found that genotyping by whole genome re-sequencing has great potential, and could be a routine application in
the near future with continuously decreasing cost of sequencing. Microsatellites, still a valuable option for breeders, have also
advanced with NGS. Here a catalogue of tools for microsatellite evaluation in short sequence reads was provided. The most
common applications of molecular marker like QTL mapping, genome-wide association mapping (GWAS), and genomic
selection were highlighted. The present review will be helpful for the effective utilization of available resources and for the
planning of crop improvement programs employing molecular marker techniques.
Key words : Molecular marker, GBS, NGS, RNA-seq, RAD, QTL, SNP
Abbreviations
AFLP: amplified fragment length polymorphism; CRoPS: complexity reduction of polymorphic sequences; DArT: diversity
array technology; EST: expressed sequence tag; GBS: genotyping by sequencing; GS: genomic selection; GWAS: genomewide association studies; MAS: marker-assisted selection; NGS: next generation sequencing; PCR: polymerase chain reaction;
QTL: quantitative trait loci; RAD: restriction site-associated DNA; RFLP: restriction fragment length polymorphism; RRLs:
reduced-representation libraries; RRS: reduced representation sequencing; SCAR: sequence characterized amplified region;
SGS: second generation sequencing; STS: sequence tagged site; TILLING: targeting induced local lesions in genomes WGS:
whole genome sequencing
Dr. Sajad Majeed Zargar (
)
Tel: +91-7298410126 / Fax: +82-31-695-4095
Email: smzargar@gmail.com
293
294
Introduction
The main objective of plant breeding is to attain sustainability in agriculture. This can only be achieved by enhancing crop yield, keeping yield stability, and improving the
quality by crossing elite cultivars of choice with lines that
possess desired new traits. Conventional plant breeding
involves crossing of the elite cultivar with the donor followed
by selection of superior recombinants. The process involves
several crosses and several generations, requiring a careful
phenotypic selection. Moreover, the whole process is time
consuming and laborious. Additionally, there is a threat of
transferring undesirable traits along with the traits of interest.
These drawbacks are major hindrances in enhancing agricultural production (Collard et al. 2005). The availability of
molecular marker technology provides solutions to problems
associated with conventional breeding. Utilization of molecular markers by tagging the desired genes or chromosome
regions during breeding makes the process more efficient and
faster (Collard et al. 2005). Using molecular tools to select
favorable alleles and selection against undesirable background regions also helps the breeder concentrate his/her
work on improved populations.
Genetic markers have evolved from traditional morphological markers to biochemical and currently to molecular
markers. Traditional morphological markers are limited in
number and highly influenced by the environment, which in
turn limits their usefulness. On the other hand, biochemical
markers although more in numbers, are also influenced by
the environment, at times difficult to measure and hence lead
to false positives and negatives in gene-mapping studies.
Using a molecular marker means employing a method to
investigate a certain polymorphic positions on the DNA and
to identify the genotype. Thus, a molecular marker allows
tracing for a genomic region in genetic populations, giving
information from which parent it originated. Molecular
markers detect polymorphisms at the DNA level such as
nucleotide changes: deletion, duplication, inversion, and/or
insertion, transition, and transversion. The evolution of
molecular markers dates back to the use of restriction fragment length polymorphism (RFLP). Due to various limitations of this technique and with coincidence of PCR
(Polymerase Chain Reaction) discovery, PCR-based markers
were developed, which includes a large number of techniques. The discovery of SSR and SNP markers usually
requires DNA sequence information, and is therefore expensive. However, with the emergence of next-generation
sequencing (NGS), the cost of sequencing has dropped dramatically so that it is being routinely used for development of
molecular marker and genotyping (Elshire et al. 2011; Sonah
et al. 2013; Wetterstrand 2014).
Reduced-representation sequencing includes reduced-representation libraries (RRLs), complexity reduction of polymorphic sequences (CRoPS), restriction site-associated DNA
(RAD)-seq, low coverage genotyping, including multiplexed
shotgun genotyping (MSG) and genotyping by sequencing
Throughput
Status
DNA quantity
/reaction
Restriction
enzyme needed
PCR
needed
Sequence
information
Multiplexing
Morphological
Biochemical
RFLPs
RAPDs
AFLPs
SSRs
SNPs
SFPs
DArT
RRLs
RAD-seq
CRoPs
GBS
MSG
Low-throughput
Low-throughput
Low-throughput
Medium-throughput
Medium-throughput
Medium-throughput
High-throughput
High-throughput
High-throughput
Ultra high-throughput
Ultra high-throughput
Ultra high-throughput
Ultra high-throughput
Ultra high-throughput
Past
Past
Past
Past
Past
Present
Present
Present
Present
Future
Future
Future
Future
Future
Phenotypic based
Protein based
2-10 g
5-10 g
~1 g
10-20 ng
5ng
50-100 ng
50-100 ng
25 ng
300 ng
300 ng
100 ng
10 ng
No
No
Yes
No
Yes
No
No
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
Yes
Yes
Yes
Yes
No
No
Yes
Yes
Yes
Yes
Yes
No
No
No
No
No
Yes
Yes
No
No
Yes
Yes
Yes
Yes
Yes
No
No
Difficult
Difficult
Possible
Possible
Possible
No
No
Possible
Possible
Possible
Possible
Possible
Sample
size
Cost
Variable
Variable
<50-100
<100
<100
48-384
Variable
100-500
100-500
>1000
>1000
>1000
>1000
>1000
Variable
Less
High
Less
High
High
Variable
Cheapest
Cheapest
Moderate
Moderate
Moderate
Moderate
Moderate
295
296
Crop
Marker class
NGS Platform
No. of markers
identified
References
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Sunflower
Flax
Lupinus angustifolius L.
Linseed
Soybean
Soybean
Peach
Brassica napus
Aegilops tauschii
Durum wheat
Amaranth
Barley
Maize
Soybean
Rice
Globe artichoke
Cotton
Rice
Rice
Brassica napus
Arabidopsis
Lettuce
SNP
SNP
SNP
SSR
SNP
SNP
SNP
SNP, INDELs
SNP
SNP
SNP
SNP
SNP
SNP
SNP
SNP, INDELs
SNP
SNP
SNP, INDELs
SNP
SNP
SNP
Solexa Illumina
Solexa Illumina
Solexa Illumina
454 Roche
Solexa Illumina
Solexa Illumina
Solexaillumina /454 Roche
Solexa Illumina
Solexa Illumina /454Roche
Solexa Illumina
454 Roche
Solexa Illumina
SolexaIllumina
Solexa Illumina
Solexa Illumina
Solexa Illumina
Solexa Illumina
Solexa Illumina
Solexa Illumina
Solexa Illumina
Solexa Illumina
Solexa Illumina
16,467
55,465
8,207
1,842
25,047
39,022
6,654
20,000; 125
1,95,631
2,659
27,658
530
1,123
1,682
2,618
34,000; 800
13,513
67,051
1,32,462; 16,448
8,92,536
1,409
5,583
Microarray-based genotyping
The problem of expensive and laborious scoring of marker
panels across target populations in gel-based marker systems
(Gupta et al. 2013) was first overcome by the development of
high-throughput array-based markers (e.g. DArT, SFPs).
These microarray-based markers have been used for the construction of high-density maps, quantitative trait loci (QTL)
mapping (including expression QTLs), and genetic diversity
analysis with a limited expense in terms of time and money
(Colasuonno et al. 2013; King et al. 2013; Mace et al. 2008;
Raman et al. 2012; Sansaloni et al. 2010). Diversity array
technology (DArT) (diversityarrays.com) allows simultaneous typing of several hundred polymorphic loci spread over a
genome without any previous sequence information about
these loci. This technique has shown to be reproducible and
cost-effective (Gupta et al. 2013). Arrays are greatly used,
e.g. a 4 44 K chip in rice (Zhao et al. 2011), common bean
768 SNPs, Illumina chip (Blair et al. 2013), a 5 K chip was
developed for common bean (Bean CAP, unpublished data),
and still further arrays are in the pipeline, such as for cowpea.
These SNP arrays are limited to known SNPs, which are usually identified via NGS, also they are comparatively inflexi-
ble as the SNP number cannot be modified and is not costeffective compared to latest developments in reduced-representation sequencing. Several platforms for high-throughput
genotyping are provided by different universities and institutes at a very reasonable cost (Table 3). Therefore, for the
breeder community, it is easy to get work done with outsourcing rather than development in their own-facility.
Table 3. Details of different SNP genotyping platform along with cost and time required
Genotyping
Platform/Method
Number of SNP
Cost/sample
Facility available
Time*
30-80K
10-40K
10-20K
1M
60K
6K
384
1536
96
$38
$25
$35
$65
$55
$80
$40
$80
$20
2-6 months
2-6 months
1-month
2-months
1-month
1-month
1-month
1-month
1-month
http://www.igd.cornell.edu/index.cfm/page/GBS/GBSpricing.htm
http://www.ibis.ulaval.ca/?pg=sequencage_genotypageSequencage
http://gsl.irri.org/services/prices-and-sample-submission/prices
http://dnatech.genomecenter.ucdavis.edu/wp-content/uploads/2013/06/Golden_Gate_Genotyping_Prices_Dec2012.pdf
http://www.ibis.ulaval.ca/?pg=sequencage_genotypageSequencage
*as per service providers
RNA sequencing, and direct analysis of methylation. Thirdgeneration DNA-sequencing is mostly distinguished by
direct inspection of single molecules with methods that do
not require the repetitive wash and scan steps during DNA
synthesis, synchronization of multiple reactions, or problems
associated with PCR amplifications or phasing (Thudi et al.
2012). It has been predicted that the third-generation
sequencing platform will replace the SGS by 47% in the next
three years (Peterson et al. 2010). These technologies are also
expected to increase the accuracy of SNP discovery, and
reduce the chances of wrong base calling.
A major issue when dealing with NGS data is the bioinformatics analysis of huge amounts of data generated, involving
trimming, deconvolution and filtering of reads, alignment to
a reference sequence or de novo alignment, and SNP calling
for polymorphism identification or genotyping. The analysis
is much more resource-dependent requiring more than previous technologies the bioinformatics support. A plethora of
commercial and non-commercial software solutions are
available; various reviews cover the available tools (Horner
et al. 2009) or specific topics like alignment (Li and Homer
2010), de novo assembly (Zhang et al. 2011), and SNP calling (Nielsen et al. 2011).
Target enrichment followed by massively parallel
sequencing is the less expensive method over whole genome
sequencing for exploring variations in specific sub-regions of
the genome such as exomes or regions associated with QTLs
(Kiialainen et al. 2011). This method requires a priori availability of sequence data to design DNA capture probes. The
protocol involves design of complementary biotinylated
RNA baits for regions of interest, and the baits are hybridized
onto the targets followed by hybrid capture and NGS. The
popular capture methods are SureSelect, Nimblegen, and
Raindance (Davey et al. 2011). This in-solution-based
genome complexity reduction technology is a simplified
method for identifying genetic diversity between plant varieties. This method has been employed successfully in recent
times on various crops (Uitdewilligen et al. 2013; Zhou and
297
298
Table 4. Crop species were transcriptome sequencing have been used for marker discovery
S.No.
Crop
Marker class
NGS platform
1
2
Eucalyptus
Maize
3
4
5
Wheat
Watermelon
Pepper
454 Roche
454 Roche
454 Roche
__
454 Roche
454 Roche
6
7
8
9
Lentil
Rubber tree
Sweet potato
Blackcurrant
10
11
12
13
Brassica napus
Pigeonpea
Sesame
Cucurbita pepo
14
Carrot
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Lentil
Olive
Blackberry
A. mongolicus
C. nankingense
Camelina sativa L.
Oil palm
Tea
Amorphophallus
Sesame
Chickpea
Field pea
Faba bean
Chickpea
Chickpea
Silene vulgaris
Silene vulgaris
Tomato
SNP
SNP
SNP
SNP
SSR
SNP
SSR
SSR
SSR
SSR
SNP
SSR
SNP
SSR
SSR
SNP
SSR
SNP
SSR
SSR
SNP
SSR
SSR
SSR
SSR
SNP
SNP
SSR
SSR
SSR
SSR
SSR
SSR
SNP
SSR
SNP
SNP
454 Roche
Solexa Illumina
Solexa Illumina
454 Roche
Solexa Illumina
454 Roche
Solexa Illumina
454 Roche
Solexa Illumina
454 Roche
Solexa Illumina
454 Roche
454 Roche
Solexa Illumina
Solexa Illumina
454 Roche
454 Roche
Solexa Illumina
Solexa Illumina
454 Roche
454 Roche
454 Roche
454 Roche
454 Roche
454 Roche
454 Roche
-
No. of markers
identified
23,742
36,000
7,000
5,471
5,000
11,849
853
192
39,257
4,114
7,000
3,000
41,593
3,771
7,702
9,043
1,882
20,058
114
192
2,987
15,886
1,827
1,788
19,379
823
3,767
19,596
7,702
4,000
2,397
802
4,072
36,446
1,320
13,432
8,784
Reference
(Novaes et al. 2008)
(Barbazuk and Schnable 2007)
(Barbazuk and Schnable 2011)
(Akhunov et al. 2010)
(Guo et al. 2011)
(Nicolai et al. 2013)
(Kaur et al. 2011)
(Li et al. 2012)
(Wang et al. 2010)
(Russell et al. 2011)
(Trick et al. 2009)
(Dutta et al. 2011)
(Wei et al. 2011)
(Blanca et al. 2011)
(Iorizzo et al. 2011
(Kaur et al. 2011)
(Kaya et al. 2013)
(Rowland et al. 2012)
(Zhou et al. 2012)
(Wang et al. 2013)
(Mudalkar et al. 2014)
(Pootakham et al. 2013)
(Wu et al. 2013)
(Zheng et al. 2013)
(Wei et al. 2011)
(Garg et al. 2011)
(Kaur et al. 2012)
(Kaur et al. 2012)
(Jhanwar et al. 2012)
(Jhanwar et al. 2012)
(Sloan et al. 2012)
(Sloan et al. 2012)
(Sim et al. 2012)
Table 5. Details of some of the important platforms and severs available for next generation sequencing data analysis
NGS Platform/ Server
Website
Galaxy
https://usegalaxy.org/
Online server provides variety of tools for NGS data analysis including SNP identification
and RNA-seq
Goldenhelix
www.goldenhelix.com
Software package provide comprehensive tools for NGS data analysis and visualization
CASAVA (Illumina)
www.support.illumina.com
Read processing and alignment, SNP and InDel calling, Expression analysis, splice junction
detection RNA-Seq analysis.
www.clcbio.com
All-inclusive and user-friendly software package for NGS analysis, SNP calling, read-pro
cessing, RNA-seq analysis, Data visualization
DNASTAR
www.dnastar.com
Provides all the software needed for next-gen sequence assembly and analysis, in a single
integrated package. It support all major NGS technologies, making it easy to work with
data form any type of NGS project.
GenomicTools
http://code.google.com/p/ibm-cbc-genomic-tools
GenoMiner
http://astridbio.com/genominer-genome-analyzer/
GenoMiner is a next generation sequencing data analysis computer for life science
researchers with or without IT background. With GenoMiners easy to-use graphical interface you can analyze your sequencing data in your own lab with only 15 clicks!
HiPipe
http://hipipe.ncgm.sinica.edu.tw/
HiPipe provides high performance NGS data analysis pipelines with intuitive user interface
to the community so that researchers with minimum IT or bioinformatics knowledge can
perform common analyses on NGS data.
JMP Genomics
http://www.jmp.com/
JMP Genomics provides comprehensive tools to analyze rare and common variants, detect
differential expression patterns, understand NGS data, discover reliable biomarker profiles,
and incorporate pathway information into your analysis workflows
Omics Pipe
http://sulab.scripps.edu/omicspipe/
Omics Pipe provides two RNA sequencing (RNA-seq) pipelines, variant calling from whole
exome sequencing (WES) and whole genome sequencing (WGS), and two ChIP-seq
pipelines
299
300
species have been sequenced and available in public repositories (www.ncbi.nlm.nih.gov/sra). Regarding marker development, the RNA-seq data are being used for SSR and SNP
mining in less-studied plant species. The usage of transcriptome sequencing for marker discovery is rapidly expanding
(Table 4). A significant technical advantage of the use of
transcript data for marker discovery is that transcripts rarely
contain large stretches of repetitive DNA, which interferes
with SNP detection in large and complex genomes of many
crop plants. This allows for increased confidence in base
calling and correct assignment to individual alleles.
Disadvantages are the lower number of SNPs compared to
other techniques and a lower reproducibility due to the
changing transcript expression patterns.
A particularly elegant example of application of RNA-seq
in genomic studies is bulk segregant analysis (BSA), RNAseq for mutant mapping and detection (Liu et al. 2012).
Mutants and non-mutants out of a segregating population
were bulked and the bulks subjected to RNA-seq. SNPs were
identified and based on marker distortions between the bulks
the mutated locus and associated markers were found. As all
the expressed genes are sequenced, it will also be possible to
identify the underlying mutation. Applying this to breeding
traits will allow mapping and identification of major QTL
and possibly the underlying gene in one experiment which
will also produce specific markers for MAS. At a low cost of
only two sequenced mRNA bulks this is likely to impact
marker development in future.
Conclusions
The whole genome sequence is available for only a few
select crop species and most of the non-model species have
limited/poor genomic resources. Also, most of the crop
species are orphans as far as genomic resource are considered
that limits their improvement. Molecular markers evolved
continuously from low-throughput hybridization-based platforms (e.g. RFLP) to medium-throughput PCR-based platforms (e.g. SSR),high-throughput SNP-based platforms and
finally ultra-high-throughput NGS-based markers. With a
decreasein cost and increase in throughput of NGS, these
ultra-high-throughput NGS-based molecular markers will
definitely replace most marker systems in the coming future.
The NGS-based molecular markers have been already developed in both model and non-model crop species and hence
have made it possible for a large number of markers available in non-model crop species. Thus, allowing the orphan
crop species to enter into the genomic phase will make it possible for the genomic-assisted crop improvement therein. The
modern genomic and breeding approaches like GWAS and
GS have not been fully exploited for crop improvement, but
can be increasingly deployed in both model and non-model
crop species with the availability of these NGS-based markers. Thus, reduced representation sequencing, amplicon
sequencing, and transcriptome sequencing using NGS-
301
302
sequencing technologies have paved the way for marker discoveries which are ultra-high-throughput, cost-effective, and
available in both model and non-model crop species. Hence,
these approaches will revolutionize genomic-assisted crop
improvement.
Acknowledgements
SMZ acknowledges the financial support of SERB-DST,
New Delhi (Government of India) for performing genetic
diversity studies in the common bean using molecular markers.
References
Ahmad R, Parfitt DE, Fass J, Ogundiwin E, Dhingra A,
Gradziel TM, et al. 2011. Whole genome sequencing of
peach (Prunus persica L.) for SNP identification and
selection. BMC genomics 12: 569
Akhunov ED, Akhunova AR, Anderson OD, Anderson JA,
Blake N, Clegg MT, Coleman-Derr D. 2010. Nucleotide
diversity maps reveal variation in diversity among wheat
genomes and chromosomes. BMC Genomics 11: 702
Altshuler D, Pollara VJ, Cowles CR, Van Etten WJ, Baldwin
J, Linton L, Lander ES. 2000. An SNP map of the human
genome generated by reduced representation shotgun
sequencing. Nature 40: 513-516
Arai-Kichise Y, Shiwa Y, Nagasaki H, Ebana K, Yoshikawa
H, Yano M, Wakasa K. 2011. Discovery of genome-wide
DNA polymorphism in a landrace cultivar of japonica rice
by whole genome sequencing. Plant Cell Physiol. 52: 274282
Aranzana MJ, Kim S, Zhao K, Bakker E, Horton M, Jakob
K, Lister C, Molitor J, Shindo C, Tang C. 2005. Genomewide association mapping in Arabidopsis identifies previously known flowering time and pathogen resistance
genes. PLoS Genet. 1: e60
Barbazuk WB, Schnable PS. 2011. SNP discovery by transcriptome pyrosequencing. Methods Mol. Biol. 729: 225246
Bastien M, Sonah H, Belzile F. 2014. Genome-wide association mapping of resistance in soybean with a genotypingby-sequencing approach. Plant Genome 7: 1-13
Beyene Y, Semagn K, Mugo S, Tarekegne A, Babu R,
Meisel B, Sehabiague P, Makumbi D, Magorokosho C,
Oikeh S. 2015. Genetic gains in grain yield through
genomic selection in eight bi-parental maize populations
under drought stress. Crop Sci. 55: 154-163
Blair MW, Corts AJ, Penmetsa RV. 2013. A high-throughput SNP marker system for parental polymorphism
screening, and diversity analysis in common bean
(Phaseolus vulgaris L.). Theor. Appl. Genet. 126: 535-548
Blair MW, Hurtado N, Chavarro CM, Muoz Torres MC,
Giraldo MC, Pedraza F, Tomkins J, Wing RA. 2011.
Gene-based SSR markers for common bean (Phaseolus
303
304
Grattapaglia D, Kirst M. 2011. Stability of genomic selection prediction models across ages and environments. In:
BMC Proceed., vol. Suppl 7. BioMed Central Ltd, p O14
Resende MF, Munoz P, Acosta J, Peter G, Davis J,
Grattapaglia D, Resende M, Kirst M. 2012a. Accelerating
the domestication of trees using genomic selection: accuracy of prediction models across ages and environments.
New Phytol. 193: 617-624
Resende MF, Muoz P, Resende MD, Garrick DJ, Fernando
RL, Davis JM, Jokela EJ, Martin TA, Peter GF, Kirst M.
2012b. Accuracy of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.). Genetics
190: 1503-1510
Riedelsheimer C, Technow F, Melchinger AE. 2012.
Comparison of whole-genome prediction models for traits
with contrasting genetic architecture in a diversity panel of
maize inbred lines. BMC Genomics 13: 452
Sansaloni CP, Petroli CD, Carling J, Hudson CJ, Steane DA,
Myburg AA, Grattapaglia D, Vaillancourt RE, Kilian A.
2010. A high-density Diversity Arrays Technology
(DArT) microarray for genome-wide genotyping in
Eucalyptus. Plant Methods 6: 16
Schneeberger K, Ossowski S, Lanz C, Juul T, Petersen AH,
Nielsen KL, Jrgensen J, Weigel D, Andersen SU. 2009.
SHORE map: simultaneous mapping and mutation identification by deep sequencing. Nat. Meth. 6: 550-551
Schulz-Streeck T, Ogutu JO, Piepho HP. 2013. Comparisons
of single-stage and two-stage approaches to genomic
selection. Theor. Appl. Genet. 126: 69-82
Sexton TR, Shapter FM. 2012. Amplicon sequencing for
marker discovery in Molecular Markers in Plants, RJ
Henry, eds, Blackwell Publishing Ltd., Oxford, UK, pp
35-56
Shu Y, Yu D, Wang D, Bai X, Zhu Y, Guo C. 2012.
Genomic selection of seed weight based on low-density
SCAR markers in soybean. Genetics Mol. Res. GMR 12:
2178-2188
Singh H, Deshmukh RK, Singh A, Singh AK, Gaikwad K,
Sharma TR. et al. 2010. Highly variable SSR markers suitable for rice genotyping using agarose gels. Mol. Breed.
25, 359-364
Snowdon RJ, Luy FLI. 2012. Potential to improve oilseed
rape and canola breeding in the genomics era. Plant Breed.
131: 351-360
Sonah H, Bastien M, Iquira E, Tardivel A, Lgar G, Boyle
B, Normandeau , Laroche J, Larose S, Jean M. 2013. An
improved genotyping by sequencing (GBS) approach
offering increased versatility and efficiency of SNP discovery and genotyping. PloS ONE 8: e54603
Sonah H, Deshmukh RK, Sharma A, Singh VP, Gupta DK,
Gacche RN. et al. 2011. Genome-wide distribution and
organization of microsatellites in plants: an insight into
marker development in Brachypodium. PLoS ONE 6:
e21298
Sonah H, ODonoughue L, Cober E, Rajcan I, Belzile F.
2014. Identification of loci governing eight agronomic
305
306
Table S1. Genomic selection efforts performed for trait improvement in different crops using different statistical models and marker genotyping platforms
Population type
Population size
Total markers
Accuracy of
GEBVs
Reference
RILs
415
69 SSR
0.90-0.93
BLUP
(Lorenzana and
Bernardo 2009)
Barley
DHLs
150
223RFLP
0.64-0.83
BLUP
(Lorenzana and
Bernardo 2009)
Barley
DHLs
140
0.66-0.85
BLUP
(Lorenzana and
Bernardo 2009)
Species
Traits
Eucalyptus
920
3564 DArTs
0.54-0.62
BLUP
(Grattapaglia and
Resende 2011)
Eucalyptus
783
3120 DArTs
0.53-0.69
BLUP
(Grattapaglia et al.
2012)
Loblolly pine
61 full-sib families
790 - 840
3938 SNPs
0.64-0.77
BLUP
(Resende et al.
2011)
Loblolly pine
Full-sib offspring
149
3406 SNPs
0.3-0.83
Pedigree model
Loblolly pine
17 traits
70 full-sib families
951
4853 SNPs
0.37-0.77
RR-BLUP, Bayes A,
Bayes C, Bayesian
LASSO
(Resende et al.
2012b)
50 haploids
769
4755 SNP
0.33-0.94
GBLUP, BayesA,
and BayesCp
61 full-sib families
800
4825 SNP
0.63-0.75
BLUP
(Resende et al.
2012a)
Maize
CIMMYT lines
300
1148 SNPs
0.42-0.79
M-BL
Maize
3 morphological traits,
grain moisture
F2
349
160 SSR
0.59-0.72
BLUP
(Lorenzana and
Bernardo 2009)
Maize
3 morphological traits,
grain moisture
Testcrosses of DHLs
371
125SNPs
0.31-0.55
BLUP
(Lorenzana and
Bernardo 2009)
Maize
8 morphological traits, 3
chemical components,
grain moisture
RILs
223
0.48-0.73
BLUP
(Lorenzana and
Bernardo 2009)
Maize
5 morphological traits,
grain moisture
RILs
119
0.40-0.50
BLUP
(Lorenzana and
Bernardo 2009)
Maize
Inbred lines
289
56,110 SNPs
0.45-0.82
RR-BLUP, LASSO
(Riedelsheimer et al.
2012)
Maize
grain yield
DHLs
177
768 SNP
0.476 to 0.710
RR-BLUP
(Schulz-Streeck et
al. 2013)
Maize
Drought
F2:3
300
286 SNPs
Pear
9 Traits
cultivars
76
155SSRs,
4 RAPD-STS
0.2-0.75
Bayesian regression
Rapeseed
6 Traits
DHLs
391
253 SNP
0.41-0.84
RR-BLUP
(Wurschum et al.
2013)
Soybean
HSW
Accessions
(Guadeloupe)
288
79 SCAR
0.69-0.904
RR-BLUP, BLR
307
308
Species
Traits
Population type
Population size
Total markers
Accuracy of
GEBVs
Reference
Soybean
Primary embryogenesis
capacity
RILs
126
80SSRs
0.12- 0.78
Empirical Bayesian
method
Sugar beet
6 Traits
924
677 SNP
0.48-0.80
RR-BLUP
(Wurschum et al.
2013)
Sugar beet
310
384 SNP
0.4-0.86
RR-BLUP
(Hofheinz et al.
2012)
Sugarcane
10 Traits
Accessions (Reunion
Island)
167
1499 DArT
0.29 - 0.62
LASSO
Sugarcane
10 Traits
Accessions
(Guadeloupe)
167
1499 DArT
0.11 - 0.5
LASSO
Wheat
Grain yield
CIMMYT lines
599
1279 DArTs
0.48-0.61
PM-RKHS
Wheat
8 grain quality
DHLs
209
0.32-0.84
RR-BLUP
Wheat
8 grain quality
DHLs
174
574 DArTs
0.41?0.73
RR-BLUP
Wheat
grain yield
Commercial lines
139
2395 SNPs
0.175-0.514
G-BLUP
Wheat
Elite lines
306
1717 DArT
0.20-0.75
Bayesian LASSO,
BRR,
(Perez-Rodriguez et
al. 2012)
Fig. S1. Sequencing cost reduction over the last fourteen years. Data
was obtained from NHGRI Genome Sequencing Program
(www.genome.gov/sequencingcosts/).