Professional Documents
Culture Documents
Tfno 958 241000 - Ext. 20243 Fax 958 243678. jsantana@ugr.es www.ugr.es/local/jsantana
TYPES OF CORPORA 1
Tfno 958 241000 - Ext. 20243 Fax 958 243678. jsantana@ugr.es www.ugr.es/local/jsantana
According to temporal variety: o Synchronic: 1 variety, normally contemporary (at compilation time). o Diachronic: Helsinki Corpus According to type of speaker: native vs learner corpora According to annotation: o Plain: e.g. Project Gutenberg texts, produced by scanning; no information about text (usually, not even edition): not really a corpus but a collection of texts. o Annotated: marked up for formatting attributes: e.g. page breaks, paragraphs, font sizes, italics, etc.: Brown annotated with identifying information, e.g. edition date, author, genre, register, etc. : BNC, ICE-BG annotated for part of speech, syntactic structure, discourse information, etc. : LOBTAG, BNC, ICE-GB
For a comprehensive list of corpora and links to them, visit: http://www.uow.edu.au/~dlee/CBLLinks.htm http://www.ugr.es/~pedrou/
TYPES OF CORPORA 2