You are on page 1of 8

2010 11 November 2010

33 4 Modern Foreign Languages Quarterly Vol33 No4



2010 9 24-25



H319 A 1003-6105 2010 04-0419-08

1 Herdan
language in
mass language in line



Brown 1967
John Carroll
Brown LOB Frown Flob 500 AHI
BNC CLEC lognormal model
Herdan
Barber



Brown

Herdan Brown
1960
Type token Mathematics
A Textbook of
Mathematical Linguistics
420

George
Miller
Wordnet



Brigham Young
Mark Davies




ICAME International Computer Archive of
Modern and Medieval English LDC
The Linguistic Data Consortium
TACT LEXA Wordcruncher
2 Wordsmith Antconc



1 Penn Treebank
Prague Dependency Tree Bank
PropBank Penn
Discourse Treebank RSTBank
TimeBank TimeBank

Linguistic Data Consortium
LDC
supervised
machine learning
automatic syntactic
parsing automatic semantic
421

analysis

parsing information
extraction word sense
disambiguation question-answer
system automatic summarization


Chinese
Linguistic Data Consortium CLDC text data mining




CLDC text data mining mining
extraction

parallel corpus



2010
data
50 knowledge


strategy
transit
2

1

422






performance-based approach

2






























423

1






2010 9
TaCL







3


2
pedagogic ESP
processing




1
1990

424

Chomsky
Corpus linguistics does not exist Tognini
Bonelli 2001 50
Widdowson 2000

Linguistics applied Widdowson
Halliday
1993 1




Halliday
Appliable linguistics 3










Applied Corpus Linguistics

2

1



1
425










2
2


3 1




2
3



4

1
British National Corpus






426










PatCount
2008 Colligator
2009
wwwcorpus4uorg





2009
Halliday M A K 1993 Quantitative studies and
probabilities in grammar In Michael Hoey ed
Data Description Discourse C London
HarperCollins Publisher 1-25

Herden G 1960 Type-Token Mathematics M The

Hague Mouton

Tognini-Bonelli E 2001 Corpus Linguistics at Work

M Amsterdam John Benjamins


2 Widdowson H G 2000 On the limitation of
linguistics applied J Applied Linguistics 21 1
3-25
2009
J 3 8-12
2008 PatCount
J

5 71-76
2009

J

3 18-23


2010-10-15

2010-10-22