You are on page 1of 9

2009 5 May 2009

2 2 Foreign Language Education in China (Quarterly) Vol. 2 No. 2

2009 (2)65-73

added valueLeech 1997



regular
expressions






1.

BNCBritish National
CorpusBoEBank of EnglishANCAmerican BNC, BoE
National Corpus


Teubert 2005
Leech1997 regular expressionswild
added value
cards

1 06JA740007

65


MICASE, Michigan Corpus of Academic
Spoken English 2.





stand-alone WordSmith Tools, Jurafsky &
Antconc Martin 2009
analyze analyse
/ analyze, analyse, analysed,
analyzed, analyzes, analyses, analyzing, analysing


analy- analy-
analy[sz]
ed|es|ing

analytical, analysis

analy[sz]e|ed|es|ing
|
ICE, International Corpus of English analy
analy analytical

[sz] s z
CLAWS analytical
Treetagger 95% analysis
Part- group
of-Speech tagging e, ed, es
ing
SWECCL 1.0
2009
SWECCL 2.0 2008
CLAWS4

66

1. Even_RR where_CS concepts_NN2 and_CC practicalities_NN2 were_VBDR not_XX


handled_VVN adequately_RR ,_, it_PPH1 was_VBDZ obvious_JJ that_CST
schools_NN2 of_IO architecture_NN1 were_VBDR making_VVG energy_NN1
efficiency_NN1 a_AT1 significant_JJ aspect_NN1 of_IO their_APPGE
curriculum_NN1 ._.
2. Although_CS there_EX is_VBZ no_AT written_JJ evidence_NN1 to_TO
substantiate_VVI this_DD1 ,_, no_AT major_JJ repairs_NN2 or_CC
maintenance_NN1 were_VBDR carried_VVN out_RP on_II the_AT machinery_NN1
,_, and_CC it_PPH1 is_VBZ evident_JJ that_CST it_PPH1 performed_VVD
well_RR from_II the_AT start_NN1 ._.
3. It_PPH1 is_VBZ clear_JJ that_CST prospects_NN2 of_IO a_AT1 widespread_JJ
programme_NN1 of_IO adoption_NN1 are_VBR grim_JJ ._.

\S+\w+\w*
it + BE + adj + that \s\S+
extraposed construction \w+
it that BE
\w* 0
analyseanalyze \s

\S+

_
BE BE \S+_VB\
BE w*
_VB
V B VB BE
am_VBM, are_VBR, be_VB0, be_VBI,
BE been_VBN, being_VBG, is_VBZ, m_VBM,
VB re_VBR, s_VBZ, was_VBDZ, were_VBDR
it
+ BE + adj + that

\S+_PPH1\s\S+_VB\w*\s\S+_J\w+\s\S+_CST\s

67

1 ication_NN1 ,_, it_PPH1 is_VBZ important_JJ that_CST the_AT informat

2 And_CC it_PPH1 s_VBZ unlikely_JJ that_CST Chas_NP1 s_GE

3 xceeded_VVN ,_, it_PPH1 was_VBDZ worrying_JJ that_CST the_AT Ashby_NP

4 ccessful_JJ ,_, it_PPH1 was_VBDZ essential_JJ that_CST the_AT supply_N

5 ccupier_NN1 ,_, it_PPH1 is_VBZ likely_JJ that_CST subsequent_JJ p

6 e_VV0 in_RP ,_, it_PPH1 is_VBZ vital_JJ that_CST if_CS we_PPIS2

7 _NN1 ,_, and_CC it_PPH1 is_VBZ interesting_JJ that_CST although_CS ang

8 s_NN2 ,_, so_CS it_PPH1 is_VBZ important_JJ that_CST very_RG basic_J

9 As_CSA it_PPH1 was_VBDZ apparent_JJ that_CST Stalin_NP1 prop

10 s_DD1 point_NN1 it_PPH1 was_VBDZ obvious_JJ that_CST there_EX were_V

11 ntioned_VVN ,_, it_PPH1 is_VBZ noticeable_JJ that_CST the_AT basic_JJ

12 1 units_NN2 ,_, it_PPH1 is_VBZ possible_JJ that_CST an_AT1 oolitic-

13 er_NPM1 1968_MC it_PPH1 was_VBDZ clear_JJ that_CST the_AT radicals

14 emories_NN2 ,_, it_PPH1 is_VBZ clear_JJ that_CST the_AT relation

15 0_MC ,_, and_CC it_PPH1 is_VBZ clear_JJ that_CST Hope_NP1 s_GE




Lawler & Dry1998
real linguistics 3. PatternBuilder





BE
BE it might have been
possible that...
PatternBuilder

CLAWS4 130
3.1 PatternBuilder
PatternBuilder
1 PatternBuilder
5

68





Nnoun V
verb
be
PatternBuilder

1 PatternBuilder


PatternBuilder

Perl
PatternBuilder Windows
Perl Perl Win32::GUI PatternBuilder CLAWS4
PatternBuilder

CLAWS4

69





CLAWS4
PatternBuilder 1

PatternBuilder


PatternBuilder
CLAWS4 PatternBuilder

PatternBuilder


progressives
BE

1

PatternBuilder





PatternBuilder PatternBuilder


PatternBuilder



Test this pattern 1

PatternBuilder

70

3.2
it + BE + adj + that PP third person sing. neuter personal pronoun
PatternBuilder (it)Get Pattern
it

itBE
that 4 VB any form of thebe
it that verbJ any form of any adjective CS
BE that (as conjunction)
BE \S+_PPH1\s\S+_VB\w*\s\
VB S+_J\w+\s\S+_CST\s
J 2
PatternBuilder check Hide POS
Ppronoun tagsTest this pattern
Expand => 2

2 it + BE + adj + that

PatternBuilder

it + BE + adj + that BNC

71

1. It_PPH1 is_VBZ perhaps_RR not_XX really_RR surprising_JJ that_CST


avian_JJ fossils_NN2 are_VBR so_RG uncommon_JJ ._.
2. It_PPH1 may_VM not_XX have_VHI been_VBN reasonably_RR foreseeable_JJ
that_CST Dan_NP1 would_VM have_VHI thrown_VVN Valerie_NP1
resulting_VVG in_II her_APPGE death_NN1 ._.
3. I t _ P P H 1 should_VM now_RT be_VBI clear_JJ that_CST the_AT
pronunciation_NN1 described_VVN in_II this_DD1 course_NN1 is_VBZ
only_RR one_MC1 of_IO a_AT1 vast_JJ number_NN1 of_IO possible_JJ
varieties_NN2 ._.
4. It_PPH1 is_VBZ perhaps_RR also_RR not_XX surprising_JJ that_CST
her_APPGE idea_NN1 of_IO transforming_VVG the_AT former_DA site_NN1
of_IO the_AT Nuremberg_NP1 rallies_VVZ into_II a_AT1 peace_NN1
park_NN1 should_VM meet_VVI with_IW opposition_NN1 ._.

1 BE 2 It BE
perhaps_RRreally_RR may_VM,
not_XXit + BE + adj + not_XX have_VHI 3 1
that BE

72


PatternBuilder
Get (Pattern)* Jurafsky, D & J. H. Martin. 2009. Speech and Language
Processing: An Introduction to Natural Language
optional
Processing, Computational Linguistics, and Speech
3
Recognition (2nd ed.) [M]. Upper Saddle River, NJ:
Prentice Hall.
3 Lawler, J. & H. A. Dry. 1998. Using Computers in Lin-

it + BE + adj + that guistics: A Practical Guide [M]. London: Routledge.


Leech, G. 1997. Introducing corpus annotation [A]. In R.
it BE
Garside, G. Leech & A. McEnery (eds.). Corpus An-
have not BE notation: Linguistic Information from Computer Text
Corpora [C]. London: Longman. 1-18.
Teubert, W. 2005. My version of corpus linguistics [J].
International Journal of Corpus Linguistics 10 (1):
1-13.
4. 2008,
SWECCL 2.0[M]

2009,
SWECCL 1.0 [M]

PatternBuilder 1964

frankliang0086@yahoo.

com.cn

73