You are on page 1of 5

2 1

2012 2

Scientific Journal of Information Engineering(SJIE)

Vol.2 No.1
Feb. 2012

471003

KNN

Using Affinity Propagation Cluster for Text Auto Summarization


Suhuan Sun, Changwei Zhao
Electronic and Information School of Henan University of Science and Technology, Luoyang, Henan province, China 471003

Abstract: Automatic summarization can help us accurately and efficiently obtain the information needed from the magnanimity
information and has attracted more attention. In this paper, a new method for Chinese text summarization using the algorithm of Affinity
Propagation Cluster (APC) is presented. It is not necessary to set the number of clusters and the initial representative exemplars in APC,
so it can avoid the problems of local-optimal and instable clustering results caused by randomly selecting initial representative
exemplars. And the algorithm has high computing efficiency. The results of the experiments show us that Chinese text automatic
summarization based on APC has higher accuracy than KNN cluster. APC is a suitable method for automatic text summarization.

Key words: affinity propagation cluster; text auto summarization; Chinese text

[1]

IBMLuhn[2]
Linclue words
[3]Julian K.[4]

Tadashidiversity of concepts
[5]Tadashi
[6][5][6]

No. 2010A6300025 No. 2011A520011

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD


- 26 -

Filatova (atomic events)


[7]

[5][6]

VSM[6]LSA

[8][9]

[9]

1
1.1
T S1, S 2,
{Sk1 , Sk 2 ,

, Sl T {Sn } S k

, Skm }

A {Sk1 , Sk 2 ,
T

, Skm }

[8,9]
T i S i j S j w
Si {wi1 , wi 2 ,

wip } S j {w j 1 , w j 2 ,

w jq } | Si | | S j | S i S j

| W | T S i S j

sim( Si , S j ) k i 1 k
|S |

log 2 | S j |

if wik S j

log 2 | W |

if wik S j

k {

stop-of-word

1.2

affinity propagation cluster[9]exemplar

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD


- 27 -

Scientific Journal of Information Engineering (SJIE)

r responsibility a
availability

i k r (i, k ) i
k evidence a(i, k ) k i
[9]

r (i, k ) s(i, k ) max {a(i, k ') s(i, k ')}


k ' s.t .k ' k

r(k, k ) s(k, k ) max s(i, k )

a(i, k ) min{0, r (k , k )

2
3

max{0, r (i ', k )}}

i ' s .t .i '{i , k }

a(k , k )

max{0, r (i ', k )}

i ' s .t .i ' k

a availability0 r 2
r (i, k ) a(i, k )
preference P P
S (i, i )
P
S (i, i ) P

P
[9]

2
JonesIntrinsic
Extrinsic[10]

Lin
ROUGERecall Oriented Understudy for Gisting Evaluation[11]
ROUGEn-gramROUGE-N
ROUGE-LROUGE-S ROUGE-WROUGE-L,
ROUGE

n s m

m
s

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD


- 28 -

m
n

F [12]

2RP
RP

P R F

3
3.1

Goldstein
[13] [14]

1

30
10
130160

3.2

neucsp[15]
P
K-MeansAPC1
P APC 1P sim( Si , Si ) 30 K-Means1 APC
1 APC 2 P sim( Si , Si ) 80 K-Means2 APC 2

3.3
1K-Means[6][16]
[6][16]K-Means
1K-Means

1
2

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD


- 29 -

Scientific Journal of Information Engineering (SJIE)


1

0.678

0.635

0.656

K-Means1

0.644

0.603

0.623

APC2

0.495

0.754

0.597

K-Means2

0.464

0.706

0.559

APC1

[1]

. [M], , 2006.

[2]

Luhn H P. The Automatic Creation of Literature Abstract[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165.

[3]

Lin C Y, Hovy E. Identifying Topics by Position[C]//Proc. of the 5th Conference on Applied Natural Language Processing. [S. l.]: IEEE
Press, 1997: 283-290.

[4]

Julian K, Pedersen J O, Chen F. A. Trainable Document Summari[C]//Proceedings of the 18th Annuual International ACM SIGIR
Conference on Research and Development in Information Retrieval. Seattle, WA: [s.n], 1995: 68-73.

[5]

Tadashi N., Matsumoto Y. A New Approach to Unsupervised Text Summarization[C]//Proc. of Annual ACM Conference on Research
and Development in Information Retrieval. [S.l.]. IEEE Press, 2001.

[6]

, , . [J]. , 2008, (1).

[7]

Filatova E, Hatzivassiloglou V. Event-based Extractive Summarization[C]//Proc. of ACL Workshop on Summarization. Barcelona, Spain:
[s. n.], 2004.

[8]

T. M. Cover, J. A. Thomas, Elements of Information Theory, (John Wiley & Sons, New York, NY, 1991).

[9]

Frey, B.J. and D. Dueck, Clustering by passing messages between data points. Science, 2007, 315(5814): 972-976.

[10] Jones K S, Galliers J R. Evaluating Natural Language Processing Systems: An Analysis and Review. Berlin: Springer, 1996.
[11] Lin C. Y. ROUGE: A Package for Automatic Evaluation of Summaries[A]. In: Proceedings of the ACL2004 Workshop on Text
Summarization[C]. Spain, 2004, 7: 428.
[12] Van Rijsbergen, C. J. Information Ret rieval, 2nd edition[M] . Dept. of Computer Science, University of Glasgow. 1979.
[13] Goldstein J et al. Creating and evaluating multi-document sentence ext ract summaries// Proceedings of the 9th International Conference
on Information and Knowledge Management. Virginia , USA , 2000: 165-172.
[14] , . [J]. , 2005, (6).
[15] NEUCSP. http://www.nlplab.com/chinese/source.htm.
[16] , , . [J]. , 2007, (8).

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD


- 30 -

You might also like