Using Affinity Propagation Cluster For Text Auto Summarization

2 1
2012 2
Scientific Journal of Information Engineering(SJIE)
Vol.2 No.1
Feb. 2012
471003
KNN
Using Affinity Propagation Cluster for Text Auto Summarization

Suhuan Sun, Changwei Zhao
Electronic and Information School of Henan University of Science and Technology, Luoyang, Henan province, China 471003
Abstract: Automatic summarization can help us accurately and efficiently obtain the information needed from the magnanimity
information and has attracted more attention. In this paper, a new method for Chinese text summarization using the algorithm of Affinity
Propagation Cluster (APC) is presented. It is not necessary to set the number of clusters and the initial representative exemplars in APC,
so it can avoid the problems of local-optimal and instable clustering results caused by randomly selecting initial representative
exemplars. And the algorithm has high computing efficiency. The results of the experiments show us that Chinese text automatic
summarization based on APC has higher accuracy than KNN cluster. APC is a suitable method for automatic text summarization.
Key words: affinity propagation cluster; text auto summarization; Chinese text
[1]
IBMLuhn[2]
Linclue words
[3]Julian K.[4]
Tadashidiversity of concepts
[5]Tadashi
[6][5][6]
No. 2010A6300025 No. 2011A520011
www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD

- 26 -
Filatova (atomic events)

[7]
[5][6]
VSM[6]LSA
[8][9]
[9]
1
1.1
T S1, S 2,
{Sk1 , Sk 2 ,
, Sl T {Sn } S k
, Skm }
A {Sk1 , Sk 2 ,
T
, Skm }
[8,9]
T i S i j S j w
Si {wi1 , wi 2 ,
wip } S j {w j 1 , w j 2 ,
w jq } | Si | | S j | S i S j
| W | T S i S j
sim( Si , S j ) k i 1 k
|S |
log 2 | S j |
if wik S j
log 2 | W |
if wik S j
k {
stop-of-word
1.2
affinity propagation cluster[9]exemplar

- 27 -
Scientific Journal of Information Engineering (SJIE)
r responsibility a
availability
i k r (i, k ) i
k evidence a(i, k ) k i
[9]
r (i, k ) s(i, k ) max {a(i, k ') s(i, k ')}

k ' s.t .k ' k
r(k, k ) s(k, k ) max s(i, k )
a(i, k ) min{0, r (k , k )
2
3
max{0, r (i ', k )}}
i ' s .t .i '{i , k }
a(k , k )
max{0, r (i ', k )}
i ' s .t .i ' k
a availability0 r 2
r (i, k ) a(i, k )
preference P P
S (i, i )
P
S (i, i ) P
P
[9]
2
JonesIntrinsic
Extrinsic[10]
Lin
ROUGERecall Oriented Understudy for Gisting Evaluation[11]
ROUGEn-gramROUGE-N
ROUGE-LROUGE-S ROUGE-WROUGE-L,
ROUGE
n s m
m
s

- 28 -
m
n
F [12]
2RP
RP
P R F
3
3.1
Goldstein
[13] [14]
1

30
10
130160
3.2
neucsp[15]
P
K-MeansAPC1
P APC 1P sim( Si , Si ) 30 K-Means1 APC
1 APC 2 P sim( Si , Si ) 80 K-Means2 APC 2
3.3
1K-Means[6][16]
[6][16]K-Means
1K-Means
1
2

- 29 -
Scientific Journal of Information Engineering (SJIE)

1
0.678
0.635
0.656
K-Means1
0.644
0.603
0.623
APC2
0.495
0.754
0.597
K-Means2
0.464
0.706
0.559
APC1
[1]
. [M], , 2006.
[2]
Luhn H P. The Automatic Creation of Literature Abstract[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165.
[3]
Lin C Y, Hovy E. Identifying Topics by Position[C]//Proc. of the 5th Conference on Applied Natural Language Processing. [S. l.]: IEEE
Press, 1997: 283-290.
[4]
Julian K, Pedersen J O, Chen F. A. Trainable Document Summari[C]//Proceedings of the 18th Annuual International ACM SIGIR
Conference on Research and Development in Information Retrieval. Seattle, WA: [s.n], 1995: 68-73.
[5]
Tadashi N., Matsumoto Y. A New Approach to Unsupervised Text Summarization[C]//Proc. of Annual ACM Conference on Research
and Development in Information Retrieval. [S.l.]. IEEE Press, 2001.
[6]
, , . [J]. , 2008, (1).
[7]
Filatova E, Hatzivassiloglou V. Event-based Extractive Summarization[C]//Proc. of ACL Workshop on Summarization. Barcelona, Spain:
[s. n.], 2004.
[8]
T. M. Cover, J. A. Thomas, Elements of Information Theory, (John Wiley & Sons, New York, NY, 1991).
[9]
Frey, B.J. and D. Dueck, Clustering by passing messages between data points. Science, 2007, 315(5814): 972-976.
[10] Jones K S, Galliers J R. Evaluating Natural Language Processing Systems: An Analysis and Review. Berlin: Springer, 1996.
[11] Lin C. Y. ROUGE: A Package for Automatic Evaluation of Summaries[A]. In: Proceedings of the ACL2004 Workshop on Text
Summarization[C]. Spain, 2004, 7: 428.
[12] Van Rijsbergen, C. J. Information Ret rieval, 2nd edition[M] . Dept. of Computer Science, University of Glasgow. 1979.
[13] Goldstein J et al. Creating and evaluating multi-document sentence ext ract summaries// Proceedings of the 9th International Conference
on Information and Knowledge Management. Virginia , USA , 2000: 165-172.
[14] , . [J]. , 2005, (6).
[15] NEUCSP. http://www.nlplab.com/chinese/source.htm.
[16] , , . [J]. , 2007, (8).

- 30 -

Using Affinity Propagation Cluster For Text Auto Summarization

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Using Affinity Propagation Cluster For Text Auto Summarization

Uploaded by

Copyright:

Available Formats

2 1

Scientific Journal of Information Engineering(SJIE)

Using Affinity Propagation Cluster for Text Auto Summarization

No. 2010A6300025 No. 2011A520011

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD

Filatova (atomic events)

affinity propagation cluster[9]exemplar

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD

Scientific Journal of Information Engineering (SJIE)

r (i, k ) s(i, k ) max {a(i, k ') s(i, k ')}

r(k, k ) s(k, k ) max s(i, k )

max{0, r (i ', k )}}

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD

Scientific Journal of Information Engineering (SJIE)

, , . [J]. , 2008, (1).

www.sjie.org PP.26-30 2011 American V-King Scientific Publishing, LTD

You might also like