You are on page 1of 13

REFERENCES

[1] A fast string searching algorithm, Boyer, R.S. and Moore, J.S. Communications of the ACM, Vol. 20, 1977, pp. 762-772

[2]Very Fast String Matching Algorithm for Small Alphabets and Long Patterns, Christian, C., Thierry, L. and Joseph, D.P., Lecture Notes in Computer Science, Vol. 1448, 1998, pp. 55-64

[3]String Matching with Errors, Sellers, P. H., Journal of Algorithms, Vol. 20, No. 1, 1980, pp. 359-373

[4]Synthesizable, Space and Time Efficient Algorithms for String Editing Problem by Vamsi K. Kundeti

[5]T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction toAlgorithms, Second Edition. The MIT Press, Sept. 2001.

[6]M. C. Herbordt, J. Model, B. Sukhwani, Y. Gu, and T. VanCourt. Single passstreaming BLAST on FPGAs. Parallel Comput., 33(10-11):741756, 2007.

[7]An Improved Approximate String Matching AlgorithmBased upon the Boyer-Moore Algorithm

[8]An Approximate String Matching Based on an ExactString Matching with Constant Pattern Length by Zu-Hao Pan

[9]Practical Methods for Approximate StringMatching Computer SciencesUniversity of Tampere, Finland.

HeikkiHyyr, Department of

[10]Approximate String Matching With Dynamic Programming and Suffix Trees LengHuiKengUniversity of North Florida

[11]An Algorithm for Fast Edit Distance Computation on GPUs Reza Farivar, HarshitKharbanda Department of Computer Science University of Illinois at UrbanaChampaign

[12]Shian-Hua Lin, Jan-Ming Ho, Yueh-Ming Huang ,ACRID ,intelligent internet document organization and retrieval ,IEEE Transactions on Knowledge and data engineering.

[13] A fast string searching algorithm, Boyer, R.S. and Moore, J.S.Communications of the ACM, Vol. 20, 1977, pp. 762-772.

[14] Very Fast String Matching Algorithm for Small Alphabets and Long Patterns, Christian, C., Thierry, L. and Joseph, D.P., Lecture Notes in Computer Science, Vol. 1448, 1998, pp. 55-64

[15] Practical fast searching in strings, HORSPOOL R.N., Software - Practice &Experience, 10(6) , 1980, pp. 501-506.

[16] Fast string searching, HUME A. and SUNDAY D.M. , Software - Practice & Experience 21(11), 1991, pp. 1221-1248 . [17] Fast pattern matching in strings, Knuth, D.E., Morris (Jr), J.H. and Pratt,V.R., SIAM Journal on Computing, Vol. 6, No. 2, 1977, pp. 323-350.

[18] A linear pattern-matching algorithm, Morris (Jr), J.H. and Pratt, V.R. Technical Report 40, 1970, University of California, Berkeley.

[19] Very fast and simple approximate string matching, G. Navarro and R.Baeza-Yates, Information Processing Letters, Vol. 72, 1999, pp.65-70.

[20] D.H. Kraft, F.E. Petry, B.P. Buckes, T. Sadasivan,Genetic algorithm for query optimization in informationretrieval: relevance feedback, in: E. Sanchez, T. Shibata, [21] L.A. Zadeh (Eds.), Genetic Algorithms and Fuzzy LogicSystems, 1997, pp. 155 173.

[22] E. Sanchez, H. Miyano, J. Brachet, Optimization offuzzy queries with genetic algorithms. Applications to adatabase of patents in biomedical engineering, in: Proc.VI IFSA Congress, Sao- Paulo, Brazil, 1995, pp. 293296.

[23] Zacharis Z. Nick and Panayiotopoulos Themis, WebSearch Using a Genetic Algorithm, IEEE Internetcomputing,1089-7801/01c2001, 18-25, IEEE

[24] Ramakrishna Varadarajan, VagelisHristidis, andTao Li , Beyond Single-PageWeb Search Results,IEEETransactions on knowledge and data engineering, 20(3),411 - 424, 2008

[25] JuditBarIlan, Comparing rankings of search resultson the Web, Information Processing and Management 41(2005) 15111519

[26]

Adriano

Veloso,

Humberto

M.

Almeida,

MarcosGoncalves,

Wagner

MeiraJr.,Learning to Rank at Query-Time using Association Rules, SIGIR08, 267-273 , 2008, Singapore. [27] S.SivaSathya and Philomina Simon, Review onApplicability of Genetic Algorithm to Web SearchInternational Journal of Computer Theory andEngineering, Vol. 1, No. 4, October2009 [28] Harry Zhang The Optimality of Naive BayesAmerican Association for Artificial Intelligence 2004. [29] Rich Caruana, AlexandruNiculescu-Mizil AnEmpirical Comparison of Supervised LearningAlgorithms Proc 23 rd International Conference onMachine Learning, Pittsburgh, PA, 2006. [30] Wenxian Wang, Xingshu Chen, YongbinZou,Haizhou Wang, Zongkun Dai A Focused Crawler Basedon Naive Bayes Classifier Third InternationalSymposium on Intelligent Information Technology andSecurity Informatics, 2010 [31] Peter A. Flach and Nicolas Lachiche Nave Bayesian Classification of Structured Data MachineLearning, Kluwer Academic Publishers

[32] Kleinberg, John "Hubs, Authorities, andCommunities" ACM computing survey,1998. [33] Joel C. Miller, Gregory Rae, Fred SchaeferModifications of Kleinbergs HITS Algorithm UsingMatrix Exponentiation

[34] NirAilon, Bernard Chazelle, SeshadhriComandur, and Ding Liu.Estimating the distance to a monotone function. Random Structures and Algorithms, 31:371383, 2007. Previously appeared in RANDOM04. [35] AlexandrAndoni, T.S. Jayram, and MihaiPatrascu.Lower bounds for edit distanceand product metrics via Poincare-type inequalities. Accepted to ACM-SIAM Symposiumon Discrete Algorithms (SODA10), 2010.

[36] AlexandrAndoni and Robert Krauthgamer. The computational hardness of estimatingedit distance. In Proceedings of the Symposium on Foundations of Computer Science(FOCS), pages 724734, 2007. Accepted to SIAM Journal on Computing (FOCS07special issue)

[37] AlexandrAndoni and Huy L. Nguyen. Near-tight bounds for testing Ulam distance. Accepted to ACM-SIAM Symposium on Discrete Algorithms (SODA10), 2010.

[38] AlexandrAndoni and Krzysztof Onak. Approximating edit distance in near-linear time.In Proceedings of the Symposium on Theory of Computing (STOC), pages 199 204, 2009 [39] TugkanBatu, FundaErgun, Joe Kilian, AvnerMagen, SofyaRaskhodnikova, RonittRubinfeld, and Rahul Sami. A sublinear algorithm for weakly approximating edit distance.In Proceedings of the Symposium on Theory of Computing (STOC), pages 316324, 2003 [40] TugkanBatu, FundaErgun, and CenkSahinalp. Oblivious string embeddings and editdistance approximations. In Proceedings of the ACM-SIAM Symposium on DiscreteAlgorithms (SODA), pages 792801, 2006.

[41] Philip Bille and Martin Farach-Colton. Fast and compact regular expression matching .Theoretical Computer Science, 409(28):486496, 2008.

[42] Ziv Bar-Yossef, T. S. Jayram, Robert Krauthgamer, and Ravi Kumar. Approximatingedit distance eciently. In Proceedings of the Symposium on Foundations of ComputerScience (FOCS), pages 550559, 2004. [43] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Cliord Stein.Introduction to Algorithms. MIT Press, 2nd edition, 2001.

[44] Graham Cormode and S. Muthukrishnan. The string edit distance matching problemwith moves. ACM Trans. Algorithms, 3(1), 2007.Special issue on SODA02. [Cor03] Graham Cormode. Sequence Distance Embeddings. Ph.D. Thesis, University of Warwick 2003

[45]

Graham

Cormode,

Mike

Paterson,

SuleymanCenkSahinalp,

and

Uzi

Vishkin.Communication complexity of document exchange. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 197206, 2000. [EKK+00] FundaErgun, SampathKannan, Ravi Kumar, RonittRubinfeld, and Manesh Viswanathan.Spot-checkers. J. Comput. Syst. Sci., 60(3):717751, 2000. [46] Dan Guseld. Algorithms on strings, trees, and sequences. Cambridge University Press,Cambridge, 1997. [47] PiotrIndyk and JirMatousek. Low distortion embeddings of nite metricspaces. CRCHandbook of Discrete and Computational Geometry, 2003.

[48] PiotrIndyk. Algorithmic aspects of geometric embeddings (tutorial). In proceedings ofthe Symposium on Foundations of Computer Science (FOCS), pages 1033, 2001. [49] PiotrIndyk and David Woodru.Optimal approximations of the frequency momentsof data streams.Proceedings of the Symposium on Theory of Computing (STOC), 2005.

[50]

SubhashKhot

and

AssafNaor.

Nonembeddability

theorems

via

Fourier

analysis.Math.Ann., 334(4):821852, 2006. Preliminary version appeared in FOCS05.

[51] EyalKushilevitz, RafailOstrovsky, and Yuval Rabani. Ecient search for approximatenearest neighbor in high dimensional spaces. SIAM J. Comput., 30(2):457 474, 2000 Preliminary version appeared in STOC98.

[52] Robert Krauthgamer and Yuval Rabani.Improved lower bounds for embeddings intoL1. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA),pages 10101017, 2006.

[53] Vladimir I. Levenshtein. Binary codes capable of correcting deletions, insertions, andreversals (in russian). DokladyAkademiiNauk SSSR, 4(163):845848, 1965. Appearedin English as: V. I. Levenshtein, Binary codes capable of correcting deletions, insertions,and reversals.Soviet Physics Doklady 10(8), 707710, 1966.

[54] Gad M. Landau, Eugene W. Myers, and Jeanette P. Schmidt. Incremental string comparison. SIAM J. Comput., 27(2):557582, 1998.

[55] William J. Masek and Mike Paterson. A faster algorithm computing string edit distances. J. Comput. Syst. Sci., 20(1):1831, 1980.

[56] S. Muthukrishnan and CenkSahinalp. Approximate nearest neighbors and sequence comparison with block operations. Proceedings of the Symposium on Theory of Computing (STOC), pages 416424, 2000.

APPENDIX A
A.1 A LIST OF STOP WORDS

a a's able about above according accordingly across actually after afterwards again against ain't all allow allows almost alone along already also although always am among amongst an and another any

anybody anyhow anyone anything anyway anyways anywhere apart appear appreciate appropriate are certain certainly changes clearly co com come comes concerning consequently consider considering contain containing contains corresponding could couldn't course

currently d definitely described despite did didn't different do does doesn't doing don't done down downwards during e each edu eg eight either else elsewhere enough entirely especially et etc even

ever every everybody everyone everything everywhere ex exactly example except f far few fifth first five followed following follows for former formerly forth four from further furthermore g get gets getting given gives go goes going

gone got gotten greetings h had hadn't happens hardly has hasn't have haven't having he he's hello help hence her here here's hereafter hereby herein hereupon hers herself hi him himself his hither hopefully how howbeit

however i i'd i'll i'm i've ie if ignored immediate in inasmuch inc indeed indicate indicated indicates inner insofar instead into inward is isn't it it'd it'll it's its itself j just k keep keeps kept

know knows known l last lately later latter latterly least less lest let let's like liked likely little look looking looks ltd m mainly many may maybe me mean meanwhile merely might more moreover most mostly

much must my myself n name namely nd near nearly necessary need needs neither never nevertheless new next nine no nobody non none noone nor normally not nothing novel now nowhere o obviously of off often

oh ok okay old on once one ones only onto or other others otherwise ought our ours ourselves out outside over overall own p particular particularly per perhaps placed please plus possible presumably probably provides q

que quite qv r rather rd re really reasonably regarding regardless regards relatively respectively right s said same saw say saying says second secondly see seeing seem seemed seeming seems seen self selves sensible sent serious

seriously seven several shall she should shouldn't since six so some somebody somehow someone something sometime sometimes somewhat somewhere soon sorry specified specify specifying still sub such sup sure t t's take taken tell tends th

than thank thanks thanx that that's thats the their theirs them themselves then thence there there's thereafter thereby therefore therein theres thereupon these they they'd they'll they're they've think third this thorough thoroughly those though three

through throughout thru thus to together too took toward towards tried tries truly try trying twice two u un under unfortunately unless unlikely until unto up upon us use used useful uses using usually uucp v

value various very via viz vs w want wants was wasn't way we we'd we'll we're we've welcome well went were weren't what what's whatever when whence whenever where where's whereafter whereas whereby wherein whereupon wherever

whether which while whither who who's whoever whole whom whose why will willing wish with within without won't wonder would would wouldn't x yes yet you you'd you'll you're you've your yours yourself yourselves z

A.2 PARTS OF SPEECH LABELS CC Coordination Conjunction CD Cardinal number DT Determiner EX Existential there FW Foreign word IN Preposition conjunction JJ Adjective JJR Adjective, comparative JJS Adjective, superlative LS List item marker MD Modal NN Noun, singular or mass NNP Proper noun, singular NNPS Proper noun, plural PDT Pre-determiner POS Possessive ending PRP Personal pronoun PRP$ Possessive pronoun VBP Verb, non3rd person singular present VBZ Verb, 3rd person singular present WDT Wh-determiner WP Wh-pronoun WP$ Possessive wh-pronoun WRB Wh-adverb VBD Verb, past tense VBG Verb, gerund or present participle VBN Verb, past participle or subordinating RBR Adverb, comparative RBS Adverb, superlative RP Particle SYM Symbol TO to UH Interjection VB Verb, base form

RB Adverb