Sciweavers

180 search results - page 1 / 36
» A Method for Calculating Term Similarity on Large Document C...
Sort
View
ITCC
2003
IEEE
13 years 10 months ago
A Method for Calculating Term Similarity on Large Document Collections
We present an efficient algorithm called the Quadtree Heuristic for identifying a list of similar terms for each unique term in a large document collection. Term similarity is de...
Wolfgang W. Bein, Jeffrey S. Coombs, Kazem Taghva
ACL
2008
13 years 6 months ago
Pairwise Document Similarity in Large Collections with MapReduce
This paper presents a MapReduce algorithm for computing pairwise document similarity in large document collections. MapReduce is an attractive framework because it allows us to de...
Tamer Elsayed, Jimmy J. Lin, Douglas W. Oard
BMCBI
2007
168views more  BMCBI 2007»
13 years 4 months ago
GOSim - an R-package for computation of information theoretic GO similarities between terms and gene products
Background: With the increased availability of high throughput data, such as DNA microarray data, researchers are capable of producing large amounts of biological data. During the...
Holger Fröhlich, Nora Speer, Annemarie Poustk...
SEKE
2010
Springer
13 years 3 months ago
Incremental Construction of Topic Hierarchies using Hierarchical Term Clustering
Topic hierarchies are very useful for managing, searching and browsing large repositories of text documents. The hierarchical clustering methods are used to support the constructi...
Ricardo M. Marcacini, Solange O. Rezende
EWMF
2005
Springer
13 years 10 months ago
Discovering a Term Taxonomy from Term Similarities Using Principal Component Analysis
Abstract. We show that eigenvector decomposition can be used to extract a term taxonomy from a given collection of text documents. So far, methods based on eigenvector decompositio...
Holger Bast, Georges Dupret, Debapriyo Majumdar, B...