Sciweavers

2088 search results - page 182 / 418
» MABAC - Matrix Based Clustering Algorithm
Sort
View
EMNLP
2008
15 years 3 months ago
Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce
This paper explores the challenge of scaling up language processing algorithms to increasingly large datasets. While cluster computing has been available in commercial environment...
Jimmy J. Lin
CIKM
2008
Springer
15 years 3 months ago
Winnowing-based text clustering
We present an approach to document clustering based on winnowing fingerprints that achieved good values of effectiveness with considerable save in memory space and computation tim...
Javier Parapar, Alvaro Barreiro
CCGRID
2006
IEEE
15 years 5 months ago
Density-Based Clustering for Similarity Search in a P2P Network
P2P systems represent a large portion of the Internet traffic which makes the data discovery of great importance to the user and the broad Internet community. Hence, the power of ...
Mouna Kacimi, Kokou Yétongnon
ECIR
2008
Springer
15 years 3 months ago
Clustering Template Based Web Documents
More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Thomas Gottron
SDM
2007
SIAM
152views Data Mining» more  SDM 2007»
15 years 3 months ago
HP2PC: Scalable Hierarchically-Distributed Peer-to-Peer Clustering
In distributed data mining models, adopting a flat node distribution model can affect scalability. To address the problem of modularity, flexibility and scalability, we propose...
Khaled M. Hammouda, Mohamed S. Kamel