Search Sciweavers | Sciweavers

77 search results - page 1 / 16

» Pairwise Document Similarity in Large Collections with MapRe...

click to vote

ACL
2008

153views Computational Linguistics» more ACL 2008»

Pairwise Document Similarity in Large Collections with MapReduce

13 years 6 months ago

Download www.umiacs.umd.edu

This paper presents a MapReduce algorithm for computing pairwise document similarity in large document collections. MapReduce is an attractive framework because it allows us to de...

Tamer Elsayed, Jimmy J. Lin, Douglas W. Oard

claim paper

Read More »

click to vote

SIGIR
2009
ACM

180views Information Technology» more SIGIR 2009»

Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce

13 years 11 months ago

Download www.umiacs.umd.edu

This paper explores the problem of computing pairwise similarity on document collections, focusing on the application of “more like this” queries in the life sciences domain. ...

Jimmy J. Lin

claim paper

Read More »

click to vote

SIGIR
2011
ACM

257views Information Technology» more SIGIR 2011»

No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity

12 years 7 months ago

Download www.umiacs.umd.edu

This work explores the problem of cross-lingual pairwise similarity, where the task is to extract similar pairs of documents across two diﬀerent languages. Solutions to this pro...

Ferhan Ture, Tamer Elsayed, Jimmy J. Lin

claim paper

Read More »

click to vote

EMNLP
2009

121views Natural Language Processing» more EMNLP 2009»

Web-Scale Distributional Similarity and Entity Set Expansion

13 years 2 months ago

Download www.aclweb.org

Computing the pairwise semantic similarity between all words on the Web is a computationally challenging task. Parallelization and optimizations are necessary. We propose a highly...

Patrick Pantel, Eric Crestan, Arkady Borkovsky, An...

claim paper

Read More »

click to vote

ITCC
2003
IEEE

96views Information Technology» more ITCC 2003»

A Method for Calculating Term Similarity on Large Document Collections

13 years 10 months ago

Download www.isri.unlv.edu

We present an efﬁcient algorithm called the Quadtree Heuristic for identifying a list of similar terms for each unique term in a large document collection. Term similarity is de...

Wolfgang W. Bein, Jeffrey S. Coombs, Kazem Taghva

claim paper

Read More »

« Prev « First page 1 / 16 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers