Sciweavers

587 search results - page 32 / 118
» New Algorithms for Text Fingerprinting
Sort
View
SDM
2007
SIAM
118views Data Mining» more  SDM 2007»
14 years 11 months ago
On Privacy-Preservation of Text and Sparse Binary Data with Sketches
In recent years, privacy preserving data mining has become very important because of the proliferation of large amounts of data on the internet. Many data sets are inherently high...
Charu C. Aggarwal, Philip S. Yu
COLING
2002
14 years 9 months ago
Concept Discovery from Text
Broad-coverage lexical resources such as WordNet are extremely useful. However, they often include many rare senses while missing domain-specific senses. We present a clustering a...
Dekang Lin, Patrick Pantel
SIGIR
2008
ACM
14 years 9 months ago
A new probabilistic retrieval model based on the dirichlet compound multinomial distribution
The classical probabilistic models attempt to capture the Ad hoc information retrieval problem within a rigorous probabilistic framework. It has long been recognized that the prim...
Zuobing Xu, Ram Akella
ACL
2003
14 years 11 months ago
Generalized Algorithms for Constructing Statistical Language Models
Recent text and speech processing applications such as speech mining raise new and more general problems related to the construction of language models. We present and describe in...
Cyril Allauzen, Mehryar Mohri, Brian Roark
SIGIR
2006
ACM
15 years 3 months ago
Near-duplicate detection by instance-level constrained clustering
For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...
Hui Yang, James P. Callan