Sciweavers

241 search results - page 26 / 49
» Detecting Co-Derivative Documents in Large Text Collections
Sort
View
68
Voted
IUI
2010
ACM
15 years 4 months ago
DocuBrowse: faceted searching, browsing, and recommendations in an enterprise context
Browsing and searching for documents in large, online enterprise document repositories are common activities. While internet search produces satisfying results for most user queri...
Andreas Girgensohn, Frank M. Shipman III, Francine...
ECIR
2004
Springer
14 years 11 months ago
Performance Analysis of Distributed Architectures to Index One Terabyte of Text
We simulate different architectures of a distributed Information Retrieval system on a very large Web collection, in order to work out the optimal setting for a particular set of r...
Fidel Cacheda, Vassilis Plachouras, Iadh Ounis
JCDL
2006
ACM
140views Education» more  JCDL 2006»
15 years 3 months ago
Exploring erotics in Emily Dickinson's correspondence with text mining and visual interfaces
This paper describes a system to support humanities scholars in their interpretation of literary work. It presents a user interface and web architecture that integrates text minin...
Catherine Plaisant, James Rose, Bei Yu, Loretta Au...
ICML
2004
IEEE
15 years 10 months ago
Text categorization with many redundant features: using aggressive feature selection to make SVMs competitive with C4.5
Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge numbers of features. Most previous studies found that the major...
Evgeniy Gabrilovich, Shaul Markovitch
INEX
2007
Springer
15 years 3 months ago
Phrase Detection in the Wikipedia
The Wikipedia XML collection turned out to be rich of marked-up phrases as we carried out our INEX 2007 experiments. Assuming that a phrase occurs at the inline level of the markup...
Miro Lehtonen, Antoine Doucet