Sciweavers

1319 search results - page 150 / 264
» Using the Structure of HTML Documents to Improve Retrieval
Sort
View
167
Voted
DKE
2007
199views more  DKE 2007»
15 years 3 months ago
QMatch - Using paths to match XML schemas
Integration of multiple heterogeneous data sources continues to be a critical problem for many application domains and a challenge for researchers world-wide. With the increasing ...
Naiyana Tansalarak, Kajal T. Claypool
121
Voted
SIGIR
2010
ACM
15 years 7 months ago
Adaptive near-duplicate detection via similarity learning
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
153
Voted
SIGMOD
2010
ACM
250views Database» more  SIGMOD 2010»
15 years 3 months ago
Expressive and flexible access to web-extracted data: a keyword-based structured query language
Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands...
Jeffrey Pound, Ihab F. Ilyas, Grant E. Weddell
171
Voted
IS
2011
14 years 10 months ago
Similarity of business process models: Metrics and evaluation
—It is common for large and complex organizations to maintain repositories of business process models in order to document and to continuously improve their operations. Given suc...
Remco M. Dijkman, Marlon Dumas, Boudewijn F. van D...
151
Voted
CLEF
2011
Springer
14 years 3 months ago
Simulation of Within-Session Query Variations Using a Text Segmentation Approach
Abstract. We propose a generative model for automatic query reformulations from an initial query using the underlying subtopic structure of top ranked retrieved documents. We addre...
Debasis Ganguly, Johannes Leveling, Gareth J. F. J...