Sciweavers

602 search results - page 36 / 121
» Integrating Data and Probabilistically Structured Text Docum...
Sort
View
INFOSCALE
2007
ACM
15 years 1 months ago
Query-driven indexing for scalable peer-to-peer text retrieval
We present a query-driven algorithm for the distributed indexing of large document collections within structured P2P networks. To cope with bandwidth consumption that has been ide...
Gleb Skobeltsyn, Toan Luu, Ivana Podnar Zarko, Mar...
SIGIR
2006
ACM
15 years 5 months ago
Contextual search and name disambiguation in email using graphs
Similarity measures for text have historically been an important tool for solving information retrieval problems. In many interesting settings, however, documents are often closel...
Einat Minkov, William W. Cohen, Andrew Y. Ng
VLDB
2007
ACM
125views Database» more  VLDB 2007»
15 years 6 months ago
Data Integration with Uncertainty
This paper reports our first set of results on managing uncertainty in data integration. We posit that data-integration systems need to handle uncertainty at three levels, and do...
Xin Luna Dong, Alon Y. Halevy, Cong Yu
GFKL
2005
Springer
93views Data Mining» more  GFKL 2005»
15 years 5 months ago
A Hybrid Machine Learning Approach for Information Extraction from Free Text
Abstract. We present a hybrid machine learning approach for information extraction from unstructured documents by integrating a learned classifier based on the Maximum Entropy Mod...
Günter Neumann
WWW
2007
ACM
16 years 17 days ago
Integrating web directories by learning their structures
Documents in the Web are often organized using category trees by information providers (e.g. CNN, BBC) or search engines (e.g. Google, Yahoo!). Such category trees are commonly kn...
Christopher C. Yang, Jianfeng Lin