Sciweavers

142 search results - page 27 / 29
» Mining Knowledge from Corpora: an Application to Retrieval a...
Sort
View
SIGIR
2008
ACM
13 years 5 months ago
Topic-bridged PLSA for cross-domain text classification
In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain ...
Gui-Rong Xue, Wenyuan Dai, Qiang Yang, Yong Yu
WWW
2005
ACM
14 years 6 months ago
Sampling search-engine results
We consider the problem of efficiently sampling Web search engine query results. In turn, using a small random sample instead of the full set of results leads to efficient approxi...
Aris Anagnostopoulos, Andrei Z. Broder, David Carm...
AIPS
2003
13 years 7 months ago
A Multi-Agent System-driven AI Planning Approach to Biological Pathway Discovery
As genomic and proteomic data is collected from highthroughput methods on a daily basis, subcellular components are identified and their in vitro behavior is characterized. Howev...
Salim Khan, William Gillis, Carl Schmidt, Keith De...
CIKM
2008
Springer
13 years 7 months ago
Predicting web spam with HTTP session information
Web spam is a widely-recognized threat to the quality and security of the Web. Web spam pages pollute search engine indexes, burden Web crawlers and Web mining services, and expos...
Steve Webb, James Caverlee, Calton Pu
WWW
2003
ACM
14 years 6 months ago
Text joins in an RDBMS for web data integration
The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identi...
Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas...