Sciweavers

563 search results - page 79 / 113
» Crawling the web for structured documents
Sort
View
RIAO
1997
15 years 3 months ago
Coupling information retrieval and information extraction: A new text technology for gathering information from the web
The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how...
Robert J. Gaizauskas, Alexander M. Robertson
WWW
2001
ACM
16 years 2 months ago
XML and XSLT Modeling for Multimedia Bitstream Manipulation
New devices gaining access to the Internet need to obtain multimedia content adapted to their limited capacities. Scalable formats allow to retrieve different versions of a single...
Sylvain Devillers
NIPS
2007
15 years 3 months ago
Mining Internet-Scale Software Repositories
Large repositories of source code create new challenges and opportunities for statistical machine learning. Here we first develop Sourcerer, an infrastructure for the automated c...
Erik Linstead, Paul Rigor, Sushil Krishna Bajracha...
117
Voted
JCST
2008
121views more  JCST 2008»
15 years 1 months ago
Clustering Text Data Streams
Abstract Clustering text data streams is an important issue in data mining community and has a number of applications such as news group filtering, text crawling, document organiza...
Yubao Liu, Jiarong Cai, Jian Yin, Ada Wai-Chee Fu
TKDE
2008
191views more  TKDE 2008»
15 years 1 months ago
Beyond Single-Page Web Search Results
Given a user keyword query, current Web search engines return a list of individual Web pages ranked by their "goodness" with respect to the query. Thus, the basic unit fo...
Ramakrishna Varadarajan, Vagelis Hristidis, Tao Li