Sciweavers

563 search results - page 17 / 113
» Crawling the web for structured documents
Sort
View
EDBT
2006
ACM
137views Database» more  EDBT 2006»
16 years 2 months ago
IQN Routing: Integrating Quality and Novelty in P2P Querying and Ranking
Abstract. We consider a collaboration of peers autonomously crawling the Web. A pivotal issue when designing a peer-to-peer (P2P) Web search engine in this environment is query rou...
Sebastian Michel, Matthias Bender, Peter Triantafi...
132
Voted
AUSDM
2008
Springer
243views Data Mining» more  AUSDM 2008»
15 years 3 months ago
Structure-Based Document Model with Discrete Wavelet Transforms and Its Application to Document Classification
Term signal is an existing text representation that depicts a term as a vector of frequencies of occurrences in a number of user-defined partitions of a document. Although term si...
Supphachai Thaicharoen, Tom Altman, Krzysztof J. C...
131
Voted
WWW
2010
ACM
15 years 5 months ago
Time is of the essence: improving recency ranking using Twitter data
Realtime web search refers to the retrieval of very fresh content which is in high demand. An effective portal web search engine must support a variety of search needs, including ...
Anlei Dong, Ruiqiang Zhang, Pranam Kolari, Jing Ba...
DOCENG
2006
ACM
15 years 7 months ago
Templates, microformats and structured editing
Microformats and semantic XHTML add semantics to web pages while taking advantage of the existing (X)HTML infrastructure. This approach enables new applications that can be deploy...
Francesc Campoy Flores, Vincent Quint, Irèn...
113
Voted
IRI
2007
IEEE
15 years 8 months ago
Acronym-Expansion Recognition and Ranking on the Web
The paper presents a study on large-scale automatic extraction of acronyms and associated expansions from Web data and from the user interactions with this data through Web search...
Alpa Jain, Silviu Cucerzan, Saliha Azzam