Sciweavers

179 search results - page 1 / 36
» Finding Replicated Web Collections
Sort
View
SIGMOD
2000
ACM
85views Database» more  SIGMOD 2000»
13 years 9 months ago
Finding Replicated Web Collections
Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times....
Junghoo Cho, Narayanan Shivakumar, Hector Garcia-M...
JASIS
2000
100views more  JASIS 2000»
13 years 4 months ago
Raising reliability of web search tool research through replication and chaos theory
: Because the World Wide Web is a dynamic collection of information, the Web search tools (or "search engines") that index the Web are dynamic. Traditional information re...
Scott Nicholson
ICDE
2004
IEEE
151views Database» more  ICDE 2004»
14 years 6 months ago
Improved File Synchronization Techniques for Maintaining Large Replicated Collections over Slow Networks
We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of impo...
Torsten Suel, Patrick Noel, Dimitre Trendafilov
WWW
2005
ACM
13 years 10 months ago
Finding the boundaries of information resources on the web
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov
LAWEB
2009
IEEE
13 years 11 months ago
An Architecture for Finding Entities on the Web
Abstract—Recent progress in research fields such as Information Extraction and Information Retrieval enables the creation of systems providing better search experiences to web u...
Gianluca Demartini, Claudiu S. Firan, Mihai George...