Sciweavers

8479 search results - page 130 / 1696
» Data Extraction from Web Data Sources
Sort
View
WSDM
2009
ACM
112views Data Mining» more  WSDM 2009»
15 years 7 months ago
Finding text reuse on the web
With the overwhelming number of reports on similar events originating from different sources on the web, it is often hard, using existing web search paradigms, to find the origi...
Michael Bendersky, W. Bruce Croft
110
Voted
VLDB
2002
ACM
95views Database» more  VLDB 2002»
15 years 12 days ago
Effective Change Detection Using Sampling
For a large-scale data-intensive environment, such as the World-Wide Web or data warehousing, we often make local copies of remote data sources. Due to limited network and computa...
Junghoo Cho, Alexandros Ntoulas
113
Voted
PVLDB
2010
118views more  PVLDB 2010»
14 years 11 months ago
Global Detection of Complex Copying Relationships Between Sources
Web technologies have enabled data sharing between sources but also simplified copying (and often publishing without proper attribution). The copying relationships can be complex...
Xin Dong, Laure Berti-Equille, Yifan Hu, Divesh Sr...
WWW
2011
ACM
14 years 7 months ago
Web information extraction using Markov logic networks
In this paper, we consider the problem of extracting structured data from web pages taking into account both the content of individual attributes as well as the structure of pages...
Sandeepkumar Satpal, Sahely Bhadra, Sundararajan S...
103
Voted
DEXA
2010
Springer
266views Database» more  DEXA 2010»
15 years 1 months ago
DBOD-DS: Distance Based Outlier Detection for Data Streams
Data stream is a newly emerging data model for applications like environment monitoring, Web click stream, network traffic monitoring, etc. It consists of an infinite sequence of d...
Md. Shiblee Sadik, Le Gruenwald