Sciweavers

13 search results - page 1 / 3
» xCrawl: A High-Recall Crawling Method for Web Mining
Sort
View
ICDM
2008
IEEE
186views Data Mining» more  ICDM 2008»
13 years 11 months ago
xCrawl: A High-Recall Crawling Method for Web Mining
Web Mining Systems exploit the redundancy of data published on the Web to automatically extract information from existing web documents. The first step in the Information Extract...
Kostyantyn M. Shchekotykhin, Dietmar Jannach, Gerh...
ADMA
2009
Springer
142views Data Mining» more  ADMA 2009»
13 years 11 months ago
Crawling Deep Web Using a New Set Covering Algorithm
Abstract. Crawling the deep web often requires the selection of an appropriate set of queries so that they can cover most of the documents in the data source with low cost. This ca...
Yan Wang, Jianguo Lu, Jessica Chen
ECIR
2010
Springer
13 years 6 months ago
Mining Anchor Text Trends for Retrieval
Anchor text has been considered as a useful resource to complement the representation of target pages and is broadly used in web search. However, previous research only uses anchor...
Na Dai, Brian D. Davison
CORR
2011
Springer
326views Education» more  CORR 2011»
12 years 11 months ago
Mining User Comment Activity for Detecting Forum Spammers in YouTube
Research shows that comment spamming (comments which are unsolicited, unrelated, abusive, hateful, commercial advertisements etc) in online discussion forums has become a common p...
Ashish Sureka
EMNLP
2011
12 years 4 months ago
Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation
We propose a general method to watermark and probabilistically identify the structured outputs of machine learning algorithms. Our method is robust to local editing operations and...
Ashish Venugopal, Jakob Uszkoreit, David Talbot, F...