Sciweavers

62 search results - page 3 / 13
» Creating Permanent Test Collections of Web Pages for Informa...
Sort
View
IPM
2006
146views more  IPM 2006»
13 years 4 months ago
Dictionary-based text categorization of chemical web pages
A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-re...
Chunyan Liang, Li Guo, Zhaojie Xia, Feng-Guang Nie...
ITCC
2005
IEEE
13 years 10 months ago
Elimination of Redundant Information for Web Data Mining
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditi...
Shakirah Mohd Taib, Soon-ja Yeom, Byeong Ho Kang
LREC
2010
149views Education» more  LREC 2010»
13 years 6 months ago
DutchParl. The Parliamentary Documents in Dutch
A corpus called DutchParl is created which aims to contain all digitally available parliamentary documents written in the Dutch language. The first version of DutchParl contains d...
Maarten Marx, Anne Schuth
CIKM
2005
Springer
13 years 10 months ago
Retrieving answers from frequently asked questions pages on the web
We address the task of answering natural language questions by using the large number of Frequently Asked Questions (FAQ) pages available on the web. The task involves three steps...
Valentin Jijkoun, Maarten de Rijke
CMS
2010
150views Communications» more  CMS 2010»
13 years 5 months ago
Throwing a MonkeyWrench into Web Attackers Plans
Abstract. Client-based attacks on internet users with malicious web pages represent a serious and rising threat. Internet Browsers with enabled active content technologies such as ...
Armin Büscher, Michael Meier, Ralf Benzmü...