Sciweavers

19 search results - page 1 / 4
» An N-Gram Based Approach to Automatically Identifying Web Pa...
Sort
View
HICSS
2009
IEEE
150views Biometrics» more  HICSS 2009»
15 years 5 months ago
An N-Gram Based Approach to Automatically Identifying Web Page Genre
The research reported in this paper is the first phase of a larger project on the automatic classification of web pages by their genres, using ngram representations of the web pag...
Jane E. Mason, Michael A. Shepherd, Jack Duffy
ACL
2006
14 years 12 months ago
Implementing a Characterization of Genre for Automatic Genre Identification of Web Pages
In this paper, we propose an implementable characterization of genre suitable for automatic genre identification of web pages. This characterization is implemented as an inferenti...
Marina Santini, Richard Power, Roger Evans
COLING
2010
14 years 5 months ago
A Novel Method for Bilingual Web Page Acquisition from Search Engine Web Records
A new approach has been developed for acquiring bilingual web pages from the result pages of search engines, which is composed of two challenging tasks. The first task is to detec...
Yanhui Feng, Yu Hong, Zhenxiang Yan, Jian-Min Yao,...
CIKM
2010
Springer
14 years 8 months ago
Web page classification on child suitability
Children spend significant amounts of time on the Internet. Recent studies showed, that during these periods they are often not under adult supervision. This work presents an auto...
Carsten Eickhoff, Pavel Serdyukov, Arjen P. de Vri...
WWW
2005
ACM
15 years 4 months ago
Finding the boundaries of information resources on the web
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov