Sciweavers

219 search results - page 4 / 44
» Web page language identification based on URLs
Sort
View
CIKM
2005
Springer
15 years 3 months ago
Web-centric language models
We investigates language models for informational and navigational web search. Retrieval on the web is a task that differs substantially from ordinary ad hoc retrieval. We perfor...
Jaap Kamps
COMPSAC
2002
IEEE
15 years 3 months ago
An Approach to Identify Duplicated Web Pages
A relevant consequence of the unceasing expansion of the Web and e-commerce is the growth of the demand of new Web sites and Web applications. The software industry is facing the ...
Giuseppe A. Di Lucca, Massimiliano Di Penta, Anna ...
CN
2000
63views more  CN 2000»
14 years 10 months ago
Squeal: a structured query language for the Web
The Web contains an abundance of useful semi-structured information that can and should be mined. Types of structure include hyperlinks between pages, structure within hypertext p...
Ellen Spertus, Lynn Andrea Stein
DEXAW
2000
IEEE
123views Database» more  DEXAW 2000»
15 years 2 months ago
Domain Name Based Visualization of Web Histories in a Zoomable User Interface
Users of hypertext systems like the World Wide Web (WWW) often find themselves following hypertext links deeper and deeper, only to find themselves “lost” and unable to fin...
R. Gandhi, Benjamin B. Bederson, G. Kumar, Ben Shn...
WWW
2009
ACM
15 years 10 months ago
Purely URL-based topic classification
Given only the URL of a web page, can we identify its topic? This is the question that we examine in this paper. Usually, web pages are classified using their content [7], but a U...
Eda Baykan, Monika Rauch Henzinger, Ludmila Marian...