Sciweavers

219 search results - page 12 / 44
» Web page language identification based on URLs
Sort
View
92
Voted
WISE
2002
Springer
15 years 5 months ago
Applying the Site Information to the Information Retrieval from the Web
In recent years, several information retrieval methods using information about the Web-links are developed, such as HITS and Trawling. In order to analyze the Web-links dividing i...
Yasuhito Asano, Hiroshi Imai, Masashi Toyoda, Masa...
INTR
2002
50views more  INTR 2002»
15 years 1 days ago
Methodologies for crawler based Web surveys
There have been many attempts to study the content of the web, either through human or automatic agents. Five different previously used web survey methodologies are described and ...
Mike Thelwall
WWW
2006
ACM
16 years 1 months ago
Geographically focused collaborative crawling
A collaborative crawler is a group of crawling nodes, in which each crawling node is responsible for a specific portion of the web. We study the problem of collecting geographical...
Weizheng Gao, Hyun Chul Lee, Yingbo Miao
AAAI
2008
15 years 2 months ago
Mining Translations of Web Queries from Web Click-through Data
Query translation for Cross-Lingual Information Retrieval (CLIR) has gained increasing attention in the research area. Previous work mainly used machine translation systems, bilin...
Rong Hu, Weizhu Chen, Jian Hu, Yansheng Lu, Zheng ...
LREC
2008
108views Education» more  LREC 2008»
15 years 1 months ago
A Lightweight and Efficient Tool for Cleaning Web Pages
Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...
Stefan Evert