Sciweavers

38 search results - page 7 / 8
» The indexable web is more than 11.5 billion pages
Sort
View
WWW
2006
ACM
15 years 10 months ago
WebKhoj: Indian language IR from multiple character encodings
Today web search engines provide the easiest way to reach information on the web. In this scenario, more than 95% of Indian language content on the web is not searchable due to mu...
Prasad Pingali, Jagadeesh Jagarlamudi, Vasudeva Va...
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
15 years 4 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
ACL
2010
14 years 7 months ago
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
Is it possible to use sense inventories to improve Web search results diversity for one word queries? To answer this question, we focus on two broad-coverage lexical resources of ...
Celina Santamaría, Julio Gonzalo, Javier Ar...
WWW
2008
ACM
15 years 10 months ago
Using graphics processors for high-performance IR query processing
Web search engines are facing formidable performance challenges due to data sizes and query loads. The major engines have to process tens of thousands of queries per second over t...
Shuai Ding, Jinru He, Hao Yan, Torsten Suel
KES
2005
Springer
15 years 3 months ago
Support for Internet-Based Commonsense Processing - Causal Knowledge Discovery Using Japanese "If" Forms
Abstract. This paper introduces our method for causal knowledge retrieval from the Internet resources, its results and evaluation of using it in utterance creation process. Our sys...
Yali Ge, Rafal Rzepka, Kenji Araki