Sciweavers

555 search results - page 74 / 111
» An Empirical Study on Web Mining of Parallel Data
Sort
View
WWW
2008
ACM
15 years 10 months ago
iRobot: an intelligent crawler for web forums
We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...
ICDM
2005
IEEE
139views Data Mining» more  ICDM 2005»
15 years 3 months ago
Stability of Feature Selection Algorithms
With the proliferation of extremely high-dimensional data, feature selection algorithms have become indispensable components of the learning process. Strangely, despite extensive ...
Alexandros Kalousis, Julien Prados, Melanie Hilari...
WSDM
2009
ACM
161views Data Mining» more  WSDM 2009»
15 years 4 months ago
Predicting the readability of short web summaries
Readability is a crucial presentation attribute that web summarization algorithms consider while generating a querybaised web summary. Readability quality also forms an important ...
Tapas Kanungo, David Orr
GIR
2007
ACM
15 years 1 months ago
Geo-tagging for imprecise regions of different sizes
Extracting geographical information from various web sources is likely to be important for a variety of applications. One such use for this information is to enable the study of v...
Robert Pasley, Paul Clough, Mark Sanderson
KDD
2009
ACM
167views Data Mining» more  KDD 2009»
15 years 10 months ago
Seven pitfalls to avoid when running controlled experiments on the web
Controlled experiments, also called randomized experiments and A/B tests, have had a profound influence on multiple fields, including medicine, agriculture, manufacturing, and adv...
Thomas Crook, Brian Frasca, Ron Kohavi, Roger Long...