Sciweavers

135 search results - page 23 / 27
» The Indifferent Naive Bayes Classifier
Sort
View
95
Voted
ICML
2003
IEEE
16 years 1 months ago
Text Bundling: Statistics Based Data-Reduction
As text corpora become larger, tradeoffs between speed and accuracy become critical: slow but accurate methods may not complete in a practical amount of time. In order to make the...
Lawrence Shih, Jason D. Rennie, Yu-Han Chang, Davi...
110
Voted
ICML
1999
IEEE
16 years 1 months ago
Using Reinforcement Learning to Spider the Web Efficiently
Consider the task of exploring the Web in order to find pages of a particular kind or on a particular topic. This task arises in the construction of search engines and Web knowled...
Jason Rennie, Andrew McCallum
91
Voted
WWW
2005
ACM
16 years 1 months ago
An experimental study on large-scale web categorization
Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification tech...
Tie-Yan Liu, Yiming Yang, Hao Wan, Qian Zhou, Bin ...
112
Voted
WWW
2005
ACM
16 years 1 months ago
Extracting semantic structure of web documents using content and visual information
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
Rupesh R. Mehta, Pabitra Mitra, Harish Karnick
SMC
2007
IEEE
100views Control Systems» more  SMC 2007»
15 years 7 months ago
Text categorization based on the ratio of word frequency in each categories
— In the present paper, we consider the automatic text categorization as a series of information processing and propose a new classification technique called the Frequency Ratio ...
Makoto Suzuki, Shigeichi Hirasawa