Sciweavers

135 search results - page 23 / 27
» The Indifferent Naive Bayes Classifier
Sort
View
ICML
2003
IEEE
15 years 10 months ago
Text Bundling: Statistics Based Data-Reduction
As text corpora become larger, tradeoffs between speed and accuracy become critical: slow but accurate methods may not complete in a practical amount of time. In order to make the...
Lawrence Shih, Jason D. Rennie, Yu-Han Chang, Davi...
ICML
1999
IEEE
15 years 10 months ago
Using Reinforcement Learning to Spider the Web Efficiently
Consider the task of exploring the Web in order to find pages of a particular kind or on a particular topic. This task arises in the construction of search engines and Web knowled...
Jason Rennie, Andrew McCallum
67
Voted
WWW
2005
ACM
15 years 10 months ago
An experimental study on large-scale web categorization
Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification tech...
Tie-Yan Liu, Yiming Yang, Hao Wan, Qian Zhou, Bin ...
87
Voted
WWW
2005
ACM
15 years 10 months ago
Extracting semantic structure of web documents using content and visual information
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
Rupesh R. Mehta, Pabitra Mitra, Harish Karnick
SMC
2007
IEEE
100views Control Systems» more  SMC 2007»
15 years 3 months ago
Text categorization based on the ratio of word frequency in each categories
— In the present paper, we consider the automatic text categorization as a series of information processing and propose a new classification technique called the Frequency Ratio ...
Makoto Suzuki, Shigeichi Hirasawa