Abstract. Text Categorization algorithms have a large number of parameters that determine their behaviour, whose effect is not easily predicted objectively or intuitively and may v...
Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification tech...
Tie-Yan Liu, Yiming Yang, Hao Wan, Qian Zhou, Bin ...
Extracting sentences that contain important information from a document is a form of text summarization. The technique is the key to the automatic generation of summaries similar ...
This paper presents Domain Relevance Estimation (DRE), a fully unsupervised text categorization technique based on the statistical estimation of the relevance of a text with respe...
tion Abstract ChengXiang Zhai (Advisor: John Lafferty) Language Technologies Institute School of Computer Science Carnegie Mellon University With the dramatic increase in online in...