Sciweavers

77 search results - page 11 / 16
» Improved Modeling of Out-Of-Vocabulary Words Using Morpholog...
Sort
View
KDD
2002
ACM
170views Data Mining» more  KDD 2002»
15 years 10 months ago
Enhanced word clustering for hierarchical text classification
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
PAKDD
2009
ACM
126views Data Mining» more  PAKDD 2009»
15 years 4 months ago
Tree-Based Method for Classifying Websites Using Extended Hidden Markov Models
One important problem proposed recently in the field of web mining is website classification problem. The complexity together with the necessity to have accurate and fast algorit...
Majid Yazdani, Milad Eftekhar, Hassan Abolhassani
ICDM
2009
IEEE
233views Data Mining» more  ICDM 2009»
15 years 4 months ago
Semi-Supervised Sequence Labeling with Self-Learned Features
—Typical information extraction (IE) systems can be seen as tasks assigning labels to words in a natural language sequence. The performance is restricted by the availability of l...
Yanjun Qi, Pavel Kuksa, Ronan Collobert, Kunihiko ...
NAACL
2010
14 years 7 months ago
Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems
The use of well-nested linear context-free rewriting systems has been empirically motivated for modeling of the syntax of languages with discontinuous constituents or relatively f...
Carlos Gómez-Rodríguez, Marco Kuhlma...
EMNLP
2010
14 years 7 months ago
A New Approach to Lexical Disambiguation of Arabic Text
We describe a model for the lexical analysis of Arabic text, using the lists of alternatives supplied by a broad-coverage morphological analyzer, SAMA, which include stable lemma ...
Rushin Shah, Paramveer S. Dhillon, Mark Liberman, ...