Statistical machine learning methods are employed to train a Named Entity Recognizer from annotated data. Methods like Maximum Entropy and Conditional Random Fields make use of fe...
We study dimensionality reduction or feature selection in text document categorization problem. We focus on the first step in building text categorization systems, that is the cho...
Given a pair of images represented using bag-of-visual words and a label corresponding to whether the images are “related”(must-link constraint) or “unrelated” (must not li...
Increasingly large text datasets and the high dimensionality associated with natural language create a great challenge in text mining. In this research, a systematic study is cond...
M. Mahdi Shafiei, Singer Wang, Roger Zhang, Evange...
We use clustering to derive new relations which augment database schema used in automatic generation of predictive features in statistical relational learning. Clustering improves...