This paper proposes a framework for semi-supervised structured output learning (SOL), specifically for sequence labeling, based on a hybrid generative and discriminative approach...
In real-world machine learning problems, it is very common that part of the input feature vector is incomplete: either not available, missing, or corrupted. In this paper, we pres...
Background: When term ambiguity and variability are very high, dictionary-based Named Entity Recognition (NER) is not an ideal solution even though large-scale terminological reso...
Yutaka Sasaki, Yoshimasa Tsuruoka, John McNaught, ...
Problems stemming from domain adaptation continue to plague the statistical natural language processing community. There has been continuing work trying to find general purpose al...
This paper proposes a framework for training Conditional Random Fields (CRFs) to optimize multivariate evaluation measures, including non-linear measures such as F-score. Our prop...