Existing template-independent web data extraction approaches adopt highly ineffective decoupled strategies--attempting to do data record detection and attribute labeling in two se...
Conditional random field (CRF) is a popular graphical model for sequence labeling. The flexibility of CRF poses significant computational challenges for training. Using existing o...
Abstract-- Extending traditional models for discriminative labeling of structured data to include higher-order structure in the labels results in an undesirable exponential increas...
With the increasing use of research paper search engines, such as CiteSeer, for both literature search and hiring decisions, the accuracy of such systems is of paramount importanc...
The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate stat...