Abstract: As web sites are getting more complicated, the construction of web information extraction systems becomes more troublesome and time-consuming. A common theme is the diffi...
Beyond conventional linear and kernel-based feature extraction, we present a more generalized formulation for feature extraction in this paper. Two representative algorithms using ...
Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-...
Providing a comprehensive set of relevant information at the point of care is crucial for making correct clinical decisions in a timely manner. Retrieval of scenario specific inf...
Enriching digital library’s author meta-data can lead to valuable services and applications. This paper addresses the problem of extracting authors’ information from their hom...