Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...
Abstract We introduce OCELOT, a prototype system for automatically generating the “gist” of a web page by summarizing it. Although most text summarization research to date has ...
In the paper we combine a Bayesian Network model for encoding forensic evidence during a given time interval with a Hidden Markov Model (EBN-HMM) for tracking and predicting the de...
Olivier Y. de Vel, Nianjun Liu, Terry Caelli, Tib&...
Because of name variations, an author may have multiple names and multiple authors may share the same name. Such name ambiguity affects the performance of document retrieval, web ...
Knowing the reputations of your own and/or competitors' products is important for marketing and customer relationship management. It is, however, very costly to collect and a...