Abstract. Information Extraction, the process of eliciting data from natural language documents, usually relies on the ability to parse the document and then to detect the meaning ...
Sampling is a popular method of data collection when it is impossible or too costly to reach the entire population. For example, television show ratings in the United States are g...
Fusionplex is a system for integrating multiple heterogeneous and autonomous information sources that uses data fusion to resolve factual inconsistencies among the individual sour...
A programming system is the user interface between the programmer and the computer. Programming is a notoriously difficult activity, and some of this difficulty can be attribute...
Data is routinely created, disseminated, and processed in distributed systems that span multiple administrative domains. To maintain accountability while the data is transformed b...