With the rapid growth of XML-document traffic on the Internet, scalable content-based dissemination of XML documents to a large, dynamic group of consumers has become an important...
If we abstract a sensor network as a network graph consisting of vertices and edges, where vertices represent sensor nodes and edges represent distance measurements between neighbo...
Dirty data is a serious problem for businesses leading to incorrect decision making, inefficient daily operations, and ultimately wasting both time and money. Dirty data often ari...
Time series data is common in many settings including scientific and financial applications. In these applications, the amount of data is often very large. We seek to support pred...
In this paper we extend the PAC learning algorithm due to Clark and Thollard for learning distributions generated by PDFA to automata whose transitions may take varying time length...