: Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organ...
Entity Resolution (ER) is the process of identifying groups of records that refer to the same real-world entity. Various measures (e.g., pairwise F1, cluster F1) have been used fo...
David Menestrina, Steven Whang, Hector Garcia-Moli...
Current pattern-detection proposals for streaming data recognize the need to move beyond a simple regular-expression model over strictly ordered input. We continue in this directi...
Badrish Chandramouli, Jonathan Goldstein, David Ma...
In this chapter, we discuss a widely used fault-tolerant data replication model called virtual synchrony. The model responds to two kinds of needs. First, there is the practical qu...
In biological applications, the tandem mass spectrometry is a widely used method for determining protein and peptide sequences from an ”in vitro” sample. The sequences are not...