Detection of template and noise blocks in web pages is an important step in improving the performance of information retrieval and content extraction. Of the many approaches propos...
Publication records are often found in the authors' personal home pages. If such a record is partitioned into a list of semantic fields of authors, title, date, etc., the uns...
Wei Zhang, Clement T. Yu, Neil R. Smalheiser, Vetl...
Imagine if you would like to deploy your new simulation online and your systems can just take care of itself. Needed web interface could be generated, database schema objects coul...
A good distance metric is crucial for many data mining tasks. To learn a metric in the unsupervised setting, most metric learning algorithms project observed data to a lowdimensio...
We define a connection subgraph as a small subgraph of a large graph that best captures the relationship between two nodes. The primary motivation for this work is to provide a pa...
Christos Faloutsos, Kevin S. McCurley, Andrew Tomk...