Given a huge real graph, how can we derive a representative sample? There are many known algorithms to compute interesting measures (shortest paths, centrality, betweenness, etc.)...
Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
This paper describes a hidden Markov model (HMM) based approach to perform search interface segmentation. Automatic processing of an interface is a must to access the invisible co...
This paper investigates the discovery of conditional functional dependencies (CFDs). CFDs are a recent extension of functional dependencies (FDs) by supporting patterns of semantic...
Wenfei Fan, Floris Geerts, Laks V. S. Lakshmanan, ...
Subspace clustering has attracted great attention due to its capability of finding salient patterns in high dimensional data. Order preserving subspace clusters have been proven to...