A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
We present DL8, an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal...
The problem of assessing the significance of data mining results on high-dimensional 0?1 data sets has been studied extensively in the literature. For problems such as mining freq...
Aristides Gionis, Heikki Mannila, Panayiotis Tsapa...
The design of algorithms on complex networks, such as routing, ranking or recommendation algorithms, requires a detailed understanding of the growth characteristics of the network...
Christian Borgs, Jennifer T. Chayes, Constantinos ...
This paper describes a novel technique for the synthesis of imperative programs. Automated program synthesis has the potential to make programming and the design of systems easier...
Saurabh Srivastava, Sumit Gulwani, Jeffrey S. Fost...