We describe a Markov chain Monte Carlo (MCMC)-based algorithm for sampling solutions to mixed Boolean/integer constraint problems. The focus of this work differs in two points from...
The availability and the accuracy of the data dictate the success of a data mining application. Increasingly, there is a need to resort to on-line data collection to address the p...
We introduce a new approach to analyzing click logs by examining both the documents that are clicked and those that are bypassed--documents returned higher in the ordering of the ...
Atish Das Sarma, Sreenivas Gollapudi, Samuel Ieong
We address the task of learning rankings of documents from search engine logs of user behavior. Previous work on this problem has relied on passively collected clickthrough data. ...
Recent work has shown the feasibility and promise of templateindependent Web data extraction. However, existing approaches use decoupled strategies ? attempting to do data record ...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...