Spreadsheets applications allow data to be stored with low development overheads, but also with low data quality. Reporting on data from such sources is difficult using traditiona...
Accurate prediction of user behaviors is important for many social media applications, including social marketing, personalization and recommendation, etc. A major challenge lies ...
ErHeng Zhong, Wei Fan, Junwei Wang, Lei Xiao, Yong...
We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
A labeled sequence data set related to a certain biological property is often biased and, therefore, does not completely capture its diversity in nature. To reduce this sampling b...
This paper follows a word-document co-clustering model independently introduced in 2001 by several authors such as I.S. Dhillon, H. Zha and C. Ding. This model consists in creatin...