Recent work has shown the feasibility and promise of templateindependent Web data extraction. However, existing approaches use decoupled strategies ? attempting to do data record ...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...
In this paper a concern about the accuracy (as a function of parallelism) of a certain class of distributed learning algorithms is raised, and one proposed improvement is illustrat...
Lawrence O. Hall, Nitesh V. Chawla, Kevin W. Bowye...
Since point and click at web pages generate continuous data stream, which flow into web log data, old patterns may be stale and need to be updated. Algorithms for mining web seque...
Genetic-Based Machine Learning Systems (GBML) are comparable in accuracy with other learning methods. However, efficiency is a significant drawback. This paper presents a new rep...
High dimensional directional data is becoming increasingly important in contemporary applications such as analysis of text and gene-expression data. A natural model for multivaria...
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho...