To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
Topic modeling has been a key problem for document analysis. One of the canonical approaches for topic modeling is Probabilistic Latent Semantic Indexing, which maximizes the join...
Deng Cai, Qiaozhu Mei, Jiawei Han, Chengxiang Zhai
Customisable embedded processors are becoming available on the market, thus making it possible for designers to speed up execution of applications by using Application-specific F...
The trend towards multicore processors and graphic processing units is increasing the need for software that can take advantage of parallelism. Writing correct parallel programs u...
One of the reasons why large-scale software development is difficult is the number of dependencies that software engineers need to face: e.g., dependencies among the software comp...
Erik Trainer, Stephen Quirk, Cleidson R. B. de Sou...