There is a lack of an integrated technology that will increase effective usage of the vast and heterogeneous multi-lingual and multimedia digital content. The need is being express...
Scopus is the world’s largest abstract and citation database of peer-reviewed literature and quality web sources (-> http://www.info.sciverse.com/scopus). ontains 41 million r...
The AutoFeed system automatically extracts data from semistructured web sites. Previously, researchers have developed two types of supervised learning approaches for extracting we...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
Massive amounts of raw data are currently being generated by biologists while sequencing organisms. Outside of the largest, high-pro le projects such as the Human Genome Project, ...