Scientists typically need to take a large volume of information into account in order to deal with re-occurring tasks such as inspecting proceedings, finding related work, or revi...
Algirdas Avizienis, Gintare Grigonyte, Johann Hall...
This paper describes a new approach towards detecting plagiarism and scientific documents that have been read but not cited. In contrast to existing approaches, which analyze docu...
We present a novel approach to managing redundancy in sequence databanks such as GenBank. We store clusters of near-identical sequences as a representative union-sequence and a se...
Michael Cameron, Yaniv Bernstein, Hugh E. Williams
We propose a method for discovering the dependency relationships between the topics of documents shared in social networks using the latent social interactions, attempting to answ...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...