This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
In e-commerce environment, business partners exchange product information in the form of ecatalogs. Since each business player uses his/her own classification and identification c...
Voluminous medical images are generated daily. They are critical assets for medical diagnosis, research, and teaching. To facilitate automatic indexing and retrieval of large medic...
Abstract. Biometric identification has emerged as a reliable means of controlling access to both physical and virtual spaces. In spite of the rapid proliferation of large-scale dat...
We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorithm DiGeST (Disk-Based Genomic Suffix Tree) improves significantly over previous ...
Marina Barsky, Ulrike Stege, Alex Thomo, Chris Upt...