Abstract— As the datasets used to fuel modern scientific discovery grow increasingly large, they become increasingly difficult to manage using conventional software. Parallel d...
Sarah Loebman, Dylan Nunley, YongChul Kwon, Bill H...
In this paper, we propose a unified framework, called Markov Model Mediator (MMM), to facilitate image database clustering and to improve the query performance. The structure of t...
Mei-Ling Shyu, Shu-Ching Chen, Min Chen, Chengcui ...
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use...
Gene and protein names follow few, if any, true naming conventions and are subject to great variation in different occurrences of the same name. This gives rise to two important p...
— We propose a randomized data mining method that finds clusters of spatially overlapping images. The core of the method relies on the min-Hash algorithm for fast detection of p...