While there has been a lot of work on finding frequent itemsets in transaction data streams, none of these solve the problem of finding similar pairs according to standard similar...
Overall performance of the data mining process depends not just on the value of the induced knowledge but also on various costs of the process itself such as the cost of acquiring...
An acceptability envelope is a region of imperfect but acceptable software systems surrounding a given perfect system. Explicitly targeting the acceptability envelope during devel...
Background: Expressed sequence tag (EST) collections are composed of a high number of single-pass, redundant, partial sequences, which need to be processed, clustered, and annotat...
Mining discrete patterns in binary data is important for subsampling, compression, and clustering. We consider rankone binary matrix approximations that identify the dominant patt...