Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

77

COMAD
2008

favoriteEmaildiscussreport

143views Knowledge Management» more COMAD 2008»

Discovering Interesting Subsets Using Statistical Analysis

15 years 1 months ago

Discovering Interesting Subsets Using Statistical Analysis

Download www.cse.iitb.ac.in

In this paper we present algorithms for identifying interesting subsets of a given database of records. In many real life applications, it is important to automatically discover subsets of records which are interesting with respect to a given measure. For example, in the customer support database, it is important to identify subsets of tickets having service time which is too large (or too small) when compared with the service time of the rest of the tickets. We use Student's t-test to check whether the measure values for a subset and its complement differ significantly. We first discuss the brute-force approach and then present heuristic-based state-space search algorithm to discover interesting subsets of the given database. To use the proposed heuristic-based approach on large data sets, we then present a samplingbased algorithm that uses sampling together with the proposed heuristics to efficiently identify interesting sets in large data sets. We discuss an application of the...

Maitreya Natu, Girish Palshikar

Real-time Traffic

COMAD 2008 | Interesting Subsets | Knowledge Management | Large Data Sets | Service Times |

claim paper

Related Content

» Interesting Subset Discovery and Its Application on Service Processes

» Pathwaybased analysis using reduced gene subsets in genomewide association studies

» Statistical Analysis of Bayes Optimal Subset Ranking

» Discovering interesting usage patterns in text collections integrating text mining with vi...

» Discover gene specific local coregulations from timecourse gene expression data

» Visualizing Statistical Properties of Smoothly Brushed Data Subsets

» Ensemble attribute profile clustering discovering and characterizing groups of genes with ...

» Efficiently Discovering Hammock Paths from Induced Similarity Networks

» Independent Component Analysis and Evolutionary Algorithms for Building Representative Ben...

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	COMAD
Authors	Maitreya Natu, Girish Palshikar

Comments (0)