Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Record label companies would like to identify potential artists as early as possible in their careers, before other companies approach the artists with competing contracts. The va...
We consider the problem of detecting anomalies in high arity categorical datasets. In most applications, anomalies are defined as data points that are 'abnormal'. Quite ...
Studies have shown that program comprehension takes up to 45% of software development costs. Such high costs are caused by the lack-of documented specification and further aggrava...
We describe a data mining system to detect frauds that are camouflaged to look like normal activities in domains with high number of known relationships. Examples include accounti...