High dimensionality remains a significant challenge for document clustering. Recent approaches used frequent itemsets and closed frequent itemsets to reduce dimensionality, and to...
Inference of latent variables from complicated data is one important problem in data mining. The high dimensionality and high complexity of real world data often make accurate infe...
Background: The use of mass spectrometry as a proteomics tool is poised to revolutionize early disease diagnosis and biomarker identification. Unfortunately, before standard super...
Latent semantic analysis (LSA), as one of the most popular unsupervised dimension reduction tools, has a wide range of applications in text mining and information retrieval. The k...
Xi Chen, Yanjun Qi, Bing Bai, Qihang Lin, Jaime G....
Calculation of object similarity, for example through a distance function, is a common part of data mining and machine learning algorithms. This calculation is crucial for efficie...