Sciweavers

DEXAW
2009
IEEE
131views Database» more  DEXAW 2009»

Clustering of Short Strings in Large Databases

15 years 9 months ago
Clustering of Short Strings in Large Databases
—A novel method CLOSS intended for textual databases is proposed. It successfully identifies misspelled string clusters, even if the cluster border is not prominent. The method uses q-gram approach to represent data and a string proximity graph to find the cluster. Contribution refers to short string clustering in text mining, when the proximity graph has multiple horizontal lines or the line is not present.
Michail Kazimianec, Arturas Mazeika
Added 20 May 2010
Updated 20 May 2010
Type Conference
Year 2009
Where DEXAW
Authors Michail Kazimianec, Arturas Mazeika
Comments (0)