Sciweavers

51 search results - page 2 / 11
» Classifying High-Dimensional Text and Web Data Using Very Sh...
Sort
View
KDD
2002
ACM
166views Data Mining» more  KDD 2002»
14 years 5 months ago
Frequent term-based text clustering
Text clustering methods can be used to structure large sets of text or hypertext documents. The well-known methods of text clustering, however, do not really address the special p...
Florian Beil, Martin Ester, Xiaowei Xu
CIARP
2006
Springer
13 years 9 months ago
Oscillating Feature Subset Search Algorithm for Text Categorization
Abstract. A major characteristic of text document categorization problems is the extremely high dimensionality of text data. In this paper we explore the usability of the Oscillati...
Jana Novovicová, Petr Somol, Pavel Pudil
ECIR
2007
Springer
13 years 6 months ago
Similarity Measures for Short Segments of Text
Measuring the similarity between documents and queries has been extensively studied in information retrieval. However, there are a growing number of tasks that require computing th...
Donald Metzler, Susan T. Dumais, Christopher Meek
KDD
1999
ACM
220views Data Mining» more  KDD 1999»
13 years 9 months ago
Efficient Mining of Emerging Patterns: Discovering Trends and Differences
We introduce a new kind of patterns, called emerging patterns (EPs), for knowledge discovery from databases. EPs are defined as itemsets whose supports increase significantly from...
Guozhu Dong, Jinyan Li
ICDM
2009
IEEE
151views Data Mining» more  ICDM 2009»
13 years 3 months ago
TagLearner: A P2P Classifier Learning System from Collaboratively Tagged Text Documents
The amount of text data on the Internet is growing at a very fast rate. Online text repositories for news agencies, digital libraries and other organizations currently store gigaan...
Haimonti Dutta, Xianshu Zhu, Tushar Mahule, Hillol...