Sciweavers

213 search results - page 3 / 43
» Combining Statistics and Semantics for Word and Document Clu...
Sort
View
EMNLP
2009
13 years 4 months ago
Statistical Estimation of Word Acquisition with Application to Readability Prediction
Models of language learning play a central role in a wide range of applications: from psycholinguistic theories of how people acquire new word knowledge, to information systems th...
Paul Kidwell, Guy Lebanon, Kevyn Collins-Thompson
KDD
2009
ACM
243views Data Mining» more  KDD 2009»
14 years 6 months ago
Exploiting Wikipedia as external knowledge for document clustering
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
APWEB
2008
Springer
13 years 7 months ago
A Study on Multi-word Extraction from Chinese Documents
As a sequence of two or more consecutive individual words inherent with contextual semantics of individual words, multi-word attracts much attention from statistical linguistics an...
Wen Zhang, Taketoshi Yoshida, Xijin Tang
KBSE
1999
IEEE
13 years 10 months ago
Automatic Software Clustering via Latent Semantic Analysis
The paper describes the initial results of applying Latent Semantic Analysis (LSA) to program source code and associated documentation. Latent Semantic Analysis is a corpus-based ...
Jonathan I. Maletic, Naveen Valluri
DAS
2006
Springer
13 years 10 months ago
Efficient Word Retrieval by Means of SOM Clustering and PCA
Abstract. We propose an approach for efficient word retrieval from printed documents belonging to Digital Libraries. The approach combines word image clustering (based on Self Orga...
Simone Marinai, Stefano Faini, Emanuele Marino, Gi...