Sciweavers

735 search results - page 102 / 147
» Corpora and data preparation
Sort
View
LREC
2008
124views Education» more  LREC 2008»
14 years 11 months ago
Annotating an Arabic Learner Corpus for Error
This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysi...
Ghazi Abuhakema, Reem Faraj, Anna Feldman, Eileen ...
NIPS
2008
14 years 11 months ago
Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation
Hierarchical probabilistic modeling of discrete data has emerged as a powerful tool for text analysis. Posterior inference in such models is intractable, and practitioners rely on...
Indraneel Mukherjee, David M. Blei
ACST
2006
14 years 11 months ago
Distributed hierarchical document clustering
This paper investigates the applicability of distributed clustering technique, called RACHET [1], to organize large sets of distributed text data. Although the authors of RACHET c...
Debzani Deb, M. Muztaba Fuad, Rafal A. Angryk
ANLP
1994
97views more  ANLP 1994»
14 years 11 months ago
Recycling Terms into a Partial Parser
Both full-text information retrieval and large scale parsing require text preprocessing to identify strong lexical associations in textual databases. In order to associate linguis...
Christian Jacquemin
CACM
2010
104views more  CACM 2010»
14 years 10 months ago
Faster dimension reduction
Data represented geometrically in high-dimensional vector spaces can be found in many applications. Images and videos, are often represented by assigning a dimension for every pix...
Nir Ailon, Bernard Chazelle