This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysi...
Ghazi Abuhakema, Reem Faraj, Anna Feldman, Eileen ...
Hierarchical probabilistic modeling of discrete data has emerged as a powerful tool for text analysis. Posterior inference in such models is intractable, and practitioners rely on...
This paper investigates the applicability of distributed clustering technique, called RACHET [1], to organize large sets of distributed text data. Although the authors of RACHET c...
Both full-text information retrieval and large scale parsing require text preprocessing to identify strong lexical associations in textual databases. In order to associate linguis...
Data represented geometrically in high-dimensional vector spaces can be found in many applications. Images and videos, are often represented by assigning a dimension for every pix...