Sciweavers

29 search results - page 1 / 6
» Evaluation of Internal Validity Measures in Short-Text Corpo...
Sort
View
CICLING
2008
Springer
13 years 5 months ago
Evaluation of Internal Validity Measures in Short-Text Corpora
Short texts clustering is one of the most difficult tasks in natural language processing due to the low frequencies of the document terms. We are interested in analysing these kind...
Diego Ingaramo, David Pinto, Paolo Rosso, Marcelo ...
DEXAW
2008
IEEE
128views Database» more  DEXAW 2008»
13 years 10 months ago
Proximity Estimation and Hardness of Short-Text Corpora
Abstract—In this work, we investigate the relative hardness of shorttext corpora in clustering problems and how this hardness relates to traditional similarity measures. Our appr...
Marcelo Luis Errecalde, Diego Ingaramo, Paolo Ross...
CIKM
2007
Springer
13 years 9 months ago
Spam filtering for short messages
We consider the problem of content-based spam filtering for short text messages that arise in three contexts: mobile (SMS) communication, blog comments, and email summary informa...
Gordon V. Cormack, José María G&oacu...
ACL
2008
13 years 5 months ago
Assessing Dialog System User Simulation Evaluation Measures Using Human Judges
Previous studies evaluate simulated dialog corpora using evaluation measures which can be automatically extracted from the dialog systems' logs. However, the validity of thes...
Hua Ai, Diane J. Litman
EPIA
2009
Springer
13 years 10 months ago
Phrase Translation Extraction from Aligned Parallel Corpora Using Suffix Arrays and Related Structures
In this paper, we will address term translation extraction from indexed aligned parallel corpora, by using a couple of association measures combined by a voting scheme, for scaling...
José Aires, Gabriel Pereira Lopes, Luis Gom...