Texts generated by automatic speech recognition (ASR) systems have some specificities, related to the idiosyncrasies of oral productions or the principles of ASR systems, that mak...
A web search with double checking model is proposed to explore the web as a live corpus. Five association measures including variants of Dice, Overlap Ratio, Jaccard, and Cosine, ...
The study presented here relies on the integrated use of different kinds of knowledge in order to improve first-guess accuracy in non-word context-sensitive correction for general...
Collocational analysis is the basis of many studies on lexical acquisition. Collocations are extracted from corpora using more or less shallow processing techniques, that span from...
Roberto Basili, Maria Teresa Pazienza, Paola Velar...
: The increasing number of digitized texts presently available notably on the Web has developed an acute need in text mining techniques. Clustering systems are used more and more o...
Abdelmalek Amine, Zakaria Elberrichi, Michel Simon...