Web content is notoriously difficult to capture on a printed page due to inconsistent and undesired results. Items that users may not want to print, such as media, navigation menu...
Search facilitated with agglomerative hierarchical clustering methods was studied in a collection of Finnish newspaper articles (N = 53,893). To allow quick experiments, clustering...
Tuomo Korenius, Jorma Laurikkala, Martti Juhola, K...
Language models used in current automatic speech recognition systems are trained on general-purpose corpora and are therefore not relevant to transcribe spoken documents dealing w...
: The increasing number of digitized texts presently available notably on the Web has developed an acute need in text mining techniques. Clustering systems are used more and more o...
Abdelmalek Amine, Zakaria Elberrichi, Michel Simon...
In TREC-9, we participated in the English-Chinese Cross Language, 10GB Web data ad-hoc retrieval as well as the Question-Answering tracks, all using automatic procedures. All thes...
Kui-Lam Kwok, Laszlo Grunfeld, Norbert Dinstl, M. ...