The WEBSOM methodology for building very large text archives has a very slow method for extracting meaningful unit labels. This is because the method computes for the relative fre...
Arnulfo P. Azcarraga, Teddy N. Yap Jr., Tat-Seng C...
Wikipedia provides a wealth of knowledge, where the first sentence, infobox (and relevant sentences), and even the entire document of a wiki article could be considered as diverse...
We investigate in this paper the combination of DBN (Dynamic Bayesian Network) classifiers, either independent or coupled, for the recognition of degraded characters. The independ...
Abstract. The prediction of diagnosis codes is typically based on freetext entries in clinical documents. Previous attempts to tackle this problem range from strictly rule-based sy...
Many large-scale Web applications that require ranked top-k retrieval are implemented using inverted indices. An inverted index represents a sparse term-document matrix, where non...
George Beskales, Marcus Fontoura, Maxim Gurevich, ...