In this paper, we develop multilingual supervised latent Dirichlet allocation (MLSLDA), a probabilistic generative model that allows insights gleaned from one language's data...
In this case study I argue for the usage of a machine-oriented controlled natural language as interface language to knowledge systems. Instead of using formal languages that are di...
A well-established principle of language is that there is a preference for closely related words to be close together in the sentence. This can be expressed as a preference for de...
This paper describes the development of a new document ranking system based on layout similarity. The user has a need represented by a set of ”wanted” documents, and the syste...
May Huang, Daniel DeMenthon, David S. Doermann, Ly...
Background: Gene/protein recognition and normalization are important preliminary steps for many biological text mining tasks, such as information retrieval, protein-protein intera...