The paper presents Bulgarian National Corpus project (BulNC) - a large-scale, representative, online available corpus of Bulgarian. The BulNC is also a monolingual general corpus,...
Information extraction (IE) systems are costly to build because they require development texts, parsing tools, and specialized dictionaries for each application domain and each na...
Background: Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures pre...
Huaiqiu Zhu, Gang-Qing Hu, Yi-Fan Yang, Jin Wang, ...
The work presents the first effort to automatically annotate the semantic meanings of temporal video patterns obtained through unsupervised discovery processes. This problem is in...
Lexing Xie, Lyndon S. Kennedy, Shih-Fu Chang, Ajay...
This paper presents an unsupervised learning approach to building a non-English (Arabic) stemmer. The stemming model is based on statistical machine translation and it uses an Eng...