This paper describes a method of detecting Japanese Katakana variants from a large corpus. Katakana words, which are mainly used as loanwords, cause problems with information retr...
This paper presents a hybrid concept hierarchy development technique for web returned documents retrieved by a meta-search engine. The aim of the technique is to separate the init...
Razvan Stefan Bot, Yi-fang Brook Wu, Xin Chen, Qua...
This paper describes a framework for defining domain specific Feature Functions in a user friendly form to be used in a Maximum Entropy Markov Model (MEMM) for the Named Entity Re...
In this paper we propose the model of a prototypical NLP architecture of an information access system to support a team of experts in a scientific design task, in a shared and hete...
Maria Teresa Pazienza, Marco Pennacchiotti, Michel...
This paper presents a framework for user-oriented text mining. It is then illustrated with an example of discovering knowledge from competitors’ websites. The knowledge to be di...