This paper reports on a study involving the automatic extraction of Chinese legal terms. We used a word segmented corpus of Chinese court judgments to extract salient legal expres...
Abstract. The growing size of electronically available text corpora like companies’ intranets or the WWW has made information access a hot topic within computational linguistics....
Hierarchy fundamentally shapes how we act at work. In this paper, we explore the relationship between the words people write in workplace email and the rank of the email’s recip...
We present a novel approach to integrate transliteration into Hindi-to-Urdu statistical machine translation. We propose two probabilistic models, based on conditional and joint pr...
Nadir Durrani, Hassan Sajjad, Alexander Fraser, He...
Word form normalization through lemmatization or stemming is a standard procedure in information retrieval because morphological variation needs to be accounted for and several la...