XML databases often contain documents comprising structured text. Therefore, it is important to integrate "information retrieval style" query evaluation, which is well-s...
This paper presents the first stochastic finite-state morphological parser for Turkish. The non-probabilistic parser is a standard finite-state transducer implementation of two-le...
An ad hoc data format is any non-standard, semi-structured data format for which robust data processing tools are not available. In this paper, we present ANNE, a new kind of mark...
In this paper, we describe and compare systems for text normalization based on statistical machine translation (SMT) methods which are constructed with the support of internet use...
Tim Schlippe, Chenfei Zhu, Jan Gebhardt, Tanja Sch...
Math search is a new area of research with many enabling technologies but also many challenges. Some of the enabling technologies include XML, XPath, XQuery, and MathML. Some of t...