Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...
This paper presents an unsupervised learning approach to building a non-English (Arabic) stemmer. The stemming model is based on statistical machine translation and it uses an Eng...
Every user has a distinct background and a specific goal when searching for information on the Web. The goal of Web search personalization is to tailor search results to a particu...
This paper presents a new context-based method for automatic detection and extraction of similar and related words from texts. Finding similar words is a very important task for m...
Current web search engines essentially conduct document-level ranking and retrieval. However, structured information about realworld objects embedded in static webpages and online...