In this paper we address the problem of detecting topics in large-scale linked document collections. Recently, topic detection has become a very active area of research due to its...
Large collections of documents containing various types of multimedia, are made available to the WWW. Unfortunately, due to the un-structuredness of Internet environments it is ha...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
When faced with the need for documentation, examples, bug fixes, error descriptions, code snippets, workarounds, templates, patterns, or advice, software developers frequently tu...
Term translation probabilities proved an effective method of semantic smoothing in the language modelling approach to information retrieval. We use Generalized Latent Semantic Ana...