Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques ...
A number of content management tasks, including term categorization, term clustering, and automated thesaurus generation, view natural language terms (e.g. words, noun phrases) as...
Alberto Lavelli, Fabrizio Sebastiani, Roberto Zano...
Document-centric XML collections contain text-rich documents, marked up with XML tags that add lightweight semantics to the text. Querying such collections calls for a hybrid quer...
The performance of web search engines may often deteriorate due to the diversity and noisy information contained within web pages. User click-through data can be used to introduce...
One central issue in system structuring and quality prediction is the interdependencies of system modules. This paper proposes a novel technique for determining the operational co...