The TREC .GOV collection makes a valuable web testbed for distributed information retrieval methods because it is naturally partitioned and includes 725 web-oriented queries with ...
Problems resulting from the management of shared, distributed knowledge has led to ontologies being employed as a solution, in order to effectively integrate information across a...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Web users display their preferences implicitly by navigating through a sequence of pages or by providing numeric ratings to some items. Web usage mining techniques are used to ext...
Markov models have been widely used for modelling users' navigational behaviour in the Web graph, using the transitional probabilities between web pages, as recorded in the w...