The Web contains an abundance of useful semi-structured information that can and should be mined. Types of structure include hyperlinks between pages, structure within hypertext p...
Though skyline queries already have claimed their place in retrieval over central databases, their application in Web information systems up to now was impossible due to the distri...
Logs of users' searches on Web health topics can exhibit signs of escalation of medical concerns, where initial queries about common symptoms are followed by queries about se...
Versioned document collections are collections that contain multiple versions of each document. Important examples are Web archives, Wikipedia and other wikis, or source code and ...
Document-centric XML collections contain text-rich documents, marked up with XML tags. The tags add lightweight semantics to the text. Querying such collections calls for a hybrid...