Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
Large amounts of (often valuable) information are stored in web-accessible text databases. "Metasearchers" provide unified interfaces to query multiple such databases at...
Panagiotis G. Ipeirotis, Alexandros Ntoulas, Jungh...
The proliferation of linked data on the Web paves the way to a new generation of applications that exploit heterogeneous data from different sources. However, because this Web of d...
Web spam can significantly deteriorate the quality of search engines. Early web spamming techniques mainly manipulate page content. Since linkage information is widely used in we...
Web Usage Mining (WUM), a natural application of data mining techniques to the data collected from user interactions with the web, has greatly concerned both academia and industry ...