Anecdotal evidence suggests that Web document summaries provide the sighted reader with a basis for making decisions regarding the route to take within non-linear text; and additi...
— We present three general approaches to detecting prototypical entities in a given taxonomy and apply them to a music information retrieval (MIR) problem. More precisely, we try...
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
The World Wide Web (WWW) has provided us with a plethora of information. However, given its unstructured format, this information is useful mainly to humans and cannot be effectiv...
In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...