Summarization of text documents is increasingly important with the amount of data available on the Internet. The large majority of current approaches view documents as linear sequ...
Search engines are useful because they allow the user to nd information of interest from the World-Wide Web. These engines use a crawler to gather information from Web sites. Howev...
Web text has been successfully used as training data for many NLP applications. While most previous work accesses web text through search engine hit counts, we created a Web Corpu...
Structural analysis of web pages has been proposed several times and for a number of reasons and purposes, such as the re-flowing of standard web pages to fit a smaller PDA screen....
Fabio Vitali, Angelo Di Iorio, Elisa Ventura Campo...
The World Wide Web is growing at such a pace that even the biggest centralized search engines are able to index only a small part of the available documents on the Internet. The d...