This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...
The Web is experiencing an exponential growth in the use of weblogs or blogs, websites containing dated journal-style entries. Blog entries are generally organised using informall...
Conor Hayes, Paolo Avesani, Sriharsha Veeramachane...
Abstract. This paper describes an efficient method to construct reliable machine learning applications in peer-to-peer (P2P) networks by building ensemble based meta methods. We co...
The increasing use of multimedia in education makes text-production with computers important for students. What kind of role does the Internet play here as an external source of i...
Abstract. A growing amounts of information are currently being generated and stored in the World Wide Web (WWW), in particular, researchers in any field can find a lot of publicati...