This paper explores probabilistic lexico-syntactic pattern matching, also known as soft pattern matching. While previous methods in soft pattern matching are ad hoc in computing t...
The TREC .GOV collection makes a valuable web testbed for distributed information retrieval methods because it is naturally partitioned and includes 725 web-oriented queries with ...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
A major application area of information systems technology and multimedia content management is that of support systems for engineering processes. This includes the particularly im...
Sebastian Bossung, Hans-Werner Sehring, Michael Sk...
Can we use social networks to combat spam? This paper investigates the feasibility of MailRank, a new email ranking and classification scheme exploiting the social communication ...