Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...
As the amount of information on the Web grows, the ability to retrieve relevant information quickly and easily is necessary. The combination of ample news sources on the Web, litt...
Krysta Marie Svore, Lucy Vanderwende, Christopher ...
We describe the Paraflow system for connecting heterogeneous computing services together into a flexible and efficient data-mining metacomputer. There are three levels of parallel...
This study borrowed sequence analysis techniques from the genetic sciences and applied them to a similar problem in email filtering and web searching. Genre identification is the ...
Expressing web page content in a way that computers can understand is the key to a semantic web. Generating ontological information from the web automatically using machine learni...