Sciweavers

967 search results - page 130 / 194
» Text Mining
Sort
View
ICDM
2007
IEEE
116views Data Mining» more  ICDM 2007»
15 years 6 months ago
A Computational Approach to Style in American Poetry
We develop a quantitative method to assess the style of American poems and to visualize a collection of poems in relation to one another. Qualitative poetry criticism helped guide...
David M. Kaplan, David M. Blei
PAKDD
2000
ACM
128views Data Mining» more  PAKDD 2000»
15 years 3 months ago
A Comparative Study of Classification Based Personal E-mail Filtering
This paper addresses personal E-mail filtering by casting it in the framework of text classification. Modeled as semi-structured documents, Email messages consist of a set of field...
Yanlei Diao, Hongjun Lu, Dekai Wu
KDD
2008
ACM
120views Data Mining» more  KDD 2008»
16 years 9 days ago
Entity categorization over large document collections
Extracting entities (such as people, movies) from documents and identifying the categories (such as painter, writer) they belong to enable structured querying and data analysis ov...
Arnd Christian König, Rares Vernica, Venkates...
ESCIENCE
2006
IEEE
15 years 3 months ago
ODIN: A Model for Adapting and Enriching Legacy Infrastructure
The Online Database of Interlinear Text (ODIN)1 is a database of interlinear text "snippets", harvested mostly from scholarly documents posted to the Web. Although large...
William D. Lewis
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
16 years 9 days ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar