Sciweavers

523 search results - page 63 / 105
» Metric Learning for Text Documents
Sort
View
EMNLP
2004
15 years 17 days ago
Trained Named Entity Recognition using Distributional Clusters
This work applies boosted wrapper induction (BWI), a machine learning algorithm for information extraction from semi-structured documents, to the problem of named entity recogniti...
Dayne Freitag
92
Voted
ICML
2010
IEEE
14 years 9 months ago
Mining Clustering Dimensions
Many real-world datasets can be clustered along multiple dimensions. For example, text documents can be clustered not only by topic, but also by the author's gender or sentim...
Sajib Dasgupta, Vincent Ng
84
Voted
EMNLP
2008
15 years 18 days ago
HTM: A Topic Model for Hypertexts
Previously topic models such as PLSI (Probabilistic Latent Semantic Indexing) and LDA (Latent Dirichlet Allocation) were developed for modeling the contents of plain texts. Recent...
Congkai Sun, Bin Gao, Zhenfu Cao, Hang Li
KDD
2008
ACM
153views Data Mining» more  KDD 2008»
15 years 11 months ago
Text classification, business intelligence, and interactivity: automating C-Sat analysis for services industry
Text classification has matured as a research discipline over the last decade. Independently, business intelligence over structured databases has long been a source of insights fo...
Shantanu Godbole, Shourya Roy
DL
1999
Springer
187views Digital Library» more  DL 1999»
15 years 3 months ago
KEA: Practical Automatic Keyphrase Extraction
Keyphrases provide semantic metadata that summarize and characterize documents. This paper describes Kea, an algorithm for automatically extracting keyphrases from text. Kea ident...
Ian H. Witten, Gordon W. Paynter, Eibe Frank, Carl...