Sciweavers

735 search results - page 95 / 147
» Corpora and data preparation
Sort
View
KDD
2012
ACM
166views Data Mining» more  KDD 2012»
13 years 13 days ago
Selecting a characteristic set of reviews
Online reviews provide consumers with valuable information that guides their decisions on a variety of fronts: from entertainment and shopping to medical services. Although the pr...
Theodoros Lappas, Mark Crovella, Evimaria Terzi
KDD
2009
ACM
204views Data Mining» more  KDD 2009»
15 years 10 months ago
Improving classification accuracy using automatically extracted training data
Classification is a core task in knowledge discovery and data mining, and there has been substantial research effort in developing sophisticated classification models. In a parall...
Ariel Fuxman, Anitha Kannan, Andrew B. Goldberg, R...
ICDM
2008
IEEE
224views Data Mining» more  ICDM 2008»
15 years 4 months ago
A Non-parametric Approach to Pair-Wise Dynamic Topic Correlation Detection
We introduce dynamic correlated topic models (DCTM) for analyzing discrete data over time. This model is inspired by the hierarchical Gaussian process latent variable models (GP-L...
Yang Song, Lu Zhang 0007, C. Lee Giles
SIGIR
2006
ACM
15 years 4 months ago
Spoken document retrieval from call-center conversations
We are interested in retrieving information from conversational speech corpora, such as call-center data. This data comprises spontaneous speech conversations with low recording q...
Jonathan Mamou, David Carmel, Ron Hoory
LREC
2008
114views Education» more  LREC 2008»
14 years 11 months ago
Improving Statistical Machine Translation Efficiency by Triangulation
In current phrase-based Statistical Machine Translation systems, more training data is generally better than less. However, a larger data set eventually introduces a larger model ...
Yu Chen, Andreas Eisele, Martin Kay