Sciweavers

ACL
2008
13 years 6 months ago
Blog Categorization Exploiting Domain Dictionary and Dynamically Estimated Domains of Unknown Words
This paper presents an approach to text categorization that i) uses no machine learning and ii) reacts on-the-fly to unknown words. These features are important for categorizing B...
Chikara Hashimoto, Sadao Kurohashi
ACL
2008
13 years 6 months ago
Query-based Sentence Fusion is Better Defined and Leads to More Preferred Results than Generic Sentence Fusion
We show that question-based sentence fusion is a better defined task than generic sentence fusion (Q-based fusions are shorter, display less variety in length, yield more identica...
Emiel Krahmer, Erwin Marsi, Paul Pelt
ACL
2008
13 years 6 months ago
Enriching Spoken Language Translation with Dialog Acts
Current statistical speech translation approaches predominantly rely on just text transcripts and do not adequately utilize the rich contextual information such as conveyed throug...
Vivek Kumar Rangarajan Sridhar, Srinivas Bangalore...
ACL
2008
13 years 6 months ago
The Good, the Bad, and the Unknown: Morphosyllabic Sentiment Tagging of Unseen Words
The omnipresence of unknown words is a problem that any NLP component needs to address in some form. While there exist many established techniques for dealing with unknown words i...
Karo Moilanen, Stephen G. Pulman
ACL
2008
13 years 6 months ago
Construct State Modification in the Arabic Treebank
Earlier work in parsing Arabic has speculated that attachment to construct state constructions decreases parsing performance. We make this speculation precise and define the probl...
Ryan Gabbard, Seth Kulick
ACL
2008
13 years 6 months ago
Smoothing a Tera-word Language Model
Frequency counts from very large corpora, such as the Web 1T dataset, have recently become available for language modeling. Omission of low frequency n-gram counts is a practical ...
Deniz Yuret
ACL
2008
13 years 6 months ago
Language Dynamics and Capitalization using Maximum Entropy
This paper studies the impact of written language variations and the way it affects the capitalization task over time. A discriminative approach, based on maximum entropy models, ...
Fernando Batista, Nuno J. Mamede, Isabel Trancoso
ACL
2008
13 years 6 months ago
Learning Semantic Links from a Corpus of Parallel Temporal and Causal Relations
Finding temporal and causal relations is crucial to understanding the semantic structure of a text. Since existing corpora provide no parallel temporal and causal annotations, we ...
Steven Bethard, James H. Martin
ACL
2008
13 years 6 months ago
Extractive Summaries for Educational Science Content
This paper describes an extractive summarizer for educational science content called COGENT. COGENT extends MEAD based on strategies elicited from an empirical study with domain a...
Sebastian de la Chica, Faisal Ahmad, James H. Mart...
ACL
2008
13 years 6 months ago
Using Automatically Transcribed Dialogs to Learn User Models in a Spoken Dialog System
We use an EM algorithm to learn user models in a spoken dialog system. Our method requires automatically transcribed (with ASR) dialog corpora, plus a model of transcription error...
Umar Syed, Jason Williams