Sciweavers

EMNLP
2009
13 years 1 months ago
The role of named entities in Web People Search
The ambiguity of person names in the Web has become a new area of interest for NLP researchers. This challenging problem has been formulated as the task of clustering Web search r...
Javier Artiles, Enrique Amigó, Julio Gonzal...
EMNLP
2009
13 years 1 months ago
What's in a name? In some languages, grammatical gender
This paper presents an investigation of the relation between words and their gender in two gendered languages: German and Romanian. Gender is an issue that has long preoccupied li...
Vivi Nastase, Marius Popescu
EMNLP
2009
13 years 1 months ago
The infinite HMM for unsupervised PoS tagging
We extend previous work on fully unsupervised part-of-speech tagging. Using a non-parametric version of the HMM, called the infinite HMM (iHMM), we address the problem of choosing...
Jurgen Van Gael, Andreas Vlachos, Zoubin Ghahraman...
EMNLP
2009
13 years 1 months ago
Synchronous Tree Adjoining Machine Translation
Tree Adjoining Grammars have well-known advantages, but are typically considered too difficult for practical systems. We demonstrate that, when done right, adjoining improves tran...
Steve DeNeefe, Kevin Knight
EMNLP
2009
13 years 1 months ago
Mining Search Engine Clickthrough Log for Matching N-gram Features
User clicks on a URL in response to a query are extremely useful predictors of the URL's relevance to that query. Exact match click features tend to suffer from severe data s...
Huihsin Tseng, Longbin Chen, Fan Li, Ziming Zhuang...
EMNLP
2009
13 years 1 months ago
Hypernym Discovery Based on Distributional Similarity and Hierarchical Structures
This paper presents a new method of developing a large-scale hyponymy relation database by combining Wikipedia and other Web documents. We attach new words to the hyponymy databas...
Ichiro Yamada, Kentaro Torisawa, Jun'ichi Kazama, ...
EMNLP
2009
13 years 1 months ago
Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora
A significant portion of the world's text is tagged by readers on social bookmarking websites. Credit attribution is an inherent problem in these corpora because most pages h...
Daniel Ramage, David Hall, Ramesh Nallapati, Chris...
EMNLP
2009
13 years 1 months ago
Less is More: Significance-Based N-gram Selection for Smaller, Better Language Models
The recent availability of large corpora for training N-gram language models has shown the utility of models of higher order than just trigrams. In this paper, we investigate meth...
Robert C. Moore, Chris Quirk
EMNLP
2009
13 years 1 months ago
Review Sentiment Scoring via a Parse-and-Paraphrase Paradigm
This paper presents a parse-and-paraphrase paradigm to assess the degrees of sentiment for product reviews. Sentiment identification has been well studied; however, most previous ...
Jingjing Liu, Stephanie Seneff