Sciweavers

CICLING
2010
Springer
13 years 8 months ago
A General Bio-inspired Method to Improve the Short-Text Clustering Task
"Short-text clustering" is a very important research field due to the current tendency for people to use very short documents, e.g. blogs, text-messaging and others. In s...
Diego Ingaramo, Marcelo Errecalde, Paolo Rosso
CICLING
2010
Springer
13 years 8 months ago
Identification of Translationese: A Machine Learning Approach
This paper presents a machine learning approach to the study of translationese. The goal is to train a computer system to distinguish between translated and non-translated text, in...
Iustina Ilisei, Diana Inkpen, Gloria Corpas Pastor...
CICLING
2010
Springer
13 years 8 months ago
Emotions in Words: Developing a Multilingual WordNet-Affect
In this paper we describe the process of Russian and Romanian WordNet-Affect creation. WordNet-Affect is a lexical resource created on the basis of the Princeton WordNet which cont...
Victoria Bobicev, Victoria Maxim, Tatiana Prodan, ...
CICLING
2010
Springer
13 years 8 months ago
Adaptive Term Weighting through Stochastic Optimization
Term weighting strongly influences the performance of text mining and information retrieval approaches. Usually term weights are determined through statistical estimates based on s...
Michael Granitzer
CICLING
2010
Springer
13 years 8 months ago
Selecting the N-Top Retrieval Result Lists for an Effective Data Fusion
Although the application of data fusion in information retrieval has yielded good results in the majority of the cases, it has been noticed that its achievement is dependent on the...
Antonio Juárez-González, Manuel Mont...
CICLING
2010
Springer
13 years 8 months ago
Multi Word Term Queries for Focused Information Retrieval
In this paper, we address both standard and focused retrieval tasks based on comprehensible language models and interactive query expansion (IQE). Query topics are expanded using a...
Eric SanJuan, Fidelia Ibekwe-Sanjuan
CICLING
2010
Springer
13 years 8 months ago
Automatic Generation of Bilingual Dictionaries Using Intermediary Languages and Comparable Corpora
Abstract. This paper outlines a strategy to build new bilingual dictionaries from existing resources. The method is based on two main tasks: first, a new set of bilingual correspo...
Pablo Gamallo Otero, José Ramon Pichel Camp...
CICLING
2010
Springer
13 years 9 months ago
Mining Parenthetical Translations for Polish-English Lexica
Abstract. Documents written in languages other than English sometimes include parenthetical English translations, usually for technical and scientic terminology. Techniques had be...
Filip Gralinski
CICLING
2010
Springer
13 years 9 months ago
Word Length n-Grams for Text Re-use Detection
Abstract. The automatic detection of shared content in written documents –which includes text reuse and its unacknowledged commitment, plagiarism– has become an important probl...
Alberto Barrón-Cedeño, Chiara Basile...
CICLING
2010
Springer
13 years 9 months ago
A Maximum Entropy Approach to Syntactic Translation Rule Filtering
In this paper we will present a maximum entropy filter for the translation rules of a statistical machine translation system based on tree transducers. This filter can be success...
Marcin Junczys-Dowmunt