10 years 5 months ago
Learning from Multiple Annotators with Gaussian Processes
Abstract. In many supervised learning tasks it can be costly or infeasible to obtain objective, reliable labels. We may, however, be able to obtain a large number of subjective, po...
Perry Groot, Adriana Birlutiu, Tom Heskes
10 years 5 months ago
Accurate transcription of broadcast news speech using multiple noisy transcribers and unsupervised reliability metrics
Professional manual transcription of speech is an expensive and time consuming process. This paper focuses on the problem of combining noisy transcriptions from multiple non-exper...
Kartik Audhkhasi, Panayiotis G. Georgiou, Shrikant...
10 years 11 months ago
Creating a Gold Standard for Sentence Clustering in Multi-Document Summarization
Sentence Clustering is often used as a first step in Multi-Document Summarization (MDS) to find redundant information. All the same there is no gold standard available. This paper...
Johanna Geiss
10 years 11 months ago
Bitext Dependency Parsing with Bilingual Subtree Constraints
This paper proposes a dependency parsing method that uses bilingual constraints to improve the accuracy of parsing bilingual texts (bitexts). In our method, a targetside tree frag...
Wenliang Chen, Jun'ichi Kazama, Kentaro Torisawa
90views more  CSDA 2006»
11 years 1 months ago
Comparing two binary diagnostic tests in the presence of verification bias
The comparison of the accuracy of two binary diagnostic tests has traditionally required knowledge of the real state of the disease in all of the patients in the sample via the ap...
José Antonio Roldán Nofuentes, Juan ...
115views more  BMCBI 2010»
11 years 1 months ago
Machine learning methods for metabolic pathway prediction
Background: A key challenge in systems biology is the reconstruction of an organism's metabolic network from its genome sequence. One strategy for addressing this problem is ...
Joseph M. Dale, Liviu Popescu, Peter D. Karp
11 years 2 months ago
Toward a Task-based Gold Standard for Evaluation of NP Chunks and Technical Terms
We propose a gold standard for evaluating two types of information extraction output -- noun phrase (NP) chunks (Abney 1991; Ramshaw and Marcus 1995) and technical terms (Justeson...
Nina Wacholder, Peng Song
11 years 3 months ago
Evaluating Information Content by Factoid Analysis: Human annotation and stability
We present a new approach to intrinsic summary evaluation, based on initial experiments in van Halteren and Teufel (2003), which combines two novel aspects: comparison of informat...
Simone Teufel, Hans van Halteren
102views Education» more  LREC 2008»
11 years 3 months ago
Measures for Term and Sentence Relevances: an Evaluation for German
Terms, term relevances, and sentence relevances are concepts that figure in many NLP applications, such as Text Summarization. These concepts are implemented in various ways, thou...
Heike Bieler, Stefanie Dipper
180views Education» more  LREC 2008»
11 years 3 months ago
Do we Still Need Gold Standards for Evaluation?
The availability of a huge mass of textual data in electronic format has increased the need for fast and accurate techniques for textual data processing. Machine learning and stat...
Thierry Poibeau, Cédric Messiant