Reducing Wrong Labels in Distant Supervision for Relation Extraction

13 years 6 months ago

Download aclweb.org

In relation extraction, distant supervision seeks to extract relations between entities from text by using a knowledge base, such as Freebase, as a source of supervision. When a sentence and a knowledge base refer to the same entity pair, this approach heuristically labels the sentence with the corresponding relation in the knowledge base. However, this heuristic can fail with the result that some sentences are labeled wrongly. This noisy labeled data causes poor extraction performance. In this paper, we propose a method to reduce the number of wrong labels. We present a novel generative model that directly models the heuristic labeling process of distant supervision. The model predicts whether assigned labels are correct or wrong via its hidden variables. Our experimental results show that this model detected wrong labels with higher performance than baseline methods. In the experiment, we also found that our wrong label reduction boosted the performance of relation extraction.

Shingo Takamatsu, Issei Sato, Hiroshi Nakagawa

Real-time Traffic

ACL 2012 | Computational Linguistics | Freebase | Hidden Variables | Source Of Supervision |

claim paper

» Ontological Smoothing for Relation Extraction with Minimal Supervision

» Collective CrossDocument Relation Extraction Without Labelled Data

» Canonical Correlation Analysis for Multiview Semisupervised Feature Extraction

Post Info
More Details (n/a)

Added	29 Sep 2012
Updated	29 Sep 2012
Type	Journal
Year	2012
Where	ACL
Authors	Shingo Takamatsu, Issei Sato, Hiroshi Nakagawa

Comments (0)

Sciweavers

Reducing Wrong Labels in Distant Supervision for Relation Extraction

ACL 2012 | Computational Linguistics | Freebase | Hidden Variables | Source Of Supervision |

Explore & Download

Productivity Tools

Sciweavers