Hitting the Right Paraphrases in Good Time

10 years 11 months ago
Hitting the Right Paraphrases in Good Time
We present a random-walk-based approach to learning paraphrases from bilingual parallel corpora. The corpora are represented as a graph in which a node corresponds to a phrase, and an edge exists between two nodes if their corresponding phrases are aligned in a phrase table. We sample random walks to compute the average number of steps it takes to reach a ranking of paraphrases with better ones being "closer" to a phrase of interest. This approach allows "feature" nodes that represent domain knowledge to be built into the graph, and incorporates truncation techniques to prevent the graph from growing too large for efficiency. Current approaches, by contrast, implicitly presuppose the graph to be bipartite, are limited to finding paraphrases that are of length two away from a phrase, and do not generally permit easy incorporation of domain knowledge. Manual evaluation of generated output shows that our approach outperforms the state-of-the-art system of Callison-Bur...
Stanley Kok, Chris Brockett
Added 14 Feb 2011
Updated 14 Feb 2011
Type Journal
Year 2010
Authors Stanley Kok, Chris Brockett
Comments (0)