Sciweavers

NLDB
2015
Springer

In Defense of Word Embedding for Generic Text Representation

8 years 5 days ago
In Defense of Word Embedding for Generic Text Representation
Abstract. Statistical methods have shown a remarkable ability to capture semantics. The word2vec method is a frequently cited method for capturing meaningful semantic relations between words from a large text corpus. It has the advantage of not requiring any tagging while training. The prevailing view is, however, that it lacks the ability to capture semantics of word sequences and is virtually useless for most purposes, unless combined with heavy machinery. This paper challenges that view, by showing that by augmenting the word2vec representation with one of a few pooling techniques, results are obtained surpassing or comparable with the best literature algorithms. This improved performance is justified by theory and verified by extensive experiments on well studied NLP benchmarks.1
Guy Lev, Benjamin Klein, Lior Wolf
Added 15 Apr 2016
Updated 15 Apr 2016
Type Journal
Year 2015
Where NLDB
Authors Guy Lev, Benjamin Klein, Lior Wolf
Comments (0)