Sciweavers

IMCSIT
2010

Parallel, Massive Processing in SuperMatrix - a General Tool for Distributional Semantic Analysis of Corpus

13 years 1 months ago
Parallel, Massive Processing in SuperMatrix - a General Tool for Distributional Semantic Analysis of Corpus
The paper presents an extended version of the SuperMatrix system -- a general tool supporting automatic acquisition of lexical semantic relations from corpora. Extensions focus mainly on parallel processing of massive amounts of data. The construction of the system is discussed. Three distributed parts of the system are presented, i.e., distributed construction of co-incidence matrices from corpora, computation of similarity matrix and parallel solving of synonymy tests. An evaluation of a proposed approach to parallel processing is shown. Parallelization of similarity matrix computation demonstrates almost linear speedup. The smallest improvements were achieved for construction of matrices, as this process is mostly bound by reading huge amounts of data. Also, a few areas in which functionality of SuperMatrix was improved are described.
Bartosz Broda, Damian Jaworski, Maciej Piasecki
Added 05 Mar 2011
Updated 05 Mar 2011
Type Journal
Year 2010
Where IMCSIT
Authors Bartosz Broda, Damian Jaworski, Maciej Piasecki
Comments (0)