Sciweavers

ICML
2009
IEEE

Grammatical inference as a principal component analysis problem

13 years 11 months ago
Grammatical inference as a principal component analysis problem
One of the main problems in probabilistic grammatical inference consists in inferring a stochastic language, i.e. a probability distribution, in some class of probabilistic models, from a sample of strings independently drawn according to a fixed unknown target distribution p. Here, we consider the class of rational stochastic languages composed of stochastic languages that can be computed by multiplicity automata, which can be viewed as a generalization of probabilistic automata. Rational stochastic languages p have a useful algebraic characterization: all the mappings ˙up : v → p(uv) lie in a finite dimensional vector subspace V ∗ p of the vector space R Σ composed of all real-valued functions defined over Σ∗ . Hence, a first step in the grammatical inference process can consist in identifying the subspace V ∗ p . In this paper, we study the possibility of using Principal Component Analysis to achieve this task. We provide an inference algorithm which computes an esti...
Raphaël Bailly, François Denis, Liva R
Added 19 May 2010
Updated 19 May 2010
Type Conference
Year 2009
Where ICML
Authors Raphaël Bailly, François Denis, Liva Ralaivola
Comments (0)