Aleksandar Dimitriev (2017) Modelling multivariate discrete data with latent Gaussian processes. MSc thesis.
Abstract
Multivariate count data are common in some fields, such as sports, neuroscience, and text mining. Models that can accurately perform factor analysis are required, especially for structured data, such as time-series count matrices. We present Poisson Factor Analysis using Latent Gaussian Processes, a novel method for analyzing multivariate count data. Our approach allows for non-i.i.d observations, which are linked in the latent space using a Gaussian Process. Due to an exponential non-linearity in the model, there is no closed form solution. Thus, we resort to an expectation maximization approach with a Laplace approximation for tractable inference. We present results on several data sets, both synthetic and real, of a comparison with other factor analysis methods. Our method is both qualitatively and quantitatively superior for non-i.i.d Poisson data, because the assumptions it makes are well suited for the data.
Actions (login required)