Physics Maths Engineering

Poisson PCA for matrix count data

  Peer Reviewed

Abstract

We develop a dimension reduction framework for data consisting of matrices of counts. Our model is based on the assumption of existence of a small amount of independent normal latent variables that drive the dependency structure of the observed data, and can be seen as the exact discrete analogue of a contaminated low-rank matrix normal model. We derive estimators for the model parameters and establish their limiting normality. An extension of a recent proposal from the literature is used to estimate the latent dimension of the model. The method is shown to outperform both its vectorization-based competitors and matrix methods assuming the continuity of the data distribution in analysing simulated data and real world abundance data.