Semi-parametric exponential family PCA : Reducing dimensions via non-parametric latent distribution estimation

Sajama Sajama and Alon Orlitsky
CS2004-0790
June 2, 2004

Principal component analysis is a widely used technique for dimensionality reduction, but is not based on a probability model. Many recently proposed dimension reduction methods are based on latent variable modelling with restrictive assumptions on the latent distribution. We present a semi-parametric latent variable model based technique for density modelling, dimensionality reduction and visualization. Unlike previous methods, we estimate the latent distribution non-parametrically. Using this estimated prior to reduce dimensions ensures that multi-modality is better preserved in the projected space. In addition, we allow the components of latent variable models to be drawn from the exponential family which makes the method suitable for special data types, for example binary or count data. We discuss connections to other probabilistic and non-probabilistic dimension reduction schemes based on gaussian and other exponential family distributions. Simulations on real valued, binary and count data show favorable comparison to other related schemes both in terms of separating different populations and generalization to unseen samples.


How to view this document


The authors of these documents have submitted their reports to this technical report series for the purpose of non-commercial dissemination of scientific work. The reports are copyrighted by the authors, and their existence in electronic format does not imply that the authors have relinquished any rights. You may copy a report for scholarly, non-commercial purposes, such as research or instruction, provided that you agree to respect the author's copyright. For information concerning the use of this document for other than research or instructional purposes, contact the authors. Other information concerning this technical report series can be obtained from the Computer Science and Engineering Department at the University of California at San Diego, techreports@cs.ucsd.edu.


[ Search ]


NCSTRL
This server operates at UCSD Computer Science and Engineering.
Send email to webmaster@cs.ucsd.edu