Document Type
Article
Publication Date
7-2012
Abstract
We present an extension to Jaynes’ maximum entropy principle that incorporates latent variables. The principle of latent maximum entropy we propose is different from both Jaynes’ maximum entropy principle and maximum likelihood estimation, but can yield better estimates in the presence of hidden variables and limited training data. We first show that solving for a latent maximum entropy model poses a hard nonlinear constrained optimization problem in general. However, we then show that feasible solutions to this problem can be obtained efficiently for the special case of log-linear models---which forms the basis for an efficient approximation to the latent maximum entropy principle. We derive an algorithm that combines expectation-maximization with iterative scaling to produce feasible log-linear solutions. This algorithm can be interpreted as an alternating minimization algorithm in the information divergence, and reveals an intimate connection between the latent maximum entropy and maximum likelihood principles. To select a final model, we generate a series of feasible candidates, calculate the entropy of each, and choose the model that attains the highest entropy. Our experimental results show that estimation based on the latent maximum entropy principle generally gives better results than maximum likelihood when estimating latent variable models on small observed data samples.
Repository Citation
Wang, S.,
Schuurmans, D.,
& Zhao, Y.
(2012). The Latent Maximum Entropy Principle. ACM Transactions on Knowledge Discovery from Data, 6 (2), 8.
https://corescholar.libraries.wright.edu/knoesis/1012
DOI
10.1145/2297456.2297460
Included in
Bioinformatics Commons, Communication Technology and New Media Commons, Databases and Information Systems Commons, OS and Networks Commons, Science and Technology Studies Commons
Comments
Attached is the unpublished, peer-reviewed author's version of this article. The final, publisher's version can be found on the ACM Digital Library at http://dx.doi.org/10.1145/2297456.2297460.