AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
We have described a variety of applications of variational methods to problems of inference and learning in graphical models

An introduction to variational methods for graphical models

Learning in graphical models, no. 2 (1999): 183-233

引用3850|浏览113
EI
下载 PDF 全文
引用
微博一下

摘要

This paper presents a tutorial introduction to the use of variational methods for inference and learning in graphical models (Bayesian networks and Markov random fields). We present a number of examples of graphical models, including the QMR-DT database, the sigmoid belief network, the Boltzmann machine, and several variants of hidden Mar...更多

代码

数据

0
简介
  • The problem of probabilistic inference in graphical models is the problem of computing a conditional probability distribution over the values of some of the nodes, given the values of other nodes.
  • The authors often wish to calculate marginal probabilities in graphical models, in particular the probability of the observed evidence, P(E).
  • Inference algorithms do not compute the numerator and denominator of Eq (1) and divide, they generally produce the likelihood as a by-product of the calculation of P(HIE).
  • Algorithms that maximize likelihood generally make use of the calculation of P(HIE) as a subroutine
重点内容
  • The problem of probabilistic inference in graphical models is the problem of computing a conditional probability distribution over the values of some of the nodes, given the values of other nodes
  • Viewed as a function of the parameters of the graphical model, for fixed E, P(E) is an important quantity known as the likelihood
  • We provide a brief overview of the QMR-DT database here; for further details see Shwe, et al (1991)
  • As in the case of the Boltzmann machine, we find that the variational parameters are linked via their Markov blankets and the consistency equation (Eq (67)) can be interpreted as a local message-passing algorithm
  • We have described a variety of applications of variational methods to problems of inference and learning in graphical models
  • It is important to emphasize, that research on variational methods for graphical models is of quite recent origin, and there are many open problems and unresolved issues
方法
  • The authors make a few remarks on the relationships between variational methods and stochastic methods, in particular the Gibbs sampler.
  • In Gibbs sampling, the message-passing is simple: each node learns the current instantiation of its Markov blanket.
  • With enough samples the node can estimate the distribution over its Markov blanket and determine its own statistics.
  • The authors can quite generally treat parameters as additional nodes in a graphical model (cf.
  • This volume) and thereby treat Bayesian inference on the same footing as generic probabilistic inference in a graphical model
  • This probabilistic inference problem is often intractable, and variational approximations can be useful.
  • The ensemble is fit by minimizing the appropriate KL divergence:
结论
  • The authors have described a variety of applications of variational methods to problems of inference and learning in graphical models.
  • The authors hope to have convinced the reader that variational methods can provide a powerful and elegant tool for graphical models, and that the algorithms that result are simple and intuitively appealing.
  • It is important to emphasize, that research on variational methods for graphical models is of quite recent origin, and there are many open problems and unresolved issues.
引用论文
  • Bathe, K. J. (1996). Finite Element Procedures. Englewood Cliffs, NJ: Prentice-Hall.
    Google ScholarFindings
  • Baum, L.E., Petrie, T., Soules, G., & Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Mathematical Statistics, 41, 164-171.
    Google ScholarLocate open access versionFindings
  • Cover, T., & Thomas, J. (1991). Elements of Information Theory. New York: John Wiley.
    Google ScholarFindings
  • Cowell, R. (in press). Introduction to inference for Bayesian networks. In M. I. Jordan (Ed.), Learning in Graphical Models. Norwell, MA: Kluwer Academic Publishers.
    Google ScholarLocate open access versionFindings
  • Dagum, P., & Luby, M. (1993). Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artificial Intelligence, 60, 141-153.
    Google ScholarLocate open access versionFindings
  • Dayan, P., Hinton, G. E., Neal, R., & Zemel, R. S. (1995). The Helmholtz Machine.
    Google ScholarFindings
  • Dean, T., & Kanazawa, K. (1989). A model for reasoning about causality and persistence.
    Google ScholarFindings
  • Dechter, R. (in press). Bucket elimination: A unifying framework for probabilistic inference. In M. I. Jordan (Ed.), Learning in Graphical Models. Norwell, MA: Kluwer
    Google ScholarLocate open access versionFindings
  • Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum-likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B39, 1-38.
    Google ScholarLocate open access versionFindings
  • Draper, D. L., & Hanks, S. (1994). Localized partial evaluation of belief networks. Uncertainty and Artificial Intelligence: Proceedings of the Tenth Conference. San Mateo, CA: Morgan Kaufmann.
    Google ScholarLocate open access versionFindings
  • Frey, B. Hinton, G. E., Dayan, P. (1996). Does the wake-sleep algorithm learn good density estimators? In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8.
    Google ScholarLocate open access versionFindings
  • Fung, R. & Favero, B. D. (1994). Backward simulation in Bayesian networks. Uncertainty and Artificial Intelligence: Proceedings of the Tenth Conference. San Mateo, CA: Morgan Kaufmann.
    Google ScholarLocate open access versionFindings
  • Galland, C. (1993). The limitations of deterministic Boltzmann machine learning. Network, 4, 355-379..
    Google ScholarLocate open access versionFindings
  • Ghahramani, Z., & Hinton, G. E. (1996). Switching state-space models. University of
    Google ScholarFindings
  • Ghahramani, Z., & Jordan, M. I. (1997). Factorial Hidden Markov models. Machine
    Google ScholarFindings
  • Gilks, W., Thomas, A., & Spiegelhaiter, D. (1994). A language and a program for complex Bayesian modelling. The Statistician, 43, 169-178.
    Google ScholarLocate open access versionFindings
  • Heckerman, D. (in press). A tutorial on learning with Bayesian networks. In M. I. Jordan (Ed.), Learning in Graphical Models. Norwell, MA: Kluwer Academic Publishers.
    Google ScholarLocate open access versionFindings
  • Henrion, M. (1991). Search-based methods to bound diagnostic probabilities in very large belief nets. Uncertainty and Artificial Intelligence: Proceedings of the Seventh
    Google ScholarLocate open access versionFindings
  • Hinton, G. E., & Sejnowski, T. (1986). Learning and relearning in Boltzmann machines. In
    Google ScholarLocate open access versionFindings
  • D. E. Rumelhart & J. L. McClelland, (Eds.), Parallel distributed processing: Volume
    Google ScholarLocate open access versionFindings
  • Hinton, G.E. & van Camp, D. (1993). Keeping neural networks simple by minimizing the description length of the weights. In Proceedings of the 6th Annual Workshop on
    Google ScholarLocate open access versionFindings
  • Hinton, G. E., Dayan, P., Frey, B., and Neal, R. M. (1995). The wake-sleep algorithm for unsupervised neural networks. Science, 268:1158-116l.
    Google ScholarLocate open access versionFindings
  • Hinton, G. E., Sallans, B., & Ghahramani, Z. (in press). A hierarchical community of experts. In M. I. Jordan (Ed.), Learning in Graphical Models. Norwell, MA: Kluwer
    Google ScholarLocate open access versionFindings
  • Horvitz, E. J., Suermondt, H. J., & Cooper, G.F. (1989). Bounded conditioning: Flexible inference for decisions under scarce resources. Conference on Uncertainty in Artificial
    Google ScholarLocate open access versionFindings
  • Intelligence: Proceedings of the Fifth Conference. Mountain View, CA: Association for UAI.
    Google ScholarLocate open access versionFindings
  • Jaakkola, T. S., & Jordan, M.1. (1996). Computing upper and lower bounds on likelihoods in intractable networks. Uncertainty and Artificial Intelligence: Proceedings of the
    Google ScholarLocate open access versionFindings
  • Jaakkola, T. S. (1997). Variational methods for inference and estimation in graphical models. Unpublished doctoral dissertation, Massachusetts Institute of Technology.
    Google ScholarFindings
  • Jaakkola, T. S., & Jordan, M. I. (1997a). Recursive algorithms for approximating probabilities in graphical models. In M. C. Mozer, M. I. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems g. Cambridge, MA: MIT Press.
    Google ScholarLocate open access versionFindings
  • Jaakkola, T. S., & Jordan, M. I. (1997b). Bayesian logistic regression: a variational approach. In D. Madigan & P. Smyth (Eds.), Proceedings of the 1997 Conference on
    Google ScholarLocate open access versionFindings
  • Jaakkola, T. S., & Jordan. M.1. (1997c). Variational methods and the QMR-DT database.
    Google ScholarFindings
  • Submitted to: Journal of Artificial Intelligence Research.
    Google ScholarFindings
  • Jaakkola, T. S., & Jordan. M. I. (in press). Improving the mean field approximation via the use of mixture distributions. In M. I. Jordan (Ed.), Learning in Graphical Models.
    Google ScholarLocate open access versionFindings
  • Jensen, C. S., Kong, A., & Kjrerulff, U. (1995). Blocking-Gibbs sampling in very large probabilistic expert systems. International Journal of Human-Computer Studies, 42, 647-666.
    Google ScholarLocate open access versionFindings
  • Jensen, F. V., & Jensen, F. (1994). Optimal junction trees. Uncertainty and Artificial Intelligence: Proceedings of the Tenth Conference. San Mateo, CA: Morgan Kaufmann.
    Google ScholarLocate open access versionFindings
  • Jensen, F. V. (1996). An Introduction to Bayesian Networks. London: UCL Press.
    Google ScholarFindings
  • Jordan, M. I. (1994). A statistical approach to decision tree modeling. In M. Warmuth (Ed.), Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory. New York: ACM Press.
    Google ScholarLocate open access versionFindings
  • Jordan, M. I., Ghahramani, Z., & Saul, L. K. (1997). Hidden Markov decision trees. In
    Google ScholarLocate open access versionFindings
  • M. C. Mozer, M. I. Jordan, & T. Petsche (Eds.), Advances in Neural Information
    Google ScholarLocate open access versionFindings
  • Kanazawa, K., Koller, D., & RusselJ, S. (1995). Stochastic simulation algorithms for dynamic probabilistic networks. Uncertainty and Artificial Intelligence: Proceedings of the Eleventh Conference. San Mateo, CA: Morgan Kaufmann.
    Google ScholarLocate open access versionFindings
  • Kjrerulff, U. (1990). Triangulation of graphs-algorithms giving small total state space.
    Google ScholarFindings
  • Kjrerulff, U. (1994). Reduction of computational complexity in Bayesian networks through removal of weak dependences. Uncertainty and Artificial Intelligence: Proceedings of the Tenth Conference. San Mateo, CA: Morgan Kaufmann.
    Google ScholarLocate open access versionFindings
  • MacKay, D.J.C. (1997a). Ensemble learning for hidden Markov models. Unpublished manuscript. Department of Physics, University of Cambridge.
    Google ScholarLocate open access versionFindings
  • MacKay, D.J.C. (1997b). Comparison of approximate methods for handling hyperparameters. Submitted to Neural Computation.
    Google ScholarFindings
  • MacKay, D.J.C. (1997b). Introduction to Monte Carlo methods. In M. I. Jordan (Ed.), Learning in Graphical Models. Norwell, MA: Kluwer Academic Publishers.
    Google ScholarLocate open access versionFindings
  • McEliece, R.J., MacKay, D.J.C., & Cheng, J.-F. (1996) Turbo decoding as an instance of Pearl's "belief propagation algorithm." Submitted to: IEEE Journal on Selected Areas in Communication.
    Google ScholarFindings
  • Merz, C. J., & Murphy, P. M. (1996). UCI repository of machine learning databases. [http:/wwv.ics.uci/......mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science.
    Google ScholarFindings
  • Neal, R. (1992). Connectionist learning of belief networks, Artificial Intelligence, 56, 71113.
    Google ScholarLocate open access versionFindings
  • Neal, R. (1993). Probabilistic inference using Markov chain Monte Carlo methods. University of Toronto Technical Report CRG-TR-93-1, Department of Computer Science.
    Google ScholarFindings
  • Neal, R., & Hinton, G. E. (in press). A view of the EM algorithm that justifies incremental, sparse, and other variants. In M. I. Jordan (Ed.), Learning in Graphical Models. Norwell, MA: Kluwer Academic Publishers.
    Google ScholarLocate open access versionFindings
  • Parisi, G. (1988). Statistical Field Theory. Redwood City, CA: Addison-Wesley.
    Google ScholarFindings
  • Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, San Mateo, CA: Morgan Kaufmannn.
    Google ScholarFindings
  • Peterson, C., & Anderson, J. R. (1987). A mean field theory learning algorithm for neural networks. Complex Systems, 1,995-1019.
    Google ScholarLocate open access versionFindings
  • Rockafellar, R. (1972). Convex Analysis. Princeton University Press.
    Google ScholarFindings
  • Rustagi, J. (1976). Variational Methods in Statistics. New York: Academic Press.
    Google ScholarFindings
  • Sakurai, J. (1985). Modem Quantum Mechanics. Redwood City, CA: Addison-Wesley.
    Google ScholarFindings
  • Saul, L. K., & Jordan, M. I. (1994). Learning in Boltzmann trees. Neural Computation, 6, 1173-1183.
    Google ScholarLocate open access versionFindings
  • Saul, L. K., Jaakkola, T. S., & Jordan, M. I. (1996). Mean field theory for sigmoid belief networks. Journal of Artificial Intelligence Research, 4, 61-'~6.
    Google ScholarLocate open access versionFindings
  • Saul, L. K., & Jordan, M. I. (1996). Exploiting tractable substructures in intractable networks. In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8.
    Google ScholarLocate open access versionFindings
  • Saul, L. K., & Jordan, M. I. (in press). A mean field learning algorithm for unsupervised neural networks. In M. I. Jordan (Ed.), Learning in Graphical Models. Norwell, MA: Kluwer Academic Publishers.
    Google ScholarLocate open access versionFindings
  • Seung, S. (1995). Annealed theories oflearning. In J.-H Oh, C. Kwon, and S. Cho, (Eds.), Neural Networks: The Statistical Mechanics Perspectives. Singapore: World Scientific.
    Google ScholarLocate open access versionFindings
  • Shachter, R. D., Andersen, S. K., & Szolovits, P. (1994). Global conditioning for probabilistic inference in belief networks. Uncertainty and Artificial Intelligence: Proceedings of the Tenth Conference. San Mateo, CA: Morgan Kaufmann.
    Google ScholarLocate open access versionFindings
  • Shenoy, P. P. (1992). Valuation-based systems for Bayesian decision analysis. Operations Research, 40,463-484.
    Google ScholarLocate open access versionFindings
  • Shwe, M. A., Middleton, B., Heckerman, D. E., Henrion, M., Horvitz, E. J., Lehmann, H. P., & Cooper, G. F. (1991). Probabilistic diagnosis using a reformulation of the INTERNIST-l/QMR lmowledge base. Meth. Inform. Med., 90, 241-255.
    Google ScholarLocate open access versionFindings
  • Smyth, P., Heckerman, D., & Jordan, M. I. (1997). Probabilistic independence networks for hidden Markov probability models. Neural Computation, 9, 227-270.
    Google ScholarLocate open access versionFindings
  • Waterhouse, S., MacKay, D.J.C. & Robinson, T. (1996). Bayesian methods for mixtures of experts. In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems 8.
    Google ScholarLocate open access versionFindings
  • Williams, C. K. I., & Hinton, G. E. (1991). Mean field networks that learn to discriminate temporally distorted strings. In Touretzky, D. S., Elman, J., Sejnowski, T., & Hinton, G. E., (Eds.), Proceedings of the 1990 Connectionist Models Summer School. San Mateo, CA: Morgan Kaufmann.
    Google ScholarLocate open access versionFindings
0
您的评分 :

暂无评分

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn