Application of vector quantization, principal components analysis or negative matrix factorization involves finding the approximate factorization of this matrix V Ϸ WH into a feature set W and hidden variables H, in the same way as was done for faces
Learning The Parts Of Objects By Non-Negative Matrix Factorization
NATURE, no. 6755 (1999): 788-791
Is perception of the whole based on perception of its parts? There is psychological(1) and physiological(2,3) evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations(4,5). But little is known about how brains or computers might learn the parts of objects....更多
下载 PDF 全文
- The authors demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text.
- The authors have applied non-negative matrix factorization (NMF), together with principal components analysis (PCA) and vector quantization (VQ), to a database of facial images.
- As shown in Fig. 1, all three methods learn to represent a face as a linear combination of basis images, but with qualitatively different results.
- principal components analysis constrains the columns of W to be orthonormal and the rows of H to be orthogonal to each other. This relaxes the unary constraint of vector quantization, allowing a distributed representation in which each face is approximated by a linear combination of all the basis images, or eigenfaces
- Unlike the unary constraint of vector quantization, these non-negativity constraints permit the combination of multiple basis images to represent a face
- Application of vector quantization, principal components analysis or negative matrix factorization involves finding the approximate factorization of this matrix V Ϸ WH into a feature set W and hidden variables H, in the same way as was done for faces
- negative matrix factorization was performed with the iterative algorithm described in Fig. 2, starting with random initial conditions for W and H
- vector quantization was done via the k-means algorithm, starting from random initial conditions for W and H
- The NMF basis is radically different: its images are localized features that correspond better with intuitive notions of the parts of faces.
- An encoding consists of the coefficients by which a face is represented with a linear combination of basis images.
- This unary representation forces VQ to learn basis images that are prototypical faces.
- This relaxes the unary constraint of VQ, allowing a distributed representation in which each face is approximated by a linear combination of all the basis images, or eigenfaces.
- Unlike the unary constraint of VQ, these non-negativity constraints permit the combination of multiple basis images to represent a face.
- The non-negativity constraints are compatible with the intuitive notion of combining parts to form a whole, which is how NMF learns a parts-based representation.
- The exact form of the objective function is not as crucial as the non-negativity constraints for the success of NMF in learning parts.
- It is helpful to visualize the dependencies between image pixels and encoding variables in the form of the network shown in Fig. 3.
- Application of VQ, PCA or NMF involves finding the approximate factorization of this matrix V Ϸ WH into a feature set W and hidden variables H, in the same way as was done for faces.
- Learning parts for these complex cases is likely to require fully hierarchical models with multiple levels of hidden variables, instead of the single level in NMF.
- Non-negativity constraints may help such models to learn parts-based representations, the authors do not claim that they are sufficient in themselves.
- This results in a basis that is non-global; in this representation all the basis images are used in cancelling combinations to represent an individual face, and the encodings are not sparse.
- The NMF representation contains both a basis and encoding that are naturally sparse, in that many of the components are exactly equal to zero.
- The authors propose that the one-sided constraints on neural activity and synaptic strengths in the brain may be important for developing sparsely distributed, parts-based representations for perception.
- We acknowledge the support of Bell Laboratories and MIT
- Hartman, W. D., Wendt, J. W. & Wiedenmayer, F. Living and fossil sponges. Notes for a short course. Sedimentia 8, 1–274 (1980).
- Ghiold, J. The sponges that spanned Europe. New Scient. 129, 58–62 (1991).
- Leinfelder, R. R. Upper Jurassic reef types and controlling factors. Profil 5, 1–45 (1993).
- Wiedenmayer, F. Contributions to the knowledge of post-Paleozoic neritic and archibental sponges (Porifera). Schweiz. Palaont. Abh. 116, 1–147 (1994).
- Levi, C. in Fossil and Recent Sponges (eds Reitner, J. & Keupp, H.) 72–82 (Springer, New York, 1991).
- Moret, L. Contribution al’etude des spongiaires siliceux du Miocene de l’Algerie. Mem. Soc. Geol. Fr. 1, 1-27 (1924).
- 8. Maldonado, M. & Young, C. M. Bathymetric patterns of sponge distribution on the Bahamian slope. Deep-Sea Res. I 43, 897–915 (1996).
- 9. Lowenstam, H. A. & Weiner, S. On Biomineralization (Oxford Univ., Oxford, 1989).
- 10. Maliva, R. G., Knoll, A. H. & Siever, R. Secular change in chert distribution: a reflection of evolving biological participation in the silica cycle. Palaios 4, 519–532 (1989).
- 11. Nelson, D. M., Treguer, P., Brzezinski, M. A., Leynaert, A. & Queguiner, B. Production and dissolution of biogenic silica in the ocean: revised global estimates, comparison with regional data and relationship to biogenic sedimentation. Glob. Biochem. Cycles 9, 359–372 (1995).
- 12. Treguer, P. et al. The silica balance in the world ocean: a reestimate. Science 268; 375–379 (1995).
- 13. Calvert, S. E. in Silicon Geochemistry and Biogeochemistry (ed. Aston, S. R.) 143–186 (Academic, London, 1983).
- 14. Lisitzyn, A. P. Sedimentation in the world ocean. Soc. Econ. Palaeon. Mineral. Spec. Pub. 17, 1–218 (1972).
- 15. Weinberg, S. Decouvrir la Mediterranee (Editions Nathan, Paris, 1993).
- 16. Maldonado, M. & Uriz, M. J. Skeletal morphology of two controversial poecilosclerid genera (Porifera, Demospongiae): Discorhabdella and Crambe. Helgolander Meeresunters. 50; 369–390 (1996).
- 17. Hinde, G. J. & Holmes, W. M. On the sponge remains in the Lower Tertiary strata near Oamaru, New Zealand. J. Linn. Soc. Zool. 24, 177–262 (1892).
- 18. Kelly-Borges, M. & Pomponi, S. A. Phylogeny and classification of lithisthid sponges (Porifera: Demospongiae): a preliminary assessment using ribosomal DNA sequence comparisons. Mol. Mar. Biol. Biotech. 3, 87–103 (1994).
- 19. Mostler, H. Poriferenspiculae der alpinen Trias. Geol. Palaont. Mitt. Innsbruck. 6, 1–42 (1976).
- 20. Palmer, T. J. & Fursich, F. T. Ecology of sponge reefs from the Upper Bathonian of Normandy. Palaeontology 24, 1–23 (1981).
- 21. Burckle, L. H. in Introduction to Marine Micropaleontology (eds Haq, B. U. & Boersma, A.) 245–266 (Elsevier, Amsterdam, 1978).
- 22. Austin, B. Underwater birdwatching. Canadian Tech. Rep. Hydro. Ocean. Sci. 38, 83–90 (1984).
- 23. Koltun, V. M. in The Biology of the Porifera (ed. Fry, W. G.) 285–297 (Academic, London, 1970).
- 24. Tabachnick, K. R. in Sponges in Time and Space (eds van Soest, R. M. W., van Kempen, T. M. G. & Braekman, J. C.) 225–232 (A. A. Balkema, Rotterdam, 1994).
- 25. Harper, H. E. & Knoll, A. H. Silica, diatoms and Cenozoic radiolarian evolution. Geology 3, 175–177 (1975).
- 26. Hartman, W. D. in Silicon and Siliceous Structures in Biological Systems (eds Simpson, T. L. & Volcani, B. E.) 453–493 (Springer Verlag, New York, 1981).
- 27. Pisera, A. Upper Jurassic siliceous sponges from Swabian Alb: taxonomy and paleoecology. Palaeont. Pol. 57, 1–216 (1997).
- 28. Reincke, T. & Barthel, D. Silica uptake kinetics of Halichondria panicea in Kiel Bight. Mar. Biol. 129, 591–593 (1997).
- 29. Grasshoff, K., Ehrardt, M. & Kremling, K. Methods of Seawater Analysis (Verlag Chemie, Weinheim, 1983).
- 30. Maldonado, M. & Uriz, M. J. An experimental approach to the ecological significance of microhabitatscale movement in an encrusting sponge. Mar. Ecol. Prog. Ser. 185, 239–255 (1999).
- 1. Palmer, S. E. Hierarchical structure in perceptual representation. Cogn. Psychol. 9, 441–474 (1977).
- 2. Wachsmuth, E., Oram, M. W. & Perrett, D. I. Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. Cereb. Cortex 4, 509–522 (1994).
- 3. Logothetis, N. K. & Sheinberg, D. L. Visual object recognition. Annu. Rev. Neurosci. 19, 577–621 (1996).
- 4. Biederman, I. Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94, 115–147 (1987).
- 5. Ullman, S. High-Level Vision: Object Recognition and Visual Cognition (MIT Press, Cambridge, MA, 1996).
- 6. Turk, M. & Pentland, A. Eigenfaces for recognition. J. Cogn. Neurosci. 3, 71–86 (1991).
- 7. Field, D. J. What is the goal of sensory coding? Neural Comput. 6, 559–601 (1994).
- 8. Foldiak, P. & Young, M. Sparse coding in the primate cortex. The Handbook of Brain Theory and Neural Networks 895–898 (MIT Press, Cambridge, MA, 1995).
- 9. Olshausen, B. A. & Field, D. J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996).
- 10. Lee, D. D. & Seung, H. S. Unsupervised learning by convex and conic coding. Adv. Neural Info. Proc. Syst. 9, 515–521 (1997).
- 12. Nakayama, K. & Shimojo, S. Experiencing and perceiving visual surfaces. Science 257, 1357–1363 (1992).
- 13. Hinton, G. E., Dayan, P., Frey, B. J. & Neal, R. M. The ‘‘wake-sleep’’ algorithm for unsupervised neural networks. Science 268, 1158–1161 (1995).
- 14. Salton, G. & McGill, M. J. Introduction to Modern Information Retrieval (McGraw-Hill, New York, 1983).
- 15. Landauer, T. K. & Dumais, S. T. The latent semantic analysis theory of knowledge. Psychol. Rev. 104, 211–240 (1997).
- 16. Jutten, C. & Herault, J. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture. Signal Proc. 24, 1–10 (1991).
- 17. Bell, A. J. & Sejnowski, T. J. An information maximization approach to blind separation and blind deconvolution. Neural Comput. 7, 1129–1159 (1995).
- 18. Bartlett, M. S., Lades, H. M. & Sejnowski, T. J. Independent component representations for face recognition. Proc. SPIE 3299, 528–539 (1998).
- 19. Shepp, L. A. & Vardi, Y. Maximum likelihood reconstruction for emission tomography. IEEE Trans. Med. Imaging. 2, 113–122 (1982).
- 20. Richardson, W. H. Bayesian-based iterative method of image restoration. J. Opt. Soc. Am. 62, 55–59 (1972).
- 21. Lucy, L. B. An iterative technique for the rectification of observed distributions. Astron. J. 74, 745–754 (1974).
- 22. Dempster, A. P., Laired, N. M. & Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. 39, 1–38 (1977).