AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
While training is computationally expensive, detection is efficient, requiring less than a second in our C-Matlab implementation. This suggests that training should be seen as an off-line process, while detection may be implemented in real-time

Unsupervised Learning of Models for Recognition

ECCV, (2000): 18-32

Cited: 945|Views167
EI

Abstract

We present a method to learn object class models from unlabeled and unsegmented cluttered scenes for the purpose of visual object recognition. We focus on a particular type of model where objects are represented as flexible constellations of rigid parts (features). The variability within a class is represented by a joint probability densi...More

Code:

Data:

0
Introduction
  • Introduction and Related Work

    The authors are interested in the problem of recognizing members of object classes, where the authors define an object class as a collection of objects which share characteristic features or parts that are visually similar and occur in similar spatial configurations.
  • Part selection: Which object parts are distinctive and stable?
  • Features in training images might need to be hand-labeled.
  • Oftentimes training images showing objects in front of a uniform background are required.
  • Objects might need to be positioned in the same way throughout the training images so that a common reference frame can be established.
  • Important differences are that the authors model the positions of the background parts through a uniform density, while they used a Gaussian with large covariance.
  • The probability distribution of the number of background parts, which Burl et al ignored, is modeled in the case as a Poisson distribution
Highlights
  • Introduction and Related Work

    We are interested in the problem of recognizing members of object classes, where we define an object class as a collection of objects which share characteristic features or parts that are visually similar and occur in similar spatial configurations
  • Segmentation or registration of training images: Which objects are to be recognized and where do they appear in the training images? Part selection: Which object parts are distinctive and stable? Estimation of model parameters: What are the parameters of the global geometry or shape and of the appearance of the individual parts that best describe the training data?
  • In order to reduce sensitivity to noise due to the limited number of training images and to average across all possible values for the decision threshold, we used the area under the receiver operating characteristics curve as a measure of the classification performance driving the optimization of the model configuration
  • We have presented ideas for learning object models in an unsupervised setting
  • We have demonstrated that our model learning algorithm works successfully on two different data sets: frontal views of faces and rear views of motor-cars
  • While training is computationally expensive, detection is efficient, requiring less than a second in our C-Matlab implementation. This suggests that training should be seen as an off-line process, while detection may be implemented in real-time
Methods
  • In order to validate the method, the authors tested the performance, under the classification task described in Sect. 3.3, on two data sets: images of rear views of cars and images of human faces.
  • In order to validate the method, the authors tested the performance, under the classification task described in Sect.
  • 3.3, on two data sets: images of rear views of cars and images of human faces.
  • As mentioned in Sec. 3, the experiments described below have been performed with a translation invariant extension of the learning method.
  • All parameters of the learning algorithm were set to the same values in both experiments.
  • 1 − AROC in percent Cars Test C B A
Results
  • Instead of classifying every image by applying a fixed decision threshold according to (3), the authors computed receiver operating characteristics (ROCs) based on the ratio of posterior probabilities.
  • In order to reduce sensitivity to noise due to the limited number of training images and to average across all possible values for the decision threshold, the authors used the area under the ROC curve as a measure of the classification performance driving the optimization of the model configuration.
  • Features along the hairline turned out to be very stable, while parts containing noses were almost never used in the models
Conclusion
  • Discussion and Future

    Work

    The authors have presented ideas for learning object models in an unsupervised setting.
  • A set of unsegmented and unlabeled images containing examples of objects amongst clutter is supplied; the algorithm automatically selects distinctive parts of the object class, and learns the joint probability density function encoding the object’s appearance.
  • This allows the automatic construction of an efficient object detector which is robust to clutter and occlusion.
  • This suggests that training should be seen as an off-line process, while detection may be implemented in real-time
Funding
  • This work was funded by the NSF Engineering Research Center for Neuromorphic Systems Engineering (CNSE) at Caltech (NSF9402726), and an NSF National Young Investigator Award to P.P. (NSF9457618)
  • Welling was supported by the Sloan Foundation. We are also very grateful to Rob Fergus for helping with collecting the databases and to Thomas Leung, Mike Burl, Jitendra Malik and David Forsyth for many helpful comments
Reference
  • Y. Amit and D. Geman. A computational model for visual selection. Neural Computation, 11(7):1691–1715, 1999.
    Google ScholarLocate open access versionFindings
  • M.C. Burl, T.K. Leung, and P. Perona. “Face Localization via Shape Statistics”. In Int Workshop on Automatic Face and Gesture Recognition, 1995.
    Google ScholarLocate open access versionFindings
  • M.C. Burl, T.K. Leung, and P. Perona. “Recognition of Planar Object Classes”. In Proc. IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recogn., 1996.
    Google ScholarLocate open access versionFindings
  • M.C. Burl, M. Weber, and P. Perona. A probabilistic approach to object recognition using local photometry and global geometry. In proc. ECCV’98, pages 628–641, 1998.
    Google ScholarLocate open access versionFindings
  • T.F. Cootes and C.J. Taylor. “Locating Objects of Varying Shape Using Statistical Feature Detectors”. In European Conf. on Computer Vision, pages 465–474, 1996.
    Google ScholarLocate open access versionFindings
  • A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society B, 39:1–38, 1976.
    Google ScholarLocate open access versionFindings
  • R.O. Duda and P.E. Hart. Pattern Classification and Scene Analysis. John Wiley and Sons, Inc., 1973.
    Google ScholarFindings
  • G.J. Edwards, T.F.Cootes, and C.J.Taylor. Face recognition using active appearance models. In Proc. 5ïñð Europ. Conf. Comput. Vision, H. Burkhardt and B. Neumann (Eds.), LNCSSeries Vol. 1406–1407, Springer-Verlag, pages 581–595, 1998.
    Google ScholarLocate open access versionFindings
  • R.M. Haralick and L.G. Shapiro. Computer and Robot Vision II. Addison-Wesley, 1993.
    Google ScholarFindings
  • M. Lades, J.C. Vorbruggen, J. Buhmann, J. Lange, C. v.d. Malsburg, R.P. Wurtz, and W. Konen. “Distortion Invariant Object Recognition in the Dynamic Link Architecture”. IEEE Trans. Comput., 42(3):300–311, Mar 1993.
    Google ScholarLocate open access versionFindings
  • T.K. Leung, M.C. Burl, and P. Perona. “Finding Faces in Cluttered Scenes using Random Labeled Graph Matching”. Proc. 5th Int. Conf. Computer Vision, pages 637–644, June 1995.
    Google ScholarLocate open access versionFindings
  • T.K. Leung, M.C. Burl, and P. Perona. Probabilistic affine invariants for recognition. In Proc. IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recogn., pages 678–684, 1998.
    Google ScholarLocate open access versionFindings
  • T.K. Leung and J. Malik. Reconizing surfaces using three-dimensional textons. In Proc. 7th Int. Conf. Computer Vision, pages 1010–1017, 1999.
    Google ScholarLocate open access versionFindings
  • K. N. Walker, T. F. Cootes, and C. J. Taylor. Locating salient facial features. In Int. Conf. on Automatic Face and Gesture Recognition, Nara, Japan, 1998.
    Google ScholarLocate open access versionFindings
  • A.L. Yuille. “Deformable Templates for Face Recognition”. J. of Cognitive Neurosci., 3(1):59–70, 1991.
    Google ScholarLocate open access versionFindings
0
Your rating :

No Ratings

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn