SMILe: Shuffled Multiple-Instance Learning

AAAI, 2013.

Cited by: 6|Bibtex|Views90
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com
Weibo:
We show that Shuffled Multiple-Instance Learning can significantly improve the performance of an MI classifier, and is well-suited for domains such as active learning in which few initial labeled bags are available to a classifier

Abstract:

Resampling techniques such as bagging are often used in supervised learning to produce more accurate classifiers. In this work, we show that multiple-instance learning admits a different form of resampling, which we call “shuffling.” In shuffling, we resample instances in such a way that the resulting bags are likely to be correctly label...More

Code:

Data:

0
Introduction
  • Consider a task such as content-based image retrieval (CBIR) (Maron and Ratan 1998), where the authors wish to automatically retrieve images from a database based on a small set of images labeled “interesting” or otherwise by a user.
  • The authors do not know precisely what the user was interested in, so one approach is to segment an image into its component objects, and assign the label “interesting” or “uninteresting” to the entire set of segments.
  • From such data, a retrieval system needs to learn a classifier that can correctly retrieve new interesting images
Highlights
  • Consider a task such as content-based image retrieval (CBIR) (Maron and Ratan 1998), where we wish to automatically retrieve images from a database based on a small set of images labeled “interesting” or otherwise by a user
  • We show that in practice, Shuffled Multiple-Instance Learning is effective in significantly improving accuracy for MI classification and MI active learning
  • As we discuss in the analysis above, Shuffled Multiple-Instance Learning is a resampling technique similar to bagging, we hypothesize that by resampling instances rather than bags, Shuffled Multiple-Instance Learning can provide an additional information about instance constraints
  • We have presented a new resampling technique for MI learning, called Shuffled Multiple-Instance Learning
  • We show that Shuffled Multiple-Instance Learning can significantly improve the performance of an MI classifier, and is well-suited for domains such as active learning in which few initial labeled bags are available to a classifier
Results
  • As the authors discuss in the analysis above, SMILe is a resampling technique similar to bagging, the authors hypothesize that by resampling instances rather than bags, SMILe can provide an additional information about instance constraints
  • To demonstrate this effect, the MI bagging algorithm described in prior work (Zhou and Zhang 2003) is implemented and used to produce the results in Figure 4.
  • The authors show that SMILe can significantly improve the performance of an MI classifier, and is well-suited for domains such as active learning in which few initial labeled bags are available to a classifier
Conclusion
  • The authors have presented a new resampling technique for MI learning, called SMILe. The approach works by resampling instances within positive bags to generate new bags that are positive with high probability.
  • In addition to variance reduction that many resampling techniques afford, SMILe can introduce additional information in the form of instance label constraints to an MI classifier.
  • The authors show that SMILe can significantly improve the performance of an MI classifier, and is well-suited for domains such as active learning in which few initial labeled bags are available to a classifier
Summary
  • Introduction:

    Consider a task such as content-based image retrieval (CBIR) (Maron and Ratan 1998), where the authors wish to automatically retrieve images from a database based on a small set of images labeled “interesting” or otherwise by a user.
  • The authors do not know precisely what the user was interested in, so one approach is to segment an image into its component objects, and assign the label “interesting” or “uninteresting” to the entire set of segments.
  • From such data, a retrieval system needs to learn a classifier that can correctly retrieve new interesting images
  • Results:

    As the authors discuss in the analysis above, SMILe is a resampling technique similar to bagging, the authors hypothesize that by resampling instances rather than bags, SMILe can provide an additional information about instance constraints
  • To demonstrate this effect, the MI bagging algorithm described in prior work (Zhou and Zhang 2003) is implemented and used to produce the results in Figure 4.
  • The authors show that SMILe can significantly improve the performance of an MI classifier, and is well-suited for domains such as active learning in which few initial labeled bags are available to a classifier
  • Conclusion:

    The authors have presented a new resampling technique for MI learning, called SMILe. The approach works by resampling instances within positive bags to generate new bags that are positive with high probability.
  • In addition to variance reduction that many resampling techniques afford, SMILe can introduce additional information in the form of instance label constraints to an MI classifier.
  • The authors show that SMILe can significantly improve the performance of an MI classifier, and is well-suited for domains such as active learning in which few initial labeled bags are available to a classifier
Related work
  • In the standard supervised learning setting, resampling and ensemble approaches such as bagging (Breiman 1996) and boosting (Freund and Schapire 1996) can improve classifier performance, for example by reducing the variance of algorithms that are unstable given small changes in the training set. Prior work investigates extending bagging to the MI setting by resampling entire bags to create bootstrap replicas of B (Zhou and Zhang 2003). Similarly, boosting is extended to the MI setting by iteratively training a weak classifier on a set of weighted bags (Auer and Ortner 2004). SMILe differs from previous approaches by resampling at the instance level rather than at the bag level. Furthermore, unlike previous MI resampling techniques, SMILe is not formulated as an ensemble method, but uses resampling to construct a single augmented dataset used to train any MI classifier. Future work will explore an instance resampling technique like in SMILe combined with an ensemble approach like bagging.
Funding
  • Doran was supported by GAANN grant P200A090265 from the US Department of Education
  • Ray was partially supported by CWRU award OSA110264
Reference
  • Andrews, S.; Tsochantaridis, I.; and Hofmann, T. 2003. Support vector machines for multiple-instance learning. In Advances in Neural Information Processing Systems, 561– 568.
    Google ScholarLocate open access versionFindings
  • Auer, P., and Ortner, R. 2004. A boosting approach to multiple instance learning. In Machine Learning: ECML 2004, volume 3201 of Lecture Notes in Computer Science. Springer. 63–74.
    Google ScholarLocate open access versionFindings
  • Blockeel, H.; Page, D.; and Srinivasan, A. 2005. Multiinstance tree learning. In Proceedings of the 22nd International Conference on Machine Learning, 57–64.
    Google ScholarLocate open access versionFindings
  • Breiman, L. 1996. Bagging predictors. Machine Learning Journal 24(2):123–140.
    Google ScholarLocate open access versionFindings
  • Cohn, D. A.; Ghahramani, Z.; and Jordan, M. I. 1996. Active learning with statistical models. Journal of Artificial Intelligence Research 4:129–145.
    Google ScholarLocate open access versionFindings
  • Dietterich, T. G.; Lathrop, R. H.; and Lozano-Perez, T. 1997. Solving the multiple instance problem with axisparallel rectangles. Artificial Intelligence 89(1–2):31–71.
    Google ScholarLocate open access versionFindings
  • Freund, Y., and Schapire, R. E. 1996. Experiments with a new boosting algorithm. In Proceedings of the 13th International Conference on Machine Learning, 148–156.
    Google ScholarLocate open access versionFindings
  • Gartner, T.; Flach, P.; Kowalczyk, A.; and Smola, A. 2002. Multi-instance kernels. In Proceedings of the 19th International Conference on Machine Learning, 179–186.
    Google ScholarLocate open access versionFindings
  • Liu, D.; Hua, X.; Yang, L.; and Zhang, H. 200Multipleinstance active learning for image categorization. Advances in Multimedia Modeling 239–249.
    Google ScholarFindings
  • Maron, O., and Ratan, A. L. 1998. Multiple-instance learning for natural scene classification. In Proceedings of the 15th International Conference on Machine Learning, 341– 349.
    Google ScholarLocate open access versionFindings
  • Maron, O. 1998. Learning from Ambiguity. Ph.D. Dissertation, Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA.
    Google ScholarFindings
  • Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12:2825– 2830.
    Google ScholarLocate open access versionFindings
  • Ramon, J., and Raedt, L. D. 2000. Multi instance neural networks. In Proceedings of the ICML 2000 workshop on Attribute-Value and Relational Learning.
    Google ScholarLocate open access versionFindings
  • Ray, S., and Craven, M. 2005. Supervised versus multiple instance learning: an empirical comparison. In Proceedings of the 26th International Conference on Machine Learning, 697–704.
    Google ScholarLocate open access versionFindings
  • Settles, B.; Craven, M.; and Ray, S. 2008. Multiple-instance active learning. In Advances in Neural Information Processing Systems, 1289–1296.
    Google ScholarLocate open access versionFindings
  • Tong, S., and Koller, D. 2002. Support vector machine active learning with applications to text classification. The Journal of Machine Learning Research 2:45–66.
    Google ScholarLocate open access versionFindings
  • Xu, X., and Frank, E. 2004. Logistic regression and boosting for labeled bags of instances. In Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 272–281.
    Google ScholarLocate open access versionFindings
  • Zhang, Q., and Goldman, S. 2001. EM-DD: An improved multiple-instance learning technique. In Advances in Neural Information Processing Systems, 1073–1080.
    Google ScholarLocate open access versionFindings
  • Zhou, Z.-H., and Zhang, M.-L. 2002. Neural networks for multi-instance learning. In Proceedings of the International Conference on Intelligent Information Technology.
    Google ScholarLocate open access versionFindings
  • Zhou, Z., and Zhang, M. 2003. Ensembles of multi-instance learners. In Machine Learning: ECML 2003, volume 2837 of Lecture Notes in Computer Science. Springer. 492–502.
    Google ScholarLocate open access versionFindings
  • Zhou, Z.; Sun, Y.; and Li, Y. 2009. Multi-instance learning by treating instances as non-IID samples. In Proceedings of the 26th International Conference on Machine Learning, 1249–1256.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Best Paper
Best Paper of AAAI, 2013
Tags
Comments