A Multi-Task Neural Approach for Emotion Attribution, Classification, and Summarization

IEEE Transactions on Multimedia, pp. 148-159, 2020.

Cited by: 1|Bibtex|Views144|DOI:https://doi.org/10.1109/TMM.2019.2922129
EI WOS
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
We present a multi-task neural network with a novel bi-stream architecture, called Bi-stream Emotion Attribution-Classification Network

Abstract:

Emotional content is a crucial ingredient in user-generated videos. However, the sparsely expressed emotions in the user-generated video cause difficulties to emotions analysis in videos. In this paper, we propose a new neural approach---Bi-stream Emotion Attribution-Classification Network (BEAC-Net) to solve three related emotion analysi...More

Code:

Data:

0
Introduction
  • The explosive growth of user-generated video has created great demand for computational understanding of visual data and attracted significant research attention in the multimedia community.
  • The ingredients that form emotions include interaction among cognitive processes, temporal succession of appraisals, and coping behaviors [21], [25]
  • This may have inspired computational work like DeepSentiBank [26] and zero-shot emotion recognition [14], which broaden the emotion categories that can be recognized
Highlights
  • The explosive growth of user-generated video has created great demand for computational understanding of visual data and attracted significant research attention in the multimedia community
  • Extending our earlier work [17], we propose a multitask neural architecture, the Bi-stream Emotion AttributionClassification Network (BEAC-Net), which tackles both emotion attribution and classification at the same time, thereby allowing related tasks to reinforce each other
  • We propose a novel twostream neural architecture that employs the emotion segment selected by the attribution network in combination with the original video
  • We suggest that the ability to locate emotional content is crucial for accurate emotion understanding
  • We present a multi-task neural network with a novel bi-stream architecture, called Bi-stream Emotion Attribution-Classification Network (BEAC-Net)
  • The attribution network locates the emotional content, which is processed in parallel with the original video within the bi-stream architecture
Methods
  • The authors conduct experiments on two video emotion datasets based on Ekman’s six basic emotions.
  • The Emotion6 Video Dataset.
  • The Emotion6 dataset [64] contains 1980 images that are labeled with a distribution over 6 basic emotions and a neutral category.
  • The images do not contain facial expressions or text directly associated with emotions.
  • The authors consider the emotion category with the highest probability as the dominant emotion
Results
  • Less than 1% of the videos comprise less than 100 frames, so the padding is rarely necessary.
Conclusion
  • Computational understanding of emotions in user-generated video content is a challenging task due to the sparsity of emotional content, the presence of multiple emotions, and the variable quality of user-generated videos.
  • The authors suggest that the ability to locate emotional content is crucial for accurate emotion understanding.
  • Toward this end, the authors present a multi-task neural network with a novel bi-stream architecture, called Bi-stream Emotion Attribution-Classification Network (BEAC-Net).
  • An ablation study shows the bi-stream architecture provides significant benefits for emotion recognition and the proposed emotion attribution network outperforms traditional temporal attention.
Summary
  • Introduction:

    The explosive growth of user-generated video has created great demand for computational understanding of visual data and attracted significant research attention in the multimedia community.
  • The ingredients that form emotions include interaction among cognitive processes, temporal succession of appraisals, and coping behaviors [21], [25]
  • This may have inspired computational work like DeepSentiBank [26] and zero-shot emotion recognition [14], which broaden the emotion categories that can be recognized
  • Methods:

    The authors conduct experiments on two video emotion datasets based on Ekman’s six basic emotions.
  • The Emotion6 Video Dataset.
  • The Emotion6 dataset [64] contains 1980 images that are labeled with a distribution over 6 basic emotions and a neutral category.
  • The images do not contain facial expressions or text directly associated with emotions.
  • The authors consider the emotion category with the highest probability as the dominant emotion
  • Results:

    Less than 1% of the videos comprise less than 100 frames, so the padding is rarely necessary.
  • Conclusion:

    Computational understanding of emotions in user-generated video content is a challenging task due to the sparsity of emotional content, the presence of multiple emotions, and the variable quality of user-generated videos.
  • The authors suggest that the ability to locate emotional content is crucial for accurate emotion understanding.
  • Toward this end, the authors present a multi-task neural network with a novel bi-stream architecture, called Bi-stream Emotion Attribution-Classification Network (BEAC-Net).
  • An ablation study shows the bi-stream architecture provides significant benefits for emotion recognition and the proposed emotion attribution network outperforms traditional temporal attention.
Tables
  • Table1: EMOTION RECOGNITION RESULTS
  • Table2: TRANSFER LEARNING: OUT-OF-DOMAIN BEAC-NET FINETUNED ON 20%
  • Table3: CLASSIFICATION ACCURACY WITH DIFFERENT PROPORTIONS OF FRAMES
Download tables as Excel
Funding
  • This work was supported in part by NSFC Projects (61572134, 61572138, U1611461), Shanghai Sailing Program (17YF1427500), Fudan University-CIOMP Joint Fund (FC2017-006), STCSM Project (16JC1420400), Shanghai Municipal Science and Technology Major Project (2017SHZDZX01, 2018SHZDZX01) and ZJLab
Reference
  • A. R. Damasio, Descartes error: Emotion, reason and the human brain. New York: Avon Books, 1994.
    Google ScholarFindings
  • G. L. Clore and J. E. Palmer, “Affective guidance of intelligent agents: How emotion controls cognition,” Cognitive Systems Research, no. 1, pp. 21–30, 2009.
    Google ScholarLocate open access versionFindings
  • R. E. Guadagno, D. M. Rempala, S. Murphy, and B. M. Okdie, “What makes a video go viral? an analysis of emotional contagion and internet memes,” Computers in Human Behavior, vol. 29, no. 6, pp. 2312–2319, 2013.
    Google ScholarLocate open access versionFindings
  • K. Yadati, H. Katti, and M. Kankanhalli, “CAVVA: Computational affective video-in-video advertising,” IEEE Transactions on Multimedia, vol. 16, no. 1, 2014.
    Google ScholarLocate open access versionFindings
  • N. Ikizler-Cinbis and S. Sclaroff, “Web-based classifiers for human action recognition,” IEEE Transactions on Multimedia, vol. 14, pp. 1031– 1045, Aug 2012.
    Google ScholarLocate open access versionFindings
  • W. Xu, Z. Miao, X. P. Zhang, and Y. Tian, “A hierarchical spatiotemporal model for human activity recognition,” IEEE Transactions on Multimedia, vol. 19, pp. 1494–1509, July 2017.
    Google ScholarLocate open access versionFindings
  • K. Somandepalli, N. Kumar, T. Guha, and S. S. Narayanan, “Unsupervised discovery of character dictionaries in animation movies,” IEEE Transactions on Multimedia, vol. PP, no. 99, pp. 1–1, 2017.
    Google ScholarLocate open access versionFindings
  • H. Joho, J. M. Jose, R. Valenti, and N. Sebe, “Exploiting facial expressions for affective video summarisation,” in Proc. ACM conference on Image and Video Retrieval, 2009.
    Google ScholarLocate open access versionFindings
  • S. Zhao, H. Yao, X. Sun, P. Xu, X. Liu, and R. Ji, “Video indexing and recommendation based on affective analysis of viewers,” in Proceedings of the 19th ACM international conference on Multimedia, 2011.
    Google ScholarLocate open access versionFindings
  • Q. Zhen, D. Huang, Y. Wang, and L. Chen, “Muscular movement model-based automatic 3D/4D facial expression recognition,” IEEE Transactions on Multimedia, vol. 18, pp. 1438–1450, July 2016.
    Google ScholarLocate open access versionFindings
  • M. Liu, S. Shan, R. Wang, and X. Chen, “Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition,” in Proceedings of the Conference on Computer Vision and Pattern Recognition, 2014.
    Google ScholarLocate open access versionFindings
  • X. Alameda-Pineda, E. Ricci, Y. Yan, and N. Sebe, “Recognizing emotions from abstract paintings using non-linear matrix completion,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5240–5248, 2016.
    Google ScholarLocate open access versionFindings
  • A. Yazdani, K. Kappeler, and T. Ebrahimi, “Affective content analysis of music video clips,” in Proc. 1st ACM workshop Music information retrieval with user-centered and multimodal strategies, 2011.
    Google ScholarFindings
  • B. Xu, Y. Fu, Y.-G. Jiang, B. Li, and L. Sigal, “Heterogeneous knowledge transfer in video emotion recognition, attribution and summarization,” IEEE Trasactions on Affective Computing, 2017.
    Google ScholarLocate open access versionFindings
  • Y. Jiang, B. Xu, and X. Xue, “Predicting emotions in user-generated videos,” in The AAAI Conference on Artificial Intelligence, 2014.
    Google ScholarLocate open access versionFindings
  • B. Xu, Y. Fu, Y.-G. Jiang, B. Li, and L. Sigal, “Video emotion recognition with transferred deep feature encodings,” in Indian Council of Medical Research, 2016.
    Google ScholarLocate open access versionFindings
  • J. Gao, Y. Fu, Y.-G. Jiang, and X. Xue, “Frame-transformer emotion classification network,” in Proceedings of the 2017 ACM International Conference on Multimedia Retrieval, 2017.
    Google ScholarLocate open access versionFindings
  • P. Ekman, “Universals and cultural differences in facial expressions of emotion,” Nebrasak Symposium on Motivation, vol. 19, pp. 207–284, 1972.
    Google ScholarLocate open access versionFindings
  • P. Ekman, “Basic emotions,” in Handbook of Cognition and Emotion, 1999.
    Google ScholarLocate open access versionFindings
  • R. Plutchik and H. Kellerman, Emotion: Theory, research and experience. Vol. 1, Theories of emotion. Academic Press, 1980.
    Google ScholarLocate open access versionFindings
  • J. J. Gross, “Emotion regulation: Affective, cognitive, and social consequences,” Psychophysiology, vol. 39, no. 3, p. 281291, 2002.
    Google ScholarLocate open access versionFindings
  • L. F. Barrett, “Are emotions natural kinds?,” Perspectives on Psychological Science, vol. 1, no. 1, pp. 28–58, 2006.
    Google ScholarLocate open access versionFindings
  • K. A. Lindquist, E. H. Siegel, K. S. Quigley, and L. F. Barrett, “The hundred-year emotion war: Are emotions natural kinds or psychological constructions? comment on Lench, Flores, and Bench (2011),” Psychological Bulletin, no. 1, p. 255263, 2013.
    Google ScholarLocate open access versionFindings
  • L. Nummenmaa, E. Glerean, R. Hari, and J. K. Hietanen, “Bodily maps of emotions,” Proceedings of the National Academy of Sciences of the United States of America, vol. 111, no. 2, pp. 646–651, 2013.
    Google ScholarLocate open access versionFindings
  • B. Li, “A dynamic and dual-process theory of humor,” in The 3rd Annual Conference on Advances in Cognitive Systems, pp. 57–74, 2015.
    Google ScholarLocate open access versionFindings
  • T. Chen, D. Borth, Darrell, and S.-F. Chang, “DeepSentiBank: Visual sentiment concept classification with deep convolutional neural networks,” CoRR, 2014.
    Google ScholarFindings
  • A. Russell, James, “A circumplex model of affect,” Journal of Personality and Social Psychology, vol. 39, no. 6, pp. 1161–1178, 1980.
    Google ScholarLocate open access versionFindings
  • J. R. Fontaine, K. R. Scherer, E. B. Roesch, and P. C. Ellsworth, “The world of emotions is not two-dimensional,” Psychological Science, vol. 18, no. 12.
    Google ScholarLocate open access versionFindings
  • H. Lovheim, “A new three-dimensional model for emotions and monoamine neurotransmitters,” Medical Hypotheses, vol. 78, no. 2, pp. 341–348, 2012.
    Google ScholarLocate open access versionFindings
  • S. Chen, Q. Jin, J. Zhao, and S. Wang, “Multimodal multi-task learning for dimensional and continuous emotion recognition,” in Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, pp. 19– 26, 2017.
    Google ScholarLocate open access versionFindings
  • J. Huang, Y. Li, J. Tao, Z. Lian, Z. Wen, M. Yang, and J. Yi, “Continuous multimodal emotion prediction based on long short term memory recurrent neural network,” in AVEC’17 Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, pp. 11–18, 2017.
    Google ScholarLocate open access versionFindings
  • Y. Baveye, E. Dellandrea, C. Chamaret, and L. Chen, “Liris-accede: A video database for affective content analysis,” IEEE Transactions on Affective Computing, vol. 6, no. 1, pp. 43–55, 2015.
    Google ScholarLocate open access versionFindings
  • S. Benini, L. Canini, and R. Leonardi, “A connotative space for supporting movie affective recommendation,” IEEE Transactions on Multimedia, vol. 13, no. 6, pp. 1356–1370, 2011.
    Google ScholarLocate open access versionFindings
  • J. Machajdik and A. Hanbury, “Affective image classication using features inspired by psychology and art theory,” in Proceedings of the 18th ACM international conference on Multimedia, pp. 83–92, 2010.
    Google ScholarLocate open access versionFindings
  • X. Lu, P. Suryanarayan, R. B. Adams, J. Li, M. G. Newman, and J. Z. Wang, “On shape and the computability of emotions,” in Proceedings of the 20th ACM international conference on Multimedia, 2012.
    Google ScholarLocate open access versionFindings
  • B. Jou, S. Bhattacharya, and S.-F. Chang, “Predicting viewer perceived emotions in animated GIFs,” in Proceedings of the 22nd ACM international conference on Multimedia, 2014.
    Google ScholarLocate open access versionFindings
  • W. Hu, X. Ding, B. Li, J. Wang, Y. Gao, F. Wang, and S. Maybank, “Multi-perspective cost-sensitive context-aware multi-instance sparse coding and its application to sensitive video recognition,” IEEE Transactions on Multimedia, vol. 18, no. 1, 2016.
    Google ScholarLocate open access versionFindings
  • Y. Song, L.-P. Morency, and R. Davis, “Learning a sparse codebook of facial and body microexpressions for emotion recognition,” in Proceedings of the 15th ACM International conference on multimodal interaction, 2013.
    Google ScholarLocate open access versionFindings
  • B. Schuller, G. Rigoll, and M. Lang, “Hidden markov model-based speech emotion recognition,” in Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2, ICME ’03, pp. 401– 404, IEEE Computer Society, 2003.
    Google ScholarLocate open access versionFindings
  • Q. Mao, M. Dong, Z. Huang, and Y. Zhan, “Learning salient features for speech emotion recognition using convolutional neural networks,” IEEE Transactions on Multimedia, vol. 16, no. 8, pp. 2203–2213, 2014.
    Google ScholarLocate open access versionFindings
  • S. Zhang, S. Zhang, T. Huang, and W. Gao, “Speech emotion recognition using deep convolutional neural network and discriminant temporal pyramid matching,” IEEE Transactions on Multimedia, vol. 20, no. 6, pp. 1576–1590, 2018.
    Google ScholarLocate open access versionFindings
  • H.-L. Wang and L.-F. Cheong, “Affective understanding in film,” IEEE TCSVT, 2006.
    Google ScholarLocate open access versionFindings
  • Z. Zeng, J. Tu, M. Liu, T. S. Huang, B. Pianfetti, D. Roth, and S. Levinson, “Audio-visual affect recognition,” IEEE Transactions on multimedia, vol. 9, no. 2, pp. 424–428, 2007.
    Google ScholarLocate open access versionFindings
  • E. Acar, F. Hopfgartner, and S. Albayrak, “A comprehensive study on mid-level representation and ensemble learning for emotional analysis of video material,” Multimedia Tools and Applications, vol. 76, pp. 1–29, 2016.
    Google ScholarLocate open access versionFindings
  • L. Pang, S. Zhu, and C.-W. Ngo, “Deep multimodal learning for affective analysis and retrieval,” IEEE Transactions on Multimedia, vol. 17, no. 11, 2015.
    Google ScholarLocate open access versionFindings
  • S. E. Kahou, C. Pal, X. Bouthillier, P. Froumenty, C. Gulcehre, R. Memisevic, P. Vincent, A. Courville, Y. Bengio, R. C. Ferrari, et al., “Combining modality specific deep neural networks for emotion recognition in video,” in Proceedings of the 15th ACM International conference on multimodal interaction, pp. 543–550, ACM, 2013.
    Google ScholarLocate open access versionFindings
  • Q. You, J. Luo, H. Jin, and J. Yang, “Robust image sentiment analysis using progressively trained and domain transferred deep networks,” in AAAI, 2015.
    Google ScholarFindings
  • D. Borth, R. Ji, T. Chen, T. M. Breuel, and S.-F. Chang., “Large-scale visual sentiment ontology and detectors using adjective noun pairs,” in Proceedings of the 21st ACM international conference on Multimedia, 2013.
    Google ScholarLocate open access versionFindings
  • S. Wang and Q. Ji, “Video affective content analysis: a survey of state of the art methods,” IEEE Transactions on Automatic Control, vol. PP, no. 99, pp. 1–1, 2015.
    Google ScholarLocate open access versionFindings
  • S. Arifin and P. Y. K. Cheung, “Affective level video segmentation by utilizing the pleasure-arousal-dominance information,” IEEE Transactions on Multimedia, vol. 10, no. 7, 2008.
    Google ScholarLocate open access versionFindings
  • B. T. Truong and S. Venkatesh, “Video abstraction: A systematic review and classification,” ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 3, no. 1, pp. 79–82, 2007.
    Google ScholarLocate open access versionFindings
  • Y.-F. Ma, L. Lu, H.-J. Zhang, and M. Li, “A user attention model for video summarization,” in Proceedings of the 10th ACM international conference on Multimedia, 2002.
    Google ScholarLocate open access versionFindings
  • J.-L. Lai and Y. Yi, “Key frame extraction based on visual attention model,” Journal of Visual Communication and Image Representation, vol. 23, no. 1, pp. 114–125, 2012.
    Google ScholarLocate open access versionFindings
  • M. Wang, R. Hong, G. Li, Z.-J. Zha, S. Yan, and T.-S. Chua, “Event driven web video summarization by tag localization and key-shot identification,” IEEE Transactions on Multimedia, vol. 14, no. 4, pp. 975–985, 2012.
    Google ScholarLocate open access versionFindings
  • F. Wang and C. W. Ngo, “Summarizing rushes videos by motion, object, and event understanding,” IEEE Transactions on Multimedia, vol. 14, pp. 76–87, Feb 2012.
    Google ScholarLocate open access versionFindings
  • X. Wang, Y. Jiang, Z. Chai, Z. Gu, X. Du, and D. Wang, “Real-time summarization of user-generated videos based on semantic recognition,” in Proceedings of the 22nd ACM international conference on Multimedia, 2014.
    Google ScholarLocate open access versionFindings
  • M. Jaderberg, K. Simonyan, A. Zisserman, and k. kavukcuoglu, “Spatial transformer networks,” in Advances in Neural Information Processing Systems 28, pp. 2017–2025, 2015.
    Google ScholarLocate open access versionFindings
  • K. K. Singh and Y. J. Lee, “End-to-end localization and ranking for relative attributes,” in European Conference on Computer Vision, pp. 753–769, Springer, 2016.
    Google ScholarLocate open access versionFindings
  • X. Kelvin, L. B. Jimmy, K. Ryan, C. Kyunghyun, C. Aaron, S. Ruslan, R. S. Zemel, and B. Yoshua, “Show, attend, tell: Neural image caption generation with visual attention,” International Conference on Machine Learning, vol. 37, pp. 2048–2057, 2015.
    Google ScholarLocate open access versionFindings
  • C.-H. Lin and S. Lucey, “Inverse compositional spatial transformer networks,” in Proceedings of the Conference on Computer Vision and Pattern Recognition, 2017.
    Google ScholarLocate open access versionFindings
  • K. Simonyan and A. Zisserman, “Two-stream convolutional networks for action recognition in videos,” in Advances in neural information processing systems, pp. 568–576, 2014.
    Google ScholarLocate open access versionFindings
  • J. Carreira and A. Zisserman, “Quo vadis, action recognition? a new model and the kinetics dataset,” in Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 4724–4733, 2017.
    Google ScholarLocate open access versionFindings
  • Z. Li, G. M. Schuster, and A. K. Katsaggelos, “Minmax optimal video summarization,” IEEE Transactions on Circuits and Systems for Video Technology, 2005.
    Google ScholarLocate open access versionFindings
  • K.-C. Peng, T. Chen, A. Sadovnik, and A. Gallagher, “A mixed bag of emotions: Model, predict, and transfer emotion distributions,” pp. 860– 868, 06 2015.
    Google ScholarFindings
  • D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in NIPS, 2012.
    Google ScholarLocate open access versionFindings
  • P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, and L. Zhang, “Bottom-up and top-down attention for image captioning and visual question answering,” in Proceedings of the Conference on Computer Vision and Pattern Recognition, 2018.
    Google ScholarLocate open access versionFindings
  • L. Yao, A. Torabi, K. Cho, N. Ballas, C. Pal, H. Larochelle, and A. Courville, “Describing videos by exploiting temporal structure,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 4507–4515, 2015.
    Google ScholarLocate open access versionFindings
  • F. Caba Heilbron, V. Escorcia, B. Ghanem, and J. Carlos Niebles, “Activitynet: A large-scale video benchmark for human activity understanding,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 961–970, 2015.
    Google ScholarLocate open access versionFindings
  • G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database forstudying face recognition in unconstrained environments,” tech. rep., 2008.
    Google ScholarFindings
  • R. Cardona-Rivera and B. Li, “Plotshot: Generating discourseconstrained stories around photos,” in Proceedings of the 12th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2016. Yanwei Fu received the Ph.D. degree from Queen Mary University of London in 2014, and the M.Eng. degree from the Department of Computer Science and Technology, Nanjing University, China, in 2011. He held a post-doctoral position at Disney Research, Pittsburgh, PA, USA, from 2015 to 2016. He is currently a tenure-track Professor with Fudan University. His research interests are image and video understanding, and life-long learning.
    Google ScholarLocate open access versionFindings
  • Boyang Li is a Senior Research Scientist at Baidu Research at Sunnyvale, Carlifornia. Prior to Baidu, he directed the Narrative Intelligence research group at Disney Research Pittsburgh. His research interests lie broadly in machine learning and multimodal reasoning, and particularly in computational understanding and generation of content with complex semantic structures, such as narratives, human emotions, and the interaction between visual and textual information. He received his Ph.D. in Computer Science from Georgia Institute of Technology in 2014, and his B. Eng. from Nanyang Technological University, Singapore in 2008. He has authored and co-authored more than 40 peer-reviewed papers in international journals and conferences.
    Google ScholarLocate open access versionFindings
  • Guoyun Tu received his Bachelor’s degree in physics at Fudan University in 2018 and is now a graduate candidate at EIT Digital Program (Eindhoven University of Technology & KTH Royal Institute of Technology). His research interests includes machine learning theory and its application.
    Google ScholarLocate open access versionFindings
  • Yu-Gang Jiang is a Professor of Computer Science at Fudan University and Director of Fudan-Jilian Joint Research Center on Intelligent Video Technology, Shanghai, China. He is interested in all aspects of extracting high-level information from big video data, such as video event recognition, object/scene recognition and large-scale visual search. His work has led to many awards, including the inaugural ACM China Rising Star Award, the 2015 ACM SIGMM Rising Star Award, and the research award for outstanding young researchers from NSF China. He is currently an associate editor of ACM TOMM, Machine Vision and Applications (MVA) and Neurocomputing. He holds a PhD in Computer Science from City University of Hong Kong and spent three years working at Columbia University before joining Fudan in 2011.
    Google ScholarLocate open access versionFindings
  • Xiangyang Xue received the BS, MS, and PhD degrees in communication engineering from Xidian University, Xi’an, China, in 1989, 1992, and 1995, respectively. He is currently a professor of computer science with Fudan University, Shanghai, China. His research interests include computer vision, multimedia information processing and machine learning.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments