Detecting Emotional Valence Using Time-Domain Analysis Of Speech Signals

2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC)(2019)

引用 5|浏览17
暂无评分
摘要
Mental health is a growing concern and its problems range from inability to cope with day-to-day stress to severe conditions like depression. Ability to detect these symptoms heavily relies on accurate measurements of emotion and its components, such as emotional valence comprising of positive, negative and neutral affect. Speech as a bio-signal to measure valence is interesting because of the ubiquity of smartphones that can easily record and process speech signals. Speech-based emotion detection uses a broad spectrum of features derived from audio samples including pitch, energy, Mel Frequency Cepstral Coefficients (MFCCs), Linear Predictive Cepstral Coefficients, Log frequency power coefficients, spectrograms and so on. Despite the array of features and classifiers, detecting valence from speech alone remains a challenge. Further, the algorithms for extracting some of these features are compute-intensive. This becomes a problem particularly in smartphone applications where the algorithms have to be executed on the device itself. We propose a novel time-domain feature that not only improves the valence detection accuracy, but also saves 10% of the computational cost of extraction as compared to that of MFCCs. A Random Forest Regressor operating on the proposed feature-set detects speaker-independent valence on a non-acted database with 70% accuracy. The algorithm also achieves 100% accuracy when tested with the acted speech database, Emo-DB.
更多
查看译文
关键词
Algorithms,Databases, Factual,Emotions,Humans,Speech
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要