Pathological Voice Classification Using Multiresolution Time Series Classification Network

Denghuang Zhao,Xincheng Zhu,Jinyang Qian,Xiaojun Zhang,Yishen Xu,Zhi Tao

2022 International Conference on Sensing, Measurement & Data Analytics in the era of Artificial Intelligence (ICSMD)（2022）

引用 0|浏览6

暂无评分

摘要

The detection of pathological voices has achieved good results in recent years. However, due to the complexity of pathological voice, traditional feature based methods are not effective to further classify different voice disease types. In recent years, deep learning methods have shown excellent performance in deep feature extraction and classification of time series. In this paper, we propose a multiresolution time series classification network based on 1-D and 2-D dilated convolutional neural networks to perform the pathological voice multi-classification task. In our method, we used the combination of raw voice, glottal wave signal and the first order difference of glottal wave as the multivariate input of the network. The dilated convolutional layers with different dilation rates were designed to capture features from different scales of voice signals. We trained our network in the MEEI, SVD and HUPA databases and collected voices with a voice recorder to test the network's effect. An improvement of 17% in distinguishing healthy voices, neuromuscular disorders and structural disorders was obtained. The experimental result shows that the structure we proposed can significantly improve the performance of multi-classification task of voices.

查看译文

关键词

pathological voice,machine learning,time series,multi-classification,glottal flow

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要