Chrome Extension
WeChat Mini Program
Use on ChatGLM

A BiLSTM and CTC Based Multi-Sensor Information Fusion Frame for Continuous Sign Language Recognition

Yuyuan Chen,Jie Li, Shifeng Lin,Yuge Xu,Chenguang Yang

2024 10th International Conference on Electrical Engineering, Control and Robotics (EECR)(2024)

Cited 0|Views0
No score
Abstract
While sign language recognition has been widely applied in human-robot interaction, the applications of continuous sign language recognition (CSLR) remain limited. Currently, a major challenge in CSLR is the scarcity of publicly available continuous sign language datasets which are often in video format. Additionally, visual information often suffers from issues such as hand blur, overlap, and disappearance. To tackle these challenges, we propose a multi-sensor information fusion CSLR based on Bi-directional Long Short-Term Memory (BiLSTM) network and Connectionist Temporary Classification (CTC) algorithm. Firstly, an RGB camera and a MYO armband are used to simultaneously collect a continuous sign language dataset, which includes three different modalities of information: RGB video, the IMU signals and sEMG signals. Then, keyframes of the RGB videos are extracted using the IMU signals to save computational costs and reduce the word error rate (WER) of CSLR. To fully utilize the information from the three modalities, a multimodal-fusion-based end-to-end CSLR model is constructed based on BiLSTM network and CTC algorithm. Comparative experiments are performed to verify the effectiveness of the proposed method. Experimental results demonstrate that the combination of the three modalities achieves the best performance, with a WER as low as 10.3% in CSLR.
More
Translated text
Key words
continuous sign language recognition,multi-mode fusion,bi-directional long short-term memory,connection-ist temporary classification
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined