mmASL: Environment-Independent ASL Gesture Recognition Using 60 GHz Millimeter-wave Signals

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, pp. 1-30, 2020.

被引用0|浏览17
EI
微博一下
We describe the challenges along with our solutions: Reliable wake-word detection at any location: While initiating an interaction with mmASL, a Deaf and Hard-of-Hearing user can perform the wake-word at any location in a given indoor environment

摘要

Home assistant devices such as Amazon Echo and Google Home have become tremendously popular in the last couple of years. However, due to their voice-controlled functionality, these devices are not accessible to Deaf and Hard-of-Hearing (DHH) people. Given that over half a million people in the United States communicate using American Sign...更多

代码

数据

0
ZH
下载 PDF 全文
引用
微博一下
简介
  • Home assistant devices such as Amazon Echo and Google Home have become tremendously popular in the last couple of years.
  • 2.4/5 GHz WiFi CSI (Channel State Information) has been leveraged for gesture recognition [46, 74, 83, 89]
  • It enables device-free and low-cost sensing, its lower tolerance to change of environment, user position, and presence of other moving users pose significant challenges in terms of accuracy and reproducibility.
  • With CSI-based sensing, presence of other people can significantly affect the sensing accuracy, but this challenge has been addressed largely through controlled experiments
重点内容
  • Home assistant devices such as Amazon Echo and Google Home have become tremendously popular in the last couple of years
  • We present mmASL, a home assistant system for Deaf and Hard-of-Hearing (DHH) users that can perform American Sign Language (ASL) recognition using 60 GHz millimeter-wave signals
  • We describe the challenges along with our solutions: (1) Reliable wake-word detection at any location: While initiating an interaction with mmASL, a DHH user can perform the wake-word at any location in a given indoor environment
  • 6.1 Data Collection and Implementation mmASL is evaluated using a large amount of data covering a range of practical scenarios involving multiple users and different environments
  • We established that variation in Doppler spread can be captured in spectrograms to recognize ASL signs
  • We find that mmASL can detect wake-word with an average accuracy of 94% for different users and environments
  • We proposed a multi-task deep learning architecture, which can learn ASL domain specific features from the spectrograms
结果
  • 6.1 Data Collection and Implementation mmASL is evaluated using a large amount of data covering a range of practical scenarios involving multiple users and different environments.
  • They include single-user (Scenario 1) and multi-user scenarios.
  • Note that in every scenario, there are always at least two additional persons in the room behind the 60 GHz system collecting the data and performing uncontrolled movements
结论
  • The authors discuss various aspects of mmASL that can be improved through further investigation: 60 GHz blockage: As observed in the evaluation, when the intended user is blocked by an interfering user, wake-word recognition accuracy decreases.
  • Such occlusions are a problem in vision-based systems.
  • The authors compared the performance of mmASL with Kinect and RGB camera and find that mmASL can achieve accurate sign recognition for a variety of practical scenarios including presence of other interfering user, change of environment and different user positions.
总结
  • Introduction:

    Home assistant devices such as Amazon Echo and Google Home have become tremendously popular in the last couple of years.
  • 2.4/5 GHz WiFi CSI (Channel State Information) has been leveraged for gesture recognition [46, 74, 83, 89]
  • It enables device-free and low-cost sensing, its lower tolerance to change of environment, user position, and presence of other moving users pose significant challenges in terms of accuracy and reproducibility.
  • With CSI-based sensing, presence of other people can significantly affect the sensing accuracy, but this challenge has been addressed largely through controlled experiments
  • Objectives:

    The objective of this work is to design a home assistant system for DHH users that can perform ASL recognition using 60 GHz millimeter-wave wireless signals. mmASL has two important components.
  • Results:

    6.1 Data Collection and Implementation mmASL is evaluated using a large amount of data covering a range of practical scenarios involving multiple users and different environments.
  • They include single-user (Scenario 1) and multi-user scenarios.
  • Note that in every scenario, there are always at least two additional persons in the room behind the 60 GHz system collecting the data and performing uncontrolled movements
  • Conclusion:

    The authors discuss various aspects of mmASL that can be improved through further investigation: 60 GHz blockage: As observed in the evaluation, when the intended user is blocked by an interfering user, wake-word recognition accuracy decreases.
  • Such occlusions are a problem in vision-based systems.
  • The authors compared the performance of mmASL with Kinect and RGB camera and find that mmASL can achieve accurate sign recognition for a variety of practical scenarios including presence of other interfering user, change of environment and different user positions.
表格
  • Table1: mmASL compared with existing works on ASL recognition
  • Table2: ASL sign recognition data set: table summarizes data for 15 subjects detailing the training and test data for different scenarios (shown in Fig. 14) and users. Superscript in ASL signs identifies phonological properties where “h” is horizontal (parallel) and “v” is vertical (perpendicular) to coronal plane, and r means repetitive
Download tables as Excel
相关工作
  • RELATED WORK mmWave Sensing

    Given their higher frequency and larger bandwidth, mmWave wireless signals have been used for sensing in recent years. Authors in [58], [47] and [87] have built custom FMCW (Frequency Modulated Continuous Wave) radars for short range (less than a meter) hand gesture recognition and fine-grained finger gesture recognition. Designed to provide hands-free interaction to smart devices, they operate by illuminating the hand and utilize Range-Doppler Maps for tracking the movement of ≤ 10 gestures. On the other hand, mmASL has to illuminate the entire upper body of the user because of the nature of the ASL signs (which involve both the hands and displacement of the hands can reach upto a couple of feet) and be available over long-ranges (a requirement for digital assistants). Operating in long-range (4 − 6 meters) brings new challenges such as presence of other people, change of environment, etc. Recently, a 60 GHz radar is used in [62] to recognize 8 gestures using FMCW range information. However, due to limited number of antenna elements, the system cannot perform beam scanning or steering which is necessary in case of mmASL DHH home assistant system. mmWave sensing is also used for fine-grained object tracking [93], and imaging using synthetic aperture radars [106]. mmVital [96] uses mmWave signals for locating a human being and monitoring her vital signs. In comparison, our approach uses beam-scanning and spatial spectrograms that are shown to be more robust than time-series metrics such as energy or variance used in mmVital. Authors in [38] recently proposed an environment independent activity
基金
  • This research is supported by NSF grant CNS-1730083 and Google Faculty Research grant
研究对象与分析
samples: 8
The 10 Hz high-pass filter removes the impact of low frequency human activities such as breathing and posture changes. The resulting filtered signal is then used to plot spectrograms using Short Time Fourier Transform (STFT) with a window size of 0.8K samples (100 ms), while sliding the window at every 1 ms (8 samples). Lastly, we use the log-transformation of amplitude values to normalize (addressed as log normalization) and emphasize on low intensity components (inspired by speech recognition literature [52, 57, 59])

users: 7
The different hyperparameters we validate are the number of convolutional layers, number of dense layers, and dropout rate. For cross validation, we use data collected from 7 users for all the 50 gestures, with 75% of data for training and the remaining for testing. We compare the variation in accuracy when the network is trained for 400 epochs, starting from the 25th epoch we evaluate every 50th epoch

samples: 3700
Note that in every scenario (even in the case of single-user), there are always at least two additional persons in the room behind the 60 GHz system collecting the data and performing uncontrolled movements. Wake-word dataset: The wake-word dataset includes a total of 3700 samples, with each sample being 3 seconds long. We collect data for three users (User A, B, and C) in three different rooms: conference room (Figure 15c), lab (Figure 15b), and class room (Figure 15a)

users: 3
Wake-word dataset: The wake-word dataset includes a total of 3700 samples, with each sample being 3 seconds long. We collect data for three users (User A, B, and C) in three different rooms: conference room (Figure 15c), lab (Figure 15b), and class room (Figure 15a). The data is collected for all the five scenarios shown in Fig 14

participants: 11
We performed data collection in two phases. In Phase-I, we recruited 11 participants (8 male and 3 female) and collected data simultaneously using mmASL, Kinect and RGB camera. The data collected using the Kinect system includes 25 body joint coordinates in 3D over time

additional participants: 4
More details of the accuracy comparison are provided in Section 6.3. In Phase-II, we collected data for the same set of 50 ASL signs from 4 additional participants using mmASL. This additional dataset is used to evaluate the performance of mmASL in terms of different training and testing splits, diversity in user’s signing and cross-subject accuracy

samples: 1590
For SVM and random forest, we perform a grid search with 10-fold cross validation to determine the optimal parameters. We use User A’s data collected in R1 (1590 samples) as training data (for all scenarios) and test on other users and rooms. This is to verify that the wake-word detection is robust for untrained users and environments

users: 7
As shown in Table 2, we perform training and testing with the same subset of users, different subset of users (cross-subject), include additional test scenarios with an interfering user, and study the impact of adding more users (and data) on mmASL. We train two sign recognition models using data from 7 users (Users A-User G). These models are referred as mmASL-Deep (deep without multi-task learning) and mmASL-MTL (deep with 2 auxiliary tasks - repetitive and motion direction)

users: 7
This is expected given that deep learning models which can learn feature representations from the data perform better compared to traditional machine learning models requiring feature engineering. Testing with untrained users (cross-subject): We now take the models developed with 7 users and test. Average Sign Recognition Accuracy For Untrained Users

additional users: 4
Random Forests SVM mmASL-Deep mmASL-MTL Kinect-LSTM Openpose-RGB. them on 4 additional users added in Phase-I. Fig. 20a shows the average accuracy and Fig. 20b shows the individual user accuracy for different models

additional users: 4
A common way to address this issue is to add more (and diverse) data in the training set. To test this hypothesis, we use the data collected for 4 additional users (User L-User O in Phase-II, refer Table 2). Here, we add 75% of their data to the training set of 7 users (User A-User G), remaining is added to the test set

users: 7
To test this hypothesis, we use the data collected for 4 additional users (User L-User O in Phase-II, refer Table 2). Here, we add 75% of their data to the training set of 7 users (User A-User G), remaining is added to the test set. We train the four models (mmASL-MTL, mmASL-Deep, random forests, and SVM) with the new training data and test the models on (i) data from the same 11 users and (ii) data from 4 untrained users (cross-subject)

users: 11
Here, we add 75% of their data to the training set of 7 users (User A-User G), remaining is added to the test set. We train the four models (mmASL-MTL, mmASL-Deep, random forests, and SVM) with the new training data and test the models on (i) data from the same 11 users and (ii) data from 4 untrained users (cross-subject). Figs. 21a and 21b show the performance of the trained models when tested on the same users (User A-User G and User L-User O) and untrained users (User H-User K), respectively

users: 11
Figs. 21a and 21b show the performance of the trained models when tested on the same users (User A-User G and User L-User O) and untrained users (User H-User K), respectively. When tested on the same 11 users, all the four models provide similar results as trained on 7 users, confirming that mmASL can scale for more users without any significant decline in performance. The evidence for the high variance of the model is observed when tested on untrained users where all the four models have gained significant benefit from the additional data

users: 7
Specifically, the average accuracy for random forests and SVM has increased by 11% and 9% respectively. Of the deep learning models, mmASL-Deep has gained an increase of 16% and mmASL-MTL has gained an increase of 9% in the average accuracy, when compared to models trained with 7 users’ data. Although the accuracy gap between mmASL-MTL and mmASL-Deep reduces with additional data, it is worth noting that the multitask learning is still needed in order for mmASL to scale for more number of signs

引用论文
  • 2014. National Instruments mmWave Transceiver System. Retrieved January 1, 2020 from http://www.ni.com/sdr/mmwave/[2] 2016. Nvidia Tesla 480 Specifications. Retrieved January 1, 2020 from https://en.wikipedia.org/wiki/Nvidia_Tesla [3] 2017. How Amazon’s Alexa is helping with child disability. Retrieved January 1, 2020 from https://themighty.com/2017/11/amazonalexa-helping-child-disability/[4] 2018. 50 Best Alexa Commands. Retrieved January 1, 2020 from https://beebom.com/best-alexa-commands-for-amazon-echo/[5] 2018. Home alerting devices for people who are deaf or hard of hearing. Retrieved January 1, 2020 from https://tap.gallaudet.edu/
    Locate open access versionFindings
  • smarthome/ [6] 2018. IEEE 802.11ay: Enhanced Throughput for Operation in License-Exempt Bands above 45 GHz. Retrieved January 1, 2020 from http://www.ieee802.org/11/Reports/tgay_update.htm [7] 2018. SiBeam Beam-steering Transceivers. Retrieved January 1, 2020 from http://www.sibeam.com/Products.aspx [8] 2018. The Smart Devices Transforming the Lives of People with Disabilities. Retrieved January 1, 2020 from https://www.mytherapyapp.
    Locate open access versionFindings
  • com/blog/smart-homes-for-living-with-disabilities [9] 2018. Smart Speaker Users Growing 48% Annually, To Hit 90M In USA This Year. Retrieved January 1, 2020 from https://www.forbes.
    Locate open access versionFindings
  • com/sites/johnkoetsier2018/05/29/smart-speaker-users-growing-48-annually-will-outnumber-wearable-tech-users-this-year/ [10] 2018. Tensorflow CNN example. Retrieved January 1, 2020 from https://www.tensorflow.org/tutorials/images/cnn [11] 2019. ARGO: A research computing cluster. Retrieved January 1, 2020 from http://orc.gmu.edu/[12] 2019. ASL Sign for email. Retrieved January 1, 2020 from https://www.handspeak.com/word/search/index.php?id=659 [13] 2019. ASL Sign for place. Retrieved January 1, 2020 from https://www.signingsavvy.com/sign/PLACE [14] 2019. ASL Sign for shopping. Retrieved January 1, 2020 from https://www.handspeak.com/word/search/index.php?id=1948 [15] 2019. ASL Sign for snow. Retrieved January 1, 2020 from https://www.handspeak.com/word/search/index.php?id=2003 [16] 2019.mmASL dataset. Retrieved January 1, 2020 from https://cs.gmu.edu/~phpathak/datasets/mmASL.html
    Locate open access versionFindings
  • [17] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-scale Machine Learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (Savannah, GA, USA) (OSDI’16). USENIX Association, Berkeley, CA, USA, 265–283. http://dl.acm.org/citation.cfm?id=3026877.3026899
    Locate open access versionFindings
  • [18] H. Abdelnasser, M. Youssef, and K. A. Harras. 2015. WiGest: A ubiquitous WiFi-based gesture recognition system. In 2015 IEEE Conference on Computer Communications (INFOCOM). 1472–1480. https://doi.org/10.1109/INFOCOM.2015.7218525
    Locate open access versionFindings
  • [19] Kamran Ali, Alex X. Liu, Wei Wang, and Muhammad Shahzad. 2015. Keystroke Recognition Using WiFi Signals. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking (Paris, France) (MobiCom ’15). ACM, New York, NY, USA, 90–102. https://doi.org/10.1145/2789168.2790109
    Locate open access versionFindings
  • [20] Marco Altini, Julien Penders, and Oliver Amft. 2012. Energy Expenditure Estimation Using Wearable Sensors: A New Methodology for Activity-specific Models. In Proceedings of the Conference on Wireless Health (San Diego, California) (WH ’12). ACM, New York, NY, USA, Article 1, 8 pages. https://doi.org/10.1145/2448096.2448097
    Locate open access versionFindings
  • [21] Oya Aran, Thomas Burger, Alice Caplier, and Lale Akarun. 200A belief-based sequential fusion approach for fusing manual signs and non-manual signals. Pattern Recognition 42, 5 (2009), 812 – 822. https://doi.org/10.1016/j.patcog.2008.09.010
    Locate open access versionFindings
  • [22] B B Blanchfield, J J Feldman, J L Dunbar, and E N Gardner. 2001. The severely to profoundly hearing-impaired population in the United States: prevalence estimates and demographics. Journal of the American Academy of Audiology 12, 4 (2001), 183–9. http://www.ncbi.nlm.nih.gov/pubmed/11332518
    Locate open access versionFindings
  • [23] H. Brashear, T. Starner, P. Lukowicz, and H. Junker. 2003. Using multiple sensors for mobile sign language recognition. In Seventh IEEE International Symposium on Wearable Computers, 2003. Proceedings. 45–52. https://doi.org/10.1109/ISWC.2003.1241392
    Locate open access versionFindings
  • [24] Diane Brentari. 1998. A prosodic model of sign language phonology. Mit Press.
    Google ScholarFindings
  • [25] Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2018. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. CoRR abs/1812.08008 (2018). arXiv:1812.08008 http://arxiv.org/abs/1812.08008
    Findings
  • [26] Naomi K. Caselli, Zed Sevcikova Sehyr, Ariel M. Cohen-Goldberg, and Karen Emmorey. 2016. ASL-LEX: A lexical database of American Sign Language. Behavior Research Methods (2016), 1–18. https://doi.org/10.3758/s13428-016-0742-0
    Locate open access versionFindings
  • [27] Yuanying Chen, Wei Dong, Yi Gao, Xue Liu, and Tao Gu. 2017. Rapid: A Multimodal and Device-free Approach Using Noise Estimation for Robust Person Identification. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 3, Article 41 (Sept. 2017), 27 pages. https://doi.org/10.1145/3130906
    Locate open access versionFindings
  • [28] C. Chuan, E. Regina, and C. Guardino. 2014. American Sign Language Recognition Using Leap Motion Sensor. In 2014 13th International Conference on Machine Learning and Applications. 541–544. https://doi.org/10.1109/ICMLA.2014.110
    Locate open access versionFindings
  • [29] Cleison Correia de Amorim, David Macêdo, and Cleber Zanchettin. 2019. Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition. CoRR abs/1901.11164 (2019). arXiv:1901.11164 http://arxiv.org/abs/1901.11164
    Findings
  • [30] Cao Dong, M. C. Leu, and Z. Yin. 2015. American Sign Language alphabet recognition using Microsoft Kinect. In 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 44–52. https://doi.org/10.1109/CVPRW.2015.7301347
    Locate open access versionFindings
  • [31] Yong Du, W. Wang, and L. Wang. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1110–1118. https://doi.org/10.1109/CVPR.2015.7298714
    Locate open access versionFindings
  • [32] Biyi Fang, Jillian Co, and Mi Zhang. 2017. DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems (Delft, Netherlands) (SenSys ’17). ACM, New York, NY, USA, Article 5, 13 pages. https://doi.org/10.1145/3131672.3131693
    Locate open access versionFindings
  • [33] Xiaonan Guo, Bo Liu, Cong Shi, Hongbo Liu, Yingying Chen, and Mooi Choo Chuah. 2017. WiFi-Enabled Smart Human Dynamics Monitoring. In Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems (Delft, Netherlands) (SenSys ’17). ACM, New York, NY, USA, Article 16, 13 pages. https://doi.org/10.1145/3131672.3131692
    Locate open access versionFindings
  • [34] Muhammad Kumail Haider and Edward W. Knightly. 2016. Mobility Resilience and Overhead Constrained Adaptation in Directional 60 GHz WLANs: Protocol Design and System Implementation. In Proceedings of the 17th ACM International Symposium on Mobile Ad Hoc Networking and Computing (Paderborn, Germany) (MobiHoc ’16). ACM, New York, NY, USA, 61–70. https://doi.org/10.1145/2942358.2942380
    Locate open access versionFindings
  • [35] Jiahui Hou, Xiang-Yang Li, Peide Zhu, Zefan Wang, Yu Wang, Jianwei Qian, and Panlong Yang. 2019. SignSpeaker: A Real-time, High-Precision SmartWatch-based Sign Language Translator. In To appear in Mobicom 2019 (Los Cabos, Mexico).
    Google ScholarFindings
  • [36] Jie Huang, Wengang Zhou, Houqiang Li, and Weiping Li. 2015. Sign Language Recognition using 3D convolutional neural networks. In 2015 IEEE International Conference on Multimedia and Expo (ICME). 1–6. https://doi.org/10.1109/ICME.2015.7177428
    Locate open access versionFindings
  • [37] IEEE P802.11adTM/D4.0. 2012. Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 3: Enhancements for Very High Throughput in the 60 GHz Band, IEEE Computer Society. IEEE Computer Society (July 2012).
    Google ScholarFindings
  • [38] Wenjun Jiang, Chenglin Miao, Fenglong Ma, Shuochao Yao, Yaqing Wang, Ye Yuan, Hongfei Xue, Chen Song, Xin Ma, Dimitrios Koutsonikolas, Wenyao Xu, and Lu Su. 2018. Towards Environment Independent Device Free Human Activity Recognition. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking (New Delhi, India) (MobiCom ’18). ACM, New York, NY, USA, 289–304. https://doi.org/10.1145/3241539.3241548
    Locate open access versionFindings
  • [39] Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
    Findings
  • [40] Sang-Ki Ko, Jae Gi Son, and Hyedong Jung. 2018. Sign Language Recognition with Recurrent Neural Network Using Human Keypoint Detection. In Proceedings of the 2018 Conference on Research in Adaptive and Convergent Systems (Honolulu, Hawaii) (RACS ’18). ACM, New York, NY, USA, 326–3https://doi.org/10.1145/3264746.3264805
    Locate open access versionFindings
  • [41] V. E. Kosmidou and L. J. Hadjileontiadis*. 2009. Sign Language Recognition Using Intrinsic-Mode Sample Entropy on sEMG and Accelerometer Data. IEEE Transactions on Biomedical Engineering 56, 12 (Dec 2009), 2879–2890. https://doi.org/10.1109/TBME.2009.2013200
    Locate open access versionFindings
  • [42] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 1097–1105. http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
    Locate open access versionFindings
  • [43] A. Kuznetsova, L. Leal-TaixÃľ, and B. Rosenhahn. 2013. Real-Time Sign Language Recognition Using a Consumer Depth Camera. In 2013 IEEE International Conference on Computer Vision Workshops. 83–90. https://doi.org/10.1109/ICCVW.2013.18
    Locate open access versionFindings
  • [44] S. Lawrence, C. L. Giles, Ah Chung Tsoi, and A. D. Back. 1997. Face recognition: a convolutional neural-network approach. IEEE Transactions on Neural Networks 8, 1 (Jan 1997), 98–113. https://doi.org/10.1109/72.554195
    Locate open access versionFindings
  • [45] Greg C. Lee, Fu-Hao Yeh, and Yi-Han Hsiao. 2016. Kinect-based Taiwanese sign-language recognition system. Multimedia Tools and Applications 75, 1 (01 Jan 2016), 261–279. https://doi.org/10.1007/s11042-014-2290-x
    Locate open access versionFindings
  • [46] Hong Li, Wei Yang, Jianxin Wang, Yang Xu, and Liusheng Huang. 2016. WiFinger: Talk to Your Smart Devices with Finger-grained Gesture. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (Heidelberg, Germany) (UbiComp ’16). ACM, New York, NY, USA, 250–261. https://doi.org/10.1145/2971648.2971738
    Locate open access versionFindings
  • [47] Jaime Lien, Nicholas Gillian, M. Emre Karagozler, Patrick Amihood, Carsten Schwesig, Erik Olson, Hakim Raja, and Ivan Poupyrev. 2016. Soli: Ubiquitous Gesture Sensing with Millimeter Wave Radar. ACM Trans. Graph. 35, 4, Article 142 (July 2016), 19 pages. https://doi.org/10.1145/2897824.2925953
    Locate open access versionFindings
  • [48] Jaime Lien, Nicholas Gillian, M. Emre Karagozler, Patrick Amihood, Carsten Schwesig, Erik Olson, Hakim Raja, and Ivan Poupyrev. 2016. Soli: Ubiquitous Gesture Sensing with Millimeter Wave Radar. ACM Trans. Graph. 35, 4, Article 142 (July 2016), 19 pages. https://doi.org/10.1145/2897824.2925953
    Locate open access versionFindings
  • [49] Frank R Lin, John K Niparko, and Luigi Ferruci. 2011. Hearing loss prevalence in the United States. Archives of internal medicine 171, 20 (2011), 1851–1853.
    Google ScholarLocate open access versionFindings
  • [50] Jingjing Liu, Bo Liu, Shaoting Zhang, Fei Yang, Peng Yang, Dimitris N. Metaxas, and Carol Neidle. 2014. Non-manual grammatical marker recognition based on multi-scale, spatio-temporal analysis of head pose and facial expressions. Image and Vision Computing 32, 10 (2014), 671 – 681. https://doi.org/10.1016/j.imavis.2014.02.009 Best of Automatic Face and Gesture Recognition 2013.
    Locate open access versionFindings
  • [51] Yongsen Ma, Gang Zhou, Shuangquan Wang, Hongyang Zhao, and Woosub Jung. 2018. SignFi: Sign Language Recognition Using WiFi. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 1, Article 23 (March 2018), 21 pages. https://doi.org/10.1145/3191755
    Locate open access versionFindings
  • [52] J. Martinez, H. Perez, E. Escamilla, and M. M. Suzuki. 2012. Speaker recognition using Mel frequency Cepstral Coefficients (MFCC) and Vector quantization (VQ) techniques. In CONIELECOMP 2012, 22nd International Conference on Electrical Communications and Computers. 248–251. https://doi.org/10.1109/CONIELECOMP.2012.6189918
    Locate open access versionFindings
  • [53] Sven L. Mattys, Matthew H. Davis, Ann R. Bradlow, and Sophie K. Scott. 2012. Speech recognition in adverse conditions: A review. Language and Cognitive Processes 27, 7-8 (2012), 953–978. https://doi.org/10.1080/01690965.2012.705006
    Locate open access versionFindings
  • [54] Pedro Melgarejo, Xinyu Zhang, Parameswaran Ramanathan, and David Chu. 2014. Leveraging Directional Antenna Capabilities for Fine-grained Gesture Recognition. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (Seattle, Washington) (UbiComp ’14). ACM, New York, NY, USA, 541–551. https://doi.org/10.1145/2632048.2632095
    Locate open access versionFindings
  • [55] Nicholas Michael, Dimitris Metaxas, and Carol Neidle. 2009. Spatial and Temporal Pyramids for Grammatical Expression Recognition of American Sign Language. In Proceedings of the 11th International ACM SIGACCESS Conference on Computers and Accessibility (Pittsburgh, Pennsylvania, USA) (Assets ’09). ACM, New York, NY, USA, 75–82. https://doi.org/10.1145/1639642.1639657
    Locate open access versionFindings
  • [56] Ross E Mitchell, Travas A Young, Bellamie Bachleda, and Michael A Karchmer. 2006. How many people use ASL in the United States? Why estimates need updating. Sign Language Studies 6, 3 (2006), 306–335.
    Google ScholarLocate open access versionFindings
  • [57] Bhadragiri Jagan Mohan and Ramesh Babu N. 2014. Speech recognition using MFCC and DTW. In 2014 International Conference on Advances in Electrical Engineering (ICAEE). 1–4. https://doi.org/10.1109/ICAEE.2014.6838564
    Locate open access versionFindings
  • [58] P. Molchanov, S. Gupta, K. Kim, and K. Pulli. 2015. Short-range FMCW monopulse radar for hand-gesture sensing. In 2015 IEEE Radar Conference (RadarCon). 1491–1496. https://doi.org/10.1109/RADAR.2015.7131232
    Locate open access versionFindings
  • [59] Lindasalwa Muda, Mumtaj Begam, and I. Elamvazuthi. 2010. Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques. CoRR abs/1003.4083 (2010). arXiv:1003.4083 http://arxiv.org/abs/1003.4083
    Findings
  • [60] Tan Dat Nguyen and Surendra Ranganath. 2011. Recognizing Continuous Grammatical Marker Facial Gestures in Sign Language Video. Springer Berlin Heidelberg, Berlin, Heidelberg, 665–676. https://doi.org/10.1007/978-3-642-19282-1_53
    Findings
  • [61] M. Oszust and M. Wysocki. 2013. Polish sign language words recognition with Kinect. In 2013 6th International Conference on Human System Interactions (HSI). 219–226. https://doi.org/10.1109/HSI.2013.6577826
    Locate open access versionFindings
  • [62] Avishek Patra, Philipp Geuer, Andrea Munari, and Petri Mähönen. 2018. mm-Wave Radar Based Gesture Recognition: Development and Evaluation of a Low-Power, Low-Complexity System. In Proceedings of the 2Nd ACM Workshop on Millimeter Wave Networks and Sensing Systems (New Delhi, India) (mmNets ’18). ACM, New York, NY, USA, 51–56. https://doi.org/10.1145/3264492.3264501
    Locate open access versionFindings
  • [63] C. Piciarelli, C. Micheloni, and G. L. Foresti. 2010. Occlusion-aware Multiple Camera Reconfiguration. In Proceedings of the Fourth ACM/IEEE International Conference on Distributed Smart Cameras (Atlanta, Georgia) (ICDSC ’10). ACM, New York, NY, USA, 88–94. https://doi.org/10.1145/1865987.1866002
    Locate open access versionFindings
  • [64] Lionel Pigou, Sander Dieleman, Pieter-Jan Kindermans, and Benjamin Schrauwen. 2015. Sign Language Recognition Using Convolutional Neural Networks. In Computer Vision - ECCV 2014 Workshops, Lourdes Agapito, Michael M. Bronstein, and Carsten Rother (Eds.). Springer International Publishing, Cham, 572–578.
    Google ScholarFindings
  • [65] J. Pons, T. Lidy, and X. Serra. 2016. Experimenting with musically motivated convolutional neural networks. In 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI). 1–6. https://doi.org/10.1109/CBMI.2016.7500246
    Locate open access versionFindings
  • [66] N. Praveen, N. Karanth, and M. S. Megha. 2014. Sign language interpreter using a smart glove. In 2014 International Conference on Advances in Electronics Computers and Communications. 1–5. https://doi.org/10.1109/ICAECC.2014.7002401
    Locate open access versionFindings
  • [67] Qifan Pu, Sidhant Gupta, Shyamnath Gollakota, and Shwetak Patel. 2013. Whole-home Gesture Recognition Using Wireless Signals. In Proceedings of the 19th Annual International Conference on Mobile Computing & Networking (Miami, Florida, USA) (MobiCom ’13). ACM, New York, NY, USA, 27–38. https://doi.org/10.1145/2500423.2500436
    Locate open access versionFindings
  • [68] N. Pugeault and R. Bowden. 2011. Spelling it out: Real-time ASL fingerspelling recognition. In 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops). 1114–1119. https://doi.org/10.1109/ICCVW.2011.6130290
    Locate open access versionFindings
  • [69] Luis Quesada, Gustavo López, and Luis A. Guerrero. 2015. Sign Language Recognition Using Leap Motion. In Ubiquitous Computing and Ambient Intelligence. Sensing, Processing, and Using Environmental Information, Juan M. García-Chamizo, Giancarlo Fortino, and Sergio F. Ochoa (Eds.). Springer International Publishing, Cham, 277–288.
    Google ScholarFindings
  • [70] R. Raman, P. K. Sa, and B. Majhi. 2012. Occlusion prediction algorithms for multi-camera network. In 2012 Sixth International Conference on Distributed Smart Cameras (ICDSC). 1–6.
    Google ScholarLocate open access versionFindings
  • [71] Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. CoRR abs/1602.04938 (2016). arXiv:1602.04938 http://arxiv.org/abs/1602.04938
    Findings
  • [72] Sebastian Ruder. 2017. An Overview of Multi-Task Learning in Deep Neural Networks. CoRR abs/1706.05098 (2017). arXiv:1706.05098 http://arxiv.org/abs/1706.05098
    Findings
  • [73] C. Savur and F. Sahin. 2015. Real-Time American Sign Language Recognition System Using Surface EMG Signal. In 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA). 497–502. https://doi.org/10.1109/ICMLA.2015.212
    Locate open access versionFindings
  • [74] J. Shang and J. Wu. 2017. A Robust Sign Language Recognition System with Sparsely Labeled Instances Using Wi-Fi Signals. In 2017 IEEE 14th International Conference on Mobile Ad Hoc and Sensor Systems (MASS). 99–107. https://doi.org/10.1109/MASS.2017.41
    Locate open access versionFindings
  • [75] J. Shang and J. Wu. 2017. A Robust Sign Language Recognition System with Sparsely Labeled Instances Using Wi-Fi Signals. In 2017 IEEE 14th International Conference on Mobile Ad Hoc and Sensor Systems (MASS). 99–107. https://doi.org/10.1109/MASS.2017.41
    Locate open access versionFindings
  • [76] S. Song, C. Lan, J. Xing, W. Zeng, and J. Liu. 2016. An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data. arXiv e-prints (Nov. 2016). arXiv:cs.CV/1611.06067
    Google ScholarFindings
  • [77] Thad Starner and Alex Pentland. 1997. Real-time american sign language recognition from video using hidden markov models. In Motion-Based Recognition. Springer, 227–243.
    Google ScholarFindings
  • [78] T. Starner, J. Weaver, and A. Pentland. 1997. A wearable computer based American sign language recognizer. In Digest of Papers. First International Symposium on Wearable Computers. 130–137. https://doi.org/10.1109/ISWC.1997.629929
    Locate open access versionFindings
  • [79] T. Starner, J. Weaver, and A. Pentland. 1998. Real-time American sign language recognition using desk and wearable computer based video. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 12 (Dec 1998), 1371–1375. https://doi.org/10.1109/34.735811
    Locate open access versionFindings
  • [80] T. Starner, J. Weaver, and A. Pentland. 1998. Real-time American sign language recognition using desk and wearable computer based video. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 12 (Dec 1998), 1371–1375. https://doi.org/10.1109/34.735811
    Locate open access versionFindings
  • [81] Chao Sun, Tianzhu Zhang, and Changsheng Xu. 2015. Latent Support Vector Machine Modeling for Sign Language Recognition with Kinect. ACM Trans. Intell. Syst. Technol. 6, 2, Article 20 (March 2015), 20 pages. https://doi.org/10.1145/2629481
    Locate open access versionFindings
  • [82] Sanjib Sur, Xinyu Zhang, Parmesh Ramanathan, and Ranveer Chandra. 2016. BeamSpy: Enabling Robust 60 GHz Links Under Blockage. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16). USENIX Association, Santa Clara, CA, 193–206. https://www.usenix.org/conference/nsdi16/technical-sessions/presentation/sur
    Locate open access versionFindings
  • [83] Sheng Tan and Jie Yang. 2016. WiFinger: Leveraging Commodity WiFi for Fine-grained Finger Gesture Recognition. In Proceedings of the 17th ACM International Symposium on Mobile Ad Hoc Networking and Computing (Paderborn, Germany) (MobiHoc ’16). ACM, New York, NY, USA, 201–210. https://doi.org/10.1145/2942358.2942393
    Locate open access versionFindings
  • [84] David Tse and Pramod Viswanath. 2005. Fundamentals of Wireless Communication. Cambridge University Press, New York, NY, USA.
    Google ScholarFindings
  • [85] Aditya Virmani and Muhammad Shahzad. 2017. Position and Orientation Agnostic Gesture Recognition Using WiFi. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (Niagara Falls, New York, USA) (MobiSys ’17). ACM, New York, NY, USA, 252–264. https://doi.org/10.1145/3081333.3081340
    Locate open access versionFindings
  • [86] Guanhua Wang, Yongpan Zou, Zimu Zhou, Kaishun Wu, and Lionel M. Ni. 2014. We Can Hear You with Wi-Fi!. In Proceedings of the 20th Annual International Conference on Mobile Computing and Networking (Maui, Hawaii, USA) (MobiCom ’14). ACM, New York, NY, USA, 593–604. https://doi.org/10.1145/2639108.2639112
    Locate open access versionFindings
  • [87] Saiwen Wang, Jie Song, Jaime Lien, Ivan Poupyrev, and Otmar Hilliges. 2016. Interacting with Soli: Exploring Fine-Grained Dynamic Gesture Recognition in the Radio-Frequency Spectrum. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). ACM, New York, NY, USA, 851–860. https://doi.org/10.1145/2984511.2984565
    Locate open access versionFindings
  • [88] Wei Wang, Alex X. Liu, and Muhammad Shahzad. 2016. Gait Recognition Using Wifi Signals. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (Heidelberg, Germany) (UbiComp ’16). ACM, New York, NY, USA, 363–373. https://doi.org/10.1145/2971648.2971670
    Locate open access versionFindings
  • [89] Wei Wang, Alex X. Liu, Muhammad Shahzad, Kang Ling, and Sanglu Lu. 2015. Understanding and Modeling of WiFi Signal Based Human Activity Recognition. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking (Paris, France) (MobiCom ’15). ACM, New York, NY, USA, 65–76. https://doi.org/10.1145/2789168.2790093
    Locate open access versionFindings
  • [90] Yan Wang, Jian Liu, Yingying Chen, Marco Gruteser, Jie Yang, and Hongbo Liu. 2014. E-eyes: Device-free Location-oriented Activity Identification Using Fine-grained WiFi Signatures. In Proceedings of the 20th Annual International Conference on Mobile Computing and Networking (Maui, Hawaii, USA) (MobiCom ’14). ACM, New York, NY, USA, 617–628. https://doi.org/10.1145/2639108.2639143
    Locate open access versionFindings
  • [91] Y. Wang, K. Wu, and L. M. Ni. 2017. WiFall: Device-Free Fall Detection by Wireless Networks. IEEE Transactions on Mobile Computing 16, 2 (Feb 2017), 581–594. https://doi.org/10.1109/TMC.2016.2557792
    Locate open access versionFindings
  • [92] Teng Wei and Xinyu Zhang. 2015. mtrack: High-precision passive tracking using millimeter wave radios. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. ACM, 117–129.
    Google ScholarLocate open access versionFindings
  • [93] Teng Wei and Xinyu Zhang. 2015. mTrack: High-Precision Passive Tracking Using Millimeter Wave Radios. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking (Paris, France) (MobiCom ’15). ACM, New York, NY, USA, 117–129. https://doi.org/10.1145/2789168.2790113
    Locate open access versionFindings
  • [94] Teng Wei and Xinyu Zhang. 2017. Pose Information Assisted 60 GHz Networks: Towards Seamless Coverage and Mobility Support. In Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking (Snowbird, Utah, USA) (MobiCom ’17). ACM, New York, NY, USA, 42–55. https://doi.org/10.1145/3117811.3117832
    Locate open access versionFindings
  • [95] W. Xi, J. Zhao, X. Li, K. Zhao, S. Tang, X. Liu, and Z. Jiang. 2014. Electronic frog eye: Counting crowd using WiFi. In IEEE INFOCOM 2014 - IEEE Conference on Computer Communications. 361–369. https://doi.org/10.1109/INFOCOM.2014.6847958
    Locate open access versionFindings
  • [96] Zhicheng Yang, Parth H. Pathak, Yunze Zeng, Xixi Liran, and Prasant Mohapatra. 2016. Monitoring Vital Signs Using Millimeter Wave. In Proceedings of the 17th ACM International Symposium on Mobile Ad Hoc Networking and Computing (Paderborn, Germany) (MobiHoc ’16). ACM, New York, NY, USA, 211–220. https://doi.org/10.1145/2942358.2942381
    Locate open access versionFindings
  • [97] Y. Ye, Y. Tian, M. Huenerfauth, and J. Liu. 2018. Recognizing American Sign Language Gestures from Within Continuous Videos. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2145–214509. https://doi.org/10.1109/ CVPRW.2018.00280
    Locate open access versionFindings
  • [98] T. Yuan, S. Sah, T. Ananthanarayana, C. Zhang, A. Bhat, S. Gandhi, and R. Ptucha. 2019. Large Scale Sign Language Interpretation. In 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019). 1–5. https://doi.org/10.1109/FG.2019.8756506
    Locate open access versionFindings
  • [99] Zahoor Zafrulla, Helene Brashear, Thad Starner, Harley Hamilton, and Peter Presti. 2011. American Sign Language Recognition with the Kinect. In Proceedings of the 13th International Conference on Multimodal Interfaces (Alicante, Spain) (ICMI ’11). ACM, New York, NY, USA, 279–286. https://doi.org/10.1145/2070481.2070532
    Locate open access versionFindings
  • [100] Yunze Zeng, Parth H. Pathak, and Prasant Mohapatra. 2016. WiWho: Wifi-based Person Identification in Smart Spaces. In Proceedings of the 15th International Conference on Information Processing in Sensor Networks (Vienna, Austria) (IPSN ’16). IEEE Press, Piscataway, NJ, USA, Article 4, 12 pages. http://dl.acm.org/citation.cfm?id=2959355.2959359
    Locate open access versionFindings
  • [101] Ding Zhang, Mihir Garude, and Parth Pathak. 2018. mmChoir: Exploiting Joint Transmissions for Reliable 60GHz mmWave WLANs. In Proceedings of the 19th ACM International Symposium on Mobile Ad Hoc Networking and Computing (Los Angeles, USA) (MobiHoc ’18). ACM, New York, NY, USA, 10.
    Google ScholarLocate open access versionFindings
  • [102] X. Zhang, X. Chen, Y. Li, V. Lantz, K. Wang, and J. Yang. 2011. A Framework for Hand Gesture Recognition Based on Accelerometer and EMG Sensors. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans 41, 6 (Nov 2011), 1064–1076. https://doi.org/10.1109/TSMCA.2011.2116004
    Locate open access versionFindings
  • [103] Yu Zhang and Qiang Yang. 2017. A Survey on Multi-Task Learning. CoRR abs/1707.08114 (2017). arXiv:1707.08114 http://arxiv.org/abs/1707.08114
    Findings
  • [104] Zhanpeng Zhang, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2014. Facial Landmark Detection by Deep Multi-task Learning. In Computer Vision – ECCV 2014, David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Cham, 94–108.
    Google ScholarLocate open access versionFindings
  • [105] Wentao Zhu, Cuiling Lan, Junliang Xing, Wenjun Zeng, Yanghao Li, Li Shen, and Xiaohui Xie. 2016. Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks. CoRR abs/1603.07772 (2016). arXiv:1603.07772 http://arxiv.org/abs/1603.07772
    Findings
  • [106] Yanzi Zhu, Yibo Zhu, Ben Y. Zhao, and Haitao Zheng. 2015. Reusing 60GHz Radios for Mobile Radar Imaging. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking (Paris, France) (MobiCom ’15). ACM, New York, NY, USA, 103–116. https://doi.org/10.1145/2789168.2790112
    Locate open access versionFindings
  • [107] H. Zou, Y. Zhou, J. Yang, H. Jiang, L. Xie, and C. J. Spanos. 2018. DeepSense: Device-Free Human Activity Recognition via Autoencoder Long-Term Recurrent Convolutional Network. In 2018 IEEE International Conference on Communications (ICC). 1–6. https://doi.org/10.1109/ICC.2018.8422895
    Locate open access versionFindings
您的评分 :
0

 

标签
评论