基本信息
views: 584
Career Trajectory
Bio
Professor Gales’ research aims to make speech systems simple and intuitive to use; achieving high levels of accuracy and naturalness. His research interests include automatic speech recognition, converting the audio waveform into text, and speech synthesis, converting text into an audio waveform. In addition he investigates various downstream tasks that these technologies enable, such as spoken language learning and assessment.
Though the deployment of speech recognition systems is becoming increasingly common, the domains in which they operate are quite limited; for example spoken search terms and transcribing broadcast news. To broaden the range of applications it is necessary to develop techniques that handle the diversity of spoken communication and the broad range of environments that these systems are required to operate in.
Speech synthesis systems have been deployed for many years. Systems are now able to deliver clear, understandable speech but they lack the ability to convey the full range of expressions found in human speech. To achieve human levels of information transfer by speech, Professor Gales' is investigating expression rich, controllable synthesis.
A fundamental aspect of both of these tasks is the need to add and exploit structure in the modeling of speech. For example by explicitly factoring a synthesis model into speaker characteristics, sentence pronunciation and sentence expression it is possible to control the exact nature of how the sentence is uttered, for example happy or angry and the speaker voice.
Though the deployment of speech recognition systems is becoming increasingly common, the domains in which they operate are quite limited; for example spoken search terms and transcribing broadcast news. To broaden the range of applications it is necessary to develop techniques that handle the diversity of spoken communication and the broad range of environments that these systems are required to operate in.
Speech synthesis systems have been deployed for many years. Systems are now able to deliver clear, understandable speech but they lack the ability to convey the full range of expressions found in human speech. To achieve human levels of information transfer by speech, Professor Gales' is investigating expression rich, controllable synthesis.
A fundamental aspect of both of these tasks is the need to add and exploit structure in the modeling of speech. For example by explicitly factoring a synthesis model into speaker characteristics, sentence pronunciation and sentence expression it is possible to control the exact nature of how the sentence is uttered, for example happy or angry and the speaker voice.
Research Interests
Papers共 529 篇Author StatisticsCo-AuthorSimilar Experts
By YearBy Citation主题筛选期刊级别筛选合作者筛选合作机构筛选
时间
引用量
主题
期刊级别
合作者
合作机构
arxiv(2024)
Cited0Views0Bibtex
0
0
arXiv (Cornell University) (2024)
Exploring AI in Applied Linguisticspp.96-117, (2024)
Guanfeng Wu,Abbas Haider,Xing Tian, Erfan Loweimi, Chi Ho Chan,Mengjie Qian, Awan Muhammad,Ivor Spence,Rob Cooper,Wing W. Y. Ng,Josef Kittler,Mark Gales,Hui Wang
IET Computer Visionno. 7 (2024): 1017-1033
Interspeech 2024pp.3774-3778, (2024)
Cited0Views0EIBibtex
0
0
CoRR (2024)
Cited0Views0EIBibtex
0
0
Interspeech 2024pp.3375-3379, (2024)
arXiv (Cornell University) (2024)
CoRR (2024)
Load More
Author Statistics
#Papers: 530
#Citation: 23933
H-Index: 62
G-Index: 140
Sociability: 7
Diversity: 1
Activity: 4
Co-Author
Co-Institution
D-Core
- 合作者
- 学生
- 导师
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn