基本信息
views: 79

Bio
Two broad directions interest me the most, stemming from two distinct points of view I harbor.
First, I am curious about the inner workings of (Large) Language Models. We can throw together a nice looking loss function, a reasonable training loop, some compute and lots of data - and voila! A model starts generating near-fluent text. But, what does it learn? Does it reverse-engineer rules of grammar? In this context, I am interested in two counterposing approaches:
How can we best port human knowledge of Natural Language (e.g. linguistic structure, disambiguation of context, and so on) to a Language Model by modifying the model, training process and/or the data? More practically, can this lead us to better parameter and data efficiency?
Humans find it hard to learn languages without any visual cues or explanations, but it is easy (for a generous definition of easy) for LMs to do so. Do they know something we don’t? Can we reverse engineer more efficient ways to think about Language from them? This is a more abstract question that nonetheless excites me as much as the previous one.
Second, as LMs become more commonplace, their potential for both benefit and harm is bound to increase. We want them to be helpful, factual and relevant, among other desiderata. I am interested in exploring how we can best steer the models towards the behavior we want, and away from undesirable and harmful behavior (e.g. hallucinations).
More generally, NLP research is fascinating in its own right. Many of the current challenges (think ChatGPT hallucinations, lack of logical reasoning, and so on) are daunting, but by the same coin quite thrilling. I believe that going forward, principled approaches that generalize well are likely to be the ones that power through them.
First, I am curious about the inner workings of (Large) Language Models. We can throw together a nice looking loss function, a reasonable training loop, some compute and lots of data - and voila! A model starts generating near-fluent text. But, what does it learn? Does it reverse-engineer rules of grammar? In this context, I am interested in two counterposing approaches:
How can we best port human knowledge of Natural Language (e.g. linguistic structure, disambiguation of context, and so on) to a Language Model by modifying the model, training process and/or the data? More practically, can this lead us to better parameter and data efficiency?
Humans find it hard to learn languages without any visual cues or explanations, but it is easy (for a generous definition of easy) for LMs to do so. Do they know something we don’t? Can we reverse engineer more efficient ways to think about Language from them? This is a more abstract question that nonetheless excites me as much as the previous one.
Second, as LMs become more commonplace, their potential for both benefit and harm is bound to increase. We want them to be helpful, factual and relevant, among other desiderata. I am interested in exploring how we can best steer the models towards the behavior we want, and away from undesirable and harmful behavior (e.g. hallucinations).
More generally, NLP research is fascinating in its own right. Many of the current challenges (think ChatGPT hallucinations, lack of logical reasoning, and so on) are daunting, but by the same coin quite thrilling. I believe that going forward, principled approaches that generalize well are likely to be the ones that power through them.
Research Interests
Papers共 9 篇Author StatisticsCo-AuthorSimilar Experts
By YearBy Citation主题筛选期刊级别筛选合作者筛选合作机构筛选
时间
引用量
主题
期刊级别
合作者
合作机构
CoRR (2024)
Cited0Views0EIBibtex
0
0
arXiv (Cornell University) (2024)
ACL (1)pp.14351-14368, (2024)
NeurIPS 2024 (2024)
SIGNAL PROCESSING (2024)
EMNLP 2023 (2023): 7053-7074
arXivorg (2022)
Author Statistics
#Papers: 9
#Citation: 74
H-Index: 3
G-Index: 5
Sociability: 3
Diversity: 1
Activity: 8
Co-Author
Co-Institution
D-Core
- 合作者
- 学生
- 导师
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn