基本信息
浏览量:30
![](https://originalfileserver.aminer.cn/sys/aminer/icon/show-trajectory.png)
个人简介
Research
Alignment. Alignment is ensuring machine learning models conform to human values and intentions. My previous work focuses on collaborative alignment, where multiple individuals engage with the model and each other to align the model to their preference without interferring other users. I envision a future where NLP models are developed in a collaborative fashion, similar to open source software or Wikipedia, benefiting from diverse user inputs for improved quality and fairness. For this scenario to materialize, we need to help users to convey knowledge, and verify the impact of their proposed changes to models, similar to “diffs” or “regression tests”. CoDev is a small step in this direction. In TACL2018, we showed how human use pragmatics, planning, and inference in dialogue to convey information.
Robustness. One main concern about utilizing ML models in the real world is their poor performance in new domains, even when the new domain is slightly different from the training domain. In my previous work, we have shown how to leverage unlabeled data through pretraining and finetuning to achieve better performance in a new domain (ICLR2021); and how to use unlabeled data to make a model robust against spurious features (FAccT2021). In ACL2023, we showed how to use LLMs as a large pool of unlabeled data to augment groups that the model has low performance on.
Reliability. How can we have a reliable model that knows when it does not know? I am interested in selective classification where a model can abstain if it cannot provide a true prediction with high probability. Previously, I proposed a model that only predicts if all models consistent with training data unanimously agree, resulting in 100% precision (ACL2016). However, as models become more complicated we cannot do the unanimous agreement anymore! In (Neurips2022), we show how to find two disjoint models and only predicts if those two model agree.
Fairness. ML Models can lead to discrimination, and in light of their increasing prevalence, it is necessary to address this problem (see this post). In my research I try to understand how we can investigate and mitigate discrimination of ML models, and how to study the feedback loops created by ML models. Previously I have shown unexpected reasons that cause ML models to exhibit discrimination. For example, adding the same amount of feature noise to all individuals (ICML2020, blog post), or the inductive bias in overparameterized regimes (FAccT2021, blog post) can lead to discrimination. In addition, in scenarios when protected groups are not known a priori, or there is an exponential number of such groups, I showed that what kind of statistical measure is possible to measure loss discrepancy among groups (SafeML,ICLR2019)
Alignment. Alignment is ensuring machine learning models conform to human values and intentions. My previous work focuses on collaborative alignment, where multiple individuals engage with the model and each other to align the model to their preference without interferring other users. I envision a future where NLP models are developed in a collaborative fashion, similar to open source software or Wikipedia, benefiting from diverse user inputs for improved quality and fairness. For this scenario to materialize, we need to help users to convey knowledge, and verify the impact of their proposed changes to models, similar to “diffs” or “regression tests”. CoDev is a small step in this direction. In TACL2018, we showed how human use pragmatics, planning, and inference in dialogue to convey information.
Robustness. One main concern about utilizing ML models in the real world is their poor performance in new domains, even when the new domain is slightly different from the training domain. In my previous work, we have shown how to leverage unlabeled data through pretraining and finetuning to achieve better performance in a new domain (ICLR2021); and how to use unlabeled data to make a model robust against spurious features (FAccT2021). In ACL2023, we showed how to use LLMs as a large pool of unlabeled data to augment groups that the model has low performance on.
Reliability. How can we have a reliable model that knows when it does not know? I am interested in selective classification where a model can abstain if it cannot provide a true prediction with high probability. Previously, I proposed a model that only predicts if all models consistent with training data unanimously agree, resulting in 100% precision (ACL2016). However, as models become more complicated we cannot do the unanimous agreement anymore! In (Neurips2022), we show how to find two disjoint models and only predicts if those two model agree.
Fairness. ML Models can lead to discrimination, and in light of their increasing prevalence, it is necessary to address this problem (see this post). In my research I try to understand how we can investigate and mitigate discrimination of ML models, and how to study the feedback loops created by ML models. Previously I have shown unexpected reasons that cause ML models to exhibit discrimination. For example, adding the same amount of feature noise to all individuals (ICML2020, blog post), or the inductive bias in overparameterized regimes (FAccT2021, blog post) can lead to discrimination. In addition, in scenarios when protected groups are not known a priori, or there is an exponential number of such groups, I showed that what kind of statistical measure is possible to measure loss discrepancy among groups (SafeML,ICLR2019)
研究兴趣
论文共 15 篇作者统计合作学者相似作者
按年份排序按引用量排序主题筛选期刊级别筛选合作者筛选合作机构筛选
时间
引用量
主题
期刊级别
合作者
合作机构
CoRR (2023)
arXiv (Cornell University) (2022)
arxiv(2022)
ICML (2020): 5209-5219
引用23浏览0EI引用
23
0
加载更多
作者统计
合作学者
合作机构
D-Core
- 合作者
- 学生
- 导师
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn