谷歌浏览器插件
订阅小程序
在清言上使用

Semi-Supervised Linear Regression

Journal of the American Statistical Association(2021)

引用 6|浏览44
暂无评分
摘要
We study a regression problem where for some part of the data we observe both the label variable (Y) and the predictors (X), while for other part of the data only the predictors are given. Such a problem arises, for example, when observations of the label variable are costly and may require a skilled human agent. When the conditional expectation E[Y vertical bar X] is not exactly linear, one can consider the best linear approximation to the conditional expectation, which can be estimated consistently by the least-square estimates (LSE). The latter depends only on the labeled data. We suggest improved alternative estimates to the LSE that use also the unlabeled data. Our estimation method can be easily implemented and has simply described asymptotic properties. The new estimates asymptotically dominate the usual standard procedures under certain non-linearity condition of E[Y vertical bar X]; otherwise, they are asymptotically equivalent. The performance of the new estimator for small sample size is investigated in an extensive simulation study. A real data example of inferring homeless population is used to illustrate the new methodology.
更多
查看译文
关键词
Linear regression,Misspecified models,Semi-supervised learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要