How to simulate outliers with the desired properties

Chemometrics and Intelligent Laboratory Systems(2021)

引用 2|浏览9
暂无评分
摘要
Deviating multivariate observations are used typically to test the performance of outlier detection methods. Yet, the generation of outlying cases itself usually appears as a secondary methodological step in methods comparison. In the literature, outliers are defined using certain distribution parameters which differ from those of the clean or reference data. However, these parameters change among authors, leading to a lack of a standard and measurable definition of the characteristics simulated outliers. This makes the comparison between methods hard and its results dependent on the procedure followed to simulate the data. In order to set a standard procedure, a framework to simulate outliers is defined here. Since it is based on certain specifications for both the Squared Prediction Error (SPE) and Hotelling’s T2 statistics from a Principal Component Analysis (PCA) model, tuning them becomes a simple and efficient task. This procedure has been implemented in a set of Matlab functions.
更多
查看译文
关键词
PCA,Outliers,Squared prediction error,Hotelling’sT2,Simulation,Matlab
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要