Automatic Identification of Causal Factors from Fall-Related Accident Investigation Reports Using Machine Learning and Ensemble Learning Approaches

Haonan Qi, Zhipeng Zhou,Javier Irizarry, Dong Lin,Haoyu Zhang, Nan Li,Jianqiang Cui

JOURNAL OF MANAGEMENT IN ENGINEERING(2024)

引用 0|浏览10
暂无评分
摘要
To enhance the performance of learning from past fall-related accidents, this study developed an innovative framework for automatically extracting every individual causal factor from accident investigation reports based upon the modified framework of the human factors analysis and classification system. Multiple techniques including the synthetic minority oversampling technique (SMOTE) algorithm for handling imbalanced data, soft voting with unequal weights for ensemble learning, and hyperparameter optimization were adopted to improve automatic identification of causal factors from unstructured text data. Experimental results denoted there were no classifiers with the best accuracy and F1 score unanimously for any of the 19 subcategories of causal factors. Therefore, one or more specific classifiers were preferred for predicting one specific causal factor with the best performance. Further comparative analyses between seven classifiers demonstrated that the ensemble learning model by the algorithm of soft voting (ELSV) could provide more stable predictions with low variance across different causal factors compared with individual machine learning models. It was suggested that the ELSV ought to be prioritized for collectively identifying all 19 causal factors. These findings are beneficial for substantial learning from past fall-related accidents with high efficiency and reliability, and valuable insights can be discerned and utilized for controlling the risk of fall-from-height at construction sites. This study aims to propose an innovative framework based on multiple machine learning models (i.e., support vector machine, naive Bayes, decision tree, k-nearest neighbors, random forest, and multilayer perceptron) and one ensemble learning approach. Several techniques (i.e., SMOTE for handling imbalanced data, soft voting with unequal weights for ensemble learning, and hyperparameter optimization) were used for improving automatic identification of causal factors. It was found that there were no best classifiers unanimously for all 19 subcategories of causal factors. Comparative analysis results between seven classifiers demonstrated that the ensemble learning approach was able to provide more stable predictions with low variance across various causal factors compared with individual machine learning models. This innovative framework provides a feasible method of automatic identification of causal factors from fall-from-height postaccident investigation reports at construction workplaces. It decreases the time and subjectivity through a manual process, enhancing the efficiency and reliability in extracting causal factors. It also satisfies the requirement that an investigation process should be implemented as fast as possible after an accident. Safety managers on site will adopt corrective and preventive measures to deal with causal factors immediately, in order to effectively reduce falling risks in the construction industry.
更多
查看译文
关键词
Fall-related accident,Causal factor,Human factors analysis and classification system (HFACS) model,Machine learning,Ensemble learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要