Machine learning in health informatics: making better use of domain experts
Machine learning in health informatics: making better use of domain experts, 2012.
In this chapter we have presented practical results regarding the application of the machine learning methodologies developed in this thesis to the task of citation screening for systematic reviews
We present novel machine learning and data mining methods that make real-world learning systems more efficient. We focus on the domain of clinical informatics, an archetypical example of a field overwhelmed with information. Due to properties inherent to clinical informatics tasks – and indeed, to many tasks that require specialized domai...更多
下载 PDF 全文
- Background and Related Work
Active learning strategies exploit an expert ‘in-the-loop’ during classifier training.
- We have introduced the task of citation screening for systematic reviews, which will motivate the data mining problems tackled in the remainder of this thesis.
- Beyond the immediate problem of citation screening, these issues are inherent to many real-world tasks in which machine learning has the potential to reduce human workload, and solving them has broad implications.
- The latter involves estimating the probability that a given instance belongs to a specific class, as opposed to prhave shown that machine learning methodologies can reduce the burden of updating of systematic reviews without sacrificing their comprehensiveness.
- We describe our work on abstrackr, an open-source, web-based annotation tool for citation screening that integrates our machine learning tools in a GUI-based tool for conducting systematic reviews.
- Our goal in developing abstrackr has been to create a practical means of deploying the machine learning technologies that we have developed to researchers undertaking systematic reviews, i.e., screening citations.
- Re-training a model on all of the labeled data for a given review in order to re-calculate the active learning score for each instance in the unlabeled pool can incur a substantial computational cost; re-prioritizing the unlabeled citations each time a new citation is labeled can be quite slow.
- Reviewers screened these citations interactively, in decreasing order of their likelihood of being relevant, as predicted by the machine learning model.
- In this chapter we have presented practical results regarding the application of the machine learning methodologies developed in this thesis to the task of citation screening for systematic reviews.
- We are presently curating a large set of systematic review datasets in order to perform a large-scale verification of the proposed technologies for new reviews, in this future evaluation we will exploit active learning and dual supervision for reviews for which labeled terms are available.
- In Chapter 4 we proposed a novel active learning method that makes the best use of a given group of experts with varying cost and expertise, i.e., at each step in AL, we pick who is to do the labeling in addition to which instance is to be labeled (167).
- In Chapter 5 we proposed a novel, co-testing based approach for dually supervised active learning (165), and showed that exploiting dual supervision to guide the AL process can improve classifier performance, in imbalanced scenarios.
- In Chapter 4, we proposed a novel means of evaluating the performance of
- But this ostensibly good calibration belies the unreliability of the probability estimates for the minority instances. One can see this by looking at the middle plot, which includes only minority instances. In this case, the estimates diverge strikingly from the observed labels; indeed the model assigned a probability of belonging to the minority class of less than 20% to most of the minority instances
- The results show that the improvement in calibration performance from using the undersampled/bagged strategy is greater, compared to standard Platt, when prevalence is low. This is a statistically significant finding (p < 0.001), and is what we would expect due to Equation 2.21
- The difference is statistically significant for both reviewers (p < 0.0001)
- It is also encouraging that our predicted time strategy, which learns to predict how long it’s going to take to label citations online (i.e., during active learning), performs comparably to the true time strategy, which uses the ‘true’ model coefficients β, as learned over the entire labeled dataset. This is in contrast to previous work (146) in which the predictive model was not sufficiently accurate to achieve the same performance as when the true times were used
- In Section 6.1, we presented results from a realistic prospective evaluation of applying the semi-automated approach to update existing systematic reviews. We demonstrated that this can reduce workload substantially – by up to 90%, in some cases – without missing relevant articles
下载 PDF 全文