Competence Level Prediction and Resume & Job Description Matching Using Context Aware Transformer Models

Changmao Li
Changmao Li
Elaine Fisher
Elaine Fisher
Rebecca Thomas
Rebecca Thomas
Steve Pittard
Steve Pittard
Vicki Hertzberg
Vicki Hertzberg

EMNLP 2020, 2020.

Cited by: 0|Views12
Weibo:
The accuracies achieved by our best models, 73.3 for T1 and 79.2 for T2, show a good promise for these models to be deployed in real Human Resource systems

Abstract:

This paper presents a comprehensive study on resume classification to reduce the time and labor needed to screen an overwhelming number of applications significantly, while improving the selection of suitable candidates. A total of 6,492 resumes are extracted from 24,933 job applications for 252 positions designated into four levels of ex...More

Code:

Data:

0
Full Text
Bibtex
Weibo
Introduction
  • An ongoing challenge for Human Resource (HR) is the process used to screen and match applicants to a target job description with a goal of minimizing recruiting time while maximizing proper matches.
  • NLP models allow for a comprehensive analyses on resumes and identification of latent concepts that may go unnoticed using a general manual process.
  • This model’s ability to infer core skills and qualifications from resumes can be used to normalize necessary content into standard concepts for matching with stated position requirements (Chifu et al, 2017; Valdez-Almada et al, 2018).
  • The task of resume classification has been under-explored due to the lack of resources for individual research labs and the heterogeneous nature of job solicitations
Highlights
  • An ongoing challenge for Human Resource (HR) is the process used to screen and match applicants to a target job description with a goal of minimizing recruiting time while maximizing proper matches
  • 70% of the data are annotated with the entry levels, CRC1 and CRC2, that is not surprising since 77.3% of the applications are submitted for those 2 levels
  • The ratio of CRC4 is notably lower than the application ratio submitted to that level, 6.8%, implying that applicants tend to apply to jobs for which they are not qualified. 13.9% of the applicants are Not Qualified (NQ); if our model detects even that portion robustly, it can remarkably reduce human labor
  • C shows a greater improvement of 1.8% than P, implying that the additional context used in C is essential for this task
  • This paper proposes two novel tasks, competencelevel classification (T1) and resume-description matching (T2), and provides a high-quality dataset as well as robust models using several transformerbased approaches
  • The accuracies achieved by our best models, 73.3 for T1 and 79.2 for T2, show a good promise for these models to be deployed in real HR systems
Methods
  • Table 5 shows the data split used to develop models for the competence-level classification task (T1).
  • The annotated data in the row Cr of Table 2 are split into the training (TRN), development (DEV) and test (TST) sets with the ratios of 75:10:15 by keeping similar label distributions across all sets.
  • Table 6 shows the data split used for the resumeto-job_description matching task (T2).
  • The same ratios of 75:10:15 are applied to generate the TRN: DEV:TST sets, respectively.
  • Algorithm 1 is designed to avoiding any overlapping applicants across datasets while keeping the similar label distributions (Appendix A.1)
Results
  • Labeling accuracy is used as the evaluation metric for all the experiments. Each model is developed three times and their average score as well as the standard deviation are reported.3 Table 7 shows the results for T1 achieved by the models in Sec. 5.2.
  • Table 7 shows the results for T1 achieved by the models in Sec. 5.2.
  • All context-aware models without section encoding perform significantly better, 1.5% with section pruning (P) and 3.3% with chunk segmenting (C), than the baseline model (Wr).
  • C shows a greater improvement of 1.8% than P, implying that the additional context used in C is essential for this task.
  • C⊕The author shows 4.2% improvement over Wr and gives the least variance of 0.16
Conclusion
  • This paper proposes two novel tasks, competencelevel classification (T1) and resume-description matching (T2), and provides a high-quality dataset as well as robust models using several transformerbased approaches.
  • The accuracies achieved by the best models, 73.3 for T1 and 79.2 for T2, show a good promise for these models to be deployed in real HR systems.
  • To the best of the knowledge, this is the first time that those two tasks are thoroughly studies, especially with the latest transformer architectures.
  • The authors will continuously explore to improve these models by integrating expert’s knowledge
Summary
  • Introduction:

    An ongoing challenge for Human Resource (HR) is the process used to screen and match applicants to a target job description with a goal of minimizing recruiting time while maximizing proper matches.
  • NLP models allow for a comprehensive analyses on resumes and identification of latent concepts that may go unnoticed using a general manual process.
  • This model’s ability to infer core skills and qualifications from resumes can be used to normalize necessary content into standard concepts for matching with stated position requirements (Chifu et al, 2017; Valdez-Almada et al, 2018).
  • The task of resume classification has been under-explored due to the lack of resources for individual research labs and the heterogeneous nature of job solicitations
  • Methods:

    Table 5 shows the data split used to develop models for the competence-level classification task (T1).
  • The annotated data in the row Cr of Table 2 are split into the training (TRN), development (DEV) and test (TST) sets with the ratios of 75:10:15 by keeping similar label distributions across all sets.
  • Table 6 shows the data split used for the resumeto-job_description matching task (T2).
  • The same ratios of 75:10:15 are applied to generate the TRN: DEV:TST sets, respectively.
  • Algorithm 1 is designed to avoiding any overlapping applicants across datasets while keeping the similar label distributions (Appendix A.1)
  • Results:

    Labeling accuracy is used as the evaluation metric for all the experiments. Each model is developed three times and their average score as well as the standard deviation are reported.3 Table 7 shows the results for T1 achieved by the models in Sec. 5.2.
  • Table 7 shows the results for T1 achieved by the models in Sec. 5.2.
  • All context-aware models without section encoding perform significantly better, 1.5% with section pruning (P) and 3.3% with chunk segmenting (C), than the baseline model (Wr).
  • C shows a greater improvement of 1.8% than P, implying that the additional context used in C is essential for this task.
  • C⊕The author shows 4.2% improvement over Wr and gives the least variance of 0.16
  • Conclusion:

    This paper proposes two novel tasks, competencelevel classification (T1) and resume-description matching (T2), and provides a high-quality dataset as well as robust models using several transformerbased approaches.
  • The accuracies achieved by the best models, 73.3 for T1 and 79.2 for T2, show a good promise for these models to be deployed in real HR systems.
  • To the best of the knowledge, this is the first time that those two tasks are thoroughly studies, especially with the latest transformer architectures.
  • The authors will continuously explore to improve these models by integrating expert’s knowledge
Tables
  • Table1: Descriptions (and general responsibilities) of the four-levels of CRC positions
  • Table2: The counts of applications (A), unique resumes for each level (B), unique resumes across all levels (C), and resumes from B and C selected for our research while preserving level proportions (Br and Cr)
  • Table3: The existence ratio of each section in the CRC levels. WoE: Work Experience, EDU: Education, PRO: Profile, ACT: Activities, SKI: Skills, OTH: Others
  • Table4: Fleiss Kappa scores measured for ITA during the five rounds of guideline development (R1-5). No annotation of CRC4 is found in the batch used for R4. The negative kappa scores are achieved for (CRC1, R3) and (CRC4, R5) that have too few samples (≤ 2)
  • Table5: Data statistics for the competence-level classification task (T1) in Section 4.1
  • Table6: Data statistics for the resume-to-job_ description matching task (T2) in Section 4.2. Y/N: applicants whose applied CRC levels match/do not match our annotated label, respectively
  • Table7: Accuracy (± standard deviation) on the development (DEV) and test (TST) sets for T1, achieved by the models in Section 5.2. δ: delta over Wr on TST
  • Table8: Accuracy (± standard deviation) on the development (DEV) and test (TST) sets for T2, achieved by the models in Section 5.2. δ: delta over Wr on TST
  • Table9: Hyperparameters. L: TE input length; GAS: gradient accumulation steps; BS: batch size; LR: learning rate; E: number of training epochs; T: approximate training time(h: hours); PS: approximate models training parameters size
  • Table10: Section lengths before section pruning (Section 4.1.2). Average/Max: the average and max lengths of input sections. Ratio: the ratios of input sections that are under the max-input length restricted by the transformer encoder
  • Table11: Section lengths after section pruning (Section 4.1.2). Average/Max: the average and max lengths of input sections. Ratio: the ratios of input sections that are under the max-input length restricted by the transformer encoder
Download tables as Excel
Related work
  • Limited studies have been conducted on the task of resume classification. Zaroor et al (2017) proposed a job-post and resume classification system that integrated knowledge base to match 2K resumes with 10K job posts. Sayfullina et al (2017) presented a convolutional neural network (CNN) model to classify 90K job descriptions, 523 resume summaries, and 98 children’s dream job descriptions into 27 job categories. Nasser et al (2018) hierarchically segmented resumes into sub-domains, especially for technical positions, and developed a CNN model to classify 500 job descriptions and 2K resumes.

    Prior studies in this area have focused on classifying resumes or job descriptions into occupational categories (e.g., data scientist, healthcare provider). However, no work has yet been found to distinguish resumes by levels of competence. Furthermore, we believe that our work is the first to analyze resumes together with job descriptions to determine whether or not the applicants are suitable for particular jobs, which can significantly reduce the intensive labor performed daily by HR recruiters.
Funding
  • C shows a greater improvement of 1.8% than P, implying that the additional context used in C is essential for this task
  • As the result, C⊕I shows 4.2% improvement over Wr and also gives the least variance of 0.16
  • C with multi-head attention (C⊕I⊕J⊕A) show a significant improvement of 4.6% over its counterpart, that is very encouraging
Study subjects and analysis
children: 98
Zaroor et al (2017) proposed a job-post and resume classification system that integrated knowledge base to match 2K resumes with 10K job posts. Sayfullina et al (2017) presented a convolutional neural network (CNN) model to classify 90K job descriptions, 523 resume summaries, and 98 children’s dream job descriptions into 27 job categories. Nasser et al (2018) hierarchically segmented resumes into sub-domains, especially for technical positions, and developed a CNN model to classify 500 job descriptions and 2K resumes

Reference
  • Emil St. Chifu, Viorica R. Chifu, Iulia Popa, and Ioan Salomie. 2017. A System for Detecting Professional Skills from Resumes Written in Natural Language. In Proceedings of the IEEE International Conference on Intelligent Computer Communication and Processing, ICCP’17, pages 189–196.
    Google ScholarLocate open access versionFindings
  • Yu Deng, Hang Lei, Xiaoyu Li, and Yiou Lin. 2018. An Improved Deep Neural Network Model for Job Matching. In Proceedings of the International Conference on Artificial Intelligence and Big Data, ICAIBD’18.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Melanie A. Myers. 2019. Healthcare Data Scientist Qualifications, Skills, and Job Focus: A Content Analysis of Job Postings. Journal of the American Medical Informatics Association, 26(5):383–391.
    Google ScholarLocate open access versionFindings
  • S. Nasser, C. Sreejith, and M. Irshad. 2018. Convolutional Neural Network with Word Embedding Based Approach for Resume Classification. In 2018 International Conference on Emerging Trends and Innovations In Engineering And Technological Research (ICETIETR), pages 1–6.
    Google ScholarLocate open access versionFindings
  • Luiza Sayfullina, Eric Malmi, Yiping Liao, and Alexander Jung. 2017. Domain adaptation for resume classification using convolutional neural networks. In International Conference on Analysis of Images, Social Networks and Texts, pages 82–93. Springer.
    Google ScholarLocate open access versionFindings
  • Darin Stewart. 2019. Understanding Your Customers by Using Text Analytics and Natural Language Processing. Gartner Research, G00373854.
    Google ScholarLocate open access versionFindings
  • Rogelio Valdez-Almada, Oscar M. Rodriguez-Elias, Cesar E. Rose-Gomez, Maria D. J. VelazquezMendoza, and Samuel Gonzalez-Lopez. 201Natural Language Processing and Text Mining to Identify Knowledge Profiles for Software Engineering Positions Generating Knowledge Profiles from Resumes. In Proceedings of the International Conference in Software Engineering Research and Innovation, CONISOFT’18.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pages 6000–6010, USA. Curran Associates Inc.
    Google ScholarLocate open access versionFindings
  • A. Zaroor, M. Maree, and M. Sabha. 2017. JRC: A Job Post and Resume Classification System for Online Recruitment. In 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pages 780–787.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments