JOBSKAPE: A Framework for Generating Synthetic Job Postings to Enhance Skill Matching
arXiv (Cornell University)(2024)
EPFL Equal contribution | IT University of Copenhagen | EPFL
Abstract
Recent approaches in skill matching, employing synthetic training data forclassification or similarity model training, have shown promising results,reducing the need for time-consuming and expensive annotations. However,previous synthetic datasets have limitations, such as featuring only one skillper sentence and generally comprising short sentences. In this paper, weintroduce JobSkape, a framework to generate synthetic data that tackles theselimitations, specifically designed to enhance skill-to-taxonomy matching.Within this framework, we create SkillSkape, a comprehensive open-sourcesynthetic dataset of job postings tailored for skill-matching tasks. Weintroduce several offline metrics that show that our dataset resemblesreal-world data. Additionally, we present a multi-step pipeline for skillextraction and matching tasks using large language models (LLMs), benchmarkingagainst known supervised methodologies. We outline that the downstreamevaluation results on real-world data can beat baselines, underscoring itsefficacy and adaptability.
MoreTranslated text
Key words
Knowledge Tracing
PDF
View via Publisher
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper
Related Papers
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
GPU is busy, summary generation fails
Rerequest