NNOSE: Nearest Neighbor Occupational Skill Extraction
CoRR(2024)
Abstract
The labor market is changing rapidly, prompting increased interest in the
automatic extraction of occupational skills from text. With the advent of
English benchmark job description datasets, there is a need for systems that
handle their diversity well. We tackle the complexity in occupational skill
datasets tasks – combining and leveraging multiple datasets for skill
extraction, to identify rarely observed skills within a dataset, and overcoming
the scarcity of skills across datasets. In particular, we investigate the
retrieval-augmentation of language models, employing an external datastore for
retrieving similar skills in a dataset-unifying manner. Our proposed method,
Nearest Neighbor Occupational Skill
Extraction (NNOSE) effectively leverages multiple datasets by
retrieving neighboring skills from other datasets in the datastore. This
improves skill extraction without additional fine-tuning. Crucially, we
observe a performance gain in predicting infrequent patterns, with substantial
gains of up to 30% span-F1 in cross-dataset settings.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined