De-identification in natural language processing

MIPRO（2014）

引用 9|浏览64

暂无评分

摘要

Natural language processing (NLP) systems usually require a huge amount of textual data but the publication of such datasets is often hindered by privacy and data protection issues. Here, we discuss the questions of de-identification related to three NLP areas, namely, clinical NLP, NLP for social media and information extraction from resumes. We also illustrate how de-identification is related to named entity recognition and we argue that de-identification tools can be successfully built on named entity recognizers.

查看译文

关键词

nlp systems,data privacy,nlp areas,social media,information extraction,textual data,data protection,natural language processing,privacy protection,informatics,media,information retrieval,databases

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要