A Text Mining Pipeline for Mining the Quantum Cascade Laser Properties.

ADBIS (Short Papers)(2023)

引用 0|浏览8
暂无评分
摘要
The development of the Terahertz laser technology in quantum cascade lasers (qcl) has brought about great potential for industrial applications. These lasers are based on the Terahertz electromagnetic waves, in the frequency range from about 100 GHz to 10 THz. There is need to understand the structure of the laser and its influence on the performance in order to optimize the design process. One way of collating this information is by having ontologies and knowledge bases capturing the various qcl designs and their performance characteristics. Majority of the laser design data is usually contained in scientific literature. The main drawback of such textual data sources is their unstructured nature. The complex nature of the laser design and the varying author language styles poses some level of difficulty in retrieving this information. Owing to this, the existing methods needs improvement in order retrieve the laser information at a high precision (with minimal number of incorrect records extracted) and minimized number of correct records not extracted. In this paper, we tackle this initial challenge by proposing a text mining pipeline for mining the qcl properties by extending the grammar rules of a conditional random field (CRF) based model using a rule-based approach. The properties of interest include: hetero-structure (laser stacking properties), working temperature, lasing frequency, laser thickness and the optical power. We evaluate the pipeline on sample open access journal papers from AIP, OPTICA and IOP Publishers.
更多
查看译文
关键词
text mining pipeline,quantum
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要