谷歌浏览器插件
订阅小程序
在清言上使用

Combining natural language processing techniques and algorithms LSA, word2vec and WMD for technological forecasting and similarity analysis in patent documents

Joao Marcos de Rezende,Izabella Martins da Costa Rodrigues, Leandro Colombi Resendo,Karin Satie Komati

Technology analysis & strategic management(2024)

引用 1|浏览1
暂无评分
摘要
Keyword search is the most ordinary tool in patent offices; however, for more advanced research, free software is not presented on their websites. Thus, this paper has the purpose to provide a data-mining framework for patent documents, linking the natural language processing techniques and data analysis algorithms. The system has two main goals: the analysis of technological prospection and the evaluation of similarities among patents through titles and abstracts. For numerical experiments, we used the base of the US Patent and Trademark Office, with over a million documents. Analysing patents about TFT-LCD, Flash Memory and PDA, from 2010 to 2018, with S-Curve it was observed that the last two technologies decline. Using a cloud of words, it was possible to see the phone's evolution, from 2010 to 2015. To evaluate the degree of similarity among patents, we investigated Latent Semantic Analysis (LSA), Word2vec, Word Mover's Distance (WMD), in three different study cases. In addition, these methods were compared with the classical Jaccard index. Numerical results show that LSA and WMD obtained similar patent indications, and the Jaccard index presented different indications from the other three.
更多
查看译文
关键词
Patinformatics,USPTO,S-Curve,word cloud
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要