Data Quality Management In Institutional Research Output Data Center
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS(2019)
摘要
Institutional research output data center will store normative and convinced scholar's research output data, and it will effectively support dynamic presentation of research output, reveal institutional academic publication in multiple dimensions, advance open access, and provide data support for subject evaluation and discipline development.In this paper, we propose a data quality management framework to build institutional research output data center, and put forward relevant technical solution for different data governance problems, such as department name similarity estimation in data matching, author name disam-biguous problem in data merging and security issue in data exchange. We also introduce some learning algorithms such as text distance and community detection with matrix factorization. Comparing with different ways, our methods achieve good performance in quality manage processing.
更多查看译文
关键词
Research information system, Author name disambiguous, Text distance, Community detection, Matrix factorization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络