Development of Replica Free Repositories using Particle Swarm Optimization Algorithm

semanticscholar(2019)

引用 0|浏览3
暂无评分
摘要
The increasing volume of information available in digital media becomes a challenging problem for data administrators. Usually built on data gathered from different sources, data repositories such as those used by digital libraries and e-commerce brokers present records with disparate schemata and structures. The increased volume even created redundant data also in the database. So a system or method is become immense to control the redundancy and duplication. In the proposed approach, a method that makes use of PSO (Particle Swarm Optimization) algorithm for generating the optimal similarity measure to decide whether the data is duplicate or not. PSO algorithm is used to generate the optimal similarity measure for the training datasets. Once the optimal similarity measure obtained, the deduplication of remaining datasets is done with the help of optimal similarity measure generated from the PSO algorithm.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要