Similarity Of Xml Schema Fragments Based On Xml Data Statistics

2007 INNOVATIONS IN INFORMATION TECHNOLOGIES, VOLS 1 AND 2（2007）

引用 3|浏览6

暂无评分

摘要

As XML has become a standard for data representation, it can be found in plenty of information technologies. A possible optimization of XML-based approaches can be exploitation of similarity of XML data.In this paper we propose a technique for evaluating similarity of XML schema fragments focusing on two often omitted aspects - structural level of similarity and tuning of parameters of the similarity measure. In the former case we exploit the results of statistical analysis of real-world XML data. In the latter case we show that the tuning problem is a kind of constraints optimization problem and can be solved using corresponding approaches. We have analyzed (dis)advantages of two of them, genetic algorithms and simulated annealing, and in further experiments we show that appropriate tuning produces a more precise similarity measure.

查看译文

关键词

simulated annealing,xml,information technology,xml schema,genetic algorithms,genetic algorithm,data representation,statistical analysis

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要