N-way Diff: Set-based Comparison of Software Variants

2020 Working Conference on Software Visualization (VISSOFT)(2020)

引用 3|浏览14
暂无评分
摘要
Software is frequently developed in many similar copies, called forks or cloned software variants. During this development, pairwise comparison is routinely used for finding differences between the cloned copies, assessing their similarity, and merging the content. However, analyzing the similarity of a large group of variants using pairwise comparison is a relatively difficult task, as the number of compared pairs grows quadratically with the number of variants. Furthermore, the result of such group of pairwise comparisons is difficult to visualize. In this paper, we discuss the problem of N-way comparison of cloned software variants. We represent the N-way comparison result as a model of N intersecting sets. By aggregating the sets along the system decomposition hierarchy, we construct the sets at every level of the system structure (files, folders, and whole systems). We define a generalized approach for set model construction, and instantiate it for an N-way diff on the textual code representation. We propose set-based visualizations for the N-way comparison, which scale for more than ten component variants and MLOC-sized components. We evaluate the approach by applying it to several groups of industrial software system variants and by performing a controlled experiment with a comparison of 5 software forks. In the experiment, the group using set-based comparison solved the tasks in 58% less time and with 92% fewer incorrect answers than the group using pairwise comparison. Finally, we propose a generalization of the approach beyond software, to enable set-based comparison and similarity visualization for hierarchically structured models and data, for example genomes.
更多
查看译文
关键词
software comparison,software reuse,similarity,set model,set visualization,software variability,product lines
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要