Measure the Diversity of Open Source Software Projects' Forks with Fork Entropy

arxiv(2022)

引用 0|浏览6
暂无评分
摘要
Forks play a central role in modern pull-based OSS development. Although rich empirical results on the participants, challenges, and features of forks have been announced, there is little discussion on quantitatively measuring the population of forks around OSS projects. In this paper, we take a step toward enriching the set of metrics about forks by proposing the fork entropy to measure the diversity of fork populations around OSS projects. We operationalize the proposed fork entropy based on Rao's quadratic entropy with a distance function defined on the forks' modifications to project files. After verifying the construct validity of fork entropy, we show the usefulness of fork entropy in understanding and predicting OSS development in terms of external productivity, the acceptance rate of external pull-requests, and code quality using a dataset consisting of fifty popular OSS projects hosted on GitHub. By conducting regression analyses, we find that fork entropy significantly and positively affects external productivity, the acceptance rate of external pull-requests, and code quality, even though sometimes with a small effect. However, as expected, fork entropy at a high level sometimes plays a negative role in OSS development. We also observe fork entropy can magically moderate other factors' effect on some project outcomes. We believe our new metric of fork entropy is helpful to guide practices of OSS development.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要