Big Data = Big Insights? Operationalising Brooks' Law in a Massive GitHub Data Set

2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)(2022)

引用 1|浏览14
暂无评分
摘要
Massive data from software repositories and collaboration tools are widely used to study social aspects in software development. One question that several recent works have addressed is how a software project's size and structure influence team productivity, a question famously considered in Brooks' law. Recent studies using massive repository data suggest that developers in larger teams tend to be less productive than smaller teams. Despite using similar methods and data, other studies argue for a positive linear or even super-linear relationship between team size and productivity, thus contesting the view of software economics that software projects are diseconomies of scale. In our work, we study challenges that can explain the disagreement between recent studies of developer productivity in massive repository data. We further provide, to the best of our knowledge, the largest, curated corpus of GitHub projects tailored to investigate the influence of team size and collaboration patterns on individual and collective productivity. Our work contributes to the ongoing discussion on the choice of productivity metrics in the operationalisation of hypotheses about determinants of successful software projects. It further highlights general pitfalls in big data analysis and shows that the use of bigger data sets does not automatically lead to more reliable insights.
更多
查看译文
关键词
big data = big insights,Brooks' law,massive GitHub data set,massive data,software repositories,collaboration tools,software development,software project,massive repository data,larger teams,smaller teams,super-linear relationship,team size,software economics,developer productivity,GitHub projects,collaboration patterns,individual productivity,collective productivity,productivity metrics,successful software projects,big data analysis,bigger data sets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要