Categorizing Bugs with Social Networks
A Case Study on Four Open Source Software Communities
Establishing efﬁcient bug triaging procedures is an important precondition for successful collaborative software engineering projects. Particularly in open source software (OSS) projects with a large base of comparably inexperienced part-time contributors, triaging bugs can become a laborious task. In this paper, we study to what extent quantitative measures of social embeddedness of bug reporters within the collaboration network can assist the triaging of bug reports. In particular, we propose an efﬁcient and practical method for an automated early classiﬁcation of valid bug reports which a) refer to an actual software bug, b) are not duplicates and c) contain enough information to be processed right away. We provide a case study on a comprehensive data set of more than 700,000 bug reports obtained for a period of more than ten years from the BUGZILLA installation of four major OSS communities. For those projects that exhibit the lowest fraction of valid bug reports, we ﬁnd that the bug reporters’ position in the collaboration network is a strong indicator for the quality of bug reports.
Based on this ﬁnding, we developed automated classiﬁcation schemes that can easily be integrated into bug tracking platforms. We analyze their performance in four major OSS communities. Our results show that a support vector machine which identiﬁes valid bug reports based on nine quantitative measures for the reporting user’s position in the collaboration network can yield a precision of up to 90.3% with an associated recall of 38.9%. With this, we signiﬁcantly improve the results obtained in previous case studies which have investigated methods for an automated early identiﬁcation of bugs that are eventually ﬁxed. Furthermore, our study highlights the potential of using quantitative measures of social organization in collaborative software engineering. It opens a number of broader perspectives regarding the integration of social network analysis in the design of support infrastructures.
This paper has been accepted for the Software Engineering in Practice track of the 2013 International Conference on Software Engineering (ICSE). A draft version is available below. If you are interested in more details, please contact firstname.lastname@example.org.
Zanetti, Marcelo Serrano; Scholtes, Ingo; Tessone, Claudio Juan; Schweitzer, Frank