Categorizing Bugs with Social Networks

A Case Study on Four Open Source Software Communities

Establishing efficient bug triaging procedures is an important precondition for successful collaborative software engineering projects. Particularly in open source software (OSS) projects with a large base of comparably inexperienced part-time contributors, triaging bugs can become a laborious task. In this paper, we study to what extent quantitative measures of social embeddedness of bug reporters within the collaboration network can assist the triaging of bug reports. In particular, we propose an efficient and practical method for an automated early classification of valid bug reports which a) refer to an actual software bug, b) are not duplicates and c) contain enough information to be processed right away. We provide a case study on a comprehensive data set of more than 700,000 bug reports obtained for a period of more than ten years from the BUGZILLA installation of four major OSS communities. For those projects that exhibit the lowest fraction of valid bug reports, we find that the bug reporters’ position in the collaboration network is a strong indicator for the quality of bug reports.

Monthly collaboration networks of four OSS communities

Based on this finding, we developed automated classification schemes that can easily be integrated into bug tracking platforms. We analyze their performance in four major OSS communities. Our results show that a support vector machine which identifies valid bug reports based on nine quantitative measures for the reporting user’s position in the collaboration network can yield a precision of up to 90.3% with an associated recall of 38.9%. With this, we significantly improve the results obtained in previous case studies which have investigated methods for an automated early identification of bugs that are eventually fixed. Furthermore, our study highlights the potential of using quantitative measures of social organization in collaborative software engineering. It opens a number of broader perspectives regarding the integration of social network analysis in the design of support infrastructures.

This paper has been accepted for the Software Engineering in Practice track of the 2013 International Conference on Software Engineering (ICSE). A draft version is available below. If you are interested in more details, please contact ischoltes@ethz.ch.

 

Selected Publications

Categorizing bugs with social networks: A case study on four open source software communities, 2013

Zanetti, Marcelo Serrano; Scholtes, Ingo; Tessone, Claudio Juan; Schweitzer, Frank