Sunday, July 24, 2016

Minor contributors correlate to bugginess. But not when they're code reviewers.

By: Patanamon (Pick) Thongtanunam, Nara Institute of Science and Technology, Japan (@pamon)
Associate Editor: Bogdan Vasilescu, University of California, Davis. USA (@b_vasilescu)

Weak code ownership correlates to poor software quality. Code ownership is a common practice in large, distributed software development teams. It is used to establish a chain of responsibility (who to blame if there is a problem) and simplify management (to whom a task or bug-fix should be assigned). A simple intuition for estimating code ownership is that the developer who has written majority code to a module should be an owner of that module. Moreover, prior research found that a module with weak code ownership (that is written by many minor authors) is more likely to have bugs in the future [1].

Nowadays, development practices are more than just writing code. A tool-based code review process has tightly integrates with the software development cycle. Recent research has found that in addition to a defect-hunting exercise, reviewers also help an author to improve the code changes [2,3]. Then, these code writing and reviewing activities are orthogonal: teams can have a developer who reviews a lot but writes little, and vice versa.

Does code review activity change what we know about ownership and software quality? This led us to investigate the importance of code review activities for code ownership and software quality [4]. Through an empirical study of Qt and OpenStack systems, we (1) investigated the code authoring and reviewing activities of developers, (2) refined code ownership using code reviewing activities, and (3) studied the relationship between our refined ownership and software quality.

Code reviewers are the majority of contributors in a module

We found that the developers who did not previously write any code changes but only reviewed code changes are in the largest proportion of developers who contributed to a module (67%-86% of contributors in a module at the median). Moreover, 18%-50% of these review-only developers are documented core developers of the Qt and OpenStack projects. These findings suggest that if a code ownership estimation considers only code authoring activities, it is missing many developers who also provided reviewing contributions to a module.



Figure 1: Refined code ownership

Many minor authors are actually major reviewers

We observe the amount of code authoring and reviewing contributions that developers made to a module and classify two levels of expertise i.e., major and minor levels in each dimension (Fig 1).

Traditional code ownership (TCO) is solely derived from code authoring activities while review-specific ownership (RSO) is solely derived from code reviewing activities. The interesting part in Figure 1 is the minor authors since these developers were classified as low-expertise developers.

However, we found that 13%-58% of minor authors are major reviewers who actually reviewed many code changes to a module. This finding suggests that many major developers who actually make large contributions to modules by reviewing code changes were misclassified as low-expertise developers according to their low code authoring activities. 

Reviewing expertise reverses the relationship between authoring expertise and software quality

We further investigated whether reviewing expertise has an impact on software quality or not. Hence, we compared the rates of developers with each level of expertise in between defective and clean modules. We found that the rates of developers with the minor author and minor reviewer expertise in defective modules are higher than those in clean modules (The left bean plot in Figure 2). On the other hand, the rates of developers with the minor author but major reviewer expertise in defective modules are less than those in clean modules (The right bean plot in Figure 2). When we control for several confounding factors using statistical models, the rates of developers with the minor author and minor reviewer expertise still share a strong increasing relationship with defect-proneness. These results indicate that the reviewing expertise share a relationship with software quality and it can reverse the direction of the association between the minor authorship and defect-proneness. 


Figure 2: The relationship between minor authors and defect-proneness in Qt version 5.0.0

Practical suggestions

Our findings lead us to believe that code reviewing activity captures an important aspect of code ownership. Therefore, future estimations of code ownership should take code review activity into consideration in order to accurately model the contributions that developers have made to evolve software systems. Such code ownership estimations also can be used to chart quality improvement plans. For example, teams should apply additional scrutiny to module contributions from developers who have neither authored nor reviewed many code changes to that module in the past, while a module with many developers who have not authored many code changes should not be considered risky if those developers have reviewed many of the code changes to that module.

References

[1] C. Bird, N. Nagappan, B. Murphy, H. Gall, and P. Devanbu, “Don’t Touch My Code! Examining the Effects of Ownership on Software Quality,” in Proceedings of the 8th joint meeting of the European Software Engineering Conference and the International Symposium on the Foundations of Software Engineering (ESEC/FSE), 2011, pp. 4–14.
[2] A. Bacchelli and C. Bird, “Expectations, Outcomes, and Challenges Of Modern Code Review,” in Proceedings of the 35th International Conference on Software Engineering (ICSE), 2013, pp. 712–721.
[3] P. C. Rigby and C. Bird, “Convergent Contemporary Software Peer Review Practices,” in Proceedings of the 9th joint meeting of the European Software Engineering Conference and the International Symposium on the Foundations of Software Engineering (ESEC/FSE), 2013, pp. 202–212.
[4] P. Thongtanunam, S. Mcintosh, A. E. Hassan, and H. Iida, “Revisiting Code Ownership and its Relationship with Software Quality in the Scope of Modern Code Review,” in Proceedings of the 38th International Conference on Software Engineering (ICSE), 2016, pp. 1039–1050.

No comments:

Post a Comment