Sunday, November 13, 2016

Why we refactor? Here are 44 different reasons, according to GitHub contributors

by Danilo Silva, Universidade Federal de Minas Gerais (danilofs@dcc.ufmg.br); Nikolaos Tsantalis, Concordia University (@NikosTsantalis); Marco Tulio Valente, Universidade Federal de Minas Gerais (@mtov)
Associate Editor: Christoph Treude (@ctreude)

The adoption of refactoring practices was fostered by the availability of refactoring catalogues, as the one proposed by Martin Fowler [1]. These catalogues propose a name and describe the mechanics of each refactoring, as well as demonstrate its application through code examples. They also provide a motivation for the refactoring, which is usually associated to the resolution of code smells. For example, Extract Method is recommended to decompose a large and complex method or to eliminate code duplication. However, there is a limited number of studies investigating the real motivations driving the refactoring practice. To fill this gap in the literature, we conducted an in-depth investigation on why developers refactor code.

During 61 days, we monitored the refactoring activity of 748 GitHub Java repositories, using an automated infrastructure we built. Every time we identified a refactoring, we asked the developer who performed it to explain the reasons behind his/her decision to refactor the code. Next, we categorized their responses into different themes of motivations. The following table presents the results of this process, in the format of a catalogue with 44 distinct motivations for refactoring, grouped by 12 well-known refactoring types.

Table 1: Motivations for each refactoring type.
Ref. TypeMotivationOccurrences
Extract MethodExtract a piece of reusable code from a single place and call the extracted method in multiple places.43
Introduce an alternative signature for an existing method (e.g., with additional or different parameters) and make the original method delegate to the extracted one.25
Extract a piece of code having a distinct functionality into a separate method to make the original method easier to understand.21
Extract a piece of code in a new method to facilitate the implementation of a feature or bug fix, by adding extra code either in the extracted method, or in the original method.15
Extract a piece of duplicated code from multiple places, and replace the duplicated code instances with calls to the extracted method.14
Introduce a new method that replaces an existing one to improve its name or remove unused parameters. The original method is preserved for backward compatibility, it is marked as deprecated, and delegates to the extracted one.6
Extract a piece of code in a separate method to enable its unit testing in isolation from the rest of the original method.6
Extract a piece of code in a separate method to enable subclasses override the extracted behavior with more specialized behavior.4
Extract a piece of code to make it a recursive method.2
Extract a constructor call (class instance creation) into a separate method.1
Extract a piece of code in a separate method to make it execute in a thread.1
Move ClassMove a class to a package that is more functionally or conceptually relevant.13
Move a group of related classes to a new subpackage.7
Convert an inner class to a top-level class to broaden its scope.4
Move an inner class out of a class that is marked deprecated or is being removed.3
Move a class from a package that contains external API to an internal package, avoiding its unnecessary public exposure.2
Convert a top-level class to an inner class to narrow its scope.2
Move a class to another package to eliminate undesired dependencies between modules.1
Eliminate a redundant nesting level in the package structure.1
Move a class back to its original package to maintain backward compatibility.1
Move AttributeMove an attribute to a class that is more functionally or conceptually relevant.15
Move similar attributes to another class where a single copy of them can be shared, eliminating the duplication.4
Rename PackageRename a package to better represent its purpose.8
Rename a package to conform to project's naming conventions.3
Move a package to a parent package that is more functionally or conceptually relevant.2
Move MethodMove a method to a class that is more functionally or conceptually relevant.8
Move a method to a class that permits its reuse by other classes.3
Move a method to eliminate dependencies between classes.3
Move similar methods to another class where a single copy of them can be shared, eliminating duplication.1
Move a method to permit subclasses to override it.1
Inline MethodInline and eliminate a method that is unnecessary or has become too trivial after code changes.13
Inline and eliminate a method because its caller method has become too trivial after code changes, so that it can absorb the logic of the inlined method without compromising readability.2
Inline a method because it is easier to understand the code without the method invocation.1
Extract SuperclassIntroduce a new superclass that contains common state or behavior from its subclasses.7
Introduce a new superclass that is decoupled from specific dependencies of a subclass.1
Extract a superclass from a class that holds many responsibilities.1
Pull Up MethodMove common methods to superclass.8
Pull Up AttributeMove common attributes to superclass.7
Extract InterfaceIntroduce an interface to enable different behavior.1
Introduce an interface to facilitate the use of a dependency injection framework.1
Introduce an interface to avoid depending on an existing class/interface.1
Push Down AttributePush down an attribute to allow specialization by subclasses.2
Push down attribute to subclass so that the superclass does not depend on a specific type.1
Push Down MethodPush down a method to allow specialization by subclasses.1

Our findings confirm that Extract Method is the "Swiss army knife of refactorings". It is the refactoring with the most motivations (11 in total), and the majority of them expresses an intention to facilitate or even enable the completion of the maintenance task the developer is working on. In contrast, only two motivations for Extract Method (decompose method to improve readability and remove duplication) are targeting code smells.

The other refactorings are performed to improve the system design. For example, the most common motivation for Move Class, Move Attribute, and Move Method is to reorganize code elements, so that they have a stronger functional or conceptual relevance, or to eliminate dependencies between code elements.

By comparing to the code symptoms that initiate refactoring reported in the study by Kim et al. [2], we found the readability, reuse, testability, duplication, and dependency concerns in common.

Automated Refactoring Support

We also asked developers whether they used the automated refactoring support of an IDE to perform refactorings. Thus, we could compare our finding with previous studies in this area, leading to the following conclusions.

  • Manual refactoring is still prevalent (55% of the developers refactored the code manually). Inheritance related refactoring tool support seems to be the most under-used (only 10% done automatically), while Move Class and Rename Package are the most trusted refactorings (over 50% done automatically). The prevalence of manually applied refactoring confirms the findings of Murphy-Hill et al. [3] and Negara et al. [4]. However, it seems that developers apply more automated refactorings nowadays.
  • The IDE plays an important role in the adoption of refactoring tool support. IntelliJ IDEA users perform more automated refactorings (71% done automatically) than Eclipse users (44%) and Netbeans users (50%).
29 developers also explained why they did not use a refactoring tool, as summarized in the following table.

Table 2: Reasons for not using refactoring tools.
ReasonOccurrences
The developer does not trust automated support for complex refactorings.10
Automated refactoring is unnecessary, because the refactoring is trivial and can be manually applied.8
The required modification is not supported by the IDE.6
The developer is not familiar with the refactoring capabilities of his/her IDE.3
The developer did not realize at the moment of the refactoring that he/she could have used refactoring tools.2

If you are interested in our study, please refer to our paper accepted at FSE 2016:

Danilo Silva, Nikolaos Tsantalis, Marco Tulio Valente. Why We Refactor? Confessions of GitHub Contributors. [pdf]

References

[1] M. Fowler. Refactoring: Improving the Design of Existing Code. Addison-Wesley, Boston, MA, USA, 1999.
[2] M. Kim, T. Zimmermann, and N. Nagappan. An empirical study of refactoring challenges and benefits at Microsoft. IEEE Trans. Softw. Eng., 40(7), July 2014.
[3] E. R. Murphy-Hill, C. Parnin, and A. P. Black. How we refactor, and how we know it. IEEE Trans. Softw. Eng., 38(1):5-18, 2012.
[4] S. Negara, N. Chen, M. Vakilian, R. E. Johnson, and D. Dig. A comparative study of manual and automated refactorings. In Proceedings of the 27th European Conference on Object-Oriented Programming (ECOOP), pages 552-576, 2013.

No comments:

Post a Comment