Sunday, June 18, 2017

When and Which Version to Adopt a Library: A Case Study on Apache Software Foundation Projects

By: Akinori Ihara, Daiki Fujibayashi, Hirohiko Suwa, Raula Gaikovina Kula, and Kenichi Matsumoto (Nara Institute of Science and Technology, Japan)
Associate editor: Stefano Zacchiroli (@zacchiro)

Are you currently using a third-party library in your system? How did you decide the version and when to adapt the library? Is it the latest version or the older (reliable) version that you adopted? Do you plan to update and would you trust the latest version? These are all tough questions with no easy answers.

A software library is a collection of reusable programs, used by both industrial and open software client projects to help achieve shorter development cycles and higher quality software [1]. Often enough, most active libraries release newer and improved versions of their libraries to fix bugs, keep up with the latest trends and showcase any new enhancements. Ideally, any client user of a library would adopt the latest version of that library immediately. Therefore, it is recommended that a client project should upgrade their outdated versions as soon a new release becomes available.

Developers do not always select to adopt the latest version over previous versions

As any practitioner is probably well-aware, adoption of the latest version is not as trivial as it sounds, and may require additional time and effort (i.e., adapting code to facilitate the new API and testing) to ensure successful integration into their existing client system. Developers of client projects are especially wary of library projects that follow a rapid-release style of development, since such library projects are known to delay bug fixes [2]. In a preliminary analysis, we identified two obstacles that potentially demotivate client users from updating:
  1. Similar client users are shown not to adopt new version shortly after it is released and that
  2. there is a delay between the library release and its adoption by similar clients.
These insights may indicate client users are likely to 'postpone' updating until a new release is deemed to become 'stable'. In this empirical study, we aim to investigate how libraries are selected in relation to their release cycles.

Design: We analyze when and which library versions are being adopted by client users. From 4,815 libraries, our study focuses on the 23 most frequent Apache Software Foundation (ASF) libraries used by 415 software client projects [3].

Figure 1: distribution of the periods between releases in each library

When to adapt a library?: We find that not all 23 libraries were yearly released (see Figure 1). Some library projects (e.g., jetty-server, jackson-mapper-asl, mockito-all) often release new versions within the year (defined as quick-release libraries), while others (e.g., commons-cli, servlet.api, commons-logging) take over a year to come out with a release (defined as late-release libraries). We found that these more traditional and well-established (i.e., older than 10 years) projects were the late-release libraries, while newer, beginner projects belonged to the quick-release libraries.


Figure 2: Percentage of client users to select the latest version (gray) and the previous version (black)

Which version to adopt?: Software projects do not always adopt new library versions in their projects (se Figure 2). Interestingly, we found that some client users of a late-release library would first select the latest version as soon as it was released, only to later on downgrade to a previous version (Figure 2: Red box and blue box shows the percentage of client users which performed downgrade after adapting the latest version or the previous version).

Lessons Learnt: From our study, we find that client users may postpone updates until a library is deemed to become stable and reliable. Although quality of most open source software would often improve by minor and micro release changes, the study finds that client projects may wait, especially in the case of a late-release library. Our study validates the notion that library updates is not trivial. We find that practitioners are indeed careful when it comes to adopting the latest version, as they may include dependency problems and potentially untested bugs.

We presented this study in International Conference on Open Source Systems (OSS'17). For more details, please see the preprint and the presentation from our website: http://akinori-ihara.jpn.org/oss2017/

[1] Frank McCarey, Mel Ó Cinnéide, and Nicholas Kushmerick, "Knowledge reuse for software reuse," Journal of Web Intelligence and Agent Systems, pp.59-81, Vol.6, Issue.1, 2008.
[2] Daniel Alencar da Costa, Surafel Lemma Abebe, Shane McIntosh, Uira Kulesza, Ahmed E Hassan "An Empirical Study of Delays in the Integration of Addressed Issues," In Proc. of the 30th IEEE International Conference on Software Maintenance and Evolution (ICSME'14), pp.281-290, 2014.
[3] Akinori Ihara, Daiki Fujibayashi, Hirohiko Suwa, Raula Gaikovina Kula, and Kenichi Matsumoto, "Understanding When to Adapt a Library: a Case Study on ASF Projects," In Proc. of the International Conference on Open Software Systems (OSS'17), pp.128-138, 2017.

No comments:

Post a Comment