Monday, September 18, 2017

Understanding the Impressions, Motivations, and Barriers of One Time Code Contributors to FLOSS Projects: A Survey

By: Amanda S. LeeUniversity of Alabama. USA (@amandaslee15)
Jeffrey C. Carver, University of Alabama. USA (@JeffCarver32)
Amiangshu Bosu, Southern Illinois University. USA (@abosu)
Associate Editor: Bogdan Vasilescu, Carnegie Mellon University. USA (@b_vasilescu)

FLOSS, or Free/Libre Open Source Software, is becoming an increasingly important, some may even say dominant, factor in the modern software economy [2]. As opposed to the traditional methods of software development, FLOSS projects function and receive high-quality code submissions often despite the lack of financial compensation and the lack of any formalized management or governance structure [1, 2, 5]. While FLOSS projects are typically guided or managed by a small number of core developers, they survive and, indeed, thrive by attracting contributions from new, talented software developers who join the project. There are many types of contributors, ranging from those who have contributed a number of patches and are on their way to becoming part of the core, to those that lack adequate coding skills and contribute primarily by reporting bugs and editing documentation [3, 4, 6]. To be successful, FLOSS projects must constantly recruit new code contributors to replace those that leave (many do so within a year of joining) [7].


In our work [8], we define a special type of FLOSS contributor called “One-Time code Contributors” (OTCs) as those contributors who have successfully contributed, that is had merged by the project, one, and only one, code patch to a given FLOSS project. This successful contribution indicates that the OTC (1) has an appropriate level of coding skills and knowledge to make a valuable contribution to the project and (2) has the determination necessary to write the code, submit the patch, and participate in the review project to get the patch accepted. Because FLOSS projects could greatly benefit by attracting these technically competent participants to submit more code, two questions arise: (1) why do OTCs not contribute additional patches? and (2) is there any way the FLOSS projects can attract and retain these valuable contributors?


To answer these questions, and understand how OTCs could better contribute to FLOSS projects in the future, we conducted a survey of 184 OTCs from 23 popular FLOSS projects. The remainder of this post summarizes some of our key findings.


Initial Impression of Project Members
When asked about their initial impression of other project members, a large percentage of the respondents indicated they had a positive or very positive impressions of their fellow project members (Figure 1). This result is surprising because previous research indicated that peripheral developers may be neglected, so we expected them to have more negative impressions [6]. Some of the most common positive responses included: project members are skilled, helpful, or responsive. These responses indicate that OTCs appreciate the assistance that other project members can provide. However, some OTCs did report negative impressions of fellow project members. These negative impressions were just as strong as the positive impressions. The most common negative impressions included: project members are busy, unresponsive, or otherwise unhelpful.


Overall_Impressions.png


Figure 1 - Overall Impressions


Tradeoff Between Skill and Busyness  
There was an interesting interaction between the most common positive impression, skilled, and the most common negative impression, busy. OTCs expected those project members who were more skilled to be less approachable and less obtainable, in other words, busier. Conversely, they expected less skilled project members to be less busy and more approachable. OTCs did not seem displeased with this trade-off—in fact, they seemed to expect it. Even so, the OTCs whose experience was that skilled members were not too busy to pay attention to them reported more positive impressions, while others who experienced the busier skilled members expressed more neutral or negative impressions.


OTCs’ Motivations for Contributing
Previous research found that peripheral contributors (of which OTCs are a subgroup) tend to be motivated more extrinsically (that is by external factors like fixing a bug) than intrinsically (that is, by internal factors such as enjoying coding as a hobby) [4, 7]. This observation would suggest that a large portion of the OTC contributions would be in the form of “drive-by commits,” that is, fixing a flaw without any real desire to join the project. When asked about their motivation to contribute their patch, the respondents listed a variety of motivations, as Figure 2 illuminates. While respondents indicated the desire to fix a bug as the most common motivation, they gave a more intrinsic motivation, share with the community, as the second most common motivation. Third, respondents indicated they were motivated by an employer’s need. These respondents were not hired to contribute to the FLOSS, but rather added or fixed the FLOSS project in support of other employer goals. Fourth, another intrinsic motivation, respondents wanted to add a new feature and have it maintained by the project. Perhaps unsurprisingly, this motivation often overlapped with the intrinsic motivation, share with the community. Other, less common motivations include ‘scratching an itch,’ personal reputation, or curiosity about the project. This list of varied motivations suggest that while, some OTCs may not be interested in long-term project participation, others have deeper motivations and could ultimately be attracted to join the project and contribute more than a single patch.


Motivation.png


Figure 2 - OTC’s Motivations


Barriers Faced by OTCs
When asked whether there were any barriers that prevented them from continuing to contribute to the project, only half of the respondents indicated that the faced barriers, as seen in figure 3. The most commonly reported barrier was time. Either it took the respondent too long to make the contribution, or the respondent was too busy with other work. The next most common barriers were patch submission difficulties and entry difficulties. While FLOSS projects and tools oriented to help project newcomers cannot address the time barrier, they can encourage OTCs to continue contributing by reducing the difficulties associated with project entry and with patch submission.


Barriers.png


Figure 3 - OTC Barriers


OTCs Who Stopped Contributing Despite Facing Barriers
Of the respondents who did not face barriers, nothing else to contribute was the most common reason to stop contributing.. This response is interesting because it suggests that if these OTCs did find something else to contribute, they might return to the project. In other words, these OTCs could be motivated to continue contributing to the project, because they exhibited no particularly strong desire to leave it. Other reasons for leaving a project included their employer no longer used the project and they never had any intention of becoming a project member. These results suggest that while not all OTCs can be motivated to make additional contributions, some very likely could be, under the right circumstances.


Conclusion
Though OTCs have only successfully contributed one code patch, FLOSS projects may be able to attract some of them to make further contributions. While some OTCs truly are “drive by committers,” whose only interest in the project was to have their patch included, others are more community-minded and more invested in the project. This second group of OTCs may have contributed additional code patches had they not encountered barriers or had they identified another interesting patch to contribute. By lowering the entry barriers and making patch submission easier, FLOSS projects can likely retain more OTCs, thereby increasing the size and skill of the contributor base, which leads to more successful FLOSS projects.


References
  1. Lakhani KR, Wolf RG (2005). “Why Hackers Do What They Do: Understanding Motivation and Effort in Free/Open Source Software Projects."  In J Feller, B Fitzgerald, S Hissam, KR Lakhani (eds.),Perspectives on Free and Open Source Software, MIT Press, Cambridge.
  2. M. Milinkovich. Keynote, Topic: "Open Collaboration, the Eclipse Way." International Conference on Software Engineering, Buenos Aires, Argentina, May 24 2017.
  3. G. Pinto, I. Steinmacher and M. A. Gerosa, "More Common Than You Think: An In-depth Study of Casual Contributors," 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Suita, 2016, pp. 112-123.
  4. Setia R. et al., “How Peripheral Developers Contribute to Open-Source Software Development,” Information Systems Research, vol. 23 issue 1 pp. 144-163, 2012.
  5. Shah, S. K., “Motivation, Governance, and the Viability of Hybrid Forms in Open Source Software Development,” Management Science, vol. 52 issue 7 pp.1000-1014, 2006.
  6. Steinmacher, I. et al., “Social Barriers Faced by Newcomers Placing their First Contribution in Open Source Software Projects.” 18th ACM Conference on Computer-Supported Cooperative Work and Social Computing, Vancouver, BC, Canada, 2015, pp. 1381-1392.
  7. Zhou, M. and Mockus, A., “What Make Long Term Contributors: Willingness and Opportunity in OSS Community.” 34th International Conference on Software Engineering (ICSE), Zurich, 2012, pp. 519-528.
  8. Lee, A., Carver, J. C. and Bosu, A., "Understanding the Impressions, Motivations, and Barriers of One Time Code Contributors to FLOSS Projects: A Survey." 39th International Conference on Software Engineering (ICSE), 2017, pp. 187-197.

Monday, September 11, 2017

IEEE July/August Issue, Blog, and SE Radio Summary

Associate Editor: Brittany Johnson (@drbrittjaydlf)

The July/August Issue of IEEE Software again provides articles that cover a range of software engineering-related topics. The themes in this issue include reliability and requirements engineering, technical debt, and agile development.

Special in this issue is a look at the history of IEEE Software in the article "Insights from the Past: The IEEE Software History Experiment" by Zeljko Obrenovic. By looking at the history website, this article proposes the practical value in using historical data. There is also discussion of the future of IEEE Software based on historical data.

The focus topic in this issue of IEEE Software was Reliability Engineering. This issue featured the following articles on reliability engineering:

One of the main concerns around reliability engineering is the growing ubiquity of software in safety critical systems. As pointed out by Diomidis Spinellis in "Software Reliability Redux", the more software is integrated into our daily lives the more we need adequate reliability engineering. Going more in depth on the topic, in "Requirements Engineering for Safety-Critical Systems: Overview and Challenges",  Martins and colleagues talk about some of the challenges that comes with engineering safety critical systems. In particular, they focus on dealing with safety requirements, discussing topics from reducing the gap between academia and industry and communicating requirements throughout the development process.

In "Safety Analysis of Safety-Critical Systems Using State-Space Models", Kumar and colleagues made one suggestion for dealing with safety analysis of safety critical systems; using state-space models for safety issue prognosis. They created an approach that takes UML state diagrams and maps them to state-space models, using Petri nets for dynamic behavioral analysis. They were able to validate their approach using a nuclear power plant's emergency core cooling system. Despite their findings, there are still strides that can be made to improve the state-of-the-art in regards to processes and tools in place for building and maintaining reliable safety critical systems.

IEEE Software Blog

The IEEE Software Blogs posts in July and August focused on vulnerabilities. Mehdi Mirakhorli wrote a blog post on tactical vulnerabilities, or vulnerabilities that result from incorrect implementation or deterioration of security tactics during coding and maintenance. Cor-Paul Bezemer and Zhen Ming (Jack) Jiang explored if and how developers use performance testing to identify vulnerabilities. They studied Java-based open source projects on GitHub and found that there may be missing tool support for effective performance testing. Maleknaz Nayebi and Federica Sarro used crowdsourcing in a case study of the Fort McMurray wildfire, along with their method called MAPFEAT, to determine useful mobile app features.

SE Radio

SE Radio welcomed two new hosts in July and August: Kishore Bhatia of BlockApps and Bryan Reinero of MongoDB.  For Kishore's first broadcast, he spoke with Kieren James-Lubin about security topics such as Blockchains, Crytocurrency, Bitcoins, and Distributed Ledger. For Bryan's first broadcast, he and Jason Hand discuss handling outages and responding to program failures. 
Other topics discussed on SE Radio in July and August include type driven development, rules engines, and P vs. NP.


Monday, September 4, 2017

IEEE May/June Issue, Blog, and SE Radio Summary

Associate Editor: Brittany Johnson (@drbrittjaydlf)

The May/June issue of IEEE Software provided a wide range articles on software-related issues; from energy-aware systems to deep learning, there's something for just about everyone. Although the focus of this issue was automotive software, there are features on various other software topics.

The focus topic in this issue of IEEE Software was Automotive Software. This issue featured the following articles on Automotive Software.

As suggested by the Editor, Diomidis Spinellis, in his article "How Abundance Changes Software Engineering", increase in software processing power has lead to changes in how we use computing technologies. This includes the relatively recent innovation of automotive software. In "Future Automotive Architecture and the Impact of IT Trends", Traub and colleagues discuss the opportunities that advances in IT and consumer-electronics afford the automotive industry. Much of the discussion focuses on the importance of seamless and scalable architecture for automotive software.

With the creation of things like Google's self driving car, and the potential for this to be a commercially available product, the importance of security in the software behind our vehicles is increasing. The authors of "Secure Automotive Software: The Next Steps" discuss this issue in depth, starting with the challenges faced by developers that work on automotive software.  Based on these challenges, the authors discuss some recommendations they have for improving the security of software used in the auto industry. Some of their recommendations include using static analysis for compile-time assurance and cryptography for runtime protection.


This issue also included articles on the feature topics of mobile app development, agile development, continuous deployment, and the business of software engineering  encompassed in the following articles:



IEEE Software Blog

The topics in the blog posts for May and June show a little more diversity than last month's...figuratively and literally. Anna Filippova of Carnegie Mellon contributed a blog post on how brainstorming can be used to support inclusiveness in diverse teams. Other topics discussed in these blog posts include cross-stack configuration errorslibrary adoption, and the effect of casual contributions on software quality.

SE Radio 

In this issue, SE Radio welcomed another new host, Matthew Farwell of Nexthink. In his debut, he spoke with Yakov Fain about Angular, including who should use it and why. Many of the episodes in this issue center around technologies and how developers are using them, including ElasticsearchDocker, and LLVM.  SE Radio also deployed a listener survey to get feedback from their listeners -- don't forget to provide your feedback!

Monday, July 24, 2017

Tactical Vulnerabilities in Chromium, PHP and Thunderbird

By:  Mehdi Mirakhorli (@MehdiMirakhorli), Associate Editor

Software engineers face a constantly growing pressure to build secure software by design, where systems have to be designed from the ground up to be secure and resistant to attacks. To achieve this goal, security architects work with various stakeholders to identify security requirements and adopt appropriate architectural solutions to address these requirements. These architectural solutions are often based on security tactics. Security tactics as reusable solutions to satisfy security quality attributes regarding resisting attacks (e.g., tactic “Authenticate Actors”), detecting attacks (e.g., tactic “Detect Intrusion”), reacting to attacks (e.g., tactic “Revoke Access”), and recovering from attacks (e.g., tactic “Audit”). Despite the significant efforts that go into designing secure systems, security can slowly erode because of ongoing maintenance activities. Incorrect implementation of security tactics or the deterioration of security tactics during coding and maintenance activities can result in vulnerabilities in the security architecture of the system, thus compromising key security requirements. We refer to these vulnerabilities as tactical vulnerabilities.

The code snippet in Listing 1 from a J2EE web application shows an example of such tactical vulnerabilities which is the incorrect implementation of the “Manage User Sessions” tactic. The correct implementation of this tactic in a web application would allow the system to keep track of users that are currently authenticated (including permissions they hold). However, in the given code snippet, the application authenticates users with LoginContext.login() without first calling HttpSession.invalidate() to invalidate any existing session. This enables attackers to fixate (i.e., find or set) another user’s session identifier (e.g., by inducing a user to initiate a session using the session identifier provided by the attacker). Once the user authenticates him/herself with this forged session identifier, the attacker would be able to hijack or steal his/her authenticated session. Although architects have used the “Manage User Sessions” tactic in the architecture design of the web application, the developers have failed to implement it correctly, resulting in a tactical vulnerability that can be exploited for session fixation attacks.



Recent empirical studies of security vulnerabilities have neglected the architectural context, including design decisions such as tactics and patterns. They mostly focus on studying and understanding coding issues related to the management of data structures and variables (e.g., buffer overflow/over-read). 

Goal of This Study

Here, I’d like to report an in-depth case study of software vulnerabilities associated with architectural security tactics across three large-scale open-source systems (Chromium, PHP, and Thunderbird). In this blog post, I only present the results, the scientific process and systematic approach used to make the conclusions can be found in our research article here.


Common Tactical Vulnerabilities

Table I lists the root causes (i.e., vulnerability types) of tactical vulnerabilities in each of the three studied systems, the related architecture tactics, as well as the total number of CVEs caused by the given vulnerability type.



Key findings

While Chromium, PHP, and Thunderbird have adopted a wide range of architectural tactics to secure the systems by design, a remarkable number of vulnerabilities discovered in these systems are due to incorrect implementations of these tactics. 

  • While Chromium, PHP, and Thunderbird have adopted a wide range of architectural tactics to secure the systems by design, a remarkable number of vulnerabilities discovered in these systems are due to incorrect implementations of these tactics
  • Improper Input Validation (CWE-20) and Improper 
Access Control (CWE-284) are the most occurring root causes for security vulnerabilities in Chromium, PHP, and Thunderbird. 

  • Vulnerabilities in the three studied systems are mostly related to tactics “Validate Inputs” and “Authorize Actors” for resisting attacks. 

  • Security of studied projects was compromised by reusing or importing vulnerable versions of third-party libraries. In the case of Chromium, such vulnerabilities occurred 106 times, while in Thunderbird and PHP, 7 and 8 times, respectively.
  • Tactical and non-tactical vulnerabilities have a similar distribution over time and releases, even though the absolute numbers of tactical and non-tactical vulnerabilities differ.
  • When fixing tactical vulnerabilities, there is no statistically higher or lower code churn compared to fixing non-tactical vulnerabilities.
  • When fixing tactical vulnerabilities, the number of affected files is not statistically significantly higher or lower compared to fixing non-tactical vulnerabilities.

Read more



Monday, July 17, 2017

Performance testing in Java-based open source projects

by Cor-Paul Bezemer (@corpaul), Queen's University, Canada
Associate Editor: Zhen Ming (Jack) Jiang, York University, Canada


From a functional perspective, the quality of open source software (OSS) is on par with comparable closed-source software [1]. However, in terms of nonfunctional attributes, such as reliability, scalability, or performance, the quality is less well-understood. For example, Heger et al. [2] stated that performance bugs in OSS go undiscovered for a longer time than functional bugs, and fixing them takes longer.

As many OSS libraries (such as apache/log4j) are used almost ubiquitously across a large span of other OSS or industrial applications, a performance bug in such a library can lead to widespread slowdowns. Hence, it is of utmost importance that the performance of OSS is well-tested.

We studied 111 Java-based open source projects from GitHub to explore to what extent and how OSS developers conduct performance tests. First, we searched for projects that included at least one of the keywords 'bench' or 'perf' in the 'src/test' directory. Second, we manually identified the performance and functional tests inside that project. Third, we identified performance-sensitive projects, which mentioned in the description of their GitHub repository that they are the 'fastest', 'most efficient', etc. For a more thorough description of our data collection process, see our ICPE 2017 paper [3]. In the remainder of this blog post, the most significant findings of our study are highlighted.

Finding # 1 - Performance tests are maintained by a single developer or a small group of developers. 
In 50% of the projects, all performance test developers are one or two core developers of the project. In addition, only 44% of the test developers worked on the performance tests as well.

Finding # 2 - Compared to the functional tests, performance tests are small in most projects. 
The median SLOC (source lines of code) in performance tests in the studied projects was 246, while the median SLOC of functional tests was 3980. Interestingly, performance-sensitive projects do not seem to have more or larger performance tests than non-performance-sensitive projects.

Finding # 3 - There is no standard for the organization of performance tests. 
In 52% of the projects, the performance tests are scattered throughout the functional test suite. In 9% of the projects, code comments are used to communicate how a performance test should be executed. For example, the RangeCheckMicroBenchmark.java file from the nbronson/snaptree project contains the following comment:
/*
* This is not a regression test, but a micro-benchmark.
*
* I have run this as follows:
*
* repeat 5 for f in -client -server;
* do mergeBench dolphin . jr -dsa\
*       -da f RangeCheckMicroBenchmark.java;
* done
*/
public class RangeCheckMicroBenchmark {
...
}

In four projects, we even observed that code comments were used to communicate the results of a previous performance test run.

Finding # 4 - Most projects have performance smoke tests. 
We identified the following five types of performance tests in the studied projects:
  1. Performance smoke tests: These tests (50% of the projects) typically measure the end-to-end execution time of important functionality of the project.
  2. Microbenchmarks: 32% of the projects use microbenchmarks, which can be considered performance unit tests. Stefan et al. [4] studied microbenchmarks in depth in their ICPE 2017 paper.
  3. One-shot performance tests: These tests (15% of the projects) were meant to be executed once, e.g., to test the fix for a performance bug.
  4. Performance assertions: 5% of the projects try to integrate performance tests in the unit-testing framework, which results in performance assertions. For example, the TransformerTest.java file from the anthonyu/Kept-Collections project asserts that one bytecode serialization method is at least four times as fast as the alternative.
  5. Implicit performance tests: 5% of the projects do not have performance tests, but simply yield a performance metric (e.g., the execution time of the unit test suite). 
The different types of tests show that there is a need for performance tests at different levels, ranging from low-level microbenchmarks to higher-level smoke tests.

Finding # 5 - Dedicated performance test frameworks are rarely used. 
Only 16% of the studied projects used a dedicated performance test framework, such as JMH or Google Caliper. Most projects use a unit test framework to conduct their performance tests. One possible explanation is that developers are trying hard to integrate their performance tests into the continuous integration processes. 

The main takeaway of our study

Our observations imply that developers are currently missing a “killer app” for performance testing, which would likely standardize how performance tests are conducted, in the same way as JUnit standardized unit testing for Java. An ubiquitous performance testing tool will need to support performance tests on different levels of abstraction (smoke tests versus detailed microbenchmarking), provide strong integration into existing build and CI tools, and support both, extensive testing with rigorous methods as well as quick-and-dirty tests that pair reasonable expressiveness with being fast to write and maintain even by developers who are not experts in software performance engineering.

References

[1] M. Aberdour. Achieving quality in open-source software. IEEE Software. 2007.
[2] C. Heger, J. Happe, and R. Farahbod. Automated Root Cause Isolation of Performance Regressions During Software Development. In Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering (ICPE). 2013.
[3] P. Leitner and C.-P. Bezemer. An exploratory study of the state of practice of performance testing in Java-based open source projects. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering (ICPE). 2017. 
[4] P. Stefan, V. Horky, L. Bulej, and P. Tuma. Unit testing performance in Java projects: Are we there yet? In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering (ICPE). 2017.

If you like this article, you might also enjoy reading:

[1] Jose Manuel Redondo, Francisco Ortin. A Comprehensive Evaluation of Common Python Implementations. IEEE Software. 2015.
[2] Yepang Liu, Chang Xu, Shing-Chi Cheung. Diagnosing Energy Efficiency and Performance for Mobile Internetware Applications. IEEE Software. 2015.
[3] Francisco Ortin, Patricia Conde, Daniel Fernández Lanvin, Raúl Izquierdo. The Runtime Performance of invokedynamic: An Evaluation with a Java Library. IEEE Software. 2014.