Sunday, March 27, 2016

Should software developers be replaced by the crowd?

by Thomas D. LaToza (@ThomasLaToza) and André van der Hoek (@awvanderhoek)
Associate Editor: Christoph Treude (@ctreude)

We value our professional software developers, hire them in teams (local or remote), and set off on our merry way producing projects. This has been the status quo for a long time. But does it need to be? Could the crowd supplant the need for explicitly hired teams? More strongly yet: should the crowd perhaps replace hired employees?

These are tantalizing questions, both in terms of excitement about possibilities and in terms of 'fear' regarding the significant disruption that may play out in the workplace. Here, we consider several aspects of the issue of crowds versus hired employees.

1. Isn’t this just open source?

According to Howe [1], crowdsourcing is “the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call”. In software, open source is one of the oldest and most successful crowdsourcing models. But all crowdsourcing is not open source. Platforms such as TopCoder employ a competition model, in which developers respond to a client request and clients select and reward a winner. Others such as trymyUI offer to the crowd short, well-defined tasks that, when taken together, compose into a larger whole – for instance comprehensively testing the usability of an application. Labor markets such as UpWork offer the promise of on-demand, fluid labor forces with the skills necessary for the job at hand, with workers ready to take on specialized work for an hour, a day, or maybe a week.

2. What about the speed of development?

As the old adage goes, many hands make light work. Fundamental to many crowdsourcing models is the decomposition of a large task into many small tasks. In principle, decomposed tasks enable work to be completed dramatically faster, as they enable parallel distribution to and completion of tasks by the crowd. But decomposition itself brings challenges: communicating tasks to developers, coordinating contributions made by the crowd, and ensuring smooth handoffs between tasks. In practice, these competing forces have given rise to a wide diversity of competing systems and platforms, using different granularities of tasks to crowdsource a variety of software development activities.

3. What about quality?

One common concern about opening contributions to anyone is the quality of the work that will result. As in traditional development, there are many approaches to managing quality. Work is almost always reviewed, either by the client, an agent of the client, or the crowd itself. Many platforms explicitly track reputation, allowing clients to identify those who have demonstrated quality work on similar tasks in the past and motivating developers to do high quality work to maintain their reputation. Competition platforms use the crowd to generate competing alternatives, allowing the client to select the highest quality solution. Through these mechanisms, it may, in fact, be possible to achieve higher quality through crowdsourcing, as some of our work has demonstrated [2]. On the other hand, it has been observed that, without proper management, quality may well suffer; care, thus, must be taken to focus on quality from the start [3].

4. What about crowdsourcing environments?

A significant innovation many crowdsourcing approaches bring is their dedicated support for performing crowdsourced work. Crowdsourcing platforms often provide, directly in the environment, support for contributors to browse and identify tasks matching their interests or that have the greatest chance of a reward. Clients may browse reputation information about potential workers, browse submitted contributions, and use the platform to make and manage payments. Systems that offer fine-grained tasks may go further still, offering workers self-contained tasks that can be quickly completed within the environment itself. For example, CrowdCode [4] offers an online environment for completing programming microtasks.

5. What about more complex development tasks?

Another common question is how a complex task such as architecture or design—requiring knowledge of the project as a whole or with dependencies making decomposition difficult—could ever be crowdsourced? Of course, one approach is simply to not decompose such tasks. For example, some TopCoder tasks are architectural in nature, asking competing workers to create design documents. An experienced crowd worker called a “co-pilot” is then responsible for creating, coordinating, and managing the tasks to implement the architecture. However, this can bring challenges in imposing a waterfall development process, where an architecture is built up-front, independent of future programming work [3]. Alternatively, it may be possible for the crowd itself to be more involved, decomposing architectures into networks of connected decisions [5].

6. Can I help?

Crowdsourcing is already penetrating software development practice today [6][7]. TopCoder has hosted more than 427,000 software design, development, and data science competitions. More than 100,000 testers freelance on uTest. But we believe that this may be just the beginning [8]. Just as open source changed how many organizations do software development and the nature of organizations themselves, crowdsourcing opens the door to new types of contributions and new ways stakeholders may interact that may lead to new models of software development. To help understand where these models may fail and where they may succeed, we have begun to study crowdsourcing approaches through a variety of experiments, from debugging to programming to designing. If you’d like to see how such approaches may work and help discover new ways of developing software, sign up here.


[1] James Surowiecki. (2005). The Wisdom of Crowds. Random House.
[2] Thomas D. LaToza, Micky Chen, Luxi Jiang, Mengyao Zhao, and André van der Hoek. (2015). Borrowing from the crowd: a study of recombination in software design competitions. International Conference on Software Engineering, 551-562.
[3] Klaas-Jan Stol and Brian Fitzgerald. (2014). Two's company, three's a crowd: a case study of crowdsourcing software development. International Conference on Software Engineering (ICSE), 187-198.
[4] Thomas D. LaToza, W. Ben Towne, Christian M. Adriano, and André van der Hoek. (2014). Microtask programming: building software with a crowd. Symposium on User Interface Software and Technology (UIST), 43-54.
[5] Thomas D. LaToza, Arturo Di Lecce, Fabio Ricci, W. Ben Towne, and André van der Hoek. (2015). Ask the crowd: scaffolding coordination and knowledge sharing in microtask programming. Symposium on Visual Languages and Human-Centric Computing (VL/HCC), 23-27.
[6] Thomas D. LaToza and André van der Hoek. (2016). Crowdsourcing in Software Engineering: Models, Motivations, and Challenges. IEEE Software, 33, 1 (January 2016), 74-80.
[7] Ke Mao, Licia Capra, Mark Harman, and Yue Jia. (2015). A survey of the use of crowdsourcing in software engineering. Technical Report RN/15/01, Department of Computer Science, University College London.
[8] Thomas D. LaToza and André van der Hoek. (2015). A vision of crowd development. International Conference on Software Engineering (ICSE), 563-566.

Sunday, March 13, 2016

Improving Bug Reporting for Mobile Applications

by: Kevin Moran@P10LeGacY 
Associate Editor: Federica Sarro@f_sarro

We currently live in the era of mobile computing. It is conceivable, as smartphones and tablets gain worldwide popularity, that many individuals’ primary computing device will soon be a mobile “smart” device. A large contributor to this phenomenon is the continued rise of an “App Store” economy where users turn to increasingly complex applications to accomplish mainstream computing tasks that were previously accomplished primarily in a desktop computing environment.
This has created a fiercely competitive software marketplace where, if an application is not performing as expected due to bugs, crashes, or a lack of customer desired features, nearly half of users will simply abandon it for a similar application [1]. Therefore, for an application to be successful in a modern mobile ecosystem, development teams must be extremely efficient and effective during the software maintenance cycle, pushing regular updates to consumers. Unfortunately, several mobile-specific confounding factors such as hardware and platform fragmentation, API-instability and fault-proneness [2], and difficult-to-test features (e.g., sensors) make for a particularly challenging maintenance process.

Perhaps the most important software maintenance activity is bug report resolution. However, the manner in which bugs are currently reported to developers is far from ideal. For apps released in mobile marketplaces, most developers and teams must sort through potentially thousands of user-reviews where users may or may not describe bugs in a manner sufficient to allow bug reproduction and resolution. Additionally, a recent study has shown that only a reduced set of user reviews can be considered useful or informative to developers [3]. Open source applications or commercial applications undergoing internal or beta testing will often make use of existing issue tracking systems (e.g., Bugzilla, Mantis, JIRA, GitHub Issue Tracker) to report information related to mobile bugs and crashes. However, these bug reporting systems rely mostly on unstructured natural language bug descriptions and lack key features that would better support and facilitate accurate bug reporting for mobile applications.

A previous study has shown that there are typically three key types of information that developers consider most helpful when comprehending and fixing issue reports [4]: 1) detailed reproduction steps, 2) stack traces, and 3) re-playable test cases or scenarios. Unfortunately, this information is also typically the most difficult for reporters to provide, and in the context of mobile applications, is further complicated by a highly GUI-event driven nature. This highlights the typical lexical gap that exists between reporters of bugs and the developers that are attempting to read and fix the reports. Reporters, particularly end users and beta testers, typically have only a working functional knowledge of an application, whereas developers have intimate code-level knowledge; thus the primary task that a bug reporting system must accomplish is the bridging of this gap by providing the types of information listed above. The whole of these current issues highlight the need for an improved reporting mechanism for mobile applications.

The SEMERU research team has taken the first step towards addressing these problems by developing a novel bug reporting mechanism called Fusion [5] that operates under the following key insight: automated program analysis techniques can be used to bridge the lexical knowledge gap between reporters and developers. Currently, issue tracking systems operate with essentially no prior knowledge of the application that they support, and given the information that can be gleaned from automated program analysis techniques, we saw a natural opportunity to create a “smart” bug reporting mechanism for mobile apps. Fusion uses fully-automated static and dynamic analysis techniques to gather screenshots, GUI-component information, and app event-flow information, linking dynamically extracted information back to program source code to help facilitate users in reporting issues and aid developers in resolving bugs. During the reporting process, users leverage a smart web-form to fill out the reproduction steps for a bug using an <action, component> formula where the action is the touch event (e.g. tap, long-tap) and the component is the GUI component on the screen (e.g. button, checkbox). The form tracks the location of the user in the high level event-flow of the application, making auto completion suggestions to guide the reporting (e.g. if the user reports that they clicked on a button that leads to a new app screen, the system will only suggest components from the new screen). In the end, the developer is presented with a report that includes detailed reproduction steps, screenshots, and traceability links back to the source code.

Figure 1: An Example Fusion Bug Report

In a user study with over 20 participants conducted by our lab, we found that users of Fusion system generated more reproducible bug reports for 14 real-world open source Android applications than a comparable system, the Google Code Issue Tracker. Particularly encouraging is that a subset of users in our study had no background in coding or computer science, but were still able to construct detailed, actionable reports. We believe that this is the first step towards creating more effective bug reporting systems for mobile and GUI-based applications. However, these reports could enable much more. The rich information contained within the reports could enable more effective methods of supporting other typically expensive or difficult maintenance tasks such as developer triaging or duplicate bug report detection.

You can access the Fusion tool, documentation, and demonstration videos at

[1] Mobile apps: What consumers really need and want
[2] Linares-Vásquez, M., Bavota, G., Bernal-Cárdenas, C., Di Penta, M., Oliveto, R., and Poshyvanyk, D., "API Change and Fault Proneness: A Threat to the Success of Android Apps", in Proceedings of 9th Joint Meeting of the European Software Engineering Conference and the 21st ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE’13)
[3] N. Chen, J. Lin, S. Hoi, X. Xiao, and B. Zhang. AR-Miner: Mining informative reviews for developers from mobile app marketplace. In 36th International Conference on Software Engineering (ICSE’14) 

[4] N. Bettenburg, S. Just, A. Schro ̈ter, C. Weiss,
R. Premraj, and T. Zimmermann. What makes a good bug report? In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering, (SIGSOFT ’08/FSE-16)
K. Moran, M. Linares-Vásquez, C. Bernal-Cárdenas, and D. Poshyvanyk. Auto-completing bug reports for android applications. 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE’15)

Sunday, March 6, 2016

Release Management of Mobile Apps

Associate Editor: Federica Sarro, @f_sarro

Among all the stories reflected by news agencies and social networks on Syrian refugees, the impact of mobile apps on their journey, in particular WhatsApp and Telegram, was echoed all over. During the tough journey they had, refugees connected to each other, shared their experience, reported the border’s traffic and updated their family about their health using VOIP apps …: “Without a cell phone [and internet] you are lost on the road” [2].

Ranging from Sukey, built to save British student protesters in 2011, to Angry Birds, which brings entertainment, mobile apps have a major impact on business, economy, health, politics, education … in short, on our whole life. Individual software engineers and their products have never been that close to the day to day life of regular people as they are today. This impact has been expected to also change the decision process for software development and evolution [3].

In a recently published study [1], we wanted to better understand the release decision process in mobile apps and their impact on users. Unfortunately, despite a recent surge of research interest in release engineering [6], the release engineering decisions and activities for mobile apps are still not well understood. To answer questions such as "how are release decisions for mobile apps made?", "how do users think about new releases?", we reached out to 36 developers and 674 users and found that release management of mobile apps is different from that of proprietary software.

First of all, the majority of mobile app developers, especially those with a large experience, has a clear rationale for deciding when to release a new version of their mobile app. We identified six release strategies, i.e., a time-based strategy (release schedule) to release at a specific point in time, marketing strategies to get more visibility, quality (test) driven strategies to test with parts of the customer base or achieve a specific test coverage, a feature-based strategy to release hot features ASAP, a size-based strategy to keep the size of the update fixed, and/or an occasional strategy to release an app concurrently with the launch of well-known mobile devices.

Second, as shown in the diagram above, we made a range of observations about how app developers perceive new releases. Developers believe that apps with frequent updates do not necessarily require more development effort, or introduce substantial changes to an app. Indeed, the majority (61.1%) of developers believe that apps with frequent updates (more than once per 3 weeks) deliver less functionality and changes in each version. However, there was no consensus if frequent releases affect customer feedback, i.e., whether they increase or decrease an app’s rating or download volume.

Third, our study on 674 users showed that mobile app users are not only aware of app updates but they also observed the presence of regularity in the release cycle of their apps. Users stated mixed feelings toward frequent app releases. They like apps with frequent updates but at the same time, frequent updates may be discouraging and could provoke them to uninstall apps. This might be why only half of the users turn on automatic updates. The summary of our findings from studying app users is presented in the diagram below.

Also, as part of the study, we identified a range of problems that users faced after updating their apps, as shown in the following picture. Finding solutions for these real problems is definitely of value.

The findings of this study pose several challenges that could be addressed by the research community:
Challenge 1. The impact of release frequency on mobile app users’ needs further study and analysis.
Challenge 2. The relation between release frequency and app quality is unclear.
Challenge 3. While the developers believe that release strategy and frequency do not affect development practices and total effort, empirical analysis needs to go beyond app stores to compare with traditional apps (desktop, server, web, etc.).

To sum up, app release practices are different from non-app software that researchers used to study. App developers have different strategies and considerations compared to non-app software and in some cases those strategies were not known before (such as size-based strategies). Moreover, only half of the users allow automatic updates of apps, hence each app release needs to earn people’s attention. Given that the release strategy is a success factor for mobile apps, the effect of release practices such as frequency of releases needs further elaboration.

You can find more details of this study in our paper [3]. You may also consider attending the paper presentation at the SANER 2016 conference.


[1] M. Nayebi, B. Adams, and G. Ruhe. “Release Practices for Mobile Apps – What do Users and Developers Think?”, Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Re-engineering, SANER 2016.

[2] WhatsApp offers lifeline for Syrian refugees on journey across Europe,, last access February 2016.

[3] W. Maalej, M. Nayebi, T. Johann, and G. Ruhe. "Towards Data-Driven Requirements Engineering", IEEE Software, Jan 2016.

[4] S. McIlroy, N. Ali, and A. E. Hassan. "Fresh apps: An empirical study of frequently-updated mobile apps in the Google play store", Empirical Software Engineering, pp. 1-25, 2015.

[5] M. William, G. Sarro, and M. Harman. "Causal Impact Analysis Applied to App M. in Google Play and Windows Phone Store", RN 15/07, (2015).

[6] B. Adams, and S. McIntosh. "Modern Release Engineering in a Nutshell - Why Researchers Should Care" in Leaders of Tomorrow: Future of Software Engineering, Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Re-engineering, SANER 2016.