Monday, January 16, 2017

IEEE November/December Issue, Blog, SE Radio Summary

November/December Issue

IEEE Software Magazine

The November/December issue of IEEE Software offers a variety of relevant and interesting topics in the software world. From hot topics like crowdsourcing and agile to thought-invoking discussions on how research translates to practice, this issue spans a wide range of topics. Tying together all the articles in this issue is an article on telling the story of computing and the role computer plays in the art of story telling. We as software engineers are artists; specializing in the art of technology and "using our software and our hardware as our brush and our canvas".

Featured in this issue are two articles on the changes in the software world that affect that developer and end user:
  • "A Paradigm Shift for the CAPTCHA Race: Adding Uncertainty to the Process" by Shinil Kwon and Sungdeok Cha, where the authors propose ways to improve CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) challenges for increased human ability and decreased bot ability to solve these challenges; and 
  • "Examining the Rating System Used in Mobile-App Stores" by Israel J. Mojica Ruiz, Meiyappan Nagappan, Bram Adams, Thorsten Berger, Steffen Dienst, and Ahmed E. Hassan,  in which the authors explore how accurately user ratings in app stores maps to actual user satisfaction levels with mobile apps.

A large portion of the papers in this issue discuss the artistry of the software architect and how the role of software architect has been changing, and will continue to change, with the changes in technology:

On one side, as technology changes, the importance of the role of the software architect increases. In "The Changing Role of the Software Architect," Editor in Chief Diomidis Spinellis discusses this phenomena in some detail. As software evolves to play a more ubiquitous role in our lives and store more critical and personal information, the design of our software and systems becomes even more vital to the potential for quality, secure transactions. For example, software architecture plays a direct role in the ability for attackers to find and manipulate attack surfaces, or the places where enemies can target their attacks on a given system. This is such an important topic that research has been devoted to approximating and minimizing attack surfaces [1, 2, 3]. Although approximating an attack surface isn't necessarily an architecture problem, minimizing them is. Having the power to determine the design of a system, especially a critical system, is one that should not be taken lightly.

But as Benjamin Parker warned Spider-man, "with great power comes great responsibility". If software architects are becoming more important to the software development and maintenance process, it naturally follows that responsibilities can, and probably should, change. But how? Articles in this issues make some suggestions. For example, Rainer Weinreich and Iris Groher propose one change to the responsibilities of the software architect in their article "The Architect's Role in Practice. From Decision Maker to Knowledge Manager?". The authors interviewed practitioners to learn about how the role of the architect has transformed. Architects are typically tasked primarily, if not solely, with making decisions regarding the design of the target system. However, they discovered that there are additional responsibilities that come with being a software architect, such as advisor and knowledge manager. All the practitioners the authors interviewed agreed that when it comes to knowledge management it is particularly important to document project-specific decisions. With the changes to the software architect role, there is a growing need for tools and guidelines to support their daily activities. Are we up for the challenge??

IEEE Software Blog

In the past couple months, the IEEE Software Blog covered some interesting and practically relevant topics. New to the blog are postmortems, modeled after Postmortems in, where we give companies an opportunity to discuss what is working and what challenges remain for software developers. December features the company Deducely. Along the same lines, there are blog posts regarding various aspects of the software development process, including using creativity in requirements engineering and how to identify and avoid code smellsAlso featured in the November/December blog entires is a blog on the panel titled "The State of Software Engineering Research", which was held last year at FSE 2016.

SE Radio

Featured for this issue of IEEE Software on SE Radio are topics ranging from soft skills, such as salary negotiation, to hard skills, like site reliability engineering and software estimation. Invited guests include Steve McConnell, Sam Aaron, Josh Doody, Björn Rabenstein, Gil Tene, and Peter Hilton. Also, SE Radio welcomed two new members to the SE Radio team: Marcus Blankenship and Felienne Hermans

[1] Theisen, C., Herzig, K., Morrison, P., Murphy, B., & Williams, L. (2015, May). Approximating attack surfaces with stack traces. In Proceedings of the 37th International Conference on Software Engineering-Volume 2 (pp. 199-208). IEEE Press.
[2]Bartel, A., Klein, J., Le Traon, Y., & Monperrus, M. (2012, September). Automatically securing permission-based software by reducing the attack surface: An application to android. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering (pp. 274-277). ACM. 
[3] Manadhata, P. K., & Wing, J. M. (2011). An attack surface metric. IEEE Transactions on Software Engineering37(3), 371-386.

Sunday, January 8, 2017

BarraCUDA in the Cloud

by W.B. Langdon, UCL and Bob Davidson, Microsoft
Associate Editor: Federica Sarro (@f_sarro), University College London, UK 

What is this, flying fishes? Well no. BarraCUDA is the name of a Bioinformatics program and the cloud in question is Microsoft’s Azure, which is in the process of being upgraded with copious nVidia K80 tesla GPUs which support CUDA in instances of virtual machines. BarraCUDA has been around for a few years [1]. It is a port of BWA [2] which takes advantage of the massive parallelism available on graphics hardware (GPUs) to greatly speed up approximate matching of millions of short DNA strings against a reference genome. For example, the human reference genome [3]. Approximate matching is necessary, because of noise but primarily because the medical purpose of many DNA scans is to reveal differences between them and “normal” (i.e. reference) DNA. A typical difference is to substitute one character for another, but tools like BarraCUDA also find matches where a character is inserted and where one is deleted. Although there are many sources of DNA data, BarraCUDA and similar programs are targeted at strings generated by “Next Generation Sequencing” (NGS) machines. These are amazing devices. A top end NGS machine is now capable of generating more than a billion DNA strings, sequences of A, C, G or T letters. Part of the trade-off for this speed is the strings are short (typically a hundred letters long) and noisy. The first step is to find where the short fragments of DNA came from by aligning the strings against a reference genome. To account for the various sources of noise, NGS is usually run with three fold redundancy and sometimes a particularly important part of a person’s genome may be scanned ten or more times. Given multiple alignments to the same part of the reference genome, it becomes possible to look for consistent variations.

BWA, BarraCUDA and Bowtie are members of a family of Bioinformatics tools which have proved successful because they are able to compress the human reference genome into less than 4 gigabytes of RAM, making it possible to run an important part of the DNA analysis tool chain on widely available computers. Indeed in the case of BarraCUDA, GPUs with 4GB are also widely available. Recently BarraCUDA was optimised using genetic improvement[4,5] (see blog posting February 3, 2016).  This updating prompted the question was it possible to use BarraCUDA with epigenetics data.

To grossly oversimplify, whilst (to a good approximation) all the cells in your body contain the same DNA, what makes your cells different from each other is how that DNA is used. It is thought that to a large extent how DNA is enabled and disabled is controlled by epigenetic makers on that DNA itself. These epigenetic markers differ between cells. Indeed the markers change not only between cells but also with the person’s age and factors outside the cell. Since this is not fully understood, the study of epigenetics, particularly how it relates to disease is a very active topic. Much of the Next Generation Sequencing technology can be reused by epigenetics. However when matching epigenetic sequences against a reference, the reference is twice the size of the DNA reference. Fortunately this need has coincided with the launch of GPUs with larger memory (e.g. the Tesla K40 has 12GB). Which in turn has coincided with the introduction of Azure cloud nodes with multiple K40s or K80s. Recently we have been benchmarking [6] BarraCUDA on epigenetics data supplied by Cambridge Epigenetics on Azure nodes.

Data from Nvidia

At 30 Nov 2016 there were 1519 GPU articles in the USA National Library of Medicine
(PubMed). 1221 (80%) since the end of 2009.

[1]  P. Klus et al., BarraCUDA. BMC Res Nts, 5(27), 2012.
[2]  Heng Li, R. Durbin, Fast and accurate short read alignment with Burrows-Wheeler transform.  Bioinformatics 2009, 25(14):1754-1760.
[3]  Initial sequencing and analysis of the human genome. Nature 409, 6822, (15 Feb 2001), 860–921.
[4]  W.B. Langdon. Genetically improved software. In Amir H. Gandomi et al., editors, Handbook of Genetic Programming Applications, chapter 8, pages 181–220. Springer, 2015.
[5]  W.B. Langdon, Brian Yee Hong Lam, M. Modat, J. Petke, M. Harman. Genetic Improvement of GPU Software. Genetic Programming and Evolvable Machines. Online first.
[6] W.B. Langdon, A. Vilella, Brian Yee Hong Lam, J. Petke and M. Harman, Benchmarking Genetically Improved BarraCUDA on Epigenetic Methylation NGS datasets and nVidia GPUs. In Genetic Improvement 2016, (GECCO 2016) workshop, pages 1131-1132, 20-24 July, Denver.

You might also enjoy reading
Genetic Improvement, IEEE Software blog, February 3, 2016
GPGPUs for bioinformatics, Oxford Protein Informatics Group, April 17, 2013
Genome Sequencing in a Nutshell, Deborah Siegel and Denny Lee, May 24, 2016

Sunday, December 18, 2016

Postmortem: Deducely

by Aswin Vayiravan, Deducely (@Deducely
Editor: Mei Nagappan (@MeiNagappan), University of Waterloo, Canada

Editor's Note: In a new series of posts (modeled after the Postmortems in, we are looking into what worked great and what challenges remain for software developers. We hope to curate many such postmortems in the future! Do email me directly if you have ideas on improving this series of posts.  In the first post of this series, I reached out to a small startup company called Deducely in India.

About Us
Deducely is an AI-powered sales lead generation platform. Usually, software companies have a lead generation team that does a lot of manual work like researching, filtering and qualifying prospects before the sales pitch happens. We have a tool that would remove the burden of repeated manual work in these teams. We use Tensor Flow to learn and track patterns and categorize leads. Also, we use NLTK to learn and extract the required information from unstructured data. We are a small bootstrapped startup headquartered in California but working out of a nondescript village in South India called Thiruparankundram.

What worked great for us?

  1. The SDLC model: Back in university we had meticulously studied the Pros and Cons of various Software Development Lifecycle models like the Waterfall model, Spiral model, and Agile model. etc. However, when I started my career in Freshdesk (A startup back then), it was surprising to find that I couldn’t fit the software development happening there into any one of these theoretical models! It was a hybrid of everything!  Similarly, in our startup Deducely we do not strictly follow any specific model, but the closest model that we could relate to would spiral model. We plan what has to be built, brainstorm the features with our customers and we finally start building one small module at a time, test it, release it and iterate.

  1. Development Platform: We are Linux lovers and to get Linux into our development devices, we use Vagrant. Although this technology might be a tad bit old, we are huge fans of it! It helps us in isolating various development environments. Even if we ruin the configuration of a particular vagrant environment, we can always spin up a new one from a previous snapshot. This gives us the freedom to go and SUDO without caring much about the consequences! We do not use any IDE. Atom is our text editor of choice.

  1. Maintaining the code: Git has become the de-facto version control system for code these days. Its decentralized approach takes some time to master but once you start making Git work for you the benefits it offers is unparalleled. Whilst many bigger companies use Github enterprise version or Atlassian Bitbucket, we went ahead with the Gitlab - an open source, self-hosted version control system with a plethora of features like revision control, continuous integration, a container registry, and an issue tracker.

  1. Tracking the tasks: It all happens with a very simple ToDo board on Trello. We are just two people (Myself and my co-founder Arun Kumar) working full time with a two more remote part-timers. We prioritize tasks and assign it to one person. Before we actually write and integrate different functions, we have a small chat about the function prototype and once the function is coded we run ad-hoc unit tests on the functions. Apart from this, the bugs in our code are tracked using GitLab’s inbuilt bug tracker.

Screen Shot 2016-12-10 at 12.11.29 AM.png
A screenshot of our task board

Screen Shot 2016-12-10 at 12.23.52 AM.png
A screenshot of the issues in one of our repositories

  1. Storing the data: Initially, there was a lot of arguments for using a conventional database like MySQL or PostgreSQL, but we settled with MongoDB because of its loosely typed Schema. All the queries in MongoDB are simple JS function calls and this is a huge advantage for a full stack JS company like ours. Also, say if we had to edit the schema of a MySQL table containing a billion records it would have been a plain disaster, but with MongoDB, we have better control over the schema and data types. Also backing up, and restoring the DB is fairly painless. Plus replication, fault tolerance and disaster recovery are made simple through MongoDBs replica sets. Though the mongoshell is the best way to connect to MongoDB, we prefer robomongo.

Our biggest challenges

  1. Callback Hell: We are Javascript aficionados, especially NodeJS. It gives us the ability to do multiple tasks in parallel instead of waiting for IO. We are a two person company and the NPM registry for NodeJS has a lot of trusted open source third party module that we generously incorporate while development. Also, the community support for NodeJS is very mature. If we face a problem, it can be solved in a matter of a few Google searches. However, writing code free of callback hell is a huge challenge not only with node but with any JS flavor. Callbacks, inside callbacks, makes the code unmaintainable and cluttered. We get around callback hell with a library called Bluebird. It makes code, more maintainable and certain code flows that are easily achievable in other languages aren’t easily possible in JS. Such code flows can be achieved with bluebird.

  1. Sequential execution: We had to scrape data out millions of web pages Javascript gave us a lot of power as it interacted natively with the DOM. However, it had one nasty side effect. In our case, we had to sequentially make millions of HTTP requests. In our initial days, I wrote a snippet to read the list of websites from the database and make those HTTP requests. Since NodeJS is non-blocking controlling the code flow was a huge challenge, and millions of HTTP requests went out but we were never able to get the responses back as these million requests went in one shot! At once! controlling this behaviour of NodeJS was the biggest challenge. JS isn’t very kind to sequential tasks and we work around this restriction via messaging queues - specifically RabbitMQ. It has a powerful API and an easy to use GUI to monitor the status of the queue.

Screen Shot 2016-12-10 at 12.57.51 AM.png

The RabbitMQ dashboard

Data Box
Developer Deducely
Platform Linux
Number of Developers 2 Full time and 2 part time
Length of development 1 year
Lines of Code 10K - 100K

Sunday, December 11, 2016

FSE 2016 Panel: The State of Software Engineering Research

by Matthieu Foucault, Carlene Lebeuf (@CarlyLebeuf), and Margaret-Anne Storey (@margaretstorey)- University of Victoria
Cross Posted from Margaret-Anne Storey's Blog.

The 2016 International Symposium on the Foundations of Software Engineering hosted a panel of prominent software engineering researchers moderated by Margaret-Anne Storey. The slides presented during the panel can be found here
Our panelists:

Tao Xie
University of Illinois at Urbana-Champaign
Tao is an ACM distinguished researcher. His Research focuses on automated software testing, mobile security, and software analytics.

Laurie Williams
North Carolina State University
Laurie is a founder of the Extreme Programming / Agile Conference. Her research focuses on software security, testing, and agile programming.

Peri Tarr
IBM Research
Peri is a principal research staff member at IBM TJ Watson Lab and a technical lead for Cognitive Tools and Methods at IBM. Her research focuses on software composition and aspect oriented software development.

Prem Devanbu
University of California at Davis
Prem started his career as an industrial software developer, then worked at Bell Labs and AT&T Research before beginning to teach at University of California at Davis. His research focuses on empirical software engineering, naturalness of software, and social analytics.

Lionel Briand
University of Luxembourg
Lionel currently leads the Software Verification and Validation Lab at the University of Luxembourg. He strongly advocates that research is practical to industry.
Our panelists were asked to reflect on three questions related to research in software engineering:
  • Do you believe our community as a whole is achieving the right balance of science, engineering, and design in our combined research efforts?
  • What new or existing areas of research do you think our community should pay more attention to?
  • Do you have novel suggestions for how we could improve our research methods to increase the impact of software engineering research in the near and distant future?
Each panelist was asked to briefly present their thoughts on these questions. Then we opened the floor to questions, and the rest of the panel was dedicated to a discussion between panelists and members of the audience. Our summary of the panel discussion focuses on the panelist responses to the three questions posed as well as the themes that emerged from their responses.

Balancing Science and Engineering

A common theme that quickly emerged was the importance of the role of industry in research. To kickstart the group discussion, panelists were asked to reflect on a statement made by Jan Bosch of Chalmers University at a research conference a few weeks earlier:
“Research does not start in universities anymore, it starts in industry.”

A quick show of hands at the conference demonstrated that the majority of people in attendance seemed to agree with Jan Bosch’s claim. Williams also agreed with this statement and expanded it further by stating that “research starts in industry, because that’s the context”. Briand felt that because “we are in a discipline where most of the phenomenon we are studying cannot be reproduced in a lab environment”, as software engineers, “our lab is the industry”.
All panelists agreed that collaborating with practitioners (not only industry, but also open-source communities, governments, etc.) is essential to solve real problems. Williams drew connections between research in software engineering and biology:
“If we try to come up with problems that we think are interesting, that would be similar to a biologist never going outside. We have to go out there and see the problems that they have and then help with it.”
However, even if practitioners are aware of these problems, they may not be able to solve them. Briand mentioned that a lack of expertise and a lack of freedom to look at novel solutions might be to blame. Devanbu observed that researchers have this advantage:
“As a researcher, you can have a broader perspective that spans over several languages, and not only try to generalize observations, but also find effects that are only observable at an ecosystem level. It’s not only a question of freedom, but also of perspective that industrials don’t have because they are not considering different projects at the same time.”
Xie suggested we engage in practitioners in research that is currently outside of their scope:
“If we show [practitioners] things outside of their scope (that in the longer term may be important), they may be more open, and may engage in collaborations with academic researchers, […] along with providing data, problems, or discussions.”
Members of the audience, namely Daniel Jackson (Massachusetts Institute of Technology) and Tom Ball (Microsoft Research), emphasized that a balance is needed, and that looking at basic science should not be left out. Notable examples, such as UNIX, Simula, ALGOL 60, and distributed systems, were not the product of massive empirical studies, but of academic researchers sitting in a room and brainstorming.
The discussion above illustrates the importance of making a conscious effort to reach out to practitioners. However, this is not an easy task and it requires real commitment and patience from researchers, as Peri Tarr mentioned:
“One of the problems that we face all the time as industrial researchers is gaining the trust of the people whose problem we’re going to help them solve […]. It can take months or years to get on the same page with the people who have a problem, to establish that yes, you’re looking for a way to solve their problem that will actually work for them, within, as Lionel points out, their real-world constraints.”
The audience (at FSE and listening to the broadcast) questioned (via Twitter) about our role as researchers and how we collaborate with industry, for example:

More discussion on these questions is needed! We invite you to participate in the blog discussion below.

Paying Attention to Other Areas of Research

In their opening statements, all panelists mentioned other areas of research that our community should look at.
Devanbu mentioned DevOps and IoT as other areas that the SE community tends to neglect. Xie mentioned SE research results that had a broad impact outside of the SE community, such as symbolic execution, delta debugging, or Representational State Transfer (REST). Xie further suggested that we consider more of the societal impact of SE research, advocating for a “bigger social responsibility” for researchers. He referred to the previous day’s keynote from Margaret Burnett about gender inclusiveness of software, and cited David Notkin’s 2013 quote:
“Anybody who thinks that we are just here because we are smart forgets that we’re also privileged, and we have to extend that further. So we have got to educate and help every generation”
Williams addressed the problem of cybersecurity as one of the main challenges for our community, stating that “we haven’t yet provided software engineers the means to write secure code without impacting their own workflow.” She said that software engineering researchers need to “situate [their] work in this world where there is someone working against [them], whether it’s an attacker or someone doing something they aren’t supposed to.” The second research area Williams highlighted was “agile software development on steroids”; the world of continuous integration, continuous deployment, devOps, continuous experimentation, testing in production, etc. We need to explore ways of adopting these practices as well as understand their benefits and the risks they introduce.
Tarr insisted on focusing our research efforts at the intersection of Software Engineering and other “high impact, societally important, value creation areas”, such as health care, environment, cognitive sciences, security and privacy, and education. She said, “In every one of these areas, these people are trying to get new generations of software done, but they don’t know how to do it […]. We desperately need software engineers at the intersection of these areas.” She noted that the traditional areas of software engineering research are now being driven by practitioners and that, as researchers, we are privileged to have the opportunity to take bigger risks that lead to bigger rewards. We “shouldn’t be working in areas where we aren’t afraid to fail”.
Briand considers that, although all topics covered by our community are relevant to practitioners, our “research is largely disconnected from practical engineering needs and priorities” and we “fail to recognize the variations across domains and contexts”. The needs and constraints of people developing software across these varying domains are completely different and “there is no such things as a universal solution to any software engineering problem”. In the domain of software engineering, our working assumptions and contextual factors make a huge difference. Because there is a disconnect from particular needs and priorities, there is a gap in the research literature – “the gap between what I needed and what I could find was significant” – that is too large to deal with.
What are your thoughts on the panelists’ suggestions for future software engineering research directions? Do you agree or disagree? Or do you suggest other areas we should pay attention to, e.g., are there other disciplines we should apply our results to, as the tweet below suggests? Let us know in the discussion below!

Widening Our Vision of What is Research

While discussing whether we need to pay more attention to different areas of research, the panelists were asked to comment on Jane Cleland-Huang’s (University of Notre Dame) tweet regarding fostering more diverse areas of research:

Cleland-Huang commented on her tweet by adding:
“It is easy for us as a community to lock into the same area. For example a lot of people do research that benefits from open-source systems, but other areas [are left out], such as immersive studies in industry, or areas where I do research in, such as requirements and traceability, where datasets are not so available. If we want to make a difference in those areas, what do we, as a community, need to do in terms of the review process and encouraging that kind of research?”
As discussed in the previous sections, if we want to do impactful research, we need to reach out to practitioners and look at the intersection of software engineering and other fields. However, this cannot happen because if, as Xie mentions, we keep a “narrow-minded definition of what is a research contribution”, a large number of papers that may have a high impact on industry will not make it in our venues. Too often our community rejects contributions that look at real-world problems because it’s not research, it’s engineering. A notable example of this is William’s comment regarding her research on agile software development: “My research and the research my lab did was initially rejected by the community because they considered that practitioners shouldn’t do that (using agile methods). But they were doing it, so we have to accept what practitioners are doing on a widespread basis.”
Xie mentioned that “we don’t have enough expertise or experience in the program committees to really judge whether there is a real problem or not.” Tarr furthered this with, “As a community, one of the most important things that we can do is to […] start establishing norms and bars for people who are conducting high risk, important research in important places and are going out into the world to get this information.”
In response to a question posed by Storey regarding how we know when our methods have crossed the line from research to pure engineering, Williams gave the advice that when we are shifting more to the engineering side of things, we should take a step back and reframe the problem in a more scientific way. For example, we can ask “what are the independent variables?” that will allow us to switch to a scientific way of thinking about our research.
This discussion is related to one tweet we received ahead of the panel:

It would seem that our community may need to accept contributions that differ according to their engineering, scientific and design content, but if that is so, do we need to establish different criteria when assessing papers? Jonathan Bell suggested in a tweet that we consider not just evaluation approaches, but also our datasets and tools:

Finally, some discussion that occurred on Twitter suggests we rethink how our community considers negative results and that we look to how other research areas embrace not just positive results:

In summary, we wish to thank the conference organizers for suggesting this panel, and we thank the panelists and the FSE community for participating in this discussion! And we hope to continue the discussion in the comments below!

Sunday, November 27, 2016

Performance and the Pipeline

- How can performance analysis keep up with ever faster and more frequent release cycles in the DevOps world?

by Felix Willnecker (@Floix), fortiss GmbH, Germany, Johannes Kroß, fortiss GmbH, Germany, and André van Hoorn (@andrevanhoorn), University of Stuttgart, Germany
Associate Editor: Zhen Ming (Jack) Jiang, York University, Canada

Back in the “good old days”, a release occurred every month, quarter, or year—leaving enough time for a thorough quality analysis and extensive performance/load tests. However, these times are coming to an end or are almost over. Deploying every day, minute, or every couple of seconds becomes the new normal [1]. Agile development, test automation, consequent automation in the delivery pipeline and the DevOps movement drive this trend that conquers the IT world [2]. In this world, performance analysis is left behind. Tasks like load tests take too long and have a lot of requirements on the test and delivery environment. Therefore, performance analysis tasks are nowadays skipped and performance bugs are only detected and fixed in production. However, this is not a willful decision but an act from necessity [3]. The rest of this blog post is organized as follows: First, we outlines the three strategies on including performance analysis in your automatic delivery pipeline without slowing down your release cycles. Then we introduce the accompanied survey to find out how performance concerns are currently addressed in industrial DevOps practice. Finally, we conclude this blog post. 

Strategy # 1: Rolling back and forward

The usual response that we get when talking about performance analysis in a continuous delivery pipeline is: “Well, we just roll back if something goes wrong”. This is a great plan: in theory. In practice, this often fails in emergency situations. First of all, this strategy requires not only a continuous delivery pipeline but also an automatic rollback mechanism. This is pretty easy on the level of an application server (just install release n-1), but is getting harder with databases (e.g., legacy table views for every change), and almost impossible if multiple applications and service dependencies are involved. Instead of rolling back, rolling forward is applied. Which means, we deploy as many fixes, until the issue is resolved. Such emergency fixes are often developed in a hurry or in war room sessions. When your company introduced continuous delivery pipeline they often promised that these war room sessions come to an end, just by releasing smaller incremental artifacts. Truth is, in case of emergency Murphy’s Law applies, your rollback mechanism fails and you spend the rest of the day/night resolving this issue. 

Strategy # 2: Functional tests applied on performance

Another common strategy is using functional tests and derive some metrics that act as indicator for performance bugs. Measuring the number of exceptions or SQL statements during a functional test and comparing these numbers with a former release or baseline are common practice. Some tool support like the PerfSig utilizing Dynatrace AM exist to automate such analysis using the Jenkins build server [4]. This approach acts pro-actively, so issues can be detected before release and requires no additional tests, just some tooling and analysis software in your delivery pipeline. However, the impact on the performance of your application are vague. Resource utilization or response time measurements conducted during short functional tests usually delivery no meaningful values, especially if the delivery pipeline runs in a virtualized environment. Exceptions and SQL statements act as an indicator and may reduce the number of performance issues in production but won’t identify a poorly developed algorithm.

Strategy # 3: Model-based performance analysis 

Performance models have their origin in academia and today are only rarely adopted by practitioners. However, such models can help to identify performance bugs in your software, without adding new tests. Nowadays, performance model generators exist that derive the performance characteristics of an application directly from a build system [5]. These approaches rely on measurements on operation and component level and require a good test coverage. A complete functional test run should execute each operation multiple times so that these generators can derive resource demands per operation. Changes in the resource demands indicate a performance change either for good (decreased resource demand) or for worse (increased resource demand). The main advantage compared to simple functional test analysis, is that a complete set of tests is analyzed and multiple runs of the same test set are supported. However, major changes in the test set, may require a new baseline for a model-based analysis.


To identify and capture the current state-of-the-art of performance practices, but also present problems and issues, we have launched a survey that we would like to promote and encourage you or your organization to participate in. We would like to find out how performance concerns are currently addressed in industrial DevOps practice and plan to integrate the impressions and results in a blueprint for performance-aware DevOps. Furthermore, we would like to know whether classical paradigms still dominate in your organization, at what stages performance evaluations are conducted, what metrics are relevant for you, and what actions are applied after a performance evaluation.

Our long-term aim is to not only conduct this survey once, but to benchmark the state-of-the-art continuously, compare the results over a longer period, and to regularly incorporate outcomes to our blueprint. The results of this survey will be incorporated into our bigger project of building the reference infrastructure for performance-aware DevOps and helps to understand DevOps in industry today. 


Classical performance and load test phases may vanish and never come back. However, current strategies on reducing the risks of performance issues in production have a number of disadvantages. Rollback mechanisms might fail, functional tests only deliver indicators, and model-based evaluations lack of industrial tool support. Most of the time, performance does not receive enough or even any attention. In our opinion, this is primarily due to the fact that present performance management practices are not integrated and adapted to typical DevOps processes, especially in terms of automation and holistic tool support.



  1. J. Seiden, Amazon Deploys to Production Every 11.6 Seconds,
  2. A. Brunnert, A. van Hoorn, F. Willnecker, A. Danciu, W. Hasselbring, C. Heger, N. Herbst, P. Jamshidi, R. Jung, J. von Kistowski, A. Koziolek. Performance-oriented devops: A research agenda. arXiv preprint arXiv:1508.04752. 2015.
  3. T-Systems MMS, PerfSig-jenkins.,
  4. M. Dlugi, A. Brunnert, H. Krcmar. Model-based performance evaluations in continuous delivery pipelines. In Proceedings of the 1st International Workshop on Quality-Aware DevOps. 2015.

If you like this article, you might also enjoy reading:

    1. L. Zhu, L. Bass, G. Champlin-Scharff. DevOps and Its Practices. IEEE Software 33(3): 32-34. 2016.
    2. M. Callanan, A. Spillane. DevOps: Making It Easy to Do the Right Thing. IEEE Software 33(3): 53-59. 2016.
    3. Diomidis Spinellis. Being a DevOps Developer. IEEE Software 33(3): 4-5. 2016.