Sunday, November 27, 2016

Performance and the Pipeline

- How can performance analysis keep up with ever faster and more frequent release cycles in the DevOps world?

by Felix Willnecker (@Floix), fortiss GmbH, Germany, Johannes Kroß, fortiss GmbH, Germany, and André van Hoorn (@andrevanhoorn), University of Stuttgart, Germany
Associate Editor: Zhen Ming (Jack) Jiang, York University, Canada

Back in the “good old days”, a release occurred every month, quarter, or year—leaving enough time for a thorough quality analysis and extensive performance/load tests. However, these times are coming to an end or are almost over. Deploying every day, minute, or every couple of seconds becomes the new normal [1]. Agile development, test automation, consequent automation in the delivery pipeline and the DevOps movement drive this trend that conquers the IT world [2]. In this world, performance analysis is left behind. Tasks like load tests take too long and have a lot of requirements on the test and delivery environment. Therefore, performance analysis tasks are nowadays skipped and performance bugs are only detected and fixed in production. However, this is not a willful decision but an act from necessity [3]. The rest of this blog post is organized as follows: First, we outlines the three strategies on including performance analysis in your automatic delivery pipeline without slowing down your release cycles. Then we introduce the accompanied survey to find out how performance concerns are currently addressed in industrial DevOps practice. Finally, we conclude this blog post. 

Strategy # 1: Rolling back and forward

The usual response that we get when talking about performance analysis in a continuous delivery pipeline is: “Well, we just roll back if something goes wrong”. This is a great plan: in theory. In practice, this often fails in emergency situations. First of all, this strategy requires not only a continuous delivery pipeline but also an automatic rollback mechanism. This is pretty easy on the level of an application server (just install release n-1), but is getting harder with databases (e.g., legacy table views for every change), and almost impossible if multiple applications and service dependencies are involved. Instead of rolling back, rolling forward is applied. Which means, we deploy as many fixes, until the issue is resolved. Such emergency fixes are often developed in a hurry or in war room sessions. When your company introduced continuous delivery pipeline they often promised that these war room sessions come to an end, just by releasing smaller incremental artifacts. Truth is, in case of emergency Murphy’s Law applies, your rollback mechanism fails and you spend the rest of the day/night resolving this issue. 

Strategy # 2: Functional tests applied on performance

Another common strategy is using functional tests and derive some metrics that act as indicator for performance bugs. Measuring the number of exceptions or SQL statements during a functional test and comparing these numbers with a former release or baseline are common practice. Some tool support like the PerfSig utilizing Dynatrace AM exist to automate such analysis using the Jenkins build server [4]. This approach acts pro-actively, so issues can be detected before release and requires no additional tests, just some tooling and analysis software in your delivery pipeline. However, the impact on the performance of your application are vague. Resource utilization or response time measurements conducted during short functional tests usually delivery no meaningful values, especially if the delivery pipeline runs in a virtualized environment. Exceptions and SQL statements act as an indicator and may reduce the number of performance issues in production but won’t identify a poorly developed algorithm.

Strategy # 3: Model-based performance analysis 

Performance models have their origin in academia and today are only rarely adopted by practitioners. However, such models can help to identify performance bugs in your software, without adding new tests. Nowadays, performance model generators exist that derive the performance characteristics of an application directly from a build system [5]. These approaches rely on measurements on operation and component level and require a good test coverage. A complete functional test run should execute each operation multiple times so that these generators can derive resource demands per operation. Changes in the resource demands indicate a performance change either for good (decreased resource demand) or for worse (increased resource demand). The main advantage compared to simple functional test analysis, is that a complete set of tests is analyzed and multiple runs of the same test set are supported. However, major changes in the test set, may require a new baseline for a model-based analysis.


To identify and capture the current state-of-the-art of performance practices, but also present problems and issues, we have launched a survey that we would like to promote and encourage you or your organization to participate in. We would like to find out how performance concerns are currently addressed in industrial DevOps practice and plan to integrate the impressions and results in a blueprint for performance-aware DevOps. Furthermore, we would like to know whether classical paradigms still dominate in your organization, at what stages performance evaluations are conducted, what metrics are relevant for you, and what actions are applied after a performance evaluation.

Our long-term aim is to not only conduct this survey once, but to benchmark the state-of-the-art continuously, compare the results over a longer period, and to regularly incorporate outcomes to our blueprint. The results of this survey will be incorporated into our bigger project of building the reference infrastructure for performance-aware DevOps and helps to understand DevOps in industry today. 


Classical performance and load test phases may vanish and never come back. However, current strategies on reducing the risks of performance issues in production have a number of disadvantages. Rollback mechanisms might fail, functional tests only deliver indicators, and model-based evaluations lack of industrial tool support. Most of the time, performance does not receive enough or even any attention. In our opinion, this is primarily due to the fact that present performance management practices are not integrated and adapted to typical DevOps processes, especially in terms of automation and holistic tool support.



  1. J. Seiden, Amazon Deploys to Production Every 11.6 Seconds,
  2. A. Brunnert, A. van Hoorn, F. Willnecker, A. Danciu, W. Hasselbring, C. Heger, N. Herbst, P. Jamshidi, R. Jung, J. von Kistowski, A. Koziolek. Performance-oriented devops: A research agenda. arXiv preprint arXiv:1508.04752. 2015.
  3. T-Systems MMS, PerfSig-jenkins.,
  4. M. Dlugi, A. Brunnert, H. Krcmar. Model-based performance evaluations in continuous delivery pipelines. In Proceedings of the 1st International Workshop on Quality-Aware DevOps. 2015.

If you like this article, you might also enjoy reading:

    1. L. Zhu, L. Bass, G. Champlin-Scharff. DevOps and Its Practices. IEEE Software 33(3): 32-34. 2016.
    2. M. Callanan, A. Spillane. DevOps: Making It Easy to Do the Right Thing. IEEE Software 33(3): 53-59. 2016.
    3. Diomidis Spinellis. Being a DevOps Developer. IEEE Software 33(3): 4-5. 2016.

    No comments:

    Post a Comment