Wednesday, November 25, 2015

Understanding Runtime Value

- A Cost/Benefit Approach to Performance Analysis 


by David Maplesden, The University of Auckland, Auckland, New Zealand (@dmap_nz)
Associate Editor: Zhen Ming (Jack) Jiang, York University, Toronto, Canada


Many large-scale modern applications suffer from performance problems [1] and engineers go to great lengths searching for optimisation opportunities. Most performance engineering approaches focus on understanding an application's cost (i.e., its use of runtime resources). However, understanding cost alone does not necessarily help find optimisation opportunities. One piece of code may take longer than another simply because it is performing more necessary work. For example, it would be no surprise that a routine that sorted a list of elements took longer than another routine that returned the number of elements in the list. The fact that the costs of the two routines are different does not help us understand which may represent an optimisation opportunity. However, if we had two different routines which output the same results (e.g., two different sorting algorithms), then determining which is the more efficient solution becomes a simple cost comparison.
The key is to understand the value provided by the code. It is then possible to find the superfluous activity that characterises poor performance.
Traditionally it has been left to the engineer to determine the value provided by a piece of code through experience, intuition or guesswork. However, the challenge of intuitively divining runtime value is difficult in large-scale applications. These applications have thousands of methods interacting to produce millions of code paths. Establishing the value provided by each method via manual inspection is not practical with such scale and complexity.
To tackle this challenge we are developing an approach to empirically measure runtime value. We can combine this measure with traditional runtime cost information to quantify the efficiency of each method in an application. This allows us to find the most inefficient methods in an application and analyse them for optimisation opportunities.
Our approach to quantifying value is to measure the amount of data created by a method that becomes visible to the rest of the application, i.e., the data that escapes the context of the method. Our rationale is that the value a method is providing can only be imparted by the visible results it creates. Intermediate calculations used to create the data but then discarded do not contribute to this final value. Intuitively two method calls that produce identical results (given the same arguments) are providing the same amount of value, regardless of their internal implementations.
Specifically we track the number of object field updates that escape their enclosing method. An object field update is any assignment to an object field or array element (e.g., foo.value = 1 or bar[0] = 2). A field update escapes a method if the object it is applied to escapes the method i.e. is a global (static), a method parameter or returned.
For example consider the time formatting Java code below:
  public static String formatElapsedTime(long timeInMillis) {
    long seconds = timeInMillis / 1000;
    long minutes = seconds / 60;
    long hours = minutes / 60;

    final StringBuilder sb = new StringBuilder();
    formatTimePart(sb, hours, "hours");
    formatTimePart(sb, minutes, "minutes");
    formatTimePart(sb, seconds, "seconds");
    return sb.toString();
  }  

  public static void formatTimePart(StringBuilder sb, long l, String description) {
    if (l > 0) {
      if (sb.length() > 0) {
        sb.append(' ');
      }
      sb.append(l);
      sb.append(' ');
      sb.append(description);
    }
  }
The formatTimePart method updates the StringBuilder parameter (via calls to sb.append()) and so it has parameter escaping field updates. The formatElapsedTime method has no parameter escaping updates but it does return a newString value (constructed via sb.toString()) and so has returned field updates. Note that the StringBuilder object sb does not escape formatElapsedTime and so the updates applied to it are actually captured by the method, it is only the subsequently constructed String which escapes. We have found captured writes such as these to be a strong indicator of inefficient method implementations.
We have evaluated our approach [2] using the DaCapo benchmark suite, demonstrating our analysis allows us to quantify the efficiency of the code in each benchmark and find real optimisation opportunities, providing improvements of up to 36% in our case studies. For example, we found over 10% of the runtime activity in the h2 benchmark was incurred by code paths such as JdbcConnection.checkClosed() that were checking assertions and not contributing directly to the benchmark result. Many of these checks were repeated unnecessarily and we were able to refactor and remove them.
Our proposed approach allows the discovery of new optimisation opportunities that are not readily apparent from the original profile data. The results of our experiments and the performance improvements we made in our case studies demonstrate that efficiency analysis is an effective technique that can be used to complement existing performance engineering approaches.

References

  1. G. Xu, N. Mitchell, M. Arnold, A. Rountev, and G. Sevitsky. Software Bloat Analysis: Finding, Removing, and Preventing Performance Problems in Modern Large-Scale Object-Oriented Applications. Proceedings of the FSE/SDP Workshop on the Future of Software Engineering Research, pages 421-425, 2010.
  2. D. Maplesden, E. Tempero, J. Hosking, and J. Grundy. A Cost/Benefit Approach to Performance Analysis. Proceedings of the 7th ACM/SPEC International Conference on Performance Engineering (ICPE), to appear. 2016.


No comments:

Post a Comment