Sunday, May 22, 2016

Has empiricism in Software Engineering amounted to anything?

Associate Editor: Bogdan Vasilescu, University of California, Davis. USA (@b_vasilescu) 

A 2014 talk by Jim Larus, currently (after a long tenure at Microsoft Research) Dean of the School of Computer and Communication Sciences at EPFL, was circulated recently over email among some colleagues. In the talk, Larus reflects on success and failure stories of software engineering (SE) tools used at Microsoft over the years: interestingly, he lists the empirical work on data mining and failure prediction, from Microsoft Research, as one of the biggest successes, together with Windows error reporting; in contrast, Larus mentions both PREfast (static analysis) and SDV (formal verification) as examples of tools that didn't matter as much.

Needless to say, this sparked quite the controversy among participants in my email thread, about the value (or lack thereof) of empiricism in SE. I found the discussion thoughtful and useful, and I’d like to share it. This blog post is an attempt to summarize two main arguments, drawing mostly verbatim from the email thread participants. Active participants included, at various times, Alex Wolf, David Rosenblum, Audris Mockus, Jim Herbsleb, Vladimir Filkov, and Prem Devanbu.

Is empiricism just blind correlation?

The main criticism is that empirical SE work does not move past mere correlations to find deeper understanding, yet it's understanding that leads to knowledge, not correlation. The analysis of data should enable understanding, explanation, and prediction, not be an end in and of itself. The accessibility of correlation among statistical methods, coupled with the flood of data from open-source development and social programming websites, have created a recipe for endless paper milling. As a result, many empirical SE papers usually don’t ask any critically important research questions, that matter both to researchers and practitioners; instead, they present obvious, expected, and unactionable insights, that neither researchers could exploit to develop new or improved approaches, nor practitioners to produce guaranteed improvements to their development efforts.

In response to this criticism, three counterexamples:
C Parnin, A Orso. ISSTA ’11 PDF
A Hindle, ET Barr, Z Su, M Gabel, P Devanbu. ICSE ’12, CACM Research Highlights ’16 PDF
R Just, D Jalali, L Inozemtseva, MD Ernst, R Holmes, G Fraser. FSE ’14 PDF

Arguably, the successes at Microsoft and other places, with saving money on quality control using process metrics, have led to a sort of Gold Rush and probably some excesses. However, it is clear that empirical SE has advanced leaps and bounds since the early correlation days. Admittedly, earlier work was trying to understand the usefulness of the data and stats, and resulted in some fluff; perhaps like with a new gadget, people were more interested in the technology than how useful it is in daily lives. However, thanks in part to all that playing around, we now have lots of solid work, and openness of the community to scientific methods.

Moreover, these days, to pass muster at top conferences, one needs to have a good theoretical grounding, not only on past work in SE & PL, but (often) also in sociology, psychology, management science, and behavioral science -- creating and maintaining software is inherently human-centric. Theory helps to predict and explain results; frame future research questions; provide continuity across studies and research groups; accumulate results into something resembling knowledge; and it points toward important open questions. For example, there is a lot of theory in psychology about multitasking and context switching, that really helps to explain the quantitative results relating productivity to patterns of work across projects. Such theory can both push further theory development as well as work toward deeper explanations of the phenomena we study. What exactly happens when humans context switch, why is context switching costly, and why do costs vary in different conditions?

This is also why correlations have largely fallen out of favour. Careful consideration of the relevant phenomena (in the context of existing work) will lead one to use, e.g., more sophisticated mixed-effects / multivariate models and bootstrapped distributions, and to carefully argue why their experimental methods are well-suited to validate the theory. The use of these models is driven precisely by more sophisticated theories (not statistical, but rather SE / behavioral science theories).

Can empiricism be valuable without fully understanding the “why”?

Although the demand for understanding why something works is prima facie not entirely unreasonable, if one insisted on that, then many engineering disciplines would come to a halt: e.g., airfoil design, medicine, natural language processing (no one really understands why n-gram models work better than just about anything else). Knowing that something works or doesn’t can be quite enough to help a great many people. We have many extremely useful things in the world that work by correlation and “hidden” models. For example, theoretically sound approaches such as topic models (e.g., LDA) don’t work well, but highly engineered algorithms (e.g., word2vec) work fantastically. By insisting on why, one would: stop using SVM, deep learning, many drugs; not fly on airplanes; and not use Siri/Echo, because in many cases the why of these things is not well understood. Even such a trivial thing as GPS has a lot of engineering adjustments that have not been derived from first principles but just work.

However, without explicit understanding of why things work, it’s not clear how far one can go with engineering. For example, many ancient civilizations were able to predict the phases of the moon and positions of the stars without understanding the why of gravity and orbital mechanics, simply based on repeated measurement. But would that have gotten us satellite communication, planetary exploration, and the Higgs boson? Similarly, would you feel comfortable in a plane that lands using a machine-learned autoland system?

Closing remarks

In summary, it seems there is a central place for empirical work in SE; not as an end in itself, but rather the beginning of an inquiry leading to understanding and useful insights that can we can use to truly advance our discipline. See also Peggy Storey's Lies, Damned Lies and Software Analytics talk along these lines.

Sunday, May 15, 2016

Does your Android App Collect More than it Promises to?

by Rocky SlavinUniversity of Texas at San Antonio, USA (@RockySlavin)
Associate Editor: Sarah Nadi, Technische Universität Darmstadt, Germany (@sarahnadi)

How do we know that the apps on our mobile devices actually access and collect the private information we are told they do? This is an important question particularly about mobile devices due to their various sensors that can produce private information. Typically, end users can read an app’s privacy policy that is provided by the publisher in order get details on what private information is being collected by the app. But even so, it is difficult to verify that the app’s code does indeed adhere to the promises made in the policy. This is an important problem not only for end users who care about their right to privacy, but for developers who have moral and legal obligations to be honest about their code.
In order to aid developers and end users in answering these questions, we have created an approach that connects the natural language used in privacy policies with the code used to access sensitive information on Android devices [4]. This connection, or mapping, allows for a fully-automated violation detection process that can check for consistency between a compiled Android application and its corresponding natural language privacy policy.

Privacy Policies
If you look for an app on the Google Play store, you’ll commonly find a link to a legal document disclosing the private information that is accessed or collected through the app. Perhaps the biggest hindrance in understanding and analyzing these privacy policies is their lack of a canonical format. Privacy policies exist in all lengths and levels of detail, yet under United States law, they must all provide the end user with enough information to be able to make an informed decision on the app’s access to their private information [3].

Sensors and Code
As mentioned above, mobile devices often provide access to various sensors including GPS, Bluetooth, cameras, networking devices, and many others. In order for an app’s code to access data from theses sensors, it must invoke methods from an application program interface (API).  For the Android operating system, accessing this API is as simple as invoking the appropriate methods, such as android.location.LocationManager.getLastKnownLocation(), directly in the app’s code. It is these invocations that need to align with the apps’ privacy policies for consistency to be true.

Bridging the Gap
For our approach, we created associations between the API methods used for accessing private data and the natural language used in privacy policies to describe that data.

First, we used the popular crowd-sourcing tool, Amazon Mechanical Turk, to identify commonly-used phrases in privacy policies that describe information that can be produced by Android’s API. The tasks involved users reading through short excerpts from a set of 50 random privacy policies and annotating the phrases used to describe information that was collected. For example, words like “IP address”, “location”, and “device identifier” were some of the most frequently found phrases. The resulting privacy policy lexicon represented the general language used in privacy policies when referencing sensitive data.

Next, we used a similar approach to identify words descriptive of the data produced from all of the publicly-accessible API methods that are sources [2] of private information. Tasks for this portion consisted of individual methods with their descriptions from the API documentation. Users annotated phrases in the description that described the information being produced by the method. This created a natural language representation of the methods’ data to which we could associate phrases from the privacy policy lexicon. The result was a many-to-many mapping of 154 methods to 76 phrases.

Detecting Violations
The resulting mapping between API methods and the language used in privacy policies made violation detection possible. To do so, we use the information flow analysis tool, FlowDroid [1], to detect API invocations that produce sensitive information and then relay it to the network. We considered such invocations as probable instances of data collection. If such a method invocation did not have a corresponding phrase in the app’s privacy policy, it was flagged as a potential privacy policy violation.

Using the above technique, we were able to discover 341 violations from the top 477 Android applications. We believe this implies a lack of a policy verification system for developers and end users alike.

Implications for Developers
Based on our results, we believe that this information and framework can be used to aid developers in ensuring consistency for their own privacy policies. To this end, we are extending our work with an IDE plugin to aid developers in consistency verification as well as a web-based tool for checking compiled apps against their policies. We believe that such tools could be invaluable especially to smaller development teams that may not have the legal resources available to more established development firms. Ultimately, access to such tools could lead to not only a better development experience, but a better product for the end user.

[1] S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traeon, D. Octeau, and P. McDaniel. Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. In 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014.
[2] S. Rasthofer, S. Arzt, and E. Bodden. A machine-learning approach for classifying and categorizing Android sources and sinks. In Network and Distributed System Security Symposium, 2014.
[3] J.R. Reidenberg, T. D. Breaux, L. F. Cranor, B. French, A. Grannis, J. T. Graves, F. Liu, A. M. McDonald, T. B. Norton, R. Ramanath, et al. Disagreeable privacy policies: Mismatches between meaning and users’ understanding. Berkeley Tech. LJ 30 (2014): 39.
[4] R. Slavin, X. Wang, M. Hosseini, W. Hester, R. Krishnan, J. Bhatia, T. D. Breaux, and J. Niu. Toward a framework for detecting privacy policy violation in Android application code., In 38th ACM/IEEE International Conference on Software Engineering, 2016, Austin, Texas.

Sunday, May 1, 2016

Why not use open source code examples? A Case Study of Prejudice in a Community of Practice

by Ohad Barzilay, Tel Aviv University (
Associate Editor: Christoph Treude (@ctreude)

With so much open source code available online, why do some software developers avoid using it? That was the research question guiding a qualitative grounded-theory study recently published [1]. We analyzed the perceptions of professional software developers as manifested in the LinkedIn online community, and used the theoretical lens of prejudice theory to interpret their answers in a broader context.

We focused on developers’ perception of (re)using code examples - existing code snippets that are used in a new context. Our definition of a code ‘example’ is broad; some of the code examples which appear on the Internet were not written in order to be reused. Code examples may accompany answers on Q&A sites [2], illustrate an idea in an online tutorial, or even be extracted from an open source project [7].

We suggest that developers’ approach with respect to using code examples is dominated by their personality, and affected by concerns such as their community identity, ownership and trust. We find that developers’ perception of such reuse goes beyond the activities and practices, and that some developers associate the use of code examples with negative character. Some of these developers stereotype habitual example users as inferior and unprofessional.

It should be noted that not only human aspects are associated with example usage – there are some other issues involved in this activity such as engineering aspects (e.g. search techniques and tools) and legal issues (e.g. copyright and licensing). These issues are outside of the scope of our discussion; however, we believe that these challenges can be mitigated with proper tools [9], training and organizational support (e.g. leveraging upon social media cues [8], teaching developers which code could they use, and under what circumstances).

Code Writers vs. Copy-and-Paste Monkeys

Some software developers perceive themselves as code writers, and feel strongly about it. Their identity and sometimes even self-esteem are derived from perceiving themselves that way. As suggested by Bewer [5] this may result in the creation of ingroup bias, and can be in turn used as a platform for hate of the outgroup – in this domain, example users. For the code writers (virtual) group, new code is the unit of progress, a sign of productivity (however misleading it may sometimes be). Copying, on the other hand, is perceived as a devalued shortcut – an imitation rather than a creation. In most university courses, the students are not allowed to share their work with fellow students, but are expected to write their own code.

Having ingroup bias often limits boundaries of trust and cooperation [6], which may explain why some developers avoid copy and paste at all cost. They do not trust other programmers enough to take responsibility for and ownership of their code. These programmers find it difficult to understand existing code, they feel that they cannot identify fallacies in someone else's code nor test it thoroughly. They prefer to write the code by themselves and take responsibility for it rather than trust others, and perhaps lose control over their code.

Furthermore, we find that example usage opponents do not conform to organizational goals, and specifically the need for speed. They do not acknowledge the required dexterity and practices for effective example usage, and they aspire to be held in high regard (as opposed to "merely plumbers" [4], suggesting that the essence of the software engineering job boils down to putting the components together and making small non-glorious fixes). After all, some of them might have chosen programming as their profession because of its status.


In a commercial context, revealing implicit prejudice and disarming it may allow developers to leverage further benefit of code reuse, and may improve the collaboration of individuals, teams and organizations. Moreover, prejudice may interfere with achieving organizational goals, or while conducting an organizational change. Some of these concerns may be mitigated by providing a comprehensive ecosystem of tools, practices, training and organizational support [3]. Having the prejudice lens in mind, one may incorporate methods which were proven effective in addressing prejudice in different context (racism, sexism, nationalism) as part of the software engineering management tools.

Finally, this study may also be considered in the broader context of the changing software engineering landscape. The recent availability of information over the Web, and in our context – availability of source code, is challenging the way software is produced. Some of the main abstractions used in the software domain, namely development and construction, do not adequately describe the emerging practices involving pragmatic and opportunistic reuse. These practices favor composing over constructing and finding over developing. In this context, prejudice can be perceived as a reaction to change and an act resulting from fear of the new and unknown.


[1] O. Barzilay and C. Urquhart. Understanding reuse of software examples: A case study of prejudice in a community of practice. Information and Software Technology 56, pages 1613-1628, 2014.
[2] O. Barzilay, C. Treude, and A. Zagalsky. Facilitating crowd sourced software engineering via Stack Overflow. In S. E. Sim and R. E. Gallardo-Valencia, editors, Finding Source Code on the Web for Remix and Reuse, pages 289–308. Springer New York, 2013.
[3] O. Barzilay. Example embedding. In Proceedings of the 10th SIGPLAN symposium on New ideas, new paradigms, and reflections on programming and software, pages 137-144, 2011, ACM.
[4] O. Barzilay, A. Yehudai, and O. Hazzan. Developers attentiveness to example usage. In Human Aspects of Software Engineering, HAoSE ’10, pages 1–8, New York, NY, USA, 2010. ACM.
[5] M. B. Brewer. The psychology of prejudice: Ingroup love and outgroup hate? Journal of social issues, 55(3):429–444, 1999.
[6] S. L. Jarvenpaa and A. Majchrzak. Knowledge collaboration among professionals protecting national security: Role of transactive memories in ego-centered knowledge networks. ORGANIZATION SCIENCE, 19(2):260–276, 2008.
[7] S. E. Sim and R. E. Gallardo-Valencia, editors. Finding Source Code on the Web for Remix and Reuse. Springer, 2013.
[8] C. Treude and M. P. Robillard. Augmenting API documentation with insights from Stack Overflow. Forthcoming ICSE ’16: 38th Int’l. Conf. on Software Engineering, 2016.
[9] A. Zagalsky, O. Barzilay, and A. Yehudai. Example overflow: Using social media for code recommendation. In Proceedings of the Third International Workshop on Recommendation Systems for Software Engineering, pages 38-42, 2012, IEEE Press.

Monday, April 25, 2016

Common Architecture Weakness Enumeration (CAWE)

By Mehdi Mirakhorli (@MehdiMirakhorli), Associate Editor.

Software architecture design is the first and the fundamental step to address quality goals surrounding attributes such as security, privacy, safety, reliability, dependability, and performance. Design Flaws in the architecture of a software system mean that successful attacks could result in enormous consequences. To satisfy a security concern, an architect must consider alternate design solutions, evaluate their trade-offs, identify the risks and select the best solution. Such design decisions are often based on well-known architectural patterns, defined as reusable techniques for achieving specific quality concerns.

Security patterns come in many different shapes and sizes and provide solutions for enforcing the data integrityprivacyaccountabilityavailabilitysafety and non-repudiation requirements, even when the system is under attack.

Previous estimations indicate that roughly 50% of security problems are the result of software design flaws such as miss-understanding architecturally significant requirementspoor architectural implementationviolation of design principles in the source code and degradations of the security architecture. Flaws in the architecture of a software system can have a greater impact on various security concerns in the system and, as a result, giving more space and flexibility for malicious users.

Fundamentally, Design flaws (or only "flaws'') are different from Bugs, as the latter are more code-level while the former are at a deeper level and are much more subtle than bugs such as buffer overflows. Although a software system will always have bugs, recent studies show that the security of many software applications is breached due to flaws in the architecture.

Architectural flaws are results of inappropriate design choices in early stages of software development, incorrect implementation of security patterns, or degradation of security architecture over time.
An example of such architectural flaw is the Use of Client-Side Authentication, in which a client/server product performs authentication within client code, but not in server code, allowing server-side authentication to be bypassed via a modified client that omits the authentication check. 
This design decision to implement authentication creates a flaw in the security architecture. It can  be successfully exploited by an intruder with reverse-engineering skills.

Even though there are many techniques and practices such as threat modeling, static and dynamic code analysis, penetration testing that help developing a secure software system, there have not been many previous research papers in the literature that approach security from the architecture perspective. A recent effort is the IEEE Center for Secure Design launched by IEEE Computer Society. However as of today, there are not many examples or catalog of design flaws obtained or published yet that can help architects and developers learn and avoid such flaws.

Therefore, in our research team, we are working on establishing a catalog of Common Architecture Weakness Enumeration, containing architectural weaknesses that may create security breaches within the software.

This catalog is built on top of the previous library of Common Software Weaknesses Enumeration, which documents about 1000 software weaknesses. These weaknesses, however, are not categorized based on their architectural impacts and do not clearly distinguish between architectural weaknesses (security issues rooted in software architecture) and programming issues. We categorize these weaknesses into architectural and non-architectural and release the resulting catalog for the public. In addition, in a series of real case studies, we demonstrate instances of architectural weaknesses in four systems. These case studies indicate that the catalog of architectural weaknesses will be helpful for architects and designers to adopt a proactive approach to architecture-bases security.

Designing for Security

To ensure an application is secure, the security principles need to be implemented from the ground up. During requirements analysis, malicious practices are taken for granted, and requirements engineers identify all the use cases which are interests of an attacker. During architecture design, architects carefully analyze these requirements and adopts appropriate security patterns to resist, detect and recover from attacks.

Weaknesses in a Security Architecture

A software architecture can be flawed for many reasons resulting fundamental breaches in the system. Such flaws occur because of bad design decisions (flaws of commission), lack of design decisions (flaws of omission) or incorrect implementation of architectural patterns used to make the system secure (flaws of realization). These types of flaws are discussed in the following:
  • Flaws of Omission. Such design flaws result from ignoring a security requirement or potential threats. Such flaws identify decisions that were never made. A common design flaw is to store a password in a file without encryption. In this flaw the architect assumes that attackers would never have access to the file, thereby considering that the password stored in plaintext would not correspond to a compromise of the system. However, such design decision can open the system to attacks, because anyone, who has granted read access to the file, will be able to read all the stored passwords.
  • Flaws of Commission.Such design flaws refer to the design decisions which were made and could lead to undesirable consequences. Examples of such flaws are “Client side authentication” or ”using a weak encryption algorithm” to achieve better performance while maintaining data confidentiality.
  • Flaws of Realization. The design decision is correct but the implementation of that suffers from a coding mistake. For instance, the system was designed to have the Chroot Jail pattern. In this pattern, a controlled environment (“jail”) is created to limit access to system files so, attackers are avoided to exploit files/directories outside a specific directory. A common way to implement this pattern on Unix environments is to invoke the chroot() system function, which creates the jail but does not change the current working directory. Consequently, a developer may incorrect implement it through creating the chroot jail without changing the working directory, which allows that relative path still point to files- side the jail. Thus, attackers would still be able to access files/directories outside the jail even after invoking the chroot() function.

Examples of Weaknesses in Security Architecture

The Secure Session Management pattern is concerned about keeping track of sessions, which are a set of activities performed over a limited time period by a certain user. The main goal of this pattern is to keep track of who is using the system at a given time through managing a session object that contains all relevant data associated to the user and the session. In this pattern, every user is assigned an exclusive identifier (Session ID), which is utilized for both identifying users and retrieving the user-related data. Since session IDs are a sensitive information, this pattern may be affected by two main types of attacks: session hijacking (an attacker impersonate a legitimate user through stealing or predicting a valid session ID) and session fixation (an attacker has a valid session ID and forces the victim to use this ID).

The session hijacking can be facilitating by the architectural flaw of not securing the storage of session identifiers. Such flaw can be observed in the “session” module of the PHP language:

Per this description we note that PHP was designed to store each data session in plain textual files in a temporary directory without using a security mechanism for storing these session files (such as encryption). When closely inspecting the source code of PHP in version 4.0, we observe that the mod names every session file as "sess_xyz” (where "xyz" is the session ID), as shown in the code snippet presented above (where buf is a variable later used when creating the session files).
Figure 1(a) shows a scenario in which the flaw could be exploited. First, a legitimate user successfully identifies him/herself to the application. This causes the Web application written in PHP to start a session for the user through invoking the session start() from the PHP’s session module. Then, the session module in the PHP assigns a session ID for the user and it creates a new file named as “sess_qEr1bqv1q4V2FGX9C7mvb0” to store the data about the user’s session. At this point, the security of the application is compromised when an attacker observes the session file name and realizes that the user’s session ID is equal to “qEr1bqv1q4V2FGX9C7mvb0”. Subsequently, the attacker is able to impersonate the user through sending a cookie (PHPSESSIONID) in a HTTP request with this stolen Session ID. The Web application, after calling functions from the PHP’s session, verifies that the session ID provided matches with the user’s data so, the application considers that the requests are being made by a legitimate user.

From this scenario we can observe that such architectural weakness can lead to many consequences. First, if the user has an administrative role in the application, the attacker will be able to perform all the administrative tasks. Second, the attacker may be able to read the contents of the session file, thereby accessing the data, which may be sensitive, about the user that the attacker is not supposed to have access. It is important to highlight that such flaw affects not only the Secure Session Management, but also other security patterns (e.g. Authentication and Authorization) which uses the Secure Session Management for performing authentication and access control of users.

An example from PHP of an architectural weakness that facilitates the session fixation is shown below:

When verifying the session implementation in the source code of PHP version 5 we note that there is an incorrect implementation (i.e. a realization flaw) in the PHP’s session module that accept uninitialized session IDs before using it for authentication/authorization purposes. In fact, in the line 158 shown above, the function ps files valid key() does not properly validates the session ID, it only checks whether it contains a valid charset and has a valid length but does not verify whether the ID actually exists associated to the client performing the HTTP request.
Figure 1(a) shows how this architectural vulnerability is exploited. The attack starts with the attacker establishing a valid session ID (steps 1 to 4). Next, the attacker induces the user to authenticate him/herself in the system using the attacker’s session ID (steps 5 and 6).

Application of CAWE Catalog

Given that the CAWE catalog provides detailed information about architectural weaknesses, it can be used to guide architects and developers make appropriate design and implementation decisions to preserve security concerns at the architectural level throughout the software development lifecycle.

For example, code reviews are usually focused on finding bugs through technical discussions and analysis of the source code and other related artifacts (such as a portion of the requirements document, the architecture, etc). However using the CAWE catalog, the reviewers, who are responsible for inspecting the code, can check common security issues in their software. Past experiences in industry lead to the creation of security-driven software development processes, which emphasizes security concerns early in the software development lifecycle, such as CLASP (Comprehensive, Lightweight Application Security Process) and Microsoft’s SDL (Security Development Lifecycle). A common aspect of these processes and practices is the recommendation of providing proper training of the employees to promote a common background about software security. With this respect, our catalog could be used to aid such training and promote the awareness of the potential architectural issues that a system may be exposed to. Moreover, those security-driven processes include two activities for modeling potential threats in the software: threat modeling and design of misuse cases. These two activities are usually done through brainstorming sections. Such brainstorming could be aided with the CAWE for obtaining insights. In fact, practitioners in the security domain, support the usage of threat libraries, built from the MITRE’s catalog, for aiding this threat modeling process. Architectural risk analysis, which is a systematic approach of evaluating de- sign decisions against quality requirements, could also benefit from our catalog as a guidance for the evaluation.

You may also like:

Iván Arce, Kathleen Clark-Fisher, Neil Daswani, Jim DelGrosso, Danny Dhillon, Christoph Kern, Tadayoshi Kohno, Carl Landwehr, Gary McGraw, Brook Schoenfield, Margo Seltzer, Diomidis Spinellis, Izar Tarandach, and Jacob West, Avoiding the Top 10 Software Security Design Flaws, IEEE Cybersecurity, 2015.

Hall, Anthony & Chapman, Roderick. “Correctness by Construction: Developing a Commercial Secure System.” IEEE Software 19, 1 (Jan./Feb. 2002): 18-25.

Linger, R. C. “Cleanroom Process Model.” IEEE Software 11, 2 (March 1994): 50-58.

"This post includes joint work with Joanna Santos and Jairo Pavel Veloz Vidal, graduate students at RIT.

Sunday, April 10, 2016

Dissecting The Myth That Open Source Software Is Not Commercial

By: Karl Fogel (@kfogel)
Associate editor: Stefano Zacchiroli (@zacchiro)

Writing a myth-debunking piece for such an informed audience poses a certain risk. The readers of the IEEE Software Blog already know what open source software is, and many have probably written some. How can I be sure that anyone reading this even holds the belief about to be debunked?

Well, I can't be completely sure, but can at least say that this myth is one I still encounter frequently among software specialists, including people who themselves use free software on a daily basis. (By the way, I will use the terms "free" — as in "free software" — and "open source" interchangeably here, because they are synonyms in the sense that they refer to the same set of software and the same set of pro-sharing software licenses.) The continued prevalence of this myth in many organizations is an obstacle to the adoption and production of open source software.
First, to state it clearly:

Myth: Open source is not commercial, or is even anti-commercial, and is driven mostly by volunteerism.

That's really two myths, but they're closely related and it's best to address them together.

In mainstream journalism, open source is still almost always portrayed as a spare-time activity pursued by irrepressible programmers who band together for the love of coding and for the satisfaction they get from releasing free tools to the world. (See the second letter here for one example, but there are many, many more examples like that.) Surprisingly, this portrayal is widespread within the software industry too, and in tech journalism. There is, to be fair, a grain of truth to the legend of the volunteers: especially in the early days of open source — from the mid 1980s until the late 1990s (a period when it wasn't even called "open source" yet, just "free software") — a higher proportion of open source development could legitimately have been called volunteer than is the case today.

But still not as high a proportion as one might think. Much free software activity was institutionally funded even then, although the institutions in question weren't always aware of it. Programmers and sysadmins frequently launched shared collaborative projects simply to make their day jobs easier. Why should each person have to write the same network log analyzer by themselves, when a few people could just write it once, together, and then maintain it as a common codebase? That's cheaper for everyone, and a lot more enjoyable.

In any case, intentional investment in open source by for-profit outfits started quite early on, and such investment has only been growing since (indeed, to the point now where meta-investment is happening: for example, my company, Open Tech Strategies, flourishes commercially by doing exclusively open source development and by advising other organizations on how to run open source projects). For a long time now, a great deal of widely-used open source software has been written by salaried developers who are paid specifically for their work on that software, and usually paid by for-profit companies. There is not space here to discuss all their business models in depth, nor how erstwhile competitors manage to collaborate successfully on matters of shared concern (though note that no one ever seems to wonder how they manage this when it comes to political lobbying). Suffice it to say that there are many commercial organizations in whose interests it is to have this growing body of code be actively maintained, and who have no need to "own" or otherwise exercise monopolistic control over the results.

A key ingredient in this growth has been the fact that all open source licenses are commercial licenses. That is, they allow anyone to use the software for any purpose, including commercial purposes. This has always been part of the very definition of a "free software" or "open source" license, and that's why there is no such thing as software that is "open source for non-commercial use only", or "open source for academic use only", etc.

An important corollary of this is that open source software automatically meets the standard government and industry definition of "Commercial Off-The-Shelf" (COTS) software: software that is commercially available to the general public. COTS doesn't mean you must pay money — though you might choose to purchase a support contract, which is a fee for service and is very different from a licensing fee. COTS essentially just means something that is equally available to all in the marketplace, and open source software certainly fits that bill.

So: open source is inherently commercial, and the people who write it are often paid for their work via normal market dynamics.

Why, then, is there a persistent belief that open source is somehow non-commercial or anti-commercial, and that it's developed mainly by volunteers?

I think this myth is maintained by several mutually reinforcing factors:
  • Open source's roots are as an ideologically-driven movement (under the name "free software"), opposed to monopoly control over the distribution and modification of code. Although that movement has turned out to be successful in technical and business terms as well, it has not shed its philosophical roots. Indeed, I would argue, though will not here due to space limitations, that its philosophical basis is an inextricable part of its technical and business success. (It is worth considering deeply the fact that merely being anti-monopoly is enough to get a movement a reputation for being anti-commercial; perhaps it is the roots of modern capitalism as actually practiced that need closer examination, not the roots of open source.)
  • For a time, various large tech companies whose revenues depend mainly on selling proprietary software on a fee-for-copying basis made it a standard part of their marketing rhetoric to portray open source as being either anti-commercial or at any rate unconcerned with commercial viability. In other words: don't trust this stuff, because there's no one whose earnings depend on making sure your deployment is successful. This tactic has become less popular in recent years, as many of those companies start to have open-source offerings themselves. I hope to see it gradually fade away entirely, but its legacy lives on in the many corporate and government procurement managers who were led to believe that open source is the opposite of commercial.
  • Many companies now offer software-as-a-service based on open source packages with certain proprietary additions — those additions being their "value-add" (or, less formally, their "secret sauce"), the thing that distinguishes their SaaS offering from you just deploying the open source package on which it is based, and the thing that not coincidentally has the potential to lock you in to that provider.
    Unfortunately, companies with such offerings almost always refer to the open source base package as the "community edition", and their proprietized version as the "commercial edition" or sometimes "enterprise edition". A more accurate way to label the two editions would be "open source" and "proprietary", of course. But, from a marketing point of view, that has the disadvantage of making it plain what is going on.
  • Software developers have multiple motivations, and it's true that in open source, some of their motivation is intrinsic and not just driven by salary. It's actually quite common for open source developers to move from company to company, being paid to work on the same project the whole time; their résumé and work product are fully portable, and they take advantage of that. Open source means that one cannot be alienated from the fruits of one's labor, even when one changes employers. There is nothing anti-commercial about this — indeed, it could be viewed as the un-distortion of a market — but one can certainly see how observers with particular concerns about the mobility of labor might be inclined to fudge that distinction.
Finally, I think people also want to believe in a semi-secret worldwide army of happy paladins acting for the good of humanity. It would be so comforting to know they're out there. But what's actually going on with open source is much more complex and more interesting, and is firmly connected to commerce.


Sunday, April 3, 2016

The Descartes Modeling Language for Self-Aware Performance and Resource Management

Samuel Kounev, University of Würzburg, Würzburg, Germany
Associate Editor: Zhen Ming (Jack) Jiang, York University, Toronto, Canada 

Modern software systems have increasingly distributed architectures composed of loosely-coupled services that are typically deployed on virtualized infrastructures. Such system architectures provide increased flexibility by abstracting from the physical infrastructure, which can be leveraged to improve system efficiency. However, these benefits come at the cost of higher system complexity and dynamics. The inherent semantic gap between application-level metrics, on the one hand, and resource allocations at the physical and virtual layers, on the other hand, significantly increase the complexity of managing end-to-end application performance.

To address this challenge, techniques for online performance prediction are needed. Such techniques should make it possible to continuously predict at runtime: a) changes in the application workloads [3], b) the effect of such changes on the system performance, and c) the expected impact of system adaptation actions [1]. Online performance prediction can be leveraged to design systems that proactively adapt to changing operating conditions, thus enabling what we refer to as self-aware performance and resource management [4, 7]. Existing approaches to performance and resource management in the research community are mostly based on coarse-grained performance models that typically abstract systems and applications at a high-level (e.g., [2, 5, 8]). Such models do not explicitly model the software architecture and execution environment, distinguishing performance-relevant behavior at the virtualization level vs. at the level of applications hosted inside the running VMs. Thus, their online prediction capabilities are limited and do not support complex scenarios such as predicting how changes in application workloads propagate through the layers and tiers of the system architecture down to the physical resource layer, or predicting the effect on the response times of different services, if a VM in a given application tier is to be replicated or migrated to another host, possibly of a different type.

To enable online performance prediction in scenarios such as the above, architecture-level modeling techniques are needed, specifically designed for use in online settings. The Descartes Modeling Language (DML) provides such a language for performance and resource management of modern dynamic IT systems and infrastructures. DML is designed to serve as a basis for self-aware systems management during operation, ensuring that system performance requirements are continuously satisfied while infrastructure resources are utilized as efficiently as possible. DML provides appropriate modeling abstractions to describe the resource landscape, the application architecture, the adaptation space, and the adaptation processes of a software system and its IT infrastructure [1, 4, 6]. An overview of the different constituent parts of DML and how they can be leveraged to enable online performance prediction and proactive model-based system adaptation can be found in [6]. A set of related tools and libraries are available from the DML website at:


[1]  F. Brosig, N. Huber, and S. Kounev. Architecture-Level Software Performance Abstractions for Online Performance Prediction. Elsevier Science of Computer Programming Journal (SciCo), Vol. 90, Part B:71–92, 2014.

[2] I. Cunha, J Almeida, V. Almeida, and M. Santos. Self-Adaptive Capacity Management for Multi-Tier Virtualized Environments. In IFIP/IEEE Int. Symposium on Integrated Network Management, pages 129–138, 2007.

[3] N. Herbst, N. Huber, S. Kounev, and E. Amrehn. Self-Adaptive Workload Classification and Forecasting for Proactive Resource Provisioning. Concurrency and Computation - Practice and Experience, John Wiley and Sons, 26(12):2053–2078, 2014.

[4] N. Huber, A. van Hoorn, A. Koziolek, F. Brosig, and S. Kounev. Modeling Run-Time Adaptation at the System Architecture Level in Dynamic Service-Oriented Environments. Service Oriented Computing and Applications Journal, 8(1):73–89, 2014.

[5] G. Jung, M.A. Hiltunen, K.R. Joshi, R.D. Schlichting, and C. Pu. Mistral: Dynamically Managing Power, Performance, and Adaptation Cost in Cloud Infrastructures. In IEEE Int. Conf. on Distributed Computing Systems, pages 62 –73, 2010.

[6] S. Kounev, N. Huber, F. Brosig, and X. Zhu. Model-Based Approach to Designing Self-Aware IT Systems and Infrastructures. IEEE Computer Magazine, 2016, IEEE. To appear.

[7] S. Kounev, X. Zhu, J. O. Kephart, and M. Kwiatkowska, editors. Model-driven Algorithms and Architectures for Self-Aware Computing Systems. Dagstuhl Reports. Dagstuhl, Germany, January 2015.

[8] Qi Zhang, Ludmila Cherkasova, and Evgenia Smirni. A Regression-Based Analytic Model for Dynamic Resource Provisioning of Multi-Tier Applications. In Proceedings of the 4th International Conference on Autonomic Computing, 2007.

If you like this article, you might also enjoy reading:
  • A. Avritzer, J. P. Ros and E. J. Weyuker, "Reliability testing of rule-based systems," IEEE Software, vol. 13, no. 5, pp. 76-82, Sep 1996.
  • E. Dimitrov, A. Schmietendorf, R. Dumke, "UML-Based Performance Engineering Possibilities and Techniques”, IEEE Software, vol. 19, no. 1, pp. 74-83, Jan-Feb, 2002.
  • J. Happe, H. Koziolek and R. Reussner, "Facilitating Performance Predictions Using Software Components," in IEEE Software, vol. 28, no. 3, pp. 27-33, May-June 2011.

Sunday, March 27, 2016

Should software developers be replaced by the crowd?

by Thomas D. LaToza (@ThomasLaToza) and André van der Hoek (@awvanderhoek)
Associate Editor: Christoph Treude (@ctreude)

We value our professional software developers, hire them in teams (local or remote), and set off on our merry way producing projects. This has been the status quo for a long time. But does it need to be? Could the crowd supplant the need for explicitly hired teams? More strongly yet: should the crowd perhaps replace hired employees?

These are tantalizing questions, both in terms of excitement about possibilities and in terms of 'fear' regarding the significant disruption that may play out in the workplace. Here, we consider several aspects of the issue of crowds versus hired employees.

1. Isn’t this just open source?

According to Howe [1], crowdsourcing is “the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call”. In software, open source is one of the oldest and most successful crowdsourcing models. But all crowdsourcing is not open source. Platforms such as TopCoder employ a competition model, in which developers respond to a client request and clients select and reward a winner. Others such as trymyUI offer to the crowd short, well-defined tasks that, when taken together, compose into a larger whole – for instance comprehensively testing the usability of an application. Labor markets such as UpWork offer the promise of on-demand, fluid labor forces with the skills necessary for the job at hand, with workers ready to take on specialized work for an hour, a day, or maybe a week.

2. What about the speed of development?

As the old adage goes, many hands make light work. Fundamental to many crowdsourcing models is the decomposition of a large task into many small tasks. In principle, decomposed tasks enable work to be completed dramatically faster, as they enable parallel distribution to and completion of tasks by the crowd. But decomposition itself brings challenges: communicating tasks to developers, coordinating contributions made by the crowd, and ensuring smooth handoffs between tasks. In practice, these competing forces have given rise to a wide diversity of competing systems and platforms, using different granularities of tasks to crowdsource a variety of software development activities.

3. What about quality?

One common concern about opening contributions to anyone is the quality of the work that will result. As in traditional development, there are many approaches to managing quality. Work is almost always reviewed, either by the client, an agent of the client, or the crowd itself. Many platforms explicitly track reputation, allowing clients to identify those who have demonstrated quality work on similar tasks in the past and motivating developers to do high quality work to maintain their reputation. Competition platforms use the crowd to generate competing alternatives, allowing the client to select the highest quality solution. Through these mechanisms, it may, in fact, be possible to achieve higher quality through crowdsourcing, as some of our work has demonstrated [2]. On the other hand, it has been observed that, without proper management, quality may well suffer; care, thus, must be taken to focus on quality from the start [3].

4. What about crowdsourcing environments?

A significant innovation many crowdsourcing approaches bring is their dedicated support for performing crowdsourced work. Crowdsourcing platforms often provide, directly in the environment, support for contributors to browse and identify tasks matching their interests or that have the greatest chance of a reward. Clients may browse reputation information about potential workers, browse submitted contributions, and use the platform to make and manage payments. Systems that offer fine-grained tasks may go further still, offering workers self-contained tasks that can be quickly completed within the environment itself. For example, CrowdCode [4] offers an online environment for completing programming microtasks.

5. What about more complex development tasks?

Another common question is how a complex task such as architecture or design—requiring knowledge of the project as a whole or with dependencies making decomposition difficult—could ever be crowdsourced? Of course, one approach is simply to not decompose such tasks. For example, some TopCoder tasks are architectural in nature, asking competing workers to create design documents. An experienced crowd worker called a “co-pilot” is then responsible for creating, coordinating, and managing the tasks to implement the architecture. However, this can bring challenges in imposing a waterfall development process, where an architecture is built up-front, independent of future programming work [3]. Alternatively, it may be possible for the crowd itself to be more involved, decomposing architectures into networks of connected decisions [5].

6. Can I help?

Crowdsourcing is already penetrating software development practice today [6][7]. TopCoder has hosted more than 427,000 software design, development, and data science competitions. More than 100,000 testers freelance on uTest. But we believe that this may be just the beginning [8]. Just as open source changed how many organizations do software development and the nature of organizations themselves, crowdsourcing opens the door to new types of contributions and new ways stakeholders may interact that may lead to new models of software development. To help understand where these models may fail and where they may succeed, we have begun to study crowdsourcing approaches through a variety of experiments, from debugging to programming to designing. If you’d like to see how such approaches may work and help discover new ways of developing software, sign up here.


[1] James Surowiecki. (2005). The Wisdom of Crowds. Random House.
[2] Thomas D. LaToza, Micky Chen, Luxi Jiang, Mengyao Zhao, and André van der Hoek. (2015). Borrowing from the crowd: a study of recombination in software design competitions. International Conference on Software Engineering, 551-562.
[3] Klaas-Jan Stol and Brian Fitzgerald. (2014). Two's company, three's a crowd: a case study of crowdsourcing software development. International Conference on Software Engineering (ICSE), 187-198.
[4] Thomas D. LaToza, W. Ben Towne, Christian M. Adriano, and André van der Hoek. (2014). Microtask programming: building software with a crowd. Symposium on User Interface Software and Technology (UIST), 43-54.
[5] Thomas D. LaToza, Arturo Di Lecce, Fabio Ricci, W. Ben Towne, and André van der Hoek. (2015). Ask the crowd: scaffolding coordination and knowledge sharing in microtask programming. Symposium on Visual Languages and Human-Centric Computing (VL/HCC), 23-27.
[6] Thomas D. LaToza and André van der Hoek. (2016). Crowdsourcing in Software Engineering: Models, Motivations, and Challenges. IEEE Software, 33, 1 (January 2016), 74-80.
[7] Ke Mao, Licia Capra, Mark Harman, and Yue Jia. (2015). A survey of the use of crowdsourcing in software engineering. Technical Report RN/15/01, Department of Computer Science, University College London.
[8] Thomas D. LaToza and André van der Hoek. (2015). A vision of crowd development. International Conference on Software Engineering (ICSE), 563-566.