Tuesday, March 5, 2019

What should practitioners know about user involvement in software development?

By Didar ZowghiMuneera Bano (@DidarZowghi @DrMuneeraBano
Associate Editor: Muneera Bano (@DrMuneeraBano)


In the neo-humanist approach, software is designed to support and improve the working environment of its intended users. In practice, this means involving them in the development process and not to treat them as mere consumers of the software

But is it really important to involve users in software development? To what extent, and at what stages of software development, should they participate? How much power should the users be given in software development decision making?  These questions have been asked by researchers and practitioners for decades and together with many related subjects, they have been the topic of a great deal of research.  

Users’ involvement (UI) during software development has been claimed to link to users’ satisfaction with the resulting system, that in turn leads to the assertion of system success. However, much of the empirical evidence to date shows that this connection between UI and system success is not ubiquitous. Although much of research in this area has revealed that involving users in software development contributes positively to system success, it is also been observed that UI is indeed a double-edged sword and can equally create problems rather than benefits.

In theory, Users’ involvement in software development and system success’ is a complex combination of four different concepts that have been studied and analyzed separately and in combination. Over five decades of this investigation enormous amount of debate and disagreement has been published both in software engineering and information systems research literature. 

Like others, we have been curious about so much interest and conflicting results on the topic of users’ involvement in software development. In the last 6 years, we have conducted empirical longitudinal studies to explore: what has been reported in research literature? what are the problems and challenges of UI? how does users’ satisfaction with their involvement evolve? and what useful theories could be developed about the link between UI and system success?. Furthermore, we have been wondering about what impact has Agile software development had on addressing the problems and challenges of UI. So, we also explored the alignment of stakeholder Expectations about UI in agile development. Finally, being convinced that UI is clearly a multi-faceted and communication rich phenomenon, we conducted a case study of organizational power and politics in the context of UI in software development.

The overall findings from our longitudinal study have resulted in several key observations and the development of a theory.  First, we observed that system success is achievable even when there are problems and challenges in involving users. User satisfaction significantly contributes to the system success even when schedule and budget goals are not met. Second, there are additional factors that contribute to the evolution of user satisfaction throughout the project. Users’ satisfaction with their involvement and the resulting system are mutually constituted while the level of user satisfaction evolves throughout the stages of the development process. Third, organizational and project politics are significant factors often used to exert power and influence in decision-making processes. Communication channels are frequently exploited for political purposes. These contribute to the users’ dissatisfaction with their involvement thus impacting on the project outcome. Fourth, varying degrees of expectation misalignments exist between the development team and users in Agile software development projects.

Our findings enabled us to provide useful hints for practitioners. Understanding the nature of the problems related to UI helps the project managers to develop appropriate strategies for increasing the effectiveness of user involvement. These management strategies and appropriate level and extent of user representation are essential elements of maintaining an acceptable level of user satisfaction throughout the software development process. When there are multiple teams of stakeholders with different levels of power in decision-making, politics is inevitable and inescapable. Without careful attention, the political aspect of user involvement in software development can contribute to an unsuccessful project.


Tuesday, February 26, 2019

How Python Tutor Uses Debugger Hooks to Help Novices Learn Programming

By: Philip Guo (@pgbovine)
Associate Editor: Karim Ali (@karimhamdanali)


Since 2010 I've been working on Python Tutor, an educational tool that helps novices overcome a fundamental barrier to learning programming: understanding what happens as the computer runs each line of code. Python Tutor allows anyone to write code in their web browser, see it visualized step by step, and get live real-time help from volunteers. Despite its now-outdated name, this tool actually supports seven languages: Python, JavaScript, TypeScript, Ruby, Java, C, and C++. So far, over 3.5 million people in over 180 countries have used it to visualize over 50 million pieces of code. You can find research papers related to this project on my publications webpage. But in this blog post, I want to dive into some implementation details that I haven't gotten to highlight in my papers.


Let's start with a simple Python example (run it live here):


This code creates instances of three basic data structures: an ordered collection (called a list in Python), a key-value mapping (called a dict or dictionary in Python), and an unordered set. Note how elements within these data structures can point to other data structures; for instance, the second element of the top list (accessible via the global variable x) points to the bottom list. Using Python Tutor, novices can easily see pointer and aliasing relationships by following the arrows in these diagrams. Without this tool, they would need to print out serialized string values to the terminal, which obscures these critical details.

How is Python Tutor implemented? By hooking into Python's built-in debugger protocol (bdb in its standard library). This tool runs the user's inputted code, single-steps through execution one line at a time, and traverses the object graph starting from globals and stack-local variables. It records a full trace of the stack and heap state at all execution steps and then sends the trace to the web frontend to render as interactive diagrams.

The main limitation of this "trace-everything" approach is scalability: it's clearly not suitable for code which runs for millions of steps or creates millions of objects. But code written by instructors and students in educational settings is usually small -- running for dozens of steps and creating around a dozen data structures -- so this simple approach works well in practice.


Now here's the same code example ported to JavaScript (run it live here):


This heap object diagram looks exactly the same as the Python one, albeit with different labels: in JavaScript, an ordered collection is called an array, and a key-value mapping is called an object (note that there's also a Map type). The JavaScript implementation works in the same way as the Python one: by hooking into the debugger protocol of the Node.js JavaScript runtime.

Here's what this example looks like in Ruby, once again implemented by hooking into the interpreter's built-in debugger protocol (run it live here):


These three identical-looking examples show how the diagrams generated by Python Tutor are designed to be fairly language-independent. Novice programmers need to learn about concepts such as stack frames, scope, control flow, primitive data types, collections, and pointers. To facilitate this learning, Python Tutor implements a graphical abstraction layer that takes the details of each language's low-level trace data and turns them into higher-level diagrams that capture the essence of the associated programming concepts. This abstraction makes it straightforward to expand the tool to work on additional languages as demand arises. It also makes it possible to scaffold learning of one language when someone already knows another one, such as teaching Python programmers how to quickly get up to speed on JavaScript.

This tool visualizes Java code in a similar way, but I'll skip that illustration to save space. Let's now turn to a more challenging pair of languages: C and C++. Unlike code in the above languages, C and C++ programs are not necessarily type- or memory-safe. This means that hooking into a debugger such as gdb isn't enough, since it's not clear which chunks of memory correspond to valid data. Here's a C example to show what I mean (run it live here):


This code allocates a 6-element integer array on the stack (accessible via localArray) and a 10-element integer array allocated on the heap via malloc (accessible via b). It then populates the elements of both arrays at indices 1, 3, and 5. The resulting visualization shows those initialized values and ? symbols next to the remaining uninitialized values. In addition, it knows that the heap array has exactly 10 elements and does not try to read unallocated elements beyond that bound, which risks crashing the program. Readers familiar with C and C++ will recognize that such memory allocation and initialization data is not available to debuggers such as gdb. Python Tutor hooks into Valgrind Memcheck to get this vital data. Without something like Memcheck, it would be impossible to build a safe and accurate visualizer for C and C++.

Finally, let's end with a C++ example (run it live here):


This visualization shows object-oriented programming concepts such as calling an instance method Date::set(), its this pointer referring to a Date object on the stack (accessible via stackDay), and another Date object on the heap (allocated with the new keyword and accessible via heapDay). Just like it does for C programs, Valgrind Memcheck ensures that Python Tutor reads only memory that has been both allocated (here recognizing that there is only one Date object on the heap) and initialized (so that it doesn't show stale junk values).


That was my quick tour of how Python Tutor works for a variety of languages that are frequently used to teach programming. The underlying principle that drives my implementation decisions is that authenticity is key: experts can work around bugs or other quirks in debugging tools, but novices will get confused and demoralized if a tool renders inaccurate diagrams. Thus, this tool needs to be able to take whatever code that users put into it and do something reasonable, or at least fail in a transparent way (e.g., stopping after 1,000 execution steps and suggesting for the user to shorten their code). I've tried out alternative language interpreters and experimental runtimes to get more detailed tracing (e.g., finer-grained stepping into subexpressions rather than line-level stepping), but time and time again I've gone back to built-in debugger hooks and widely-deployed tools such as Valgrind since they are far more robust than experimental alternatives. Try it out today at http://pythontutor.com/ and let me know what you think!

Tuesday, February 19, 2019

Software Improvement by Data Improvement

By: William B. Langdon, CREST Department of Computer Science, University College, London

Associate Editor: Federica Sarro, University College London (@f_sarro)


In previous blogs [1, 2] we summarised genetic improvement [3] and described the result of applying it to BarraCUDA [4]. Rather than giving a comprehensive update (see [5]) I describe a new twist: evolving software via constants buried within it to give better results. The next section describes updating 50000 free energy parameters used by dynamic programming to find the lowest energy state of RNA molecules and hence predict their secondary structure (i.e. how they fold) by fitting data in silico to known true structures. The last section describes converting a GNU C library square root function into a cube root function by data changes.

Better RNA structure prediction via data changes only

RNAfold is approximately 7 000 lines of code within the open source Vienna- RNA package. Almost all the constants within the C source code are pro- vided via 21 multi (1–6) dimensional int arrays [6, Tab. 2]. We used a population of 2000 variable length lists of operators to mutate these inte- gers. The problem dependent operators can invert values, replace them or update them with near by values. They can be applied to individuals values or using wild cards (*) sub-slices or even the whole of arrays. From these a population of mutated RNAfold is created. Each member of the popula- tion is tested on a 681 small RNA molecules and the mutants prediction is compared with their known structure [6, Tab. 1]. At the end of each gen- eration the members of the population are sorted by their average fitness on the 681 training examples and the top 1000 are selected to be parents of the next generation. Half the children are created by mutating one parent and the other 1000 by randomly combining two parents. After one hundred generations, the best mutant in the last generation is tidied (i.e. ineffective bloated parts of it are discarded) and used to give a new set of 50 000 integer parameters (29% of them are changed).
On average, on both big and small molecules of known structure (not used in training), the new version of RNAfold does better than the original. (In many cases it gives the same prediction, in some it is worse but in more it is better.)
Figure 1 shows RNAfold’s original prediction of the secondary structure of an example RNA molecule and then the new prediction using the updated free energy parameters.

Figure 1: Secondary structure (i.e. folding patterns) for RNA molecule PDB 01001. 1) Prediction made original RNAfold does not match well true structure right. For example the highlighted hairpin loop (red) is not in the true structure. 2) Prediction made with GI parameters is almost identical to the true structure. 3) True structure. (Figure 2 tries to show the three dimensional structure of two PDB 01001 RNA molecules in a larger complex.)

A new Cube Root Function

The GNU C library contains more than a million constants. Most of these are related to internationalisation and non-ascii character sets [8]. However one implementation of the double precision square root function uses a table of 512 pairs of real numbers. (Most implementations of sqrt(x) simply call low level machine specific routines.) The table driven implementation
is written in C and essentially uses three iterations of Newton-Raphson’s method. To guarantee to converge on the correct square(x) to double precision accuracy, Newton-Raphson is given a very good start point for both the target value x^(1/2) and the derivative 0.5x^(−1/2) and these are held as pairs in the table.
Unlike the much larger RNAfold (previous section), with cbrt(x) some code changes were made by hand. These were to deal with: x being negative, normalising x to lie in the range 1.0 to 2, reversing the normalisation so that the answer has the right exponent and replacing the Newton-Raphson constant 1/2 by 1/3 [8, Sec. 2.1]. Given a suitable objective function (how close 23 is cbrt(x)×cbrt(x)×cbrt(x) to x), starting with each of the pairs of real numbers for sqrt(x), in less than five minutes CMA-ES [9] could evolve all 512 pairs of values for the cube root function.
The GNU C library contains many math functions which follow similar implementations. For fun, we used the same template to generate the log2(x) function [10].

Figure 2: Three dimensional structure of two PDB 01001 RNA molecules (blue, orange) in a Yeast protein complex (green, yellow) [7, Fig 2. A].

References

  1. W. B. Langdon and Justyna Petke. Genetic improvement. IEEE Soft- ware Blog, February 3 2016.
  2. W. B. Langdon and Bob Davidson. BarraCUDA in the cloud. IEEE Software Blog, 8 January 2017.
  3. W. B. Langdon and Mark Harman. Optimising existing software with genetic programming. IEEE Transactions on Evolutionary Computa- tion, 19(1):118–135, 2015.
  4. W. B. Langdon and Brian Yee Hong Lam. Genetically improved Barra- CUDA. BioData Mining, 20(28), 2 August 2017.
  5. Justyna Petke et al. Genetic improvement of software: a comprehensive survey. IEEE Transactions on Evolutionary Computation, 22(3):415– 432, 2018.
  6. W. B. Langdon, Justyna Petke, and Ronny Lorenz. Evolving better RNAfold structure prediction. In Mauro Castelli et al., editors, EuroGP 2018, pages 220–236, Parma, Italy, 2018. Springer Verlag.
  7. Masaru Tsunoda et al. Structural basis for recognition of cognate tRNA by tyrosyl-tRNA synthetase from three kingdoms. Nucleic Acids Re- search, 35(13):4289–4300, 2007.
  8. W. B. Langdon and Justyna Petke. Evolving better software parame- ters. In Thelma Elita Colanzi and Phil McMinn, editors, SSBSE 2018, pages 363–369, Montpellier, France, 2018. Springer.
  9. Nikolaus Hansen and Andreas Ostermeier. Completely derandom- ized self-adaptation in evolution strategies. Evolutionary Computation, 9(2):159–195, 2001.
  10. W. B. Langdon. Evolving square root into binary logarithm. Technical Report RN/18/05, University College, London, UK, 2018.

You might also enjoy reading

  • James Temperton. Code ’transplant’ could revolutionise program- ming. Wired.co.uk, 30 July 2015. Online.
  • John R. Woodward, Justyna Petke, and William Langdon. How com- puters are learning to make human software work more efficiently. The Conversation, page 10.08am BST, June 25 2015.
  • Justyna Petke. Revolutionising the process of software development. DAASE project blog, August 7 2015.

Monday, February 11, 2019

Learning from Failure: Modeling User Goals in the App Market



By: Grant Williams (@g_will74), Nishant Jha (@nishant_vodoo), and Anas Mahmoud (@nash_mahmoud), Louisiana State University

Associate Editor: Federica Sarro, University College London ( @f_sarro)



App success has been broadly studied in the literature. This line of research is mainly driven by the significant business interests in the performance of mobile apps. In general, app success can be quantified based on market performance; apps that get more downloads, appear frequently on the top-charts, have steady user acquisition and retention rates, or generate more revenue, are more successful. Failure, on the hand, is a less well-understood phenomenon. Naturally, success is more appealing as an object of analysis than failure. However, market research has shown that failure should be studied in isolation from total prior experience. According to Madsen et al. [1], organizations learn more effectively from failures than successes. The ability to analyze individual cases of failure represents a unique opportunity for gaining knowledge about what went wrong, the key learning points, and how to prevent such failures in the future. 

To shed more light on app failure, in this article, we describe a case-study dedicated to analyzing the main reasons that led to the failure of Yik Yak, one of the most popular social networking apps of 2015. Our objective is to model the main users’ goals in domain of Yik Yak along with their relations to the core features of the domain. Models can help to translate tacit domain knowledge into an explicit form. Once explicit domain knowledge is created, it can be preserved, communicated, and passed through to others.



A Case Study on App Failure

Fig. 1 The number of monthly downloads after 
anonymity was removed and then restored. 

Launched in 2013, Yik Yak was a location-based social networking app that allowed users to post and vote on short messages (known as yaks). Yik Yak distinguished itself from its competitors by two main features: anonymous communication and geographical locality; users could anonymously post and engage in discussions within a 1.5 mile radius around their location. The combination of anonymity and locality proved successful, and by the end of 2014, Yik Yak had become one of the most popular social networking platforms on college campuses, with a market valuation close to $400 million. 
Despite its popularity among users, the anonymity provided by Yik Yak had a downside. Due to the lack of personal identifiability, Yik Yak posts were an easy vector for cyber-bullies to anonymously harass other users at their schools or universities. This problem became significant when the app was used to make threats credible enough for institutions to request police assistance. To control for cyberbullying, Yik Yak rolled out mandatory handles, where users would be forced to choose a screen name. The response from the community to this feature update was overwhelmingly negative. Consequently, the number of downloads and active users dropped sharply. Yik Yak’s attempt to reverse course by reintroducing anonymous posting was never successful in recapturing its previous popularity, eventually, forcing the company to shut down the app and suspend its operations in May of 2017. 


The story of Yik Yak represents a unique opportunity for analyzing one of the most recent cases of major failures in the app market. To get a systematic in-depth look into this case, we use feature-softgoal interdependency graphs (F-SIG). F-SIGs enable a comprehensive qualitative reasoning about the complex interplay between the functional features of an application domain and its users’ goals. To generate our model, we initially identified the main rival apps of Yik Yak in the domain of anonymous social networking apps. This list included Kik, Jodel, Firechat, Whisper, Swiflie, and Spout. To identify end-users’ goals in the domain, we analyzed user feedback available on app store reviews and the Twitter feeds of these apps [3]. Our analysis was focused on identifying user sentiments toward the core features of the domain [4]. The generated F-SIG model is shown in Figure 2.   
Fig. 2: A model of the common user goals (concerns) and their relationships to the core features of the domain of anonymous social networking apps.

Significance and Future Work

The F-SIG diagram in Figure 2 shows that several in-depth insights can be gleaned from analyzing user goals and their relations to the core features of the domain. Such information can serve as input for the Next Release Problem (NRP), which is mainly concerned with maximizing customer value by optimizing the subset of requirements to be included in the coming release. Furthermore, through the model's dependency relations, developers can get a sense of the synergy and trade-offs between features and user goals, thus can adjust their release strategies to focus on features that enhance the most desirable goals. This information can be particularly useful for smaller businesses and startups trying to break into the app market. Specifically, the proposed model will serve as a core asset that will help startup companies to get a quick and comprehensive understanding into the history of the domain, providing information about how specific user goals and features of the domain have emerged and evolved to their current state.
Our future work in this direction will be focused on automating the model generation process, utilizing automatic app store and social media mining methods as well as automated domain modeling techniques. Our goal is to devise automated methods for data collection, domain feature analysis, and model generation. These methods will be evaluated over large datasets of app data to ensure their practicality and accuracy.

References

[1] G. Williams and A. Mahmoud, “Modeling User Concerns in the App Store: A Case Study on the Rise and Fall of Yik Yak,” in IEEE International Requirements Engineering Conference, 2018.

[2] S. Jarzabek, B. Yang, and S. Yoeun. Addressing quality attributes in domain analysis for product
lines. IEE Proceedings - Software, 153(2):6-73, 2006.

[3] W. Martin, F. Sarro, Y. Jia, Y. Zhang, and M. Harman, A survey of app store analysis for software engineering, in IEEE Transactions on Software Engineering, vol. 43, no. 9, pp. 817-847, 1 Sept. 2017.

[4] P. Madsen and V. Desai, “Failing to learn? The effects of failure and success on organizational learning in the global orbital launch vehicle industry,” Academy of Management Journal, vol. 53, no. 3, pp. 451–476, 2010.

Thursday, January 31, 2019

Architectural Security Weaknesses in Industrial Control Systems (ICS)

By Mehdi Mirakhorli (@MehdiMirakhorliAssociate Editor and Danielle Gonzalez (@dngonza) 
Center for Cybersecurity at RIT

Industrial control systems (ICS) are computers that control any automation system used in industrial environments that include critical infrastructures. They allow operators to monitor and control industrial processes and support the day to day operations of manufacturing, oil and gas production, chemical processing, electrical power grids, transportation, pharmaceutical, and many other critical infrastructures. Ever since the Stuxnet attack on modern supervisory control and data acquisition (SCADA) and PLC systems, there have been many vulnerabilities discovered and reported for these systems. Recent studies have shown that the documented cases of attacks to ICS infrastructures have exponentially increased from a few incidents to hundreds per year. In this blog post, we discuss common security architectural weaknesses in ICS.


We conducted a study that took several months in which we collected and analyzed 988 security advisories that disclosed ICS vulnerabilities. These vulnerabilities were received from the Industrial Control Systems Cyber Emergency Response Team (ICSCERT). We performed a detailed analysis of the vulnerability reports to measure which components of ICS have been affected the most by known vulnerabilities, which security tactics were affected most often in ICS and what are the common architectural security weaknesses in these systems 
Figure 1 shows a typical architecture of ICS. Programmable Logic Controllers (PLCs) and Remote Terminal Units (RTUs) connect to sensors and actuators through a Fieldbus to gather real-time information and provide operational control of the equipment. These controllers also communicate to a data acquisition server (SCADA) using various industrial communications protocols.


ICS relies on Human-Machine Interfaces (HMI) to configure the industrial plant, troubleshoot the operation and equipment, run, stop, install and update programs, and recover from failures. The day to day operational data is logged by the Data Historian component for further analysis and diagnosis. Each of these components has its own interfaces and can be the target of attacks that impact industrial equipment and operations. 


Figure 1. Overivew of an ICS Architecture


Our analysis detected 17 ICS components in 544 ICS advisory reports (55.06% of the entire corpus). Figure 2 is a heat map showing the most common vulnerable components in these ICS advisory reports. The size of each block in the heatmap represents the frequency of a component. 


Human-Machine Interfaces (HMIs), Supervisory Control And Data Acquisition (SCADA) Software, and Programmable Logic Controllers (PLCs) are the most vulnerable ICS components.
The remaining components were significantly less vulnerable. For instance, the fourth component impacted by vulnerabilities is OPC (reported 57 times) which is a software protocol based on Microsoft’s Distributed Component Object Model (DCOM), used by manufacturers for interoperability of industrial processes in real time.

We examined how many of the vulnerabilities had an architectural root cause by identifying the Common Architectural Weakness Enumeration (CAWE) assigned to each vulnerability and noting how many were associated with architectural tactics:
62.86% of ICS vulnerability disclosures had an architectural root cause, while 37.14% of the vulnerabilities were due to coding defects.

The most common architectural root causes for ICS vulnerabilities:

Improper Input Validation affects ICS the most, followed by Cross-site Scripting, Cross-Site Request Forgery, SQL Injection,  Improper Authentication

Table III shows the top 20 architectural CAWEs affecting ICS.

Table 2. Top CAWEs in Industrial Control Systems

1) Improper Input Validation: This weakness occurs when an ICS component either does not or incorrectly implements Validate Inputs tactic. Our analysis indicates that this weakness can be present in every ICS component and the consequences often results in a denial of service attack. For instance, ICSA-14-079-01 reports two vulnerabilities impacting Siemens SIMATIC S7-1200 Programmable Logic Controllers (PLCs). The crafted packets sent on 2 specific ports could initiate a special “defect mode” on the PLC device. The subsystem receiving the packets fails to validate, thus causing a vulnerability that can be exploited remotely and causes denial-of-service. These vulnerabilities were viable since the devices were accessible through the internet, however, they were mitigated by Siemens new versions of these PLCs. 

2) Improper Neutralization of Input During Web Page Generation (’Cross-site Scripting’): This weakness occurs in an ICS component that has a web-accessible UI, but does not or incorrectly “neutralize” input provided by one user (commonly a web request) than is subsequently used to generate web pages which are served to additional users. With this flaw, attackers can inject client-side scripts into web pages viewed by other users. This weakness is related to the Validate Inputs tactic, however, it is limited to ICS components with web-accessible UIs. In contrast, the previous weakness (CWE-20) presented the cases that attackers could implement a denial of service attack using a crafted input, that did not necessarily involved web applications. ICSA-15- 342-01 describes a cross-site scripting vulnerability affecting XZERES 442SR Wind Turbines. The web-based operating system of the turbine generator did not properly neutralize incoming web request content. This vulnerability could be exploited remotely and could cause a loss of power for the entire system. 

3) Improper Authentication: A common design weakness discovered in ICS is that a component fails to or incorrectly verifies the identity claims of an actor interacting with it. For instance, ICSA-17-264-04 reports that iniNet Solutions GmbH’s SCADA Web server assumed the SCADA is used in a protected network and did not implement “Authenticate Actor” tactic at all. However, the system was connected to the Internet. In another case, ICSA-16-308-01 describes vulnerabilities in multiple versions and models of Moxa’s OnCell Security software, which are cellular IP gateways that connect devices to cell networks. Access to a specific URL is not restricted, and any user may download log files when the URL is accessed. This can be exploited and used for an authentication bypass, where the malicious actor accesses the URL without prior authentication. 

4) Improper Access Control: This is a weakness in Authorize Actors tactic that occurs when an ICS component fails to or incorrectly restricts an unauthorized actor from accessing resources or equipment. This weakness can result in privilege escalation, the disclosure of confidential or sensitive data, or execution of malicious code. ICSA-13-100-01 discusses a case that MiCOM S1 Studio Software which is used to configure parameters of electronic protective relays fails to limit access to executables, meaning users without administrative privileges can replace the executables with malicious code or perform unauthorized modification of the relay parameters. The malicious actor could also exploit this to allow other users to escalate their own privileges. Exploiting this vulnerability requires physical access to the device and it can’t be exploited remotely. The company provides mitigation strategies for users, but it has not been patched or fixed by adding appropriate security tactics. 

5) Cross-Site Request Forgery (CSRF): ICS components may not verify that an expected and/or valid request to a web server was sent intentionally by the client (a.k.a. “forged” request). A malicious entity could take advantage of this weakness to force or trick a client into sending an unintentional request to an ICS component or equipment, which appears to be valid to the SCADA server or process controllers. ICSA-15-239-02 describes a cross-site request forgery (CSRF) vulnerability in the integrated web server of Siemens SIMATIC S7-1200 CPUs, which are used in Programmable Logic Controllers (PLCs). A user with an active session on these web servers could be tricked into sending a malicious request, which would be accepted by the CPU server without verifying intent. This can be exploited remotely, and would allow the malicious entity to perform actions to the PLC using the tricked user’s permissions. 

6) Use of Hard-coded Credentials: occurred when ICS components stored any type of credential used for authentication, external communication, or encryption of data in readable formats (e.g. plain text) in discoverable locations. This weakness compromises an Authenticate Actors tactic in ICS, and enables malicious actors to implement authentication bypass exploits and make ICS components or equipment perform actions that require authentication or elevated privilege. This weakness was prevalent across all ICS components. For instance, ICSA-15-309-01 discloses a hard-coded SSH key in Advantech’s EKI-122X Modbus gateways, which integrate Modbus/RTU and Modbus/ASCII devices to TCP/IP networked devices. The firmware of these gateways contains unmodifiable, hard-coded SSH keys. Since these keys cannot be changed but are discoverable, they could be remotely exploited to intercept communication by a malicious external actor. 

7) Improper Neutralization of Special Elements used in an SQL Command (’SQL Injection’): This is another weakness in ICS that affects implementations of the Validate Inputs tactic. It describes an ICS component’s incorrect or failure to “neutralize” user-provided inputs which are inserted as parameters to SQL queries, which can be exploited in SQL Injection attacks. In such cases the un-handled input is executed as part of the query itself instead of as a parameter. This can lead to the database unintentionally returning sensitive data or modification of the existing data. This weakness was mostly associated with components such as Human-machine Interface (HMI), Supervisory control software (SCADA) and variants of SCADA (e.g. Process Control System (PCS)). ICSA-14-135- 01 discusses a SQL injection vulnerability in several versions of the web-based CSWorks framework that is used to build process control software. The framework fails to sanitize or validate (“neutralize”) user provided inputs which are intended to be used to read and write paths. 

8) Use of Hard-coded Password: Hard-coded passwords are passwords stored in plain text or easily decryptable formats that are located somewhere discoverable by malicious actors, who can exploit them to bypass authentication or authorization checks. This weakness affects implementations of the Authenticate Actors tactic in ICS products. For instance, ICSA-12-243-01 report describes a privilege escalation vulnerability in multiple versions of two varieties of GarrettCom Magnum MNS-6K Management Software, which is used for device management on managed Ethernet switches. An undocumented but discoverable hard-coded password could allow a malicious actor with access to a preexisting account on a device to elevate their account privileges to admin levels. With these elevated privileges, a denial-of-service attack could be rendered or sensitive equipment settings could be changed. 

9) Insufficiently Protected Credentials: This weakness occurred when the ICS components used an insecure method for transmission (e.g. insecure network) or storage (e.g. plain-text) of credentials. This insecurity can be exploited for malicious interception or discovery. This weakness affects implementations of the Encrypt Data tactic. ICSA-16-336-05B discusses unprotected credentials which can be exploited in multiple versions of 3GE Proficy Human Machine Interface (HMI), Supervisory Control and Data Aquisition (SCADA), and Data Historian products. In these products, if a malicious actor has access to an authenticated session, they may be able to retrieve user passwords not belonging to the authenticated account they are using. However, it cannot be exploited remotely. 

10) Improper Control of Generation of Code (’Code Injection’): This weakness occurs when the ICS components fail to or incorrectly “neutralize” user-provided input to remove code syntax. This can result in this code being executed when the input is used to generate a code segment in the software. This weakness affects the Validate Inputs tactic. ICSA-14-198-01 discusses a code injection vulnerability affecting some versions of Cogent Real-Time Systems, Inc’s DataHub, a middleware used to interface with various control systems. If a malicious actor has access to and creates a ‘Gamma’ script on the local file system, he/she can craft a specially formatted user name and password to perform a code injection attack via an ASP page to execute that script file.

Finally, we looked at which tactics are most often compromised in ICS.


The top 3 most compromised architectural tactics in ICS are Validate Inputs, Authenticate Actors, and Authorize Actors.

Furthermore, our finding indicates that:



OWASP Top 10 covers the critical Web Application Security Risks, however, it does not adequately present the risks in ICS.  Here you can learn about Top 20 security risks in ICS.

Summary:

  • Many modern HMIs are now web-based and there are cloud-based SCADA systems, therefore common web vulnerabilities affect these components of ICS.
  • The most common architectural weaknesses in ICS is different than for web applications (OWASP top 10).
  • ICS are most vulnerable to CWE-20 Improper Input Validation that can result in denial of service attack, crashing ICS process and equipment.
  • ICS components are not secured by design. Many vendors have indicated that their products are designed to be used in a protected environment and therefore do not have mitigation techniques (e.g. authentication and input validation tactics) required to work in an untrusted environment.
  • Industrial control systems rely on many vendors for PLC, RTU, IED and other controllers and equipment. This increases their attack surface and makes enforcing Input Validation tactic difficult.

In a paper to appear at ICSA 2019[1], we report these findings in greater details and present advice for practitioners, such as Isolate control system devices and/or systems from untrusted networks.



You may also like:

Joanna C. S. Santos, Anthony Peruma, Mehdi Mirakhorli, Matthias Galster, Jairo Veloz Vidal, Adriana Sejfia: Understanding Software Vulnerabilities Related to Architectural Security Tactics: An Empirical Investigation of Chromium, PHP and Thunderbird. ICSA 2017

Joanna C. S. Santos, Katy Tarrit, Mehdi Mirakhorli: A Catalog of Security Architecture Weaknesses. ICSA Workshops 2017: 220-223

M. Krotofil and D. Gollmann, Industrial control systems security: What is happening?2013 11th IEEE International Conference on Industrial Informatics (INDIN), Bochum, 2013.


S. McLaughlin et al., The Cybersecurity Landscape in Industrial Control Systems, in Proceedings of the IEEE, vol. 104, no. 5, pp. 1039-1057, May 2016.







[1] Danielle Gonzalez, Fawaz Alhenaki and Mehdi Mirakhorli, “Architectural Security Weaknesses in Industrial Control Systems (ICS): An Empirical Study based on Disclosed Software Vulnerabilities”, International Conference on Software Architecture, Hamburg, Germany, March 2019.