Sunday, July 24, 2016

Minor contributors correlate to bugginess. But not when they're code reviewers.

By: Patanamon (Pick) Thongtanunam, Nara Institute of Science and Technology, Japan (@pamon)
Associate Editor: Bogdan Vasilescu, University of California, Davis. USA (@b_vasilescu)

Weak code ownership correlates to poor software quality. Code ownership is a common practice in large, distributed software development teams. It is used to establish a chain of responsibility (who to blame if there is a problem) and simplify management (to whom a task or bug-fix should be assigned). A simple intuition for estimating code ownership is that the developer who has written majority code to a module should be an owner of that module. Moreover, prior research found that a module with weak code ownership (that is written by many minor authors) is more likely to have bugs in the future [1].

Nowadays, development practices are more than just writing code. A tool-based code review process has tightly integrates with the software development cycle. Recent research has found that in addition to a defect-hunting exercise, reviewers also help an author to improve the code changes [2,3]. Then, these code writing and reviewing activities are orthogonal: teams can have a developer who reviews a lot but writes little, and vice versa.

Does code review activity change what we know about ownership and software quality? This led us to investigate the importance of code review activities for code ownership and software quality [4]. Through an empirical study of Qt and OpenStack systems, we (1) investigated the code authoring and reviewing activities of developers, (2) refined code ownership using code reviewing activities, and (3) studied the relationship between our refined ownership and software quality.

Code reviewers are the majority of contributors in a module

We found that the developers who did not previously write any code changes but only reviewed code changes are in the largest proportion of developers who contributed to a module (67%-86% of contributors in a module at the median). Moreover, 18%-50% of these review-only developers are documented core developers of the Qt and OpenStack projects. These findings suggest that if a code ownership estimation considers only code authoring activities, it is missing many developers who also provided reviewing contributions to a module.



Figure 1: Refined code ownership

Many minor authors are actually major reviewers

We observe the amount of code authoring and reviewing contributions that developers made to a module and classify two levels of expertise i.e., major and minor levels in each dimension (Fig 1).

Traditional code ownership (TCO) is solely derived from code authoring activities while review-specific ownership (RSO) is solely derived from code reviewing activities. The interesting part in Figure 1 is the minor authors since these developers were classified as low-expertise developers.

However, we found that 13%-58% of minor authors are major reviewers who actually reviewed many code changes to a module. This finding suggests that many major developers who actually make large contributions to modules by reviewing code changes were misclassified as low-expertise developers according to their low code authoring activities. 

Reviewing expertise reverses the relationship between authoring expertise and software quality

We further investigated whether reviewing expertise has an impact on software quality or not. Hence, we compared the rates of developers with each level of expertise in between defective and clean modules. We found that the rates of developers with the minor author and minor reviewer expertise in defective modules are higher than those in clean modules (The left bean plot in Figure 2). On the other hand, the rates of developers with the minor author but major reviewer expertise in defective modules are less than those in clean modules (The right bean plot in Figure 2). When we control for several confounding factors using statistical models, the rates of developers with the minor author and minor reviewer expertise still share a strong increasing relationship with defect-proneness. These results indicate that the reviewing expertise share a relationship with software quality and it can reverse the direction of the association between the minor authorship and defect-proneness. 


Figure 2: The relationship between minor authors and defect-proneness in Qt version 5.0.0

Practical suggestions

Our findings lead us to believe that code reviewing activity captures an important aspect of code ownership. Therefore, future estimations of code ownership should take code review activity into consideration in order to accurately model the contributions that developers have made to evolve software systems. Such code ownership estimations also can be used to chart quality improvement plans. For example, teams should apply additional scrutiny to module contributions from developers who have neither authored nor reviewed many code changes to that module in the past, while a module with many developers who have not authored many code changes should not be considered risky if those developers have reviewed many of the code changes to that module.

References

[1] C. Bird, N. Nagappan, B. Murphy, H. Gall, and P. Devanbu, “Don’t Touch My Code! Examining the Effects of Ownership on Software Quality,” in Proceedings of the 8th joint meeting of the European Software Engineering Conference and the International Symposium on the Foundations of Software Engineering (ESEC/FSE), 2011, pp. 4–14.
[2] A. Bacchelli and C. Bird, “Expectations, Outcomes, and Challenges Of Modern Code Review,” in Proceedings of the 35th International Conference on Software Engineering (ICSE), 2013, pp. 712–721.
[3] P. C. Rigby and C. Bird, “Convergent Contemporary Software Peer Review Practices,” in Proceedings of the 9th joint meeting of the European Software Engineering Conference and the International Symposium on the Foundations of Software Engineering (ESEC/FSE), 2013, pp. 202–212.
[4] P. Thongtanunam, S. Mcintosh, A. E. Hassan, and H. Iida, “Revisiting Code Ownership and its Relationship with Software Quality in the Scope of Modern Code Review,” in Proceedings of the 38th International Conference on Software Engineering (ICSE), 2016, pp. 1039–1050.

Sunday, July 17, 2016

Your Local Coffee Shop Performs Resource Scaling

Marios-Eleftherios Fokaefs, York University, Toronto, Canada
Associate Editor: Zhen Ming (Jack) Jiang, York University, Toronto, Canada 

Ever since I moved to Canada, about 8 years ago, I became an avid Starbucks customer, primarily, because it was one of the few places, where I could find a decent iced coffee. As a Greek, I was bound by destiny and tradition to keep drinking iced coffee (asking for a cold beverage in -35℃ in Alberta, it was funny to look at baristas rendered speechless). When I moved to York University for my postdoc and found the closest Starbucks to initiate my everyday routine, Starbucks happened to launch the “Mobile Order & Pay” feature. At the same time, with the new group at CERAS lab, I got better acquainted with Cloud Computing and with such concepts as Self-Adaptive Systems and Cloud Elasticity. Given these two facts, one morning I was waiting in a rather long line at Starbucks, when I noticed one of the employees coming out with a cart full of empty cups and taking orders from the people in the line. I also noticed that baristas are highly efficient and multi-tasking when crafting the beverages, but customers are rather slow in comparison when ordering or paying. “Here is resource scaling and process adaptation in practice!”, I thought. The employees noticed the delay in ordering and they decided to speed up the process, parallelized it with paying and took advantage of the much faster crafting process.

Stepping back a few years, back when I started my graduate studies at the University of Alberta, my then supervisor, Dr. Eleni Stroulia, in order to introduce us to web services and processes recommended us to read Gregor Hohpe’s paper “Your Coffee Shop Doesn’t Use Two-Phase Commit” in IEEE Software Design [1]. In that paper, the author makes a simile of how Starbucks executes orders (the choice of coffee shop is completely coincidental, I swear!) with software processes, fault tolerance and rollback. In this post, I will make a similar attempt again using Starbucks as my example to explain how resource scaling in cloud works. The post is split in four parts where I lay out the details on resources and processes (part 1) as employed by the Starbucks system and the equivalent web software system on cloud, monitoring and analysis of performance metrics (part 2), planning and execution of scaling and adaptive actions (part 3) and the economics of scaling (part 4) in both systems.


1. Processes, Resources and Topologies


Starbucks is primarily a service, which means that in the center of its processes are people and human tasks. The people that participate in the service are the customers, who issue the orders and pay, and the employees, who can be distinguished in tellers and baristas (i.e., the ones who prepare the beverages or other orders). From a system’s perspective, the customers provide the input to the service and the baristas along with special equipment and raw material (coffee, milk etc.) are the resources with which the requests are executed and served.



Figure 1. The Starbucks system and the flow of orders.


Figure 1 shows the overall flow of the Starbucks system. As the clients start their interaction with the system, they enter a queue. The first interface of the system is the cash register. At this point, a client may issue an order. The order can be anything or any combination between hot or cold beverages, food and dessert items or packaged goods. Different orders may require different processes, different equipment and obviously different preparation time. This is an advantage for the baristas since they can parallelize several orders and speed up the whole order process. For example, some hot beverages need either or both the espresso machine and the milk steamer, while cold beverages may need the blender. For many of these drinks the set of required equipment may be completely independent which allows the baristas to execute them simultaneously. The same assumption holds for drinks and food items, as the latter may need to be heated. Some orders may be so simple that can be executed on the spot by the cashier, including brewed coffee, tea or some packaged items. In such orders, the wait time is almost insignificant. The orders are received and executed in a “first-in-first-out” basis. However, due to the variation in execution time, a customer may receive an order later than another, even though it may have been placed earlier than the latter. An interesting characteristic of the Starbucks system concerning its human resources is that there is relatively little training involved and as a result every employer can assume any role at any time with little or no impact to the process. This additional flexibility allows the system to reassign its resources to address its needs as they appear, e.g. assign more baristas or assign more tellers to take orders, given the equipment restrictions.
Once the order has been placed, payment must be received. As with the orders, payment also comes in many forms, which can also affect the time in which the payment will be processed. For example, cash or Starbucks cards are quickly processed, while credit cards and debit cards take more time, even without the odd failure. In general, once the order has been placed, its execution starts immediately and the payment has been usually processed before the drink has been prepared. This means that the client may need to wait more before he or she receives the end product of the order, depending also on the backlog of the orders that has been accumulated by the time the order was placed.


On the Cloud…


Using the Starbucks system as basis, I will explain in similar terms how software services operate using cloud resources. One basic difference is that such systems do not rely as heavily on humans, as most operations are automatic (or at the very least interactive) and executed by software. Nevertheless, the input can come either from other software systems or from humans (as the Starbucks clients). The difference is that software clients have much higher capacity than humans in issuing requests as they have little think time; once a response is received it can be quickly processed and a new request may be issued immediately. Other than that, the systems are basically similar; we have requests coming in (like orders), the requests may be of different nature, thus requiring different resources and taking variable time to be processed, and the clients remain in a queue, while waiting for their requests to finish.



Figure 2. The topology of a web software system on the cloud and the flow of requests.


Figure 2 shows the equivalent system of a web application deployed on a cloud, along with the flow of request processing. The system we are considering is a simple three-tier architecture, where the clients issue requests through an interface, the requests are dispersed by a load balancer to copies of the application in a number of application servers, where they are processed, and, if there is a need, a database is accessed to fetch or store data. The load balancer serves as the queue of the Starbucks system. In design time, we may set the load balancer to distribute requests in a generic manner (e.g. round robin, or by busyness) or be more sophisticated and distribute the requests to specific server clusters according to their individual demands for resources (e.g. CPU, memory, disk etc.). The latter case, which is closer to the Starbucks scenario with multiple types of orders, has interesting and important economic extensions, which we will discuss in Part 4. Exactly like the Starbucks system, requests affect each other as they take up resources, which may lead to delays and longer queues.


Unlike the Starbucks system, it is not as easy or seamless to repurpose resources and assign them a different task on runtime; an application server cannot become a database server with the snap of a finger. However, to compensate for this challenge, cloud environments offer additional flexibility in commissioning and decommissioning resources. Since we deal with virtual and not physical resources on the cloud layer, we can easily boot up a new server and have it working in matter of minutes or even seconds. When we no longer need it, we can stop it without affecting the functionality or the performance of the overall system. This strategy is not equally possible in the Starbucks system, because we cannot hire or fire people on the spot for a very short period, neither can we call in an employee in an instant during rush hour. We will further discuss this special ability of cloud computing in Part 3. 

2. Monitoring and analysis

When considering the quality of service, we need to pinpoint those metrics, which need to be monitored with respect to the system’s health, in order to identify any potential performance problems. Performance is crucial for interactive systems as this will be perceived as quality by the end clients. In the Starbucks system, quality is determined by customers based on the several wait times they have to endure, as shown in Figure 3 by the red-yellow stars. Customers will have to wait to order, the original queue, to pay, based on payment processing times, and finally to receive their order, after all the preparations and crafting have finished.



Figure 3. The Starbucks system along with the monitored metrics.


Interestingly enough, order and pay wait times do not depend entirely on the system’s response capacity, given the primarily interactive nature of the system. While order wait time is partly waiting in line, a big chunk of it is waiting for the customer to decide what they want to order and then actually order it. For those familiar with the Starbucks menu, you can imagine that this is not always a trivial task. After the order has been placed, the cashier has to make out all the details for a particular order in an interactive manner (“Is 2% milk fine?”, “Would you like that sweetened?”). Remarkably, perceived quality is not so much determined by the actual order wait time, but more based on the wait time in the queue. This is because in the queue the customer is inactive, while during the order there is interaction, which is understandable and acceptable. Pay wait time depends on the particular payment type. Usually, cash payment may take longer than card payments, which also take long based on the responsiveness of the credit card service, not considering any potential failures and retries. Payment with the Starbucks rewards and cash cards is most often the fastest one.


The processing and wait times concerning the back-end processes of the system, i.e., the preparation of the orders, are usually regulated and within constraints imposed by the resources. More often than not, baristas are quite efficient in crafting beverages, even more than the customers ordering. The performance of the equipment is standardized. Therefore, the overall performance of the process can be improved only by adding more resources, baristas or equipment. However, adding too many resources can actually create problems, as it was suggested in Fred Brooks’ book “The Mythical Man-Month”.

On the Cloud…


To a large degree, the quality of any system is perceived based on its responsiveness and effectiveness in a timely fashion. The same holds for most software systems on the cloud. As shown in Figure 4 (also as red-yellow stars), the software system’s final response time depends on the performance of the individual resources participating in a request, including computation, storage and network resources. Similarly to the Starbucks system, the client is also responsible for producing some wait time while preparing to issue the request. However, this time is not perceived by the system, since the response time is measured from the moment that the request is received, and there is no or insignificant waiting in the queue, as requests are usually processed in parallel by multithreaded applications. Nevertheless, this time, known as think time, is important when considered along with the response time; if the response time is lower (or even significantly lower) than the think time, it is not perceived as strongly by the customer. In the opposite case, if the think time is low (especially when we talk about software clients), even a slight increase in the response time will be noticed by the client. This property is helpful, when considering and setting performance goals.




Figure 4. A software system on the cloud along with the monitored metrics.


Concerning the performance of the back-end resources, we have to take into account the demand of the various requests in CPU, memory, disk, network and so on. These demands can be roughly estimated for each class of requests during the implementation of the application in the profiling process. Given the performance specification of the cloud resources and the demands of requests, we can also estimate the overall need of the application for resources and predict its performance under certain workloads. With this knowledge, we can also address any fluctuations on the workload by dynamically allocating or deallocating resources. Unlike the Starbucks system, resources in the cloud can change in number and in size in a more flexible and volatile manner, since we are talking about virtual resources, and there are less significant constraints with respect to their number since they do not affect each other to a large degree. The only constraints are imposed by the underlying hardware, which may or may not lie within the application owner’s control, and eventually is a matter of cost with respect to how many resources will be allocated.

3. Scaling Planning and Execution


Having a good understanding about the Starbucks system and its resources, and having set up the monitors for the performance metrics, we can now better understand and identify the motivation behind some of the changes that Starbucks performed to their process. Figure 5 shows the changes introduced in the Starbucks system, marked as numbered circles. 




Figure 5. The adapted Starbucks system with the changes in numbered circles.


The carefully placed monitors (aka employees looking at very long lines and disgruntled patrons muttering under their breath about the same problem), revealed that the bottleneck, or at least one of the bottlenecks, of the Starbucks system is the order queue. Therefore, the first adaptive action that the employees took was the “cup cart” (Figure 5, change 1), i.e. a small wheel cart with empty cups of all sizes and types. An employee would walk along the line with the cart and he or she would mark down a cup with each customer’s order(s) and then pass them down to the baristas. The wheel cart does not accept orders for food, packaged items and brewed coffee or tea as these can be served straight at the cashier. With this change, Starbucks effectively separated the order queue from the pay queue and managed to parallelize the two processes. Having placed the order, the customers feel more relaxed as they know that their order is already being processed as they wait in line to pay. In practice, by the time the customers pay, their order may be ready for pick up, which increases their perception of quality. 


The second adaptive change took advantage of novel technology, where Starbucks introduced the “Mobile Order and Pay” service through the Starbucks mobile application (Figure 5, change 2). The concept is that a customer can place an order to a specific Starbucks store, pay through their Starbucks account and then go to pick up their order from the store. In this way, they can completely skip the queue and pick up their order right when they step in the store. The order and pay queues become completely invalid and the only thing that remains is the preparation time. This also becomes manageable, as the application returns an approximate time by which the order will be ready, taking into account average preparation times and customer arrival patterns.

After both these changes, the only wait time that remains is the preparation time. As it has already been mentioned, this time can be reduced if necessary by commissioning more resources, human or equipment. However, several restrictions apply to these scaling actions. For example, we cannot dynamically increase the size of our equipment for a few ours and then release whatever we no longer need. The acquisition of equipment is planned based on average customer arrival and as a result there are moments when equipment is underutilized or others when it is not enough and wait time is temporarily increased for customers. In addition, we cannot increase the number of baristas to more than 2 or 3 at a given time depending on the size of the store. What Starbucks would do with respect to employees is identify specific times during the day or during the year (e.g. Christmas or other holidays) with increased traffic and assign more baristas or cashiers. This strategy has become a norm to almost all services and has been transferred to software services as well.


On the Cloud…


Thanks to the flexibility of cloud computing, there is a number of adaptive changes we can make to address potential performance issues of the deployed system. Figure 6 shows some of these changes noted in numbered circles. The first change that comes in mind when considering a high response time for a software system on the cloud is to add more resources, mainly virtual machines (Figure 6, change 1).  Unlike the Starbucks system, space restrictions are less prominent in cloud systems. Although these restrictions can be imposed by hardware (a physical server cannot accommodate an infinite number of virtual machines), it doesn’t always come to that. And when it does, the software can be moved to a public cloud, which usually has much larger capacity than private infrastructure and space is not an issue. On the other hand, the problem then becomes one of cost; reserving more and more virtual machines translates into more money that needs to be paid to the public cloud provider. This is a consequence, which we will discuss in the next part concerning the economics of scaling. Cost is actually the motive for the second change (Figure 6, change 2); removing a virtual machine from a cluster when it is not fully utilized. If the workload can be shared by the remaining resources, the spare virtual machine can be removed to save costs.




Figure 6. The adapted software system with the changes as numbered circles.


Another possible change concerns the redistribution of requests to specialized clusters (Figure 6, change 3). We assume that the load balancer of the software system already distributes the incoming requests to specialized clusters according to their demands for specific resources. However, this is done in a very straightforward manner and the balancer sends the CPU intensive requests to the CPU cluster, the memory intensive requests to the memory cluster and so on. If a particular cluster is saturated, the first thought would be to add resources, as in change 1. Alternatively, since the virtual machines in the clusters actually possess all resources (CPU, memory, disk) in different configurations, we can avoid adding unnecessary resources and actually redirect requests from the saturated cluster to one that is not as utilized. However, one needs to be careful and not send too many requests to other clusters to avoid saturating their otherwise limited resources. Therefore, this action requires changes to the balancer software to become more sophisticated with management responsibilities as well.


Finally, if the bottleneck is in the database requests, we can scale the data layer in a similar manner (Figure 6, change 4). Thanks to recent advancements with Big Data and NoSQL technologies in databases, it is possible to partition data and distribute it in multiple stores making both reading and writing to the database faster.


4. On the Economics of Scaling


Out of the potential bottlenecks we identified in Part 2, we have seen that Starbucks have paid particular attention and applied actions to address the order and pay queues. Performance improvement aside, looking at the economic aspect of scaling, this focus makes sense for another reason. Waiting in long lines to order may prompt the customers to abandon the endeavour and leave the store altogether. This automatically means loss of revenue for Starbucks, but it may also imply a steep fall to the customers’ long-term perceived quality, which may prevent them from visiting this particular store in the future or Starbucks altogether. On the other hand, once the customers have placed their order and wait to pay, they are less likely to leave the queue and even less after they have paid. Formally, the probability that a client will leave the system prematurely decreases as he or she progresses further within the system. Long pickup times may affect the long-term quality, but customers will rarely abandon something they have already paid for. By eliminating the order and/or the pay queue, either with the wheel cart or with the mobile app, Starbucks minimizes the risk of losing clients, who have already entered the system, and increases its total expected revenue.
The adaptation costs of the changes described above are also of particular interest. The wheel cart solution is virtually inexpensive, since the cart and the cups already exist, as is the human resource (cashier or barista) at the moment of the change. The redistribution of human resources may affect the rest of the system, but since the order queue has been identified as the current bottleneck, reassigning one employ to this extra task will be of more benefit than cost. The mobile app has obvious additional costs (infrastructure, developers, maintenance and so on), but it is a global solution, which can be applied to all (or potentially all) stores. Furthermore, it eliminates two potential bottlenecks, order and pay queues, and it is a parallel and alternative process, which does not affect the other orders to a large degree. Finally, adding and removing resources dynamically and just-in-time has obvious costs, but more importantly carries high economic risk; reserving equipment for a period of high traffic that would actually be shorter than expected may result in higher costs. Overall, the cost to benefit ratio of adding physical resources may be high enough, so that the system would prefer a few short periods with higher delays than adding resources indiscriminately.


On the Cloud…


Unlike the Starbucks system, adding and removing cloud resources is a convenient change and inexpensive considering the low cost for hardware and the fact that cloud computing is an economy of scale. The last statement means that since a physical host can host a large number of virtual machines, the more VMs are commissioned by clients the more the cost spreads across these machines. However, the concept of economies of scale also applies on the software in a negative manner; the fewer requests a VM serves the more expensive they are. Therefore, it is desirable that our clusters operate close to full capacity so that their costs spread out to more requests. This is especially prominent in small systems, where the clusters are small and an extra VM will increase the average cost per request too much. As a result, when a small number of requests will trigger a scaling action to preserve performance, we may be reluctant to add new resources and prefer to wait until more requests arrive with the risk of increasing the system’s response time. However, a considerably increased response time may result in dropped requests, similar to Starbucks customers leaving the queue. Both these phenomena can result in a significant decrease to the long-term perceived quality of the system. In fact, unlike services like Starbucks, software services usually have written agreements, known as Service Level Agreements (SLAs), with their clients where they guarantee a maximum response time and a minimum availability rate (i.e., percentage of served requests out of total received). Violation of these agreements may even result to financial penalties.

Concerning the concept of heterogeneous clusters using VMs optimized for specific resources, as we described our software system, the motivation comes from Amazon’s pricing policies for virtual resources. Within virtual machines of the same type (general purpose, CPU optimized, memory optimized and so on), when we want to double the resources of the VM so does its cost. However, if we want to unilaterally increase only one resource per VM, we can commission a specialized VM for a lower cost than a general purpose VM, which would unnecessarily increase the other resources as well. Although heterogeneous clusters are an optimal solution from cost and performance perspective, they require additional logic on the load balancer to make sure that the requests are distributed according to their needs.

5. Conclusions 


The purpose of this post was to show that resource scaling and dynamic adaptation is a reality not only in software and computer systems, but in everyday services and processes as simple as ordering a cup of coffee. Crucial components of the adaptation process are monitoring, where we study the performance of our system and identify potential problems, correct planning and execution of the adaptive actions given the available resources, and eventually the economic considerations of the whole process. 

The inclusion of novel technologies, including cloud and mobile computing, does not diminish the role of humans in the process. On the contrary, scaling and smart solutions can lead not only to better services for customers, whether these are software or human services, but more importantly they can lead to economic benefits for the companies, the employees and the clients. Cost savings can allow companies to redistribute the budget towards further improving the service, or the quality of work for their employees (increased salaries, better training etc.). In addition, lower costs can lead the market through competition to lower the service prices to the benefit of the clients.  These facts show that during dynamic adaptation, a service should be perceived both as a system and as a product with economic considerations.


References

[1] 
Gregor Hohpe. Your Coffee Shop Doesn't Use Two-Phase Commit. IEEE Software. 22(2): 64-66 (2005) 


If you like this article, you might also enjoy reading:
  • Panos Louridas. Up in the Air: Moving Your Applications to the Cloud. IEEE Software. 27(4): 6-11 (2010).
  • Leah Riungu-Kalliosaari, Ossi Taipale, Kari Smolander. Testing in the Cloud: Exploring the Practice. IEEE Software. 29(2): 46-51 (2012).
  • Diomidis Spinellis. Developing in the cloud. IEEE Software. 31(2): 41-43 (2014).

Sunday, July 10, 2016

IEEE Software July/August 2016 Issue

Associate Editor: Brittany Johnson (@brittjaydlf)

The July/August 2016 issue of IEEE Software is packed with interesting papers, with a focus on software quality and the human elements' role in improving software quality. Despite so many great papers and interesting topics, I wound up reading two papers from top to bottom:

  • "The Weakest Link" by Gerard J. Holzmann, and
  • "Test Better by Exploring. Harnessing Human Skills and Knowledge" by Itkonen and colleagues

Numerous papers caught my attention just from the title... "Obstanovka: Exploring Nearby Space,"  "Exploiting Big Data's Benefits," and "Examining the Relationship between FindBugs Warnings and App Ratings" to name a few. However, it wasn't until I spoke with a colleague of mine that I was able to decide on a couple of papers to focus in on. After narrowing my selections down to a handful, I asked her, as someone with an academic and research background, which two she would be more interested in reading or hearing about. 

These two papers make similar yet contradictory points regarding how we can improve software development and quality. While Test Better By Exploring proposes introducing more (specialized) humans into the software development and quality assessment mix, The Weakest Link poses that humans may be introducing more problems than we can solve (in the context of trusting a computer with decision making rather than a human). 

The former reminds me of a study my advisor conducted that compared software and game development called "Cowboys, ankle sprains, and keepers of quality: how is video game development different from software development?" 1. Though it is not explicitly stated in this article, I think, as with Cowboy and Ankle Sprains, there is an implicit argument that we as software developers can learn and improve from other groups or types of development. Especially since an important part of game development is alpha and beta testing, which sound quite similar to the arguments being made in Test Better By Exploring. Being a human factors researcher, I recognize the benefits of incorporating this type of evaluation into all software development. The lab I work in does software tools research and we have found that one reason users may have so many problems with their software is because the software doesn't do what they expect. To know what users expect, or to deal with what they wouldn't expect more importantly, it seems necessary to include them in the process of creating and evaluating the software. Similar to heuristic evaluations, it seems like you can get the most out of this practice if you have a mix of different types of users, including non-user experts, to help shine a light on issues developers themselves may not encounter.

The topic of the latter has been a topic of discussion for a long time, as was mentioned in the article, and pervasive in pop culture (Will Smith's character's unwillingness to trust robots in I, Robot, the first artificially intelligent child being trusted by humans and robots in Artificial Intelligence A.I.). Most often the takeaway is, trust computers when human insights can be added or provided. The Weakest Link challenges this notion, suggesting that in some situations, computers could be effective at solving problems and making decisions without human interference. This topic becomes an even bigger debate when we think about how much of what we do on a day to day basis is being automated; self driving cars, robot vacuum cleaners, airplane autopilot, etc. Should we be putting more trust in our computers, slowly eliminating the human factor? Or is it possible that there is a plateau for the growth and dissemination of artificial intelligence as an every day part of our lives?

Overall, whether I agreed with all the points made or not, I thought both papers were a good read. Check out the July/August 2016 issue to see for yourself! 

Murphy-Hill, Emerson, Thomas Zimmermann, and Nachiappan Nagappan. "Cowboys, ankle sprains, and keepers of quality: how is video game development different from software development?." Proceedings of the 36th International Conference on Software Engineering. ACM, 2014.

Sunday, July 3, 2016

Diversity in CS shines at NSBC 2016

Associate Editor: Brittany Johnson (@brittjaydlf)

Earlier last month, I spent my weekend in the beautiful A-T-L, Georgia where I had the pleasure of attending the first annual National Society of Blacks in Computing (NSBC) Conference. This is separate from NSBE or Richard Tapia, though related (and almost just as new to me as NSBC). When I started studying CS, though I was aware that there were few females, and even fewer females that look like me, I was fortunate enough to have a solid support system from my mentor (Dr. Jim Bowring) and participation in SC LS-AMP under the direction of Christine Moore. However, the community that I found at NSBC goes above and beyond any of my prior experiences for various reasons, which I will attempt to summarize below.

NSBC was started with the hopes of building a community and awareness of  the accomplishments of blacks in computing; though if you attend NSBE, for example, you'll see hundreds if not thousands of us, that's is not the case in CS. We have much fewer African Americans in CS than any other engineering discipline which, I don't find surprising as I am typically one of few African Americans at the conferences I regularly attend. It was also apparent in the significantly smaller number of attendees at NSBC in comparison to engineering targeted groups like NSBE. But when I say it was refreshing to be around others like me on a personal AND professional level; people with similar research interests and areas that I would have NEVER met had it not been for NSBC.

However, when I say we are here...we are here. The conference had an attendance of 90, higher than they (and I) expected. Especially being the first year of the convention; nevertheless, NSBC was truly an informational and inspirational experience that I recommend to any African American studying or with a career in computing.

Now I would like to note that it seems a large part of the goal of this convention is to recruit more African Americans studying CS into PhD studies and then academia. So the target audience is undergrads and grad students (both early and late in their studies) that haven't made a decision regarding what they are going to do or need some inspiration and information to make a decision. Though there were some industry folk, both there as mentors and to learn more about potentially transitioning from industry into an academic career.

I would also like to note this was not a blacks rule, everyone else drools kind of conference. It was empowering, but there was lots of advice and anecdotes that attempt to put us in our places as well regarding how we perceive our place in the community, the non-uniqueness of our struggle to our race, and how we can use our resilience to build relationships and be successful through adversity. The example that stood out to me was Ben Shneiderman, who is basically the father of the field of Human Computer Interaction. Before he was renowned and successful, however, he was a leper in the CS community, constantly ridiculed, belittled, and disrespected. He even had trouble getting tenure promotions, something we wouldn't expect a white male to have any issues with. However, ignorance is real and transcends race, ethnicity, and culture. Sometimes we're just scared of what we don't  know or understand...so just like Ben, help them understand. Be a part of your community, despite the adversity, and continue to stay true to who you are and what you believe. Once they understand, they'll come around. And if they don't, you didn't need them to begin with.

Though some sessions were for all, most sessions were divided into three tracks: undergraduate, graduate, and future faculty. Being I'll be on the market in the Fall, I attended the future faculty track where I was able to gather extremely useful information, mostly in the form of anecdotes delivered by computer scientists with various education backgrounds and career paths. However, I was told all tracks were valuable and provided both opportunities for learning, networking, and personal growth. For more information on the program from this year's conference, look here: https://goo.gl/oZA6ez

I want to narrow in on the networking point...we really don't hear enough about all the GREAT things we are doing in computing and computing education. One thing in particular that stood out to me at this conference was the incredible intelligence, resilience, and success our people have come to achieve and utilize. For example, I had the absolute pleasure of meeting Dr. Elva Jones. Dr. Jones got her PhD from NC State (something I was almost embarrassed that I didn't know); afterwards, she got an offer to come back and teach at NCSU and turned it down to go back to her HBCU alma mater to give back (Winston-Salem State). And give back she did...Dr. Jones is the founder (and legacy behind) the CS department at Winston-Salem State! Probably the coolest thing I've ever heard...and it hits so close to home. Now I was really embarrassed that I didn't know who she was...though super glad I got to meet her. Not to go on too much of a tangent, but this is why we need more specialized efforts in recognizing and archiving the achievements of African Americans in CS...unfortunately they tend to fall by the waist side, leaving people like myself to find inspiration on their own. One initiative I've been involved with plans to change that - maybe I can give more details on that later when it's come closer to finalizing ;).

To better assist with networking and the forming of relationships (i.e. professional, mentoring), the organizers used personality assessments to provide personality profiles to each attendee. They used stickers on our badges to show what personality profile we are, and for many this was a great way of meeting like minded people that could potentially contribute to each other's career. I will say I didn't use the stars to decide who to meet, but for those who know me I just love meeting new people...like these wonderful women pictured below. I already knew one of them, but the rest I met at the conference. I can honestly say about each and every one of them that we have formed some sort of foundation to a future relationship in our careers, both mentoring and professional.

At the end of it all, there was a dinner with a keynote and surprise guests. The keynote was GREAT (I'll talk about in follow-up day by day details), but the surprise guests were the founders of the aaphdcs listserv! It was such a humbling (and tear jerking) experience seeing them take in all the beautiful, intelligent African American Computer Scientists around them. I had a similar moment with one of the other faculty attendees that where we both took a step back to take in the idea that we are building relationships that will help bring in a new generation of diversity in tech.

I would also like to mention that aside from the fact that they had us in the NICEST hotel I've ever stayed in for a conference, the food was PHENOMENAL! There were options for everyone, and plenty to go around. The only downside, cash bar :(. But, due to the hospitality of the organizers, we were able to have some fun after hours watching the NBA Finals and enjoying each other's company (with free alcohol, which never hurts XD).

Now that I've given you an overview of my experience and what I thought was great about NSBC, I hope it'll inspire you to join the community or spread the word so others know they're not alone! We're here baby!


Editor's Note: As Editor-in-Chief of the IEEE Software Blog, it gives me great pleasure to welcome Brittany Johnson as a new associate editor for the blog. She will blog about diversity in computing as well as about the content in the new issues of IEEE Software Magazine. - Mei Nagappan










Sunday, June 26, 2016

CodeTube: Making the Best of Software Development Video Tutorials

 Associate Editor: Sonia Haiduc, Florida State University, USA (@soniahaiduc)

Software developers need to continuously learn new skills to keep up with their tasks, such as using a new library, learning a new programming language, or in general adopting a new technology never used before. In such a learning process, along with more formal documentation, online (and informal) resources can be very useful. Video tutorials are one of the emerging ways in which this kind of knowledge is available to developers. Videos are, however, a noisy data source, and finding the right piece of information within a long video tutorial can be frustrating and inefficient.

Meet CodeTube, a novel search engine that analyzes and fragments the contents of videos, offering developers the ability to find only the information they need within otherwise long tutorials. CodeTube extracts and indexes the audio transcript, as well as the text appearing on screen in the video tutorials, including source code, something no other video search engine currently offers to developers. Furthermore, the indexed text is used to retrieve related posts from Stack Overflow, displaying them below the video fragment and thus integrating in one place different sources of information.

Actually, CodeTube does much more than that. Here are some of its main features:
  • It mines video tutorials found on the web, enabling developers to query their contents;
  • It splits video tutorials into cohesive and self-contained video fragments;
  • It returns only relevant video fragments in response to a developer’s query, ignoring the irrelevant parts that may occur in lengthy videos;
  • It extracts and indexes the source code and English text which appear on the screen, as well as the audio transcripts, using a combination of text analysis and image processing;
  • It recommends relevant Stack Overflow discussions to the video fragments selected by developers; and
  • It recommends related video fragments to the one selected.





CodeTube has been evaluated in two studies involving developers. In the first study, 34 developers evaluated (i) the coherence and conciseness of the video fragments produced by CodeTube, as well as their relevance to a query, as compared to the results returned by YouTube, and (ii) the relevance and complementarity of Stack Overflow discussions returned by CodeTube for specific video fragments. In the second study, CodeTube was introduced to leading developers involved in the development of Android apps. They were asked questions about the usefulness of CodeTube, focusing on the value of extracting fragments from video tutorials, and of providing recommendations by combining different sources of information.  The results of both studies indicate that developers consider CodeTube a useful tool with a great potential to help them during their daily tasks.

CodeTube is available online, and you can try it for yourself at: http://codetube.inf.usi.ch/.

The current dataset focuses on Android tutorials, but the researchers aim to include other topics in future work, so stay tuned!


References:

[1] L. Ponzanelli, G. Bavota, A. Mocci, M. Di Penta, R. Oliveto, M. Hasan, B. Russo, S. Haiduc, and M. Lanza, “Too Long; Didn’t Watch! Extracting Relevant Fragments from Software Development Video Tutorials,” in International Conference on Software Engineering, Austin, Texas, USA, 2016. Preprint available at:
http://www.inf.usi.ch/phd/ponzanelli/profile/publications/2016a/Ponz2016a.pdf

[2] L. Ponzanelli, G. Bavota, A. Mocci, M. D. Penta, R. Oliveto, B. Russo, S. Haiduc, and M. Lanza, “CodeTube: Extracting Relevant Fragments from Software Development Video Tutorials,” in Proceedings of ICSE 2016 (38th ACM/IEEE International Conference on Software Engineering), 2016. Preprint available at:
http://www.inf.usi.ch/phd/ponzanelli/profile/publications/2016b/Ponz2016b.pdf