Sunday, June 26, 2016

CodeTube: Making the Best of Software Development Video Tutorials

 Associate Editor: Sonia Haiduc, Florida State University, USA (@soniahaiduc)

Software developers need to continuously learn new skills to keep up with their tasks, such as using a new library, learning a new programming language, or in general adopting a new technology never used before. In such a learning process, along with more formal documentation, online (and informal) resources can be very useful. Video tutorials are one of the emerging ways in which this kind of knowledge is available to developers. Videos are, however, a noisy data source, and finding the right piece of information within a long video tutorial can be frustrating and inefficient.

Meet CodeTube, a novel search engine that analyzes and fragments the contents of videos, offering developers the ability to find only the information they need within otherwise long tutorials. CodeTube extracts and indexes the audio transcript, as well as the text appearing on screen in the video tutorials, including source code, something no other video search engine currently offers to developers. Furthermore, the indexed text is used to retrieve related posts from Stack Overflow, displaying them below the video fragment and thus integrating in one place different sources of information.

Actually, CodeTube does much more than that. Here are some of its main features:
  • It mines video tutorials found on the web, enabling developers to query their contents;
  • It splits video tutorials into cohesive and self-contained video fragments;
  • It returns only relevant video fragments in response to a developer’s query, ignoring the irrelevant parts that may occur in lengthy videos;
  • It extracts and indexes the source code and English text which appear on the screen, as well as the audio transcripts, using a combination of text analysis and image processing;
  • It recommends relevant Stack Overflow discussions to the video fragments selected by developers; and
  • It recommends related video fragments to the one selected.

CodeTube has been evaluated in two studies involving developers. In the first study, 34 developers evaluated (i) the coherence and conciseness of the video fragments produced by CodeTube, as well as their relevance to a query, as compared to the results returned by YouTube, and (ii) the relevance and complementarity of Stack Overflow discussions returned by CodeTube for specific video fragments. In the second study, CodeTube was introduced to leading developers involved in the development of Android apps. They were asked questions about the usefulness of CodeTube, focusing on the value of extracting fragments from video tutorials, and of providing recommendations by combining different sources of information.  The results of both studies indicate that developers consider CodeTube a useful tool with a great potential to help them during their daily tasks.

CodeTube is available online, and you can try it for yourself at:

The current dataset focuses on Android tutorials, but the researchers aim to include other topics in future work, so stay tuned!


[1] L. Ponzanelli, G. Bavota, A. Mocci, M. Di Penta, R. Oliveto, M. Hasan, B. Russo, S. Haiduc, and M. Lanza, “Too Long; Didn’t Watch! Extracting Relevant Fragments from Software Development Video Tutorials,” in International Conference on Software Engineering, Austin, Texas, USA, 2016. Preprint available at:

[2] L. Ponzanelli, G. Bavota, A. Mocci, M. D. Penta, R. Oliveto, B. Russo, S. Haiduc, and M. Lanza, “CodeTube: Extracting Relevant Fragments from Software Development Video Tutorials,” in Proceedings of ICSE 2016 (38th ACM/IEEE International Conference on Software Engineering), 2016. Preprint available at: