If you look for an app on the Google Play store, you’ll commonly find a link to a legal document disclosing the private information that is accessed or collected through the app. Perhaps the biggest hindrance in understanding and analyzing these privacy policies is their lack of a canonical format. Privacy policies exist in all lengths and levels of detail, yet under United States law, they must all provide the end user with enough information to be able to make an informed decision on the app’s access to their private information .
Sensors and Code
As mentioned above, mobile devices often provide access to various sensors including GPS, Bluetooth, cameras, networking devices, and many others. In order for an app’s code to access data from theses sensors, it must invoke methods from an application program interface (API). For the Android operating system, accessing this API is as simple as invoking the appropriate methods, such as android.location.LocationManager.getLastKnownLocation(), directly in the app’s code. It is these invocations that need to align with the apps’ privacy policies for consistency to be true.
Bridging the Gap
For our approach, we created associations between the API methods used for accessing private data and the natural language used in privacy policies to describe that data.
Using the above technique, we were able to discover 341 violations from the top 477 Android applications. We believe this implies a lack of a policy verification system for developers and end users alike.
Implications for Developers
Based on our results, we believe that this information and framework can be used to aid developers in ensuring consistency for their own privacy policies. To this end, we are extending our work with an IDE plugin to aid developers in consistency verification as well as a web-based tool for checking compiled apps against their policies. We believe that such tools could be invaluable especially to smaller development teams that may not have the legal resources available to more established development firms. Ultimately, access to such tools could lead to not only a better development experience, but a better product for the end user.
 S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traeon, D. Octeau, and P. McDaniel. Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. In 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2014.
 S. Rasthofer, S. Arzt, and E. Bodden. A machine-learning approach for classifying and categorizing Android sources and sinks. In Network and Distributed System Security Symposium, 2014.
 J.R. Reidenberg, T. D. Breaux, L. F. Cranor, B. French, A. Grannis, J. T. Graves, F. Liu, A. M. McDonald, T. B. Norton, R. Ramanath, et al. Disagreeable privacy policies: Mismatches between meaning and users’ understanding. Berkeley Tech. LJ 30 (2014): 39.