With AndroidTimeMachine we present a graph-based dataset of commit history of real-world Android apps.
The project provides a dataset of 8,431 real-world open-source Android apps. It combines source and commit history information available on GitHub with the metadata from Google Play store. The graph representation used for structuring the data eases the analysis of the relationships between source code and metadata. The dataset is provided as Docker containers to improve its accessibility and extensibility.
You can find detailed information on our data colletion process and follow the guide to accessing AndroidTimeMachine data.
AndroidTimeMachine is presented at the data showcase track of MSR 2018. Based on this dataset we additionally built a classifier for self-reported activities of Android developers.