App Usage Behavior Modeling and Prediction


Research Directions and Publications

(1) Usage Pattern Recognition and Understanding

    The goal is to understand how user use Apps. Basically, we have observed that the time interval of each two App usage record and the total number of records of each user both follows the power law (Huang et al., MASS 2017). We have detected 3963 frequent App-sets with a support of 35%. Among these, 500 App-sets are used by more than 3 percent of the total mobile users (Huang et al., MASS 2017). We further find that 88% of users with more than 10 apps used can be uniquely re-identified by 4 random Apps (Tu et al., Ubicomp 2018), which indicates a high uniqueness of App usage. For city-wide App usage understanding, we have analyzed the temporal App traffic patterns of base stations clustered by PoIs (Zeng et al., IEEE Wireless Communications 2018).

(2) App Usage Modeling and Prediction

    Having looked into the specific App usage of large users, we further investigate some models for App designers and service providers. Since App usage are highly related with the attributes of locations, we utilized easy-to-collect PoIs to prediction the difficult-to-collect regional App usage (Yu et al., Ubicomp 2018). For personal App usage prediction, we presents CAP which takes both contextual information (location & time) and attribution (App with type information) into consideration by building a heterogeneous graph embedding algorithm to map App, location and time into the common comparable latent space (Chen et al., Ubicomp2019). We also proposed a novel hierarchical Dirichlet process mixture model for large-scale users’ App usage prediction(Wang et al., Ubicomp 2019)

(3) App Dataset Applied in Other Domains (Urban Computing, App Recomm., etc.)

    App dataset is not only used in App usage modeling, but also in other different while relevant domains. For human health, we investigated the mobile fitness app usage and find that time and location are key factors to affect human workout activities (Chen et al., IEEE Communications Magazine 2018). For location recommendation, we proposed a novel generative model to transfer user interests from online app usage behavior to offline location preference (Tu et al., Ubicomp 2019). For urban computing, we joined the online behavior of App usage with offline behavior of mobility of different regions to detect dynamic functions (Xia et al., Ubicomp 2019), and we also revealed the network traffic consumption patterns through Apps in different regions with different functions (Wang et al., ACM IMC 2015) and (Zhang et al., IEEE Transactions on Big Data 2017).


Open Dataset and Description

(1) Dataset information

    In this App usage dataset, each entry contains an anonymized User identification, timestamps of HTTP request or response, the length of the packet, the domain visited and the user-agent field. We identify Apps from the networking metadata by adopting SAMPLES, and crawl their category from Android Market and Google Play. We also provide the distribution of PoIs (Points of Interests) under each base station.

    Dataset statistics
    Duration One week
    City One of the biggest city of China
    Number of identified Apps 2000
    Number of users 1000
    Number of base stations 9800
    Number of App category 20
    Number of PoI category 17

(2) Files and Description

    Dataset Files Description
    App_Usage_Trace.txt User ID||Timestamp(Second) ||Location (base station ID)||Used App ID||Traffic (Byte)
    App2Category.txt App ID||Category ID
    Categories.txt Category ID||English Name
    base_poi.txt Basestation ID ||Number of PoIs for each category

    This dataset is released to the public for research purposes only. Commercial use is strictly prohibited. It is forbidden to attempt discovery of the identities of users. Works that use the dataset must mention the name of the dataset "Tsinghua App Usage Dataset" and cite the following original article:
    Yu, D., Li, Y., Xu, F., Zhang, P., & Kostakos, V. Smartphone app usage prediction using points of interest. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (ACM IMWUT/UbiComp), 1(4), 174, 2018.
    @article{yu2018smartphone,
        title={Smartphone app usage prediction using points of interest},
        author={Yu, Donghan and Li, Yong and Xu, Fengli and Zhang, Pengyu and Kostakos, Vassilis},
        journal={Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies},
        volume={1},
        number={4},
        pages={174},
        year={2018},
        publisher={ACM}}

(3) Another Dataset of Long-term App Usage: Carat

(4) Contact Information

    Email: liyong07 AT Tsinghua.edu.cn


Related Publications

  • Tu, Z., Li, R., Li, Y., Wang, G., Wu, D., Hui, P., ... & Jin, D. Your apps give you away: distinguishing mobile users by their app usage fingerprints, ACM IMWUT/UbiComp, 2018, 2(3), 138.
  • Zeng, M., Lin, T. H., Chen, M., Yan, H., Huang, J., Wu, J., & Li, Y. Temporal-spatial mobile application usage understanding and popularity prediction for edge caching, IEEE Wireless Communications, 2018, 25(3), 36-42.
  • Yu, D., Li, Y., Xu, F., Zhang, P., & Kostakos, V. Smartphone app usage prediction using points of interest, ACM IMWUT/UbiComp, 2018, 1(4), 174.
  • Chen, X., Wang, Y., He, J., Pan, S., Li, Y., & Zhang, P. CAP: Context-aware App Usage Prediction with Heterogeneous Graph Embedding, ACM IMWUT/UbiComp, 2019, 3(1), 4.
  • Wang, H., Li, Y., Zeng, S., Wang, G., Zhang, P., Hui, P., & Jin, D. Modeling Spatio-Temporal App Usage for a Large User Population, ACM IMWUT/UbiComp, 2019, 3(1), 27.
  • Chen, X., Zhu, Z., Chen, M., & Li, Y. Large-scale mobile fitness app usage analysis for smart health, IEEE Communications Magazine, 2018, 56(4), 46-52.
  • Tu, Z., Fan, Y., Li, Y., Chen, X., Su, L., & Jin, D. From Fingerprint to Footprint: Cold-start Location Recommendation by Learning User Interest from App Data, ACM IMWUT/UbiComp, 2019, 3(1), 26.
  • Xia, T., Li, Y. Revealing Urban Dynamics by Learning Online and Offline Behaviours Together, ACM IMWUT/UbiComp, 2019, 3(1), 30.
  • Wang, H., Xu, F., Li, Y., Zhang, P., & Jin, D. Understanding mobile traffic patterns of large scale cellular towers in urban environment, ACM IMC, 2015, pp. 225-238.
  • Zhang, M., Fu, H., Li, Y., & Chen, S. Understanding urban dynamics from massive mobile traffic data, IEEE Transactions on Big Data, 2017.
  • Huang, J., Xu, F., Lin, Y., & Li, Y. On the understanding of interdependency of mobile app usage, IEEE MASS, 2017, pp. 471-475.