Skip to main content

WorldMove provides mobility datasets for over 1600 cities spanning 179 countries across 6 continents

WorldMove is an open access worldwide human mobility dataset, we follow a generative AI-based approach to create a large-scale mobility dataset for cities worldwide. Our method leverages publicly available multi-source data, including population distribution, points of interest (POIs), and synthetic commuting origin-destination flow datasets, to generate realistic city-scale mobility trajectories.

Multi-source data collection

We integrate various globally accessible multi-source datasets that provide key attributes for every grid cell: Population data are from the WorldPop, which provides high-resolution (100m) population distribution estimates. POI data is sourced from OpenStreetMap (OSM) and categorized by type within the boundaries of each urban region. Location popularity is quantified as the visitation frequency rank, derived from a global high-resolution origin-destination commuting flow dataset. And for each location, apart from the features related to trajectory semantics, we also introduce a local coordinate system to assist our model in learning the spatial relationships between the locations visited by the trajectory.

Mobility data generation

The multi-source location feature data is processed through a location feature encoder, compressing and projecting the regional characteristics into a unified embedding space. Building upon the location embeddings, we leverage real-world human mobility data to form a comprehensive mobility dataset that encompasses diverse urban mobility patterns. Our diffusion model is then trained on this unified dataset. During the generation process, the diffusion model first generates a transition sequence within the embedding space. This embedding sequence is subsequently matched to the target city's location embeddings using a minimum-distance mapping, ultimately constructing the final mobility trajectory.

Data fidelity

We analyze the generated mobility patterns on 6 cities across 3 countries to evaluate whether the generated data can not only resemble real-world data but also adhere to fundamental mobility laws, including power-law of jump length, radius of gyration and wait time, Zipf's law which characterizes the frequency distribution of visited locations, and flow data which shows the collective movement patterns of human mobility.