Realistic Trajectory Generation using Simple Probabilistic Language Models
Hayat Sultan, Mario Nascimento
Trajectory data, sourced from GPS-enabled devices such as smart vehicles and smartphones, offers valuable insights into human movement patterns across various modes of transportation. However, there is limited availability of such large datasets for testing and benchmarking tools and solutions. Drawing on similarities between trajectories in mobility data and natural language sentences, we explore the application of probabilistic language models to generate arbitrarily large realistic trajectories by treating sequences of GPS points as sequences of tokens, akin to sentences in natural language. Our experiments have shown that, using a small sample of real taxi trajectories, the proposed approach can generate a diverse set of synthetic trajectories that follows closely the distribution of the original sample.