TRIDENT: Cross-Domain Trajectory Spatio-Temporal Representation via Distance-Preserving Triplet Learning
Abstract
We present the TRIplet-based Distance-preserving Embedding Network for Trajectories (TRIDENT), a spatio-temporal representation framework for compressing and retrieving trajectories across scales, from badminton courts to large-scale urban environments. Existing methods often assume smooth, continuous motion, but real trajectories exhibit event-driven annotation, abrupt direction changes, GPS errors, irregular sampling, and domain shifts, exposing the inefficiency, limited generalization, and inability to robustly integrate temporal order with spatial sequence structure of prior models. TRIDENT addresses these challenges by combining GCN spatial embeddings with temporal features in a Dual-Attention Encoder (DAEncoder), along with a Nonlinear Tanh-Projection Attention Pooling (NTAP) module that preserves local order and robustness under noise. For metric learning, we introduce a Distance-preserving Multi-kernel Triplet Loss (DMT) to preserve pairwise spatio-temporal distances in the native feature space and their rank order within the embedding, thereby reducing geometry distortion and improving cross-domain generalization. Experiments on urban mobility and badminton datasets show that TRIDENT outperforms strong baselines in retrieval accuracy, efficiency, and cross-domain generalization. Furthermore, the learned embeddings capture spatio-temporal sequence patterns, facilitating tactical analysis of badminton rallies via silhouette-guided spectral clustering that provides more actionable insights than direct trajectory classification. An anonymous repo with code and data is in the supplement.