Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting

Yuqi Li, Chuanguang Yang, Hansheng Zeng, Zeyu Dong, Zhulin An, Yongjun Xu, Yingli Tian, Hao Wu

arXiv:2507.02939·cs.LG·Published 2025-06-27·Updated 2025-07-20

Spatiotemporal forecasting tasks, such as traffic flow, combustion dynamics, and weather forecasting, often require complex models that suffer from low training efficiency and high memory consumption. This paper proposes a lightweight framework, Spectral Decoupled Knowledge Distillation (termed SDKD), which transfers the multi-scale spatiotemporal representations from a complex teacher model to a more efficient lightweight student network. The teacher model follows an encoder-latent evolution-decoder architecture, where its latent evolution module decouples high-frequency details and low-frequency trends using convolution and Transformer (global low-frequency modeler). However, the multi-layer convolution and deconvolution structures result in slow training and high memory usage. To address these issues, we propose a frequency-aligned knowledge distillation strategy, which extracts multi-scale spectral features from the teacher's latent space, including both high and low frequency components, to guide the lightweight student model in capturing both local fine-grained variations and global evolution patterns. Experimental results show that SDKD significantly improves performance, achieving reductions of up to 81.3% in MSE and in MAE 52.3% on the Navier-Stokes equation dataset. The framework effectively captures both high-frequency variations and long-term trends while reducing computational complexity. Our codes are available at https://github.com/itsnotacie/SDKD

TopicsFluid Dynamics & Plasma Physics

Tagsweather-forecasting

arXiv categoriescs.LG, cs.AI, cs.CV

arXiv abstract pagePDF