Background and AimsSoil salinization is a major cause of land degradation and ecological damage. Traditional soil salinity monitoring techniques are limited in coverage and scalability, while remote sensing offers broader applicability and efficiency. This study addresses spatiotemporal variations in soil salt content (SSC) inversion across crop types in Tongliao City, Inner Mongolia, China, using an innovative integration of multi-temporal data and crop cover types, improving remote sensing monitoring accuracy.MethodsField sampling data and Sentinel-2 images from June to September in 2021 and 2022 were utilized. The deep learning U-net model classified key crops, including sunflower (33%), beet (12%), and maize (55%), and analyzed the effects of crop coverage on SSC across multiple time series. Six spectral variables were selected using the SVR-RFE model (R2 = 0.994, MAE = 0.016). SSC prediction models were developed using three machine learning methods (DBO-RF, PSO-SVM, BO-BP) and a deep learning method (Transformer).ResultsConsidering crop coverage variations improved the sensitivity of spectral variables to SSC response, enhancing predictive accuracy and model stability. Crop classification showed that the salinity index (SIs) correlated more strongly with SSC than the vegetation index (VIs), with SI6 having the highest correlation coefficient of 0.50. The Transformer model, using multi-time series data, outperformed other algorithms, achieving an average R2 of 0.71. The SSC inversion map from the Transformer model closely matched field survey trends.ConclusionThis research provides a novel approach to soil salinity prediction using satellite remote sensing, offering a scalable solution for monitoring salinization and valuable insights for environmental management and agricultural planning.