Gas station sites pose potential risks of soil and groundwater contamination, which not only threatens public health and property but may also damage the assets and reputation of businesses and government entities. Given the complex nature of soil and groundwater contamination at gas station sites, this study utilizes field data from basic and environmental information, maintenance information for tank and pipeline monitoring, and environmental monitoring to develop machine learning models for predicting potential contamination risks and evaluating high-impact risk factors. The research employs three machine learning models: XGBoost, LightGBM, and Random Forest (RF). To compare the performance of these models in predicting soil and groundwater contamination, multiple performance metrics were utilized, including Receiver Operating Characteristic (ROC) curves, Precision-Recall graphs, and Confusion Matrix (CM). The Confusion Matrix analysis revealed the following results: accuracy of 85.1-87.4 %, precision of 86.6-88.3 %, recall of 83.0-87.2 %, and F1 score of 84.8-87.8 %. Performance ranking across all metrics consistently showed: XGBoost > LightGBM > RF. The area under the ROC curve and precision-recall curve for the three models were 0.95 (XGBoost), 0.94 (LightGBM), and 0.93 (RF), respectively. While all three machine learning approaches demonstrated satisfactory predictive capabilities, the XGBoost model exhibited optimal performance across all evaluation metrics. This research demonstrates that properly trained machine learning models can serve as effective tools for environmental risk assessment and management. These findings have significant implications for decision-makers in environmental protection, enabling more accurate prediction and control of contamination risks, thereby enhancing the preservation of ecological systems, public health, and property security.
This study investigates the application of machine learning (ML) algorithms for seismic damage classification of bridges supported by helical pile foundations in cohesive soils. While ML techniques have shown strong potential in seismic risk modeling, most prior research has focused on regression tasks or damage classification of overall bridge systems. The unique seismic behavior of foundation elements, particularly helical piles, remains unexplored. In this study, numerical data derived from finite element simulations are used to classify damage states for three key metrics: piers' drift, piles' ductility factor, and piles' settlement ratio. Several ML algorithms, including CatBoost, LightGBM, Random Forest, and traditional classifiers, are evaluated under original, oversampled, and undersampled datasets. Results show that CatBoost and LightGBM outperform other methods in accuracy and robustness, particularly under imbalanced data conditions. Oversampling improves classification for specific targets but introduces overfitting risks in others, while undersampling generally degrades model performance. This work addresses a significant gap in bridge risk assessment by combining advanced ML methods with a specialized foundation type, contributing to improved post-earthquake damage evaluation and infrastructure resilience.