0
Article ? AI-assigned paper type based on the abstract. Classification may not be perfect — flag errors using the feedback button. Tier 2 ? Original research — experimental, observational, or case-control study. Direct primary evidence. Human Health Effects Policy & Risk Sign in to save

Machine Learning-Driven QSAR Modeling for Predicting Short-Term Exposure Limits of Hydrocarbons and Their Derivatives

Processes 2025 Score: 48 ? 0–100 AI score estimating relevance to the microplastics field. Papers below 30 are filtered from public browse.
Wei Zhao, Jingjie Shi, Wei Zhao, Cheng Wang, Wei Zhao Linli Ni, Wei Zhao, Wei Zhao Wei Zhao, Wei Zhao, Xiongjun Yuan, Wei Zhao, Wei Zhao

Summary

Researchers developed machine learning-based QSAR models to predict short-term exposure limits (STELs) for hydrocarbons and their derivatives, addressing the critical gap in occupational health data for many chemicals. The models showed strong predictive performance and provide a faster alternative to experimentally determining STELs for new compounds.

Body Systems

The scarcity of reliably determined STELs for numerous chemicals severely impedes occupational health risk assessment. To address this gap, this study establishes and validates a suite of robust quantitative structure–activity relationship (QSAR) models to efficiently predict STELs for hydrocarbons and their derivatives. A dataset of 60 compounds was partitioned using Affinity Propagation clustering, and the validity of this division was verified using Tanimoto similarity analysis and Uniform Manifold Approximation and Projection (UMAP). Four optimal molecular descriptors, indicative of molecular size and spatial configuration, were identified using a genetic algorithm. These descriptors served as inputs for one linear model—multiple linear regression (MLR)—and three nonlinear models: support vector machine (SVM), back-propagation artificial neural network (BP-ANN), and extreme gradient boosting (XGBoost). All models were rigorously validated according to OECD principles. The results demonstrated that the XGBoost model achieved superior performance, with key metrics (R2, Qloo2, Qext2) all exceeding 0.9. Interpretability analysis using SHAP (SHapley Additive exPlanations) revealed that molecular size and symmetry descriptors (E3u, G2m) positively correlate with STEL, while the degree of unsaturation (n = CHR) shows a significant negative influence, providing novel mechanistic insights into the structure–toxicity relationship. Notably, 96% of the predictions fell within the defined applicability domain, confirming the model’s reliability. This study therefore serves as a rapid, accurate, interpretable, and reliable computational tool, with the potential to significantly inform and enhance occupational health and safety decision-making, especially for novel or data-poor chemicals.

Sign in to start a discussion.

Share this paper