Identification of Polymers with a Small Data Set of Mid-infrared Spectra: A Comparison between Machine Learning and Deep Learning Models

Xin Tian; Frederic Béen; Yiqun Sun; P. van Thienen; Patrick S. Bäuerlein

doi:10.1021/acs.estlett.2c00949

0

Article ? AI-assigned paper type based on the abstract. Classification may not be perfect — flag errors using the feedback button. Tier 2 ? Original research — experimental, observational, or case-control study. Direct primary evidence. Detection Methods Policy & Risk Sign in to save

Identification of Polymers with a Small Data Set of Mid-infrared Spectra: A Comparison between Machine Learning and Deep Learning Models

Environmental Science & Technology Letters 2023 19 citations ? Citation count from OpenAlex, updated daily. May differ slightly from the publisher's own count.

Xin Tian, Frederic Béen, Yiqun Sun, P. van Thienen, Patrick S. Bäuerlein

Summary

Researchers compared multiple machine learning and deep learning models for identifying polymer types from mid-infrared spectral data using a small reference dataset, finding that certain deep learning architectures outperformed traditional methods even with limited training examples, supporting automated microplastic identification.

Body Systems

Nervous

Identifying environmental polymers and microplastics is crucial for the scientific world, environmental agencies, and water authorities to estimate their environmental impact and increase efforts to decrease emissions. On the basis of different spectroscopy techniques, e.g., laser-directed infrared imaging and Raman spectroscopy, polymers can be observed and represented as spectroscopic signals. The latter can be further analyzed and classified by data science, in particular, machine learning (ML). Past studies applied a variety of ML models to identify polymers from small or large data sets. However, a comprehensive comparison of multiple models across different data set sizes is still needed, which is presented in this study. Furthermore, we also provide a practical data augmentation technique to generate synthetic samples when only a limited number of samples are available. Our results show that the ensemble ML model, compared to neural network models, takes the least training time to achieve the best performance, i.e., a classification accuracy of 99.5%. This study provides a generic framework for selecting ML models and boosting model performance to accurately identify polymers.

Read via DOI Download PDF

Share this paper

Post Share Share Pin Email