0
Article ? AI-assigned paper type based on the abstract. Classification may not be perfect — flag errors using the feedback button. Sign in to save

Classification of recycled plastics using sparse and imbalanced spectral data and data augmentation by the generative adversarial network

Environmental Research and Planetary Health 2026
Xuan Liu, Xuerui Song, Yusuf Sulub, David Zoller, Zhenyu Kong, Blake N. Johnson

Summary

Using GAN-generated synthetic FTIR spectra to balance training datasets, researchers achieved 96.2% classification accuracy for six recycled polymer types, substantially outperforming models trained on imbalanced data alone. Accurate rapid identification of plastic types is essential for improving recycling rates and reducing the quantity of misidentified plastics that end up fragmenting into environmental microplastics.

Accurate identification of post-consumer plastics is essential to establishing high-performance recycling processes and enabling a circular and sustainable economy and environment through effective recycling and remanufacturing. However, Fourier transform infrared (FTIR) spectra of recycled materials often exhibit noise, baseline shifts, and overlapping signatures from additives or contaminants, resulting in datasets that are both sparse and severely imbalanced. This data complexity, sparsity, and class imbalance can degrade conventional machine-learning classifiers, resulting in higher rates of misclassifying plastics. To address these challenges, we investigated if data augmentation using generative adversarial networks could enhance polymer classification performance. We implemented a Generative Adversarial Network (GAN) framework that integrates adversarial training with a classifier-guided feedback loop to synthesize realistic, class-discriminative FTIR spectra for six commonly recycled polymers, polyethylene (PE), polypropylene (PP), polystyrene (PS), polycarbonate (PC), polyethylene terephthalate (PET), and acrylonitrile butadiene styrene (ABS), and trained multilayer perceptron classifiers on datasets with varying ratios of synthetic data. The optimal balanced accuracy of 96.2% was achieved when synthetic spectra accounted for 50% of the training set, whereas including more than 90% synthetic data degraded generalization. Synthetic data augmentation using a GAN with the optimal augmentation ratio improved ABS classification accuracy, precision, and recall by 43%, 50%, and 33%, respectively, compared with no augmentation and replicate experimental measurements. These results demonstrate that GAN-based data augmentation can effectively mitigate data sparsity and class imbalance in spectral classification of common plastics, providing a practical foundation for creating robust online polymer classification systems.

Share this paper