0
Article ? AI-assigned paper type based on the abstract. Classification may not be perfect — flag errors using the feedback button. Sign in to save

Raman spectra for plastics identification (RaSPI) and Raman maps for plastics identification (RaMPI) datasets

International Journal of Integrated Research and Practice 2026
Úna. E. Hogan, H. B. Voss, Benjamin Lei, Avery E. Bec, Xinyi Feng, Rodney D. L. Smith

Summary

Two complementary Raman spectroscopy datasets—402 high-quality spectra and 34 two-dimensional spectroscopic maps spanning 14 plastic types—were published to support machine learning development for environmental plastic identification. These standardized, high-quality datasets containing both pristine and environmental pollution samples fill a critical resource gap for training and validating next-generation automated microplastics detection algorithms.

Efforts to expedite accurate identification of environmental plastics pollution have strong focus on machine learning (ML) techniques. We published two Raman spectroscopy datasets to support the development of next-generation ML methods. The Raman spectra for plastics identification (RaSPI) dataset presents 402 high-quality Raman spectra with <1 cm-1 resolution between 100 and 4000 cm-1. RaSPI spans 14 plastic types and has variability in (unknown) additives. The Raman maps for plastics identification dataset (RaMPI) contains 34 two-dimensional spectroscopic maps containing 33,119 spectra. RaMPI spectra offer <1 cm-1 resolution across the fingerprint region, with significant variability in signal:noise ratios that is useful for methodology testing and validation. Both datasets contain data from pristine samples and from environmental pollution. Spectra across both datasets have been manually assigned as one of 14 different plastic classifications. The consistency and quality of these datasets make them high-value resources for researchers active in diverse topics, including training ML models for microplastics research, for developing spectroscopic processing algorithms, or for those seeking datasets to test their methodologies against.

Share this paper