A large dataset of synthetic SEM images of powder materials and their ground truth 3D structures

This data article presents a data set comprised of 2048 synthetic scanning electron microscope (SEM) images of powder materials and descriptions of the corresponding 3D structures that they represent. These images were created using open source rendering software, and the generating scripts are included with the data set. Eight particle size distributions are represented with 256 independent images from each. The particle size distributions are relatively similar to each other, so that the dataset offers a useful benchmark to assess the fidelity of image analysis techniques. The characteristics of the PSDs and the resulting images are described and analyzed in more detail in the research article “Characterizing powder materials using keypoint-based computer vision methods” (B.L. DeCost, E.A. Holm, 2016) [1]. These data are freely available in a Mendeley Data archive “A large dataset of synthetic SEM images of powder materials and their ground truth 3D structures” (B.L. DeCost, E.A. Holm, 2016) located at http://dx.doi.org/10.17632/tj4syyj9mr.1[2] for any academic, educational, or research purposes.


a b s t r a c t
This data article presents a data set comprised of 2048 synthetic scanning electron microscope (SEM) images of powder materials and descriptions of the corresponding 3D structures that they represent. These images were created using open source rendering software, and the generating scripts are included with the data set. Eight particle size distributions are represented with 256 independent images from each. The particle size distributions are relatively similar to each other, so that the dataset offers a useful benchmark to assess the fidelity of image analysis techniques. The characteristics of the PSDs and the resulting images are described and analyzed in more detail in the research article "Characterizing powder materials using keypoint-based computer vision methods" (B.L. DeCost, E.A. Holm, 2016) [1]. These data are freely available in a Mendeley Data archive "A large dataset of synthetic SEM images of powder materials and their ground truth 3D structures" (B.L. DeCost, E.A.

Value of the data
Microstructural image analysis is a core discipline and an active research area in materials science; however, data science approaches to microstructural image analysis are hindered by a lack of large, well-understood image data sets. The images in this data set help fill that gap by providing a statistically significant number of powder material images with known ground truth characteristics.
Because their generating particle size distributions are closely related, the resulting structures are challenging to differentiate, thus they present a useful benchmark to assess the fidelity of image analysis techniques.
When combined with their ground truth structures and classifications, these images can be used to benchmark image analysis approaches including segmentation, quantitative characterization, machine learning, and others.
Images of powder materials are especially important for understanding powder bed based Additive Manufacturing (AM) processes.

Data
This data set is comprised of 2048 synthetic scanning electron microscope (SEM) images of powder materials and descriptions of the corresponding 3D structures that they represent [2]. There are 256 images/structures from each of eight closely related particle size distributions (PSDs), as described in Table 1. Fig. 1 shows the PSDs sampled to create the images, and Fig. 2 shows example images from each PSD. Fig. 3 shows a text snippet from a 3D structure file, as well as a rendering of an example 3D powder structure that would be synthetically imaged to generate a 2D micrograph.

Experimental design, materials and methods
The dataset of 3D structures and their corresponding 2D images was created using Blender [3], an open source computer graphics suite used for 3D modeling, rendering, animation, and scientific visualization. In this dataset, powders are comprised of spherical particles with sizes drawn at random from the appropriate PSD. We consider eight PSDs, as shown in Fig. 1, and construct 256 independent structure/image pairs for each PSD, resulting in 2048 synthetic powder micrographs (examples are shown in Fig. 2).
To synthesize each image, we use an 11 Â 11 Â 2 (arbitrary Blender units) render volume and insert 800 particles placed at random. Particle radii are selected at random from one of the eight generating PSDs, and they are permitted to intersect and/or occlude each other. Particles are rendered using a spherical mesh, with a surface texture achieved by wrapping the particle with an image of zinc grains, included in the dataset. The particles are imaged on the z¼0 plane, which intersects the centroid of the render volume, as shown in Fig. 3(b). The camera is located in the center of the volume at height z¼ 10, and the resulting image resolution is 512 Â 512 pixels. Python scripts used to perform these operations are included in the dataset files.  It is worth noting that the PSDs in Fig. 1 consist of four pairs of size distributions that are relatively similar to each other. The characteristics of the PSDs are given in Table 1 and described in more detail in [1].