Lensless imaging of pollen grains at three-wavelengths using deep learning

Image reconstruction of pollen grains was performed using neural networks, from light scattering patterns recorded with simultaneous irradiation at three laser wavelengths. The shapes of the reconstructed optical images using one network were shown to have a pixel accuracy on average of 98.9%. Two other neural networks were shown to be able to convert scattering patterns into predictions of z-stack maximum intensity projection microscope images and scanning electron microscopy images. The capability of producing magnified images in a variety of formats directly from scattering patterns will be applicable to particle sensing in a range of fields, including health and safety, environmental protection, ocean and space science.


Introduction
Sensing and monitoring of airborne pollution particles, such as pollen grains, wood combustion ash, microplastics and diesel soot, is important for understanding and mitigating their consequences for human exposure, and hence reducing what is rapidly becoming a significant global health problem. Particulate matter pollution is associated with respiratory and cardiovascular diseases, different types of cancers, as well as dementia [1][2][3][4]. Since different particles are more strongly associated with certain adverse health outcomes, and some people may be more susceptible to specific types of pollution [5], it is important to determine the individual species that are contributing to particulate matter pollution. For example, regarding hay fever, sufferers' susceptibility varies depending on the types of pollen, be it weed, grass or tree pollen [6,7]. Therefore, it is advantageous for an individual to know what species of pollen is triggering hay fever symptoms, and whether they are present in the air around them at any particular time. Although imaging a particle directly could enable determination of some micron-sized particles, a lensless setup, in which only the scattered light from an object is captured, potentially offers simplicity, which is desirable for sensing applications.
Since absorption coefficients vary from chemical to chemical [8], and since the shape and size of an object, along with the wavelength of illuminating light, can affect the scattering [9,10], the scattered light encodes information regarding these properties of the particle [11,12]. However, since standard cameras record only the intensity of the scattered light, the phase information is lost. Therefore, an exact description of the object is not directly possible, as both the amplitude and phase of the scattered light are required to fully characterise shape, refractive index and size. The challenge of producing the inverse function that maps a scattering pattern to the object has led to lensless imaging approaches such as phase retrieval [13][14][15][16] and non-interferometric imaging [17,18]. Phase retrieval offers a solution by requiring that the object is oversampled, which is equivalent to ensuring that the object has zero padding (i.e. no intensity contribution) outside a well-defined region. Therefore, this approach is generally only applicable to objects that are smaller than the size of the illuminating light source. Ptychography enables imaging over a continuous object, but requires the collection of scattering patterns that correspond to overlapping regions of the object, where, in general, the degree of overlap must also be measured, along with the illumination function.
Deep learning convolutional neural networks [19][20][21], have been shown to be able to classify objects [22,23], and have been used in areas such as text classification [24], video classification [25], speech recognition [26] and bird song classification [27,28], as well as facial recognition in humans and non-human primates [29][30][31]. Of greater relevance to this work, deep learning has been used in the counting and classification of particulate matter pollution, such as pollen and plastic microbeads via imaging their scattered light [32], as well as bioaerosol sensing [33]. Recent work on deep learning in the field of phase retrieval has been discussed by [34][35][36][37], with deep learning having demonstrated the capability of retrieving phase information from images created using a spatial light modulator [38], via ptychography [39] and holography [35], and via multiple scattering patterns [40].
Here, we extend our earlier work on classification of pollution particles in water and in air [41,42] to show how neural networks can be used to generate images of pollen grains directly from acquired three-wavelength scattering patterns and even predict their appearance under a scanning electron microscope (SEM).

Experimental methods
Sample fabrication and data collection Two forms of experiments were carried out: the first was an in situ experiment in which the optical imaging of, and laser scattering from, the pollen grains was recorded in parallel, whilst in the same setup. In the second set of experiments, optical images of (for reference only) and laser scattering data from the pollen were recorded using the same experimental apparatus as the in situ experiments, but then, the pollen was imaged using different apparatus, such as a visible light microscope and an SEM. Here, for ease of distinction between the two experiments, we refer to this experiment as 'ex situ', since the imaging part was done elsewhere relative to our initial setup.
For the in situ experiment, Iva xanthiifolia and Populus deltoides pollen grains were procured from Sigma Aldrich, and Narcissus and Mahonia aquifolium pollen grains were collected from the University of Southampton grounds. Iva xanthiifolia, Populus deltoides and Narcissus pollen grains, were deposited onto a substrate (a 25 mm by 75 mm, 1 mm thick soda-lime glass slide), while Mahonia aquifolium pollen grains (used solely for testing) were deposited onto a separate slide. A total of 120 pollen grains (∼33% for each pollen type: Iva xanthiifolia, Populus deltoides and Narcissus) were individually located and the scattering patterns and optical images for each were recorded and used for training and testing the neural network. Subsequently, images and scattering patterns for Mahonia aquifolium pollen (a species not used in training) were recorded to test the capability of the neural network to generate images of not just unseen pollen grains but previously-unseen species.
For the ex situ experiment, imaging of pollen grains was carried out using another separate glass slide that included additional pollen grains obtained from purchased flowers. Pollen from the species Bellis perennis, Populus deltoides, Narcissus, Iva xanthiifolia, Populus tremuloides, Hyacinthus orientalis, Chrysanthemum, Antirrhinum majus, Chamelaucium, and Rosa were included. Testing was carried out using pollen that was present on the same slide as the training. In this case, the scattering patterns and optical images for a total of 100 different pollen grains (10 for each pollen type) were recorded using the in situ experimental setup. Additional ex situ imaging data (z-stack maximum intensity projection and SEM) were then obtained for the same set of pollen grains. The pollen grains ranged from ∼10 to ∼50 microns in size.
In-situ imaging setup As shown in figure 1, light from 3 laser diodes (Thorlabs Inc.) operating at 405 nm, 532 nm and 650 nm was focussed on the surface of a pollen-coated glass slide, producing a spot with a diameter of approximately 50 μm. The light from each laser was attenuated to below 1 mW using neutral density filters (Thorlabs Inc.), prior to focussing onto the sample. The forward scattered light from the pollen grains was collected by a CMOS colour camera (Thorlabs, DCC3260C, 1936×1216 pixels, 5 ms integration time), placed 3 mm away from the pollen grains. The camera was connected to a computer to allow recording of the scattering patterns. The glass slides were mounted on a 3-axis stage (25 mm travel, 10 μm resolution) for positional control. The pollen grains were also illuminated using a white light source (a halogen lamp, I.+W. MUSTER Gdb, 150 W) so that the pollen grains could be imaged, via a beam splitter, using an Olympus SLMPLN 50× objective (NA=0.35, WD=18 mm) and CMOS camera (Thorlabs Inc. DCC1645C, 1280×1024 pixels). Both cameras used ThorCam software [43] by Thorlabs Inc. to record the scattering patterns and images.
Ex-situ imaging setup Two methods of ex situ imaging were applied. (1) Z-stack maximum intensity projection. For each pollen grain z-stack optical microscopy was performed using a Nikon microscope with an image magnification of 50× (Nikon, LE Plan, NA=0.4, WD=3.5 mm) and CMOS camera (Thorlabs Inc. DCC1645C, 1280×1024 pixels). The z-axis of the microscope stage was translated in micron increments with an image being recorded at every step. A z-stack of images was thus obtained from the top part of each pollen grain to its base on the surface of the glass slide. The stack of images was then combined via maximum intensity projection (a 2D image containing the maximum intensity value throughout all layers, for each pixel), to obtain ex situ images of the pollen grains. The images were saved with an image pixel count of 1280×1024 pixels, and cropped to 512×512 pixels. (2) Scanning electron microscopy. Imaging was carried out using a Zeiss Evo SEM with probe current of 134 pA, operating at high vacuum. The images were produced using a cycle time of 48.7 s, at a resolution of 1024×768 pixels, which were also cropped to 512×512 pixels. The samples were coated with 20 nm of Au/Pd (50:50) prior to placement inside the SEM in order to reduce charge build-up on the sample during imaging.

Neural network
Three separate neural networks were used in this work, one for predicting the in situ optical images based on the experimentally measured scattering pattern and, similarly, one each for predicting the image output for the two ex situ imaging methods. All networks were trained using a conditional generative adversarial network (cGAN) architecture on an NVIDIA RTX 2080 graphics processing unit (GPU). The cGAN framework used here was based on the network presented in [44], which in turn was based on that described in [45]. The generator network had a 9-layer architecture in order to enable an image resolution of 512×512 pixels, and had a learning rate of 0.0002 and drop-out of 0.5. At the start of training, the neuron weightings for the generator and discriminator were randomly initialised, meaning they encoded zero information about the training data. Each neural network was trained until the training errors reached a minimum (approximately 500 epochs for the in situ imaging neural network and 300 epochs for the ex situ imaging neural networks, where one epoch is defined as the processing of all training data exactly once). For each respective neural network, 90% of the training data was used for training while 10% was used for validation. The processing time for the generation of test images was approximately 100 milliseconds in all cases. Figure 2 shows an overview of the procedure for training a neural network to transform experimental scattering patterns into predictions of an image of the particle. Scattering patterns were used as the input to the neural network and the neural network prediction of the particle image was compared with the experimental image. This process was repeated until the prediction error was minimised. The procedure was followed for training neural networks to convert the collected scattering patterns into predicted images in in situ data format (optical image) and each of the ex situ data formats (i.e. z-stack maximum intensity projection and SEM). Results and discussion Figure 3 shows results obtained during testing of the in situ neural network, showing scattering patterns that were fed into the neural network (column 1), the image generated by the neural network (column 2) and the experimental image (column 3). Each row shows data for a different pollen species: (a) Narcissus, (b) Populus deltoides, (c) Iva xanthiifolia and (d) Mahonia aquifolium. Column 4 features a comparison metric in which the experimental images are subtracted from the neural network generated images (thresholding is applied to both image types to produce binary image masks). These are then combined so that black represents true negative, white is true positive, green is false positive and blue is false negative, for the in situ image generation. The figure shows that the neural network was able to predict the shape and the orientation of pollen grains it has not seen before. Although there is sap present in the top left hand corner of the experimental image in figure 3(b), this is not present in the generated image, because the sap was not illuminated by the laser light and thus relevant information is not present in the scattering. The sulcus (a fissure on the pollen) is also present in the generated image shown in figure 3(c), as indicated by an arrow. In addition to image prediction for unseen pollen grains of the same species that was used during training, figure 3(d) demonstrates image prediction for a pollen grain of the species Mahonia aquifolium that was not used during training.
The length and width of the neural network generated Narcissus grain (figure 3(a)) were 100.3% and 89.0% relative to the grain dimension in the experimental image. Similarly, the lengths and widths of the generated Populus deltoides ( figure 3(b)), Iva xanthiifolia (figure 3(c)) and Mahonia aquifolium ( figure 3(d)) were 102.4% and 100.0%, 101.9% and 98.6% and, 101.2% and 100.0%, respectively. Table 1 shows the percentage of pixels in the generated images that, when the overall shape of the pollen grains, obtained via image thresholding, is compared with the actual image, are true negative (black), false negative (blue), false positive (green) and true positive (white). The true total (true negative plus true positive percentage of pixels) is also shown in the table. The tabulated results show that on average, 98.9% of the pixels were true. Figure 4 demonstrates the performance achieved by the neural networks with image data types that were recorded ex situ of the scattering apparatus. It includes the scattering patterns (column 1), the neural network generated images (column 2) and the experimental images (column 3), for (a) Hyacinthus orientalis and (b) Chamelaucium. Columns 2 and 3 include images of both ex situ measurement types: z-stack images are labelled (i) and (iii)), and SEM images are labelled (ii) and (iv)). Also, labelled on the images is sap, which was found in both the experimental and the generated images. These data presented in figure 4 were not included in the training data and so the image predictions are made for previously unseen pollen grains. It is evident from the figure that recovery of the overall shape and orientation of the grains has been achieved, and in addition, features such as the deformation at the top right of the Hyacinthus orientalis pollen grain in (a) are present in the generated images. In regard to the generated image of a Chamelaucium pollen grain in (b), the neural network    was able to generate images of pollen with the correct orientation, as well as the spherical lobes at each of the corners of its triangular structure. In addition, in both the generated images, similar size sap droplets are present on the slides as compared with the actual images. Inexactness in the generated images is attributed to the limited number of scattering patterns and pollen grain image pairs used for training.
The merit for using three lasers of different wavelengths to scatter from the pollen grains is demonstrated in figure 5. The figure shows reconstruction of a Chamelaucium pollen grain using neural networks that were trained using (a) red light, (b) green light, (c) blue light, and (d) all three wavelengths. For reference, the experimental z-stack image of the pollen grain is shown in (e). The insets in each sub-figure show the corresponding scattering pattern that was fed into the neural network to generate the images. Here, a separate neural network (using the same architecture and training regime as the multi-wavelength neural network) was trained for each of the different wavelengths used (one network for red light only, green light only and blue light only), such that each neural network created a transfer function between the scattering pattern and the experimental z-stack image. This was achieved by separating out the RGB channels of the scattering patterns recorded on the camera. Figure 5 clearly shows that the generated images for the single wavelength scattering patterns are less precise.
We attribute this higher accuracy when using multiple wavelengths to being able to effectively obtain three sets of scattering information. The information is unique for each wavelength, due to the wavelength dependence of the refractive index of the material, and the fact that scattering from a structure is dependent on parameters such as its size, material and wavelength, hence, meaning that multiple wavelengths can yield additional information about the structure of the object from which the light is scattered.

Conclusion
To conclude, we have shown how neural networks can be used to generate images of pollen grains using their experimentally measured scattering pattern, with the resultant images showing a high degree of similarity to the experimentally measured images. More specifically, one neural network was shown to be able to successfully generate an optical image of an unseen species of pollen grain from its scattering pattern, and had a pixel accuracy with an average of 98.9%. In addition, the capability for a neural network to generate images where training data (such as SEM images) were collected ex situ of the scattering apparatus was shown. We presented evidence indicating that scattering patterns containing signals at multiple wavelengths were more effective than those containing only a single wavelength in producing accurate image reconstructions. Further improvements are anticipated through the collection of larger data sets and the use of additional wavelengths of light.