Reconstruction of the mass and geometry of snowfall particles from multi angle snowflake camera (MASC) images

This paper presents a method named 3D-GAN, based on a generative adversarial network (GAN), to retrieve the total mass, 3D structure and the internal mass distribution of snowflakes. The method uses as input a triplet of binary silhouettes of particles, corresponding to the triplet of stereoscopic images of snowflakes in free fall captured by a Multi-Angle Snowflake Camera (MASC). 3D-GAN is trained on simulated snowflakes of known characteristics whose silhouettes are statistically similar to real MASC observations and it is evaluated by means of snowflake replicas printed in 3D at 1 : 1 scale. 5 The estimation of mass obtained by 3D-GAN has a normalized RMSE (NRMSE) of 40%, a mean normalized bias (MNB) of 8% and largely outperforms standard relationships based on maximum size and compactness. The volume of the convex hull of the particles is retrieved with MNRSE of 35% and MNB of +19%. In order to illustrate the potential of 3D-GAN to study snowfall microphysics and highlight its complementarity with existing retrieval algorithms, some application examples and ideas are provided, using as showcases the large available datasets of MASC images collected worldwide during various field 10 campaigns. The combination of mass estimates (from 3D-GAN) and hydrometeor classification or riming degree estimation (from independent methods) allows for example to obtain mass-to-size power law parameters stratified on hydrometeor type or riming degree. The parameters obtained in this way are consistent with previous findings, with exponents overall around 2 and increasing with the degree of riming.

the methods, data and the novel mass and shape estimation algorithm. Section 4 is devoted to the evaluation of the retrieval, while Sec. 5 provides examples of applications and potential future studies. Section 6 draws the main conclusions of this work.

The multi-angle snowflake camera (MASC)
The method presented here is built and designed for the data collected by the multi-angle snowflake camera (MASC). We briefly recall here the most important technical characteristics of the instrument and the known limitations, and we refer the 65 interested reader to more detailed literature on the subject at the end of this section.
A MASC is composed of three high resolution co-planar cameras pointed to a common focal point. Each camera is separated by 36°with respect to the next one (rotation around the vertical axis) such that a picture of a given snowflake can be obtained simultaneously at an angle of 0°and ±36°. Two infrared (IR) emitter-detector pairs are triggering the cameras and three associated spotlights. The IR arrays are separated vertically by 32 mm, providing in this way an estimate of particle fall 70 velocity. For the data shown here, the MASC system is composed of three 2448 × 2048 pixels cameras and the maximum acquisition rate is about 2 Hz (as in Praz et al., 2017).
The data processing steps employed in this study are the same as in Praz et al. (2017), although only a minor part of the information generated is used as actual input of the method described in the following section, while another part can be used to interpret and complement the output (as illustrated in Sec. 5). The preprocessing involves snowflake identification 75 (and matching) in the three images, calculation of geometrical and textural descriptors, image quality evaluation, hydrometeor classification and riming degree estimation.
Although the MASC is a relatively recent instrument, the interested reader can find a fair amount of relevant literature about it. Its measurement principle is detailed in Garrett et al. (2012). Several works exploited MASC data to investigate geometry and fall speed characteristics of hydrometeors (Garrett and Yuter, 2014;Garrett et al., 2015;Jiang et al., 2019), and others were 80 devoted to hydrometeor classification techniques as Praz et al. (2017); Hicks and Notaros (2019); Leinonen and Berne (2020). that the discriminator considers to be real. This results in the two training processes competing against each other, which is referred to as adversarial training. Since the discriminator is a powerful image recognition network, the generator must learn to produce highly realistic outputs in order to successfully "fool" the discriminator. The generator is able to produce diverse outputs because it is fed random noise as an input, and the generator learns to map the distribution of the noise to the distribution of the input data. In a conditional GAN, both the discriminator and the generator additionally receive a condition as input data, 95 and therefore the generator learns the conditional probability distribution of the input data.
In the original GAN formulation of Goodfellow et al. (2014), the discriminator is a binary classifier, but it was found by Arjovsky et al. (2017) that some of the instability problems of GANs are remedied by reformulating the objective using a dual of the Wasserstein distance of probability distributions. Gulrajani et al. (2017) then combined this approach with a constraint on the gradient of the weights with respect to the training objective; this combination is referred to as a Wasserstein GAN with 100 gradient penalty (WGAN-GP). Given its superior stability with respect to the original GAN, a WGAN-GP is employed in the present study.

3D reconstruction GAN
Our 3D reconstruction GAN, named 3D-GAN hereafter, is formulated as a conditional WGAN-GP, where the desired data is the 3D structure of the snowflake and the condition is a set of three binary images (silhouettes) captured from the angles at 105 which the MASC sees the snowflake. The objective for the generator is thus to generate a 3D structure that the discriminator considers as appropriate for the image triplet.
The generator network is shown in Fig. 1a. The inputs are three snowflake silhouettes of 128×128 pixels. The first part of the processing passes the inputs through a series of residual downsampling blocks followed by a fully connected layer, resulting in a set of descriptors for each image. Following the "Siamese network" approach (Chicco, 2021), this step is implemented using 110 the same weights for each image. The descriptors are then concatenated and processed through several fully connected layers, resulting in a set of descriptors for the image triplet. At this stage, the noise is also included in the model by multiplying the input of the second fully connected layer with the noise vector. These descriptors are then passed through one more fully connected layer to produce 2048 variables, which are then reshaped to 32 3D feature maps of 4 × 4 × 4 pixel size. In the final stage of processing, the 3D feature maps are passed through upsampling blocks, eventually producing a 3D grid of 32 × 32 × 32 grid 115 volume elements (voxels). The size of the produced grid was selected as a compromise between resolution and computational requirements.
The inputs to the discriminator (Fig. 1b) are a 3D grid (either from the training dataset or from the generator) and a triplet of images. The images are processed to descriptors using a Siamese network in a manner identical to the generator. Meanwhile, the grid is passed through a set of downsampling 3D convolution blocks, the result of which is flattened into descriptors. The 120 descriptors for both the 3D grid and the images are processed through multiple fully connected blocks. The descriptors for the grid and the images are then combined by multiplying them with each other. The result of this is passed through more fully connected blocks, eventually producing a single scalar as the discriminator output.
While the silhouettes are binary images, the value of each voxel in the 3D grid is proportional to the average density of the ice-air mixture within that voxel, scaled such that the mean density of the nonzero voxels is approximately 1. It is therefore 125 possible to compute the snowflake mass from the outputs of the GAN. We however found that we can achieve better mass estimation with a separate neural network trained specifically to predict the mass. For this, we used a network architecture similar to that of the discriminator (Fig. 1b) except without the 3D grid input and processing branch. This network gives us the total mass m; to estimate the mass m i in each voxel i in the 3D grid output of the generator, we scale the voxel value as where y i is the generator output at voxel i.

Training
Training the 3D reconstruction GAN requires large training datasets of 3D structures and MASC imagery. As it is extremely difficult to accurately map the 3D structure of a snowflake, such datasets are currently not available from measurements of real snowflakes. Thus, we train the GAN using synthetic observations from modeled snowflakes created with the snowflake 135 generation model described in Leinonen et al. (2013), Leinonen and Moisseev (2015) and Leinonen and Szyrmer (2015). This model creates volumetric 3D models of snowflakes, and is capable of modeling single crystals, aggregation and riming. The degree of riming is indirectly prescribed by the liquid water path (LWP, in kg m −2 ) parameter (Leinonen and Szyrmer, 2015).
The generated snowflake models are defined by a set of volume elements of 40 µm size, each either entirely filled with solid ice of density ρ ice = 917 kg m −3 or empty. In order to create a training set, snowflakes are generated by randomly selecting a few 140 input parameters. To cite the most important, LW P varies from 0.0 to 2.0 kg m −2 , the number of monomers varies between 1 and 50, the monomer type varies among dendrites, needles, rosettes, plates, columns; the riming process is chosen as either occurring at the same time with respect to aggregation (simultaneous) or only once aggregation is completed (subsequent).
For each snowflake generated with the model, we calculated the silhouettes that would be seen by the MASC from the three different camera angles; the silhouettes were artificially blurred by a randomized amount to simulate conditions where the 145 snowflakes are out of focus.
In order to fully utilize the 3D grid and the projection image in the training process and at the same time operate with data of fixed dimensions, the voxels and the projection pixels can correspond to different physical sizes for different snowflakes.
Thus, for example, a snowflake of 5 mm maximum dimension would have a grid element size of approximately 5 mm/32 = 0.156 mm and a silhouette pixel size of approximately 5 mm/128 = 0.0391 mm.

150
The training samples of the GAN are loaded from data files that contain, for each snowflake: the 3D grid, the grid voxel size, 12 simulated projection silhouettes, and the projection pixel size. The 12 silhouettes comprise four image triplets; the images in each triplets are 36 • apart corresponding to the MASC camera separation, while the four triplets are spaced 90 • from each other. When training the GAN, we increase sample diversity by selecting one of the four triplets at random for each training sample and training step and then rotating the grid correspondingly. We also randomly apply mirroring for further data 155 augmentation.
As mentioned above, we adopt the approach of using a model instead of real observations in the training process out of necessity, while acknowledging that it has a number of potential drawbacks and uncertainties: 1. The model algorithms may not be representative of the physics of real snowflake formation.
2. Although the model can accept any input parameters, the distribution of the model parameters may not match that of real 160 conditions in nature.
3. The simulation of the image formation is not necessarily accurate.
4. By using the silhouettes instead of the gray-scale images captured by the real MASC, we lose the texture information contained in the real MASC images.
For point 1, regarding the realism of the physics of the model, we note that although the model does not implement a fully 165 physical simulation of snowflake formation, it has been found to produce realistic mass-dimensional relations of both unrimed (Leinonen and Moisseev, 2015) and rimed (Leinonen and Szyrmer, 2015) snowflakes, and has been used successfully for modeling snowflake microphysics (Seifert et al., 2019) and remote sensing signals from snowflakes (e.g. Leinonen et al., 2018;Tridon et al., 2019).
To mitigate issue 2, we forced the distribution of parameters closer to that found in nature using the following strategy. First,

170
we identified a selection of morphological image features that Praz et al. (2017) found important for identifying snowflakes, and which did not use texture information and therefore could be calculated also for the silhouettes. We then extracted these features from the dataset of Praz et al. (2017), collected in Davos, Switzerland during 2016-2017. We excluded the particles classified as small particles by Praz et al. (2017), as their size does not allow for any shape recognition or significant variability in the descriptors, and computed principal component analysis (PCA) of the feature distribution on the rest of the population.

175
This excludes particles with maximum dimension roughly lower than half millimeter. We kept the three most important PCA components, sorted in order of explained variance. Then, while generating snowflakes, we applied a realistic range of parameters such as snow crystal type, number of crystals per aggregate, and amount of riming, and calculated the same features and PCA components. After generating a large number of crystals, we then accepted the generated snowflakes to the final dataset only when they made the distribution of the PCA components for the generated snowflakes closer to that of the real snowflakes, 180 rejecting the others. Thus, we obtained a distribution of snowflake samples that is close to real ones at least in terms of visual descriptors. After this filtering step, the final training set included 20472 samples.
For issue 3, we attempted to simulate the main features of image formation such as randomly blurring the images to simulate situations where they are out of focus. The primary manner in which we attempted to determine if our simulation of the MASC silhouettes is adequate was to use MASC observations of artificial snowflakes 3D-printed from our models, thus using the real As for issue 4, we accept the lack of texture identification as a current shortcoming of the model. This is unfortunate because the availability of high-resolution texture is one of the greatest advantages of the MASC; however, the radiative transfer of light inside snowflakes is highly complicated and, to our knowledge, no simulation tools exist that could be used to accurately 190 model it and thereby generate proper simulated 2D images from our 3D models, which additionally does not provide surface properties of ice, as roughness for example. On the other hand, using only the silhouette images may make our approach easier to adapt to silhouette-only instruments such as the 2DVD.

Experiment with snowflake replicas 195
In order to evaluate the performance of 3D-GAN with real MASC data, we used a set a snowflakes printed in 3D. The snowflake shape models were computer-generated with the technique described in Section 3.3.
The printer used to generate the particles is a Nanoscribe Photonic Professional GT+ (PPGT+) 1 and the material used is a polymer (IP-Q) supplied by Nanoscribe (Bagheri and Jin, 2019). Once polymerized, the material is similar to polymethyl methacrylate (PMMA). The resolution used to generate the flakes is the 3D laser spot size of 1.5µm diameter (horizontal plane) 200 and 8µm height (vertical axis).
A few noteworthy limitations set the boundaries of what we could achieve with this approach: 1. The maximum dimension of the printed snowflakes is in the range of 3 − 5 mm. Smaller snowflakes could not be practically manipulated and larger ones could not be printed.
2. We could not successfully generate completely unrimed particles (LW P = 0 kg m −2 ) as they resulted in structures too 205 fragile to be manipulated without breaking.
3. Lightly rimed particles sometimes suffered damage while being handled in the MASC measurement area and could thus be used only for a limited number of times.
14 printed snowflakes were used in the evaluation; an overview of their characteristics is shown in Table 1. We dropped each particle several times through the MASC measurement and after discarding physically damaged particles, a total of 198 210 MASC triplets (and,accordingly,198 GAN reconstructions) were obtained. Although a larger population of printed snowflakes would be desirable, we believe that, given the above mentioned limitations and technical difficulties, this training sample is a good starting point, including various snowflake habits as well as different riming degrees. Because the reconstruction is based on the silhouette of MASC images it follows that, for particles of irregular shape and size, the reconstructed output will vary to a certain extent with the orientation of the falling replicas. This is illustrated in Fig. 2 where one can observe how the performance to vary: from some angles the particle may be easier to reconstruct than from others and thus we performed multiple experiments with the same particles. An additional source of uncertainty may come from the fact that printed snowflakes are not made of ice: their color and optical properties may be different with respect to actual snowflakes. We assume this aspect to be of negligible importance in our case because only silhouettes are used as input.

1D descriptors
A first evaluation of the ability of 3D-GAN to reconstruct realistic snowflakes can be obtained by looking at one-dimensional descriptors. We selected for this purpose the total snowflake mass m, gyration radius r g , maximum size D max and volume of convex hull V CH . D max and V CH are geometric quantities that define exactly the spatial extent of a snowflake. However, the mass distribution of the GAN output is continuously varying and therefore it is not straightforward to define the exact 225 boundaries of a hydrometeor. The way we tackled this limitation and obtained exact estimates of size and volume is detailed in Appendix A. The evaluation of the descriptors discussed in this section is also summarized in Table 2.

Mass estimation
Mass estimation is a major added value of the proposed method or at least, in response to the current needs of the scientific community , a readily usable product. Figure 3 shows that mass is overall well reconstructed. As a reference, prior knowledge about hydrometeor type and can be readily calculated from the 2-D views of the MASC from silhouettetype images without exploiting textural information. M07 is an adaptive mass-size relation where the exponent and prefactor take different values as the particle dimension (D max ) increases and it is a relation in principle valid for unrimed snowflakes.
BL06 includes more advanced geometrical considerations and it uses the maximum dimensions in two orthogonal directions, projected area and perimeter.

240
3D-GAN largely outperforms both of these estimation approaches, as summarized in Table 2. The Normalized Root Mean Square Error (NRMSE) is roughly 40% for 3D-GAN, 70% for BL06 and 103% for M07 while the Mean Normalized Bias (MNB) is close to 10%, -40% and -72% respectively. BL06 is able to provide better estimates than M07 although they are both affected by significant negative biases that become mostly evident for the heaviest snowflakes. In our evaluation data set, the snowflakes having the largest mass are also the ones with the highest degree of riming (See Table 1). In this sense, 3D-GAN 245 shows its ability to indirectly infer the riming degree and the related increase of mass by exploiting the information embedded in the silhouettes. At the same time, heavily rimed particles have more regular shapes and thus represent a less complex geometrical challenge for 3D-GAN. With this in mind it is also not surprising that BL06, which includes more information on particle geometry and compactness, outperforms a simple mass-size relation as M07.

Geometry estimation 250
We evaluate here two geometrical quantities: D max and V CH (Fig. 3, bottom panels). Both quantities are reconstructed in a satisfactory manner, with NRMSE of 12% and 35% respectively and MNB of 7% and 19%. The estimation of D max is compared with what can be achieved using individual 2D images, selecting the maximum of the three estimates of D max , one for each camera view, as for example in Praz et al. (2017). D max is slightly better reconstructed using the 2D images directly due to the fact that the 3D-GAN mass distribution output varies smoothly and the exact boundaries can only be 255 approximated with the approach detailed in Appendix A. The retrieval of D max from 2D images is practically unbiased: a result in itself interesting for MASC users. Riming has no major impact on the quality of D max , while it affects the retrieval of V CH : particles with LWP greater than 1 are overall better reconstructed (improvements of 15% in terms of NRMSE while no significant differences in terms of bias, not shown). It is not surprising that heavily rimed particles, are better reconstructed in terms of geometry because their geometry is significantly less complex. In Kleinkort et al. (2017), the performance of the VH 260 reconstruction algorithm for what concerns volume reconstruction (using a standard 3-camera MASC) is quantified to be 27% in terms of absolute error, for a simple spherical test object. The mean absolute error of 3D-GAN for all the printed replicas, thus for significantly more complex shapes, is 30%. If only heavily rimed particles, thus less complex shapes, are considered (LWP> 1 kg m −2 ), the error is further reduced down to 26%. The 3D structure of even heavily rimed particles is certainly more complex than a sphere and thus it is reasonable and conservative to assume that 3D-GAN performs at least as good as

265
VH for what concerns volume reconstruction, with the significant added value to provide at the same time an estimate of mass m.
The effect of the smooth variation of mass of the 3D-GAN output, without sharp edges, is evident when looking at the gyration radius: r g (Fig. 3), defined in this case as: where N is the number of voxels, d i CM is the distance of each voxel with respect to the center of mass of the snowflake and m i is the voxel mass content. r g is overestimated by 3D-GAN (overall by 13%), indicating that the mass contents of the reconstructed snowflakes have a larger spread around the respective centers of mass in comparison to the structure of the reference snowflakes.

3D mass distribution evaluation 275
With the evaluation setup described above, the 3D distribution of mass is available. In principle this allows one to compare the reconstructed and reference snowflakes with a voxel by voxel approach. Although this 1 : 1 comparison is undoubtedly ambitious and not straightforward, it is worth to show here some results in this direction. There are two main preliminary issues to be considered: 1. The orientations of the reconstructed snowflakes depend on the orientation of the printed replicas themselves, as they 280 were falling in the MASC measurement area. The orientation of the reference model is instead fixed.
2. The grid resolution of reference snowflakes is fixed at 40 µm while the grid resolution of the GAN output varies from flake to flake, as mentioned in Sec. 3.3 and it is generally lower (100 µm or more).
In order to address point 1, a preliminary alignment of each snowflake pair (reconstructed vs modeled reference) is performed. The snowflakes are considered as point clouds and their best alignment is found with the (rigid-body) point cloud 285 alignment technique implemented in the OPEN3D package of Zhou et al. (2018). Issue 2, grid resolution, is tackled by computing voxel by voxel performance indicators of mass distribution at various grid resolutions, by first down-scaling the data of both snowflakes into a common grid.
Several performance descriptors can be used to evaluate the reconstruction in terms of overlap or quantitative error. We can define here the following two descriptors. Given a pair of 3-dimensional snowflakes, one being the 3D-GAN reconstruction 290 and one the reference, let the Matched Mass Ratio (MMR) be: where ∆V is a given voxel size (resolution of the regular grid), m i 3D-GAN (m i REF ) is the content of mass of the i th voxel of the GAN reconstructed snowflake (reference true snowflake). m 3D-GAN (m REF ) is the total mass, invariant across scales, of 3D-GAN (reference) snowflake and N defines the set of voxels where the mass content is both nonzero for the GAN and 295 the reference. MMR varies between 0 (worse) and 1 (best) and it evaluates how well the mass of the reconstructed snowflake and the mass of the reference overlap, independently whether the total mass itself is correctly estimated. A MMR close to 1 indicates that as a whole the combined mass of the two snowflakes occupies the same voxels. A second, significantly more severe and quantitative, indicator is the normalized sum of errors (NSE): where M is the entire set of voxels where the mass content of 3D-GAN or the reference is nonzero. NSE does not allow for error compensation and it can in principle be as low as 0% only if the estimate of total mass of the GAN is perfect. Figure 4 illustrates the behavior of MMR and NSE across scales. The distribution of mass is overall well matched (mean MRR above 0.8), while the sum of individual errors accumulates, in terms of mean NSE, from 100% to about 50% over the range of grid scales. It must be underlined, however, that (i) the best achievable results of NSE are limited to a minimum 305 mean NSE of about 40%, due to the error in the mass estimation itself and (ii) the alignment of the two objects is assumed to be optimal. No real adversary method exists to compute the 3D mass distribution of snowflakes from MASC images, so we decided to show a comparison with an idealized reference as illustrated in the red curves of Fig. 4. This reference correspond to an ellipsoidal approximation of the reference snowflake with two major competitive advantages with respect to 3D-GAN: The orientation and overlap is thus optimal and no complications and uncertainties due to the realization of an actual measurement can play a role here.
-The density of the ellipsoid is adapted in order to perfectly match the total mass of the reference snowflake.
The Ellipsoid method is an idealization of the best possible approximation of the snowflakes by means of an ellipsoid of constant density, thus in principle largely superior to any ellipsoidal approximation that can be obtained using actual MASC 315 measurements. The performance of 3D-GAN is close to the one of this idealized retrieval both in terms of MMR and of NSE across all the scales and, most importantly, it is superior at the small scales: up to about 1.25 mm for MMR and up to 0.75 mm for NSE. Regarding NSE, as the scale of the comparison approaches the dimension of the snowflake, the ellipsoid approximation obviously exploits the advantage of "knowing" the exact total mass. Given the idealized nature of the benchmark and the complexity of the retrieval itself we consider the performance of 3D-GAN satisfactory.

Examples of application
The information provided by 3D-GAN is an important complement to what can be calculated or retrieved from MASC data (for example size, shape, complexity, orientation, hydrometeor type or riming degree, as in Praz et al., 2017). We would like to provide the reader with examples and suggestions about possible applications and future research directions that could benefit from the output of 3D-GAN. We consider the retrieval of mass an immediate added value of 3D-GAN and we apply this 325 retrieval here to datasets collected in the past years at various geographical locations. We focus here exclusively on snowfall data and blowing snow images have been removed using the classification scheme of Schaer et al. (2020).
The availability of both mass and size estimates can be used to construct m(D max ) relationships using the measurements of a single instrument, the MASC. These relations can then be stratified according to the identified hydrometeor type or as a function of the apparent riming degree, taking advantage of previous work in this direction (Praz et al., 2017). An example 330 is shown in the scatter plots of Fig. 5, for data collected in Switzerland in 2016 and 2017. The same dataset is color-coded according to the apparent riming degree R c (0 being unrimed particles and 1 fully developed graupel) and according to the classified hydrometeor type. Keeping in mind that 3D-GAN does not have access to textural information other than binary particle silhouettes, it is reassuring to observe in these plots several features that make physical sense. For example: for a given particle maximum size, the riming degree increases the mass content; graupel has the largest mass content (at a given size) 335 while columns the lowest, except for the largest observed sizes that can only be reached by aggregate snowflakes. Table 4 and 5 provide the parameters of the m(D max ) power laws calculated for various field campaigns conducted in the Alps and in Antarctica over several years. While an in-depth microphysical interpretation of these results and their differences linked to season and geographical location is beyond the scope of this study, it is worth to briefly discuss these results and hypothesize how they will be useful to support future research in this direction. Considering the entire datasets of individual field measurements (e.g. von Lerber et al., 2017, and references therein) and on simulations (Leinonen and Szyrmer, 2015;Karrer et al., 2020). Especially the work of von Lerber et al. (2017) provides b m values also lower than 1.7 and as low as 1.5, as occasionally estimated also by us. Other studies report b m always larger than approximately 1.7 (Mason et al., 2018), 1.9 (Karrer et al., 2020) or 2 (Leinonen and Szyrmer, 2015).

345
The estimated prefactors a m reproduce well the range of values that are documented in the literature 2 . In cgs units, the values listed in Table 4 and 5 span roughly between 0.001 and 0.04 g cm −bm . This range of variation is similar to von Lerber et al.
(2017). Also Mason et al. (2018) reports values in this range, but occasionally higher: up to 0.08 for lump graupel, and larger than 0.1 only for hail or solid ice spheres. Leinonen and Szyrmer (2015) obtains a maximum a m value of approximately 0.09 g cm −bm , but only for a model aiming to reproduce the growth by riming of frozen droplets rather than ice crystals (called 350 rime growth).

365
-Exploit the availability of mass estimates to find and explain the observed relations with size, shape, fall velocity and vertical structure of precipitation.
-Exploit the 3D mass distribution for scattering simulations and remote-sensing applications.

Summary, Conclusions, Outlook
The MASC instrument is a state-of-the art device to investigate and describe the habits and microphysical properties of solid-370 phase precipitation particles. Large datasets of triplets of hydrometeor images have been gathered worldwide, with more to be collected during present and future field campaigns. MASC data provided already noteworthy contributions to studies of snowfall microphysics and recent algorithms exist to estimate the hydrometeor type, riming degree as well as volume properties of the particle pictured by the masc. With one exception (Kleinkort et al., 2017), limited effort has been conducted so far to exploit the multi-dimensionality of MASC images to retrieve three-dimensional properties of the hydrometeors. We presented 375 here a method, based on machine learning and trained on synthetic data (with verified realistic properties), to retrieve the threedimensional distribution of mass of individual snowflakes using a triplet of silhouettes as input, corresponding to the MASC images. Unlike the pioneering work of Kleinkort et al. (2017), mass estimation is provided as a key output and not merely shape and volume.
We have conducted a validation of 3D-GAN by means of 3D-printed replicas of realistic snowflakes of known characteristics.

380
Due to technical limitations and difficulties to handle small and fragile particles, the evaluation is limited to a range of values of sizes and masses that does not fully overlap the one of naturally occurring snowflakes. The mass content is estimated with low bias (10% mean overestimation) and with a normalized RMSE of 40%. Concerning geometrical features, D max is reconstructed with a mean overestimation of 7% (NRMSE 12%), the volume of the convex hull V CH is overestimated by 19% (NRMSE 35%) and the gyration radius r g by 13% (NRMSE 16%). The evaluation of 3D-GAN reconstructions was also 385 conducted on a voxel-by-voxel basis, after alignment of the reconstructed snowflakes and the original model by means of point cloud alignment (technically called 'registration"). We have additionally shown that, in order to provide results comparable to 3D-GAN with an ellipsoidal approximation one would need both be able to achieve the best possible 3D fit and to exactly retrieve the mass of the snowflake: two requirements that are extremely unlikely to be fulfilled using MASC data as input.
The 3D-GAN method has still margin for improvement. For example, about the input (black-white silhouettes): future studies 390 may employ image simulation techniques in order to add the missing textural information (including lights and shadows) to the training set and thus to the input. Although we are aware that this is not a straightforward step, it would allow one to fully exploit the MASC data. Our evaluation highlighted a positive bias for r g , V CH and D max suggesting that 3D-GAN could be improved in terms of particle compactness. When it will be feasible to 3d-print, at lower costs, a large number of snowflakes at a fine resolution (at least the 40 µm voxels used by the model presented here), it will be of interest to extend the validation 395 to a larger and more variate sample.
We have shown some examples of application of the novel method, by combining the retrieved mass with dimensional information as well as hydrometeor type and riming degree and then fitting the coefficients of mass-to-size power laws. We 3D-GAN code is available at: https://github.com/jleinonen/masc3dgan. The codes and data to support the evaluation of the performances of 3D-GAN, including the models and shapefiles of the replica snowflakes are published and available at (10.5281/zenodo.4790962 Grazioli et al., 2021). Raw or processed MASC data for any campaign mentioned in the paper, as well as a MATLAB code to pre-process the data according to the method of Praz et al. (2017) are available upon request to the authors.
Appendix A: Approximation of geometrical features 410 As discussed in the manuscript, the GAN output consists of a three-dimensional distribution of mass. These values vary smoothly and do not generate a clear cutoff at the edges of the reconstructed snowflakes, artificially expanding their apparent size. For practical purposes, quantities such as D max or the volume of the convex hull may be of interest and thus we propose a simple but conceptually-sound method to define the geometrical extent of each snowflake by means of an adaptive minimum density threshold.

415
The goal is to obtain, for each individual snowflake, an optimal density threshold ρ opt th [kg m −3 ] such that only voxels having ρ ≥ ρ opt th are used to define the spatial extent of the particle. Let P be the three dimensional matrix defining the density of each voxel of a given snowfake. As the maximum density can vary for each snowflake, it is normalized between 0 and 1. Let P th be the same matrix, censored by zeroing the voxels having a lower density content than an arbitrary threshold ρ th . Two scalar quantities can be defined, given P and ρ th : 420 1. ρ th defines the mean density of non-zero voxels.
2. m th /m 0 the residual total mass of the censored matrix with respect to the uncensored total mass.
The first scalar quantity increases with increasing threshold levels (as voxels of low density are removed), while the second one decreases. Let us then multiply the two scalars and define a simple weight W * as a trade-off between increase of mean density (and compactness), which we want to reward, and the associated loss of mass, which should be penalized.
The evolution of W * as a function of the threshold level has a behavior as illustrated in Fig. A1. We choose an optimal threshold corresponding to the location of maximum W * in order to balance the two errors. The final threshold is then applied to define the spatial extent assigned to the snowflake. In this work we used this approach to be able to evaluate the GAN output against the 3D-printed replicas, exclusively for quantities as D max or the convex hull volume. The total mass as well as the gyration      overlap the 25-75% percentile range. An artificial horizontal displacement is added to the data series to enhance readability. The reference method (Ellipsoid) corresponds to an optimal ellipsoidal fit of the reference snowflake with perfect mass match, as described in the text.    Table 4 and 5. Left: data color-coded according to riming index Rc of Praz et al. (2017). Right: data color-coded according to the hydrometeor classes, also of Praz et al. (2017). Different curves of the same color correspond to different field campaigns. Figure A1. Example, for a reconstructed snowflake, of the distribution of the weight W * as a function of the threshold on the voxel density content ρ. The weight is displayed here as normalized between 0 and 1. The threshold maximizing W * is considered as optimal and it is used to censor the data to calculate geometric descriptors. A maximum could always been obtained for all the reconstructed snowflakes.     Table 4. Values of the parameters of the relation m = am D bm max estimated on the datasets of different field campaigns for various degrees of riming. m is estimated with 3D-GAN, while Rc is the normalized riming index as in Praz et al. (2017), averaged over the three camera views. Dmax is the maximum dimension obtained from the triplet of images of the MASC.  Table 5. Values of the parameters of the relation m = am D bm max estimated on the datasets of different field campaigns for various hydrometeor types. m and is estimated with GAN-3D, while the hydrometeor type is obtained with the method of Praz et al. (2017). Dmax is the maximum dimension obtained from the triplet of images of the MASC.