Monte Carlo Source Convergence Acceleration by Hybrid Multigroup and Continuous Energy Neutron Transport

Abstract A novel source convergence acceleration method for Monte Carlo eigenvalue calculations is proposed in this paper. The method consists of simulating the bulk of the inactive cycles with online-generated multigroup cross sections. Then the active cycles are simulated with continuous-energy cross sections to preserve full fidelity. The method was implemented in the Monte Carlo code SCONE and tested on several three-dimensional full-length assembly models. In some cases, the same multigroup cross sections were used for several spatially separated materials in order to limit statistical uncertainties. The method was shown to accelerate calculations by a factor of 2.5 to 5 at the cost of a slightly increased standard deviation in the flux distribution estimated across several independent simulations. The memory usage due to storing multigroup cross sections does not seem to be prohibitive for practical applications.


I. INTRODUCTION
Monte Carlo is one of the most popular neutron transport methods. Its most prominent features are being high fidelity but relatively computationally expensive. Historically, its main use was to perform reference calculations for code-to-code benchmarks, although with more available computing power it is seeing increasingly routine use across industry and academia for reactor analysis and design. In order to extend the range of problems to which it can conveniently be applied, several methods to speed up Monte Carlo transport simulations are being investigated.
A subclass of acceleration strategies is source convergence acceleration. Monte Carlo criticality calculations use a convergence method analogous to the power iteration method used by deterministic methods. 1 Starting from a guess neutron fission source and fundamental eigenvalue, a transport calculation is performed in order to obtain the next step's fission source and eigenvalue. The process is repeated until convergence is reached by some chosen measure. Since the first steps, called inactive cycles in Monte Carlo, produce biased estimates due to the unconverged fission source, they are normally excluded from the final results. The cycles that actually contribute to the collection of statistics to calculate the final results are called active.
In many criticality safety models, e.g., metallic spheres, the dominance ratio is low and convergence to the fundamental fission source is achieved very rapidly. As such, most of the computational time is spent on accumulating statistics during the active cycles. However, in certain cases the inactive cycles can take the largest fraction of the total computational time. In problems with a high dominance ratio, such as full-length three-dimensional (3-D) reactor models, it might take several hundreds of cycles before the source converges as measured by the Shannon entropy. 2,3 As an example, a fullcore pressurized water reactor (PWR) can have a dominance ratio larger than 0.99; the 3-D BEAVRS reactor's dominance ratio is estimated to be 0.996 (Ref. 3). At the same time, the same models might be subject to issues such as clustering and correlation effects, which are partially controlled by increasing the neutron population. 4 If the number of neutrons per cycle is large enough, the number of active cycles needed to achieve acceptable statistical uncertainties can be relatively low. However, the number of iterations needed for the problem to converge is independent of the neutron population. Therefore, it might be the case that the number of inactive cycles needed largely exceeds the number of active cycles, thus becoming extremely computationally expensive when large populations are necessary.
To obviate this problem, several methods to accelerate source convergence have been devised. Some of these achieve convergence acceleration by reducing the number of full-fidelity Monte Carlo inactive cycles needed. This is possible if a good fission source guess, produced by a cheaper transport or diffusion solver, is provided. Typically, the cheaper solver is a deterministic method. This is the case, for example, of response matrix acceleration 5 developed within the Monte Carlo code Serpent. 6 In this method, a slightly modified Monte Carlo inactive cycle is used to generate coefficients for a deterministic response matrix solver that can provide a good fission source guess. The method was tested on a 3-D full-reactor model and accelerated source convergence considerably. The main limitation is that, in order for the deterministic solver to converge avoiding instabilities, good coarse and fine spatial meshes must be provided by the user. However, choosing the optimal mesh is nontrivial and might require a trial and error approach.
Another method is coarse mesh finite difference (CMFD) acceleration. CMFD is commonly used to accelerate deterministic methods, while it was first applied to Monte Carlo transport by Ref. 7. Given a coarse mesh, the coefficients for a multigroup (MG) diffusion calculation can be obtained during one or several Monte Carlo inactive cycles. By solving a diffusion equation that is forced to be consistent with the transport solution via a correction factor, the weights of the Monte Carlo source neutrons for the next cycle can be modified to match the diffusion solution. This is repeated until convergence is reached. CMFD has been shown to accelerate convergence substantially, reducing the inactive cycles required by a factor of at least 3 in large reactor problems. 3 One of the difficulties of CMFD is that it might be affected by numerical instabilities. 8 This can be avoided by choosing the coarse mesh parameters wisely. A similar, but more uniquely Monte Carlo approach, is fission matrix acceleration. 9 This method also modifies particle weights between inactive cycles, but it is informed by estimating the discretized Green's function for fission neutrons born in different regions of the problem at hand.
A different category of convergence acceleration strategies aims at making the inactive cycles less expensive, rather than reducing the number of inactive cycles needed. An example of such methods is particle ramping, 10 where the neutron population is initially low and it is gradually increased as the inactive cycles proceed. While this method can surely accelerate the simulation, it may be prone to negative correlation effects due to the whole neutron population originating from a small number of families.
This work proposes a novel source convergence acceleration method for reactor calculations that falls under the latter category. Its goal is to speed up the inactive cycles by using MG data instead of continuousenergy (CE) data. Due to reduced memory access, the use of MG data is known to substantially accelerate Monte Carlo neutron transport. 11 This method was introduced in Ref. 12, and it is expanded here. The rest of this paper is structured as follows. The methodology is described in Sec. II, the numerical investigations carried out and the results found are presented in Sec. III, and the conclusions are drawn in Sec. IV.

II. METHODOLOGY
One of the computational bottlenecks of Monte Carlo particle transport codes is appreciated to be the retrieval of CE cross sections. 13 CE cross sections may lie on energy grids, including up to 1 million points, which makes the grid search process expensive. Additionally, this large amount of data must be stored in a deep layer of the computer memory. Accessing these data in the memory is an expensive operation that must be performed several times per each neutron track during a transport simulation. However, this bottleneck can be alleviated by using MG nuclear data. MG nuclear data normally lies on much coarser energy grids that can be kept closer in memory, offering a large speedup at the cost of lower fidelity. MG cross sections are materialwise rather than nuclidewise and treat scattering via scattering matrices rather than scattering laws. Data are more compact and faster to process: acceleration factors of 5/6 can be expected when transitioning from CE to MG.

MONTE CARLO SOURCE CONVERGENCE ACCELERATION · RAFFUZZI et al. 365
The methodology applied in this paper is an extension of Ref. 12. The approach aims at accelerating fission source convergence by simulating most of the inactive cycles with MG cross sections. The same group structure was used per each set of cross sections. This algorithm is implemented in the Monte Carlo code SCONE (Ref. 14). SCONE has the unique feature of allowing the use of different cross-section representations simultaneously or in the same calculation. In Ref. 12, the MG cross sections were pregenerated via a Serpent run and fed into SCONE for simulating the bulk of the inactive cycles. Only a small number of final inactive cycles was simulated with CE cross sections in order to ensure convergence to the correct fission source. The transition from one cross-section type to the other was possible thanks to the implementation of a routine that can switch from MG to CE nuclear data from one cycle to the other. This operation requires resampling the energy of the particles contained in the neutron bank for the following cycle from the correct CE fission spectrum. This was done by sampling one of the fissile nuclides in the material each particle is in based on their fission cross section at 1 eV; then a neutron energy from the thermal fission spectrum of the nuclide selected is sampled. The energy used to sample the fissioning nuclide is arbitrary and could be changed. However, in this case, given that the method was tested on light water reactors (LWRs), 1 eV seemed an appropriate value. Although the choice of a fixed energy introduces an approximation, the error added is expected to be negligible and to disappear after a single or very few CE cycles. Afterward, the active cycles are simulated with CE cross sections to obtain full-fidelity results.
In this work, the main difference with respect to Ref. 12 is that MG cross sections are generated online during the first few inactive cycles. This is possible because the fission source does not need to be converged for efficient cross-section generation. This feature allows for preparing and performing a MG calculation in only one step, removing the need for the preliminary Serpent calculation used in the preceding work. A similar approach is applied, for example, when using CMFD acceleration. 3 To summarize, the new calculation routine consists of the following steps: 1. Running some CE inactive cycles for MG crosssection generation.
2. Initializing the MG materials with the calculated cross sections.
3. Resampling the energy of the fission source neutrons from a MG spectrum. 4. Running a certain number of MG cycles. 5. Resampling the energy of the fission source neutrons from a CE spectrum. 6. Running the remainder of the inactive cycles and all of the active cycles with CE data.
To allow this, new features had to be developed in SCONE. The capability to generate MG cross sections was introduced, as well as a subroutine to transition from CE to MG nuclear data at the end of a cycle. Other than the usual population parameters, such as number of active and inactive cycles and neutron population per cycle, other parameters can now be varied by the user. These are the number of cycles dedicated to cross-section generation and the number of final CE inactive cycles. The next section describes the MG cross-section generation procedure implemented in SCONE.

II.A. Cross-Section Generation
The MG cross sections need to account for energy and spatial self-shielding effects. One of the advantages of generating them with Monte Carlo is that selfshielding is taken into account automatically and approximate treatments like equivalence theory are not necessary. In SCONE, the cross-section generation methodology was inspired by Refs. 15 and 16. A new tally type was implemented to introduce the cross-section generation capability. Given a user-defined energy group structure, the MG cross section in group g for a generic reaction x, Σ x;g , is calculated as described by Eq. (1) per each material or group of materials: where Σ x and ϕ are the CE cross section and scalar neutron flux, respectively. The average number of neutrons per fission ν g is calculated as shown in Eq. (2): The scattering matrices are normally calculated from the scattering double-differential cross sections Σ s ðμ; E; E 0 Þ, as shown in Eqs. (3) and (4) where P l is the l'th Legendre polynomial, and Σ s;l;g;g 0 is the corresponding scattering matrix. However, similarly to MCNP and Serpent, SCONE reads nuclear data from ACE files, which do not include such cross sections. An alternative method, proposed in Ref. 15, consists of calculating the probability of scattering from group g to g 0 , P g;g 0 , with an analog estimator. This can then be used to estimate the scattering matrices as in Eqs. (5) and (6), which refer to Legendre polynomial orders 0 and 1, respectively: and Σ s;1;g;g 0 ¼ P g;g 0 Σ s;g μ ; ð6Þ where the average scattering angle cosine μ is also estimated with an analogue estimator. For this work, only the P 0 and P 1 scattering matrices were generated since SCONE cannot read higher anisotropy moments. However, the P 1 scattering representation was assumed to be sufficient for the models simulated in this work, given they are LWR problems and as such do not present strong anisotropy. Alternatively, transportcorrected P 0 scattering would also be suitable to treat higher anisotropy, provided the transport-corrected cross sections were appropriately generated. This approach would be preferable, since it would also reduce memory consumption.
An analogue estimator was used also for the MG fission neutron spectrum χ g : The cross sections generated by SCONE were compared to those produced by Serpent for the problems investigated, and they agree well within uncertainties.

III. NUMERICAL INVESTIGATIONS
The goal of this work is to demonstrate whether the proposed method can accelerate simulations while maintaining full fidelity. There are a few parameters that can be varied to achieve this: the number of crosssection generation cycles, the number of energy groups used, the number of unique MG material regions, and the number of final CE inactive cycles. It is desirable to minimize the number of CE inactive cycles used to obtain the maximum speedup possible. At the same time, however, one must ensure that full convergence is achieved to get unbiased scores during the active cycles. For this, using a finer group structure might be useful. Conveniently, the time taken by MG transport is basically unaffected by the group structure; this was seen in the current work varying the number of groups from 7 to 69. However, the finer the group structure, the larger the number of particles needed to produce reliable cross sections given the statistical fluctuations. The use of more groups would thus result in more initial CE cycles, but it might allow for reducing the number of final CE cycles. On the contrary, using a smaller group structure might allow for reducing the number of initial CE cycles. However, in some cases this can come at the cost of a disproportionately larger amount of final CE cycles. This is the case because few-group structures might not capture all the relevant spectral effects and can converge to a different fission source. Therefore, final CE inactive cycles are needed to correct the source distribution. However, correcting the fission source after wrong convergence might take a considerable number of CE cycles. This is especially the case in slow converging problems, i.e., problems with a high dominance ratio. Similar concerns apply for the number of unique material regions for which cross sections may be generated; one can more accurately capture spatial spectral variations by generating MG cross sections in many small regions, but this would also entail increased variance.
The parameters chosen clearly depend on the number of neutrons simulated per cycle. For instance, a larger neutron population would reduce the number of crosssection generation cycles needed, but increase the weight of the inactive cycles on the total computational time. On the other hand, the lower bound for the neutron population is determined by correlation effects, often present in high dominance ratio problems, such as large 3-D reactors. Following these considerations, the method proposed was tested on several models and a sensitivity study on all parameters was performed.

MONTE CARLO SOURCE CONVERGENCE ACCELERATION · RAFFUZZI et al. 367
In order to test the proposed acceleration method, three 3-D light water assembly/minireactor models were simulated. These are a fresh-fuel PWR assembly, a CE version of the C5G7 benchmark model, and a PWR assembly with burnt fuel composition. Since SCONE does not support parallel calculations yet, a 3-D fullcore model was deemed too computationally expensive to simulate for this paper. However, these models were chosen because they have different challenging features. The fresh PWR assembly contains heavy absorbers in the form of gadolinium-bearing pins; the C5G7 introduces a mixed-oxide (MOX) and uranium-oxide reflector interface with large spectral differences; and the burnt PWR assembly allowed for testing the method on a burnt problem, where each fuel pin has a different material composition.
For each model, different population parameters were used. Generally, a large number of inactive cycles was chosen to guarantee full convergence. For the BEAVRS benchmark, which has a relatively high dominance ratio, 200 inactive cycles were sufficient for convergence without any source convergence acceleration. 3 In order to be conservative, at least 300 inactive cycles were used for the models simulated in this paper. At the same time, a relatively large neutron population was necessary to limit high statistical fluctuations, as well as correlation effects. 17 For the same reasons, all results were averaged over eight independent runs. On particle numbers, it seems common in literature to use on the order of hundreds of millions of active neutron histories to simulate a full 3-D PWR core. 18,19 As a consequence, 160 million total histories seem appropriate for a single or few 3-D PWR assemblies. The number of active cycles was chosen to be low, since it was previously demonstrated that lowering the number of neutrons per cycle (keeping the number of total neutron histories constant) is effective in reducing variance while using simulation time economically. This is due to intergenerational correlation; increasing the number of active cycles does not reduce the true standard deviation of estimates according to the ideal 1 ffi ffi ffi N p trend. 3,20 The state of convergence was monitored through Shannon entropy, 2 as well as through the final axial and radial flux profiles. All simulations were run on a single Intel® Xeon® Gold 6130 2.10 GHz CPU core, without parallelization.

III.A. Full-Length Fresh Assembly
The first model tested is a 3.6-m-long PWR assembly with gadolinium pins and water-filled guide tubes. The fuel is 4.2% enriched UO 2 ; gadolinium-bearing pins are made of 2.7% enriched UO 2 , with 6.6 mole % of Gd 2 O 3 , where gadolinium has a natural isotopic composition. The model radial configuration is shown in Fig. 1. Axially, a realistic water density profile was introduced by dividing the moderator material into 10 axial nodes. The density profile is reported in Table I, where the density of node 10 corresponds to 0.75425 g/cm 3 . In order to capture axial spectral variations, the fissile materials also were split into 10 axial nodes for cross-section generation. Radially, in a given axial node, all UO 2 pins have the same MG cross sections, as have the gadoliniumbearing pins. The boundary conditions applied are radially reflective and axially vacuum. For this model, 1 million particles per cycle, 400 inactive cycles, and 20 active cycles were simulated. While a large number of inactive cycles were utilized to ensure convergence, few active cycles were enough to get good statistical accuracy given the relatively high neutron population.
An identical model was simulated to test the method in Ref. 12, which is identical to the method presented in this paper with the exception that cross sections were precalculated by a Serpent run. In that work, the Shannon entropy was calculated axially over 20, 50, and 100 axial bins, as well as radially with a pin-by-pin discretization. The result was that, even though in some cases the final axial flux profile differed up to 4%, the Shannon entropies calculated with the different axial

368
RAFFUZZI et al. · MONTE CARLO SOURCE CONVERGENCE ACCELERATION discretizations agreed perfectly. This implies that in these cases the MG calculation converged to a slightly incorrect fission source, biasing the final flux distribution, and this was not captured in the Shannon entropy measurement. When different population parameters were applied, the axial flux profile improved considerably, while the Shannon entropy did not change. Given these observations, in this work Shannon entropy is treated as an approximate convergence measure. Whether the MGaccelerated calculation converged properly, thus ensuring full-fidelity results, will be determined instead by comparing the final flux profiles. Additionally, Ref. 12 showed that the pin-by-pin radial entropy converges within five inactive cycles for this model, and the pin-by-pin radial flux error was below or equal to 0.1% for all cases considered. This was the case also when the axial flux profile was not satisfactory. Following these considerations, radial entropy and flux measurements were not scored for this model.
Some results for the PWR assembly are shown next. The results from Ref. 12 were used as a starting point to choose the simulation parameters. In that work, for the same model, seven groups with 25 final CE inactive cycles produced satisfactory results. When only 10 final CE cycles were used, the flux deviation reached up to 4%. Here, several group structures were used; these are the CASMO7 −12 and −23 energy grids optimized for LWRs. For this group structure comparison, all cases included 20 cross-section generation cycles and 20 final CE cycles. The Shannon entropy, calculated over 20 axial bins, is shown in Fig. 2. The entropy looks fully converged for all cases, and using a finer group structure seems to slightly improve the entropy value when compared with the CE reference. Figure 3 shows the Shannon entropy as a function of run time rather than cycle number. The solid and dashed gray lines in the plot represent, respectively, the mean and standard deviation of the converged CE entropy calculated over the final 100 cycles. The 23-group case entropy is within one standard deviation of the reference one, while this is not the case for the 7-and 12-group cases. Interestingly, the 7-group entropy seems to be substantially offset from the reference value until approximately cycle 380, where the calculation is switched back to CE and the entropy changes trend. Using a finer group structure does not impact the computational time noticeably. When simulating 400 inactive cycles, the MG inactive cycles took approximately 70% less time to conclude than the CE ones. However, all  Table II shows the cross-section uncertainties for the three different group structures when 20 CE generation cycles are used. An additional case with 23 groups and 10 CE generation cycles was also added. For all cases, a large difference between minimum and maximum uncertainty, reflected in large standard deviations, is present. A closer analysis showed that uncertainties are maximum on the materials at the top and bottom of the assemblies and minimum at the center. The large difference between minimum and maximum is thus justified by the presence of statistically insignificant regions. The average for capture and fission cross sections does not change heavily with the group structure. The expected 1 ffi ffi ffi N p trend on the average uncertainty, in particular, was not observed. This is probably due to correlation effects; it was previously observed by Ref. 3 that tallies do not follow the well-known 1 ffi ffi ffi N p behavior in problems with a high dominance ratio because of the high coupling between successive generations of neutrons. Uncertainties are significant for the P 0 matrix, whose dimensions are N � N. The average uncertainty in P 0 increases considerably when increasing the group number from 12 to 23. Using 10 cycles instead of 20, however, does not produce a considerable difference. The results for the 23-group case with 10 cross-section generation cycles are shown later in this section. Figures 4 and 5 show the relative difference between the entropies of the CE and MG cases when changing the number of generation cycles and number of energy groups, respectively. This is displayed for the last 100 inactive cycles. The error bars represent the standard deviation of the entropy between the eight independent runs. Figure 4 shows results for the 7-group structure. An attempt to maximize speedup was made by reducing the number of cross-section generation cycles. Even if the average value is not affected, decreasing the number of cross-section generation cycles increases the entropy variance. Although the large variance of the 5-cycle case suggests that the results are within the uncertainty of the reference, the reduced uncertainty of the 20-cycle case shows that using seven groups leads to a small systematic bias. In order to approach a more accurate source distribution, a finer group structure should be used, as shown in Fig. 5. Surprisingly, the standard deviations for all cases are similar. Evidently, the higher uncertainties in the scattering matrix of the 23-group case do not significantly affect the axial source profile. Figures 4 and 5 are intended to show the effects of changing some simulation parameters on the mean and the variance of the Shannon entropy. Even if some trends can be seen by varying certain parameters, all results are within 0.2% of the reference Shannon entropy. Therefore, whether a substantial systematic bias was introduced or not by the MG cycles will be determined by examining the final flux distributions. Figure 6 shows the axial flux profile calculated over 20 axial bins. The relative difference between different cases can be observed in Figs. 7 and 8, where the error bars represent the uncertainty resulting from the standard deviation between independent runs. The blue area in the plots represents the standard deviation between independent CE runs as a term of comparison. In Fig. 7, the flux difference is compared for the three energy grids used. For all cases, both the mean errors and standard deviations increase at the extremities of the assembly, where the flux magnitude is lower and MG cross sections have extremely high uncertainties. Even if all fluxes agree within uncertainties, the average difference gets smaller when the number of groups is increased. In the 23-group case, the average difference is below 1.5%, and it is within the CE standard deviation band. At the same time, the uncertainties decrease when increasing the number of groups. Given that using 23 groups rather than 7 or 12 does not produce a significant time overhead, the CASMO23 group structure seems an appropriate choice for this model.
Reducing the total number of CE inactive cycles used in a simulation should improve the speedup. This can be achieved by reducing either the number of cross-section generation cycles or the number of final CE inactive cycles. In Fig. 8, several combinations were tried. The combinations 20 + 10 and 10 + 20 perform equally well; however, reducing the number of final CE inactive cycles increases the standard deviation between independent runs more than reducing the cross-section generation

III.B. C5G7 Benchmark
The second model tested is a 3-D, 2 � 2 assembly configuration based on the C5G7 benchmark. 21 The model is made of two UO 2 and two MOX assemblies surrounded by a thick water reflector. The MOX assemblies are made of pins with three different transuranic enrichments: 4.3%, 7.0%, and 8.7%. The fuel materials were homogenized with Zircaloy and aluminum claddings. The resulting pin radius is 0.54 cm. The fuel material densities are shown in Table III. The radial configuration of the model can be seen in Fig. 9. Water guide tubes are present between the fuel pins. Axially, the assemblies are 385.56 cm long, and a 21.42-cm water reflector is added both at the top and at the bottom. The following boundary conditions are applied: radially, the boundaries adjacent to the assemblies were reflective while the others were vacuum. Axially, the top and bottom boundaries were vacuum. Additionally, a water axial profile, not present in the original benchmark, was introduced by splitting the water into 10 axial materials. The profile is identical to the one presented in Table I, where the MOX materials are named after their respective transuranic enrichment. In this case, the density of node 10 is 1.0 g/cm 3 . All water materials also include 500 ppm of boron. In order to capture spectral variations, also all the fuel materials are split into 10 axial nodes for MG cross-section generation. All the materials in the benchmark use cross-section libraries at a temperature of 300 K. It is worth noting that in this case water is considerably denser than in the PWR assembly. The high water density decreases the neutron mean free path, making the reactor strongly decoupled. Since neutrons take a long time to travel from one end of the assembly to the other, convergence is extremely slow. In this case, 800 inactive cycles were run to ensure full convergence. The neutron population used was 1 million neutrons per cycle, with 20 active cycles.
Shannon entropy was measured separately axially and radially, using 20 axial bins and a pin-by-pin subdivision, respectively. These results can be seen in Figs. 10 and 11. In this case, the best group structure found in the previous case was used. This consists of the CASMO23 energy grid with 20 cross-section generation cycles; in order to ensure convergence in this extremely high dominance ratio problem, 30 final CE cycles were added. The radial entropy converges after less than 50 cycles. The fast radial convergence is due to the relatively small radial dimensions. On the other hand, the axial entropy takes approximately 500 cycles to converge. Applying the same reasoning as in the previous case, it can be concluded that the MG case converges approximately three times faster than the reference one.
The cross-section uncertainties when 20 CE generation cycles are used are shown in Table IV. Like in the previous model, even if the average uncertainties observed are unacceptably large, the error is much smaller in statistically relevant regions. This is reflected in the results shown, which seem to imply that no bias is added as a result of the MG cycles. Figure 12 shows the axial flux profile, while Fig. 13 shows the relative error between the reference and the MG cases. The blue area in the plot represents the standard deviation between independent CE runs. In Fig. 13, an attempt to reduce the number of final CE cycles was made. All average errors are below 2% and within the standard deviation of the reference runs. However, even in the case with the largest number of CE cycles, the MG standard deviation is larger than the CE one. Decreasing the number of final CE inactive cycles increases the standard deviation even further. Potentially, reducing the number of cross section generation cycles could be attempted as well. However, this has not been included in this study.
The reference radial flux is shown in Fig. 14, and the relative error between CE and MG is shown in Fig. 15. The MG case chosen was the one with 23 groups and the lowest number of CE cycles, i.e., 20 + 10. Once again, the radial features of the model can be reproduced more accurately than the axial ones. All pin errors are below 1%, and the average error is 0.15%.

RAFFUZZI et al. · MONTE CARLO SOURCE CONVERGENCE ACCELERATION
Initially, the same simulation strategy as for the other models was chosen. This includes generating a different set of cross sections per each material in the geometry. In this case, given the substantial computational cost of running a model with many nuclides, 400 000 neutrons per cycle were also used, along with 300 inactive cycles and 30 active cycles. The results produced are shown in Fig. 17. The blue and red bands in the plots represent the standard deviation between reference CE runs with, respectively, 400 000 and 1 million neutrons per cycle. Initially, the CASMO23 group structure, successful for the previous models, was used. However, this produced an error around 4% with a standard deviation also around 4%. Increasing the number of groups and adopting the CASMO40 group structure improved the results, while using the WIMS69 group structure produced an intermediate error. Increasing the number of final CE inactive cycles improved the results only slightly, thus they are not shown here.
Given the large number of materials and nuclides present in this model, memory consumption might also become a concern. This is especially the case when relatively fine group structures are used, like the WIMS69 grid. Since this is the most memoryintensive model simulated in this paper, memory utilization was measured for this model only. The memory consumption was found to be 217 MB in the reference CE case, 254 MB when 23 groups were used, and 431 MB when 69 groups were used. The memory usage almost doubled when 69 group cross sections were generated due to the large scattering matrices, indicating that this method could easily become memory intensive in certain cases. However, the memory increase was more modest when only 23 groups were used. In order to try to reduce the correlation effects that originated from the lower neutron population used and were possibly responsible for the large errors, a population of 1 million neutrons per cycle was tried as well. The results are shown in Fig. 18. Increasing the neutron population improved Fig. 16. Portion of burnt PWR assembly adopted for 1/8 symmetry.   the flux difference and reduced the standard deviation considerably. Even if the flux error is now consistently below 2%, it is still outside the CE standard deviation bands. The high error observed was found to be caused by the extremely high cross-section uncertainties presented in Table V. Also in this case, 20 CE generation cycles were used. Contrary to the other cases, capture and fission cross-section uncertainties now differ more significantly. In particular, it is interesting to observe the minimum uncertainty values found. While this is around 1% for capture, it is larger than 10% for fission. While capture includes both water and fuel materials, fission only refers to the fuel. It is clear that water cross sections have reasonably low uncertainties, but this is not the case for the fuel. While in previous models there were 20 to 40 fuel materials, this case includes 390 fuel materials. Therefore, to compensate, at least 10 times more particles or cross-section generation cycles should be employed. Using 10 times more generation cycles, however, would basically take the entirety of the inactive cycles. Additionally, it was previously mentioned that tallies do not follow a 1 ffi ffi ffi N p behavior in high correlation models. Therefore, more than 10 times more cycles would probably be needed. While, for the same reason, there is no clear trend between the uncertainties relative to 23, 40, and 69 groups, the 7-group structure has obviously lower uncertainties. However, the 7-group structure was found to be unable to provide proper convergence for a PWR assembly model with gadolinium pins. Due to these obvious limitations, another approach was attempted.

III.C.1. Material Grouping
From the previous results, it was found that a burnt problem could not be simulated in the same manner as done so far. In order to assure sufficient statistics to generate reliable cross sections, almost the entirety of the inactive cycles would have to be simulated with CE, defeating the goal of the method. As a consequence, the same cross sections were used for similar materials during the MG cycles. A single set of cross sections was generated for all the gadolinium-free fuel materials in each axial slice and for the gadolinium pins. The other materials were treated identically as before. This approach aims to reduce the uncertainties in crosssection generation, and it fulfills the purpose of reducing memory consumption as well. When using 23 groups, in this case the memory usage amounts to 220 MB rather than the previous 254 MB, while the reference CE case occupies 217 MB.
In the following calculation, parameters including 23 groups with 20 cross-section generation cycles and 20 final CE cycles, successful in modeling the fresh PWR assembly, were used. The new axial and radial Shannon entropies are presented in Figs. 19 and 20. The axial entropy was calculated over 20 axial bins, and the radial one with a pin-by-pin discretization. Different neutron populations per cycle were used. The axial Shannon entropy converges in 250 cycles, while the radial entropy takes approximately 50 cycles. When only 100 000 particles are used, both entropies are extremely noisy and the radial entropy in the concluding CE cycles appears to converge to an incorrect value. However, this does not happen when larger neutron populations are used.  Figure 21 shows the axial reference entropy and the one calculated with 400 000 particles per cycle as a function of computational time. Contrary to the previous models, the inactive cycle speedup reaches a factor of approximately 4 or 5. Each MG cycle is nine times faster than each CE cycle. This is to be expected due to the large number of different materials and nuclides present in the geometry that largely slow down the CE calculation due to increased cross-section lookups. The net advantage of MG cross sections is therefore greater in this case.
The axial flux profiles for the cases considered are shown in Fig. 22, while the relative error can be seen in Fig. 23. The average errors of all cases are generally within the standard deviation of the CE runs. The only exceptions are the 100 000-and 1 million-particle cases, which reach an average error of up to 2% at the bottom of the assembly. On the other hand, the error for the 400 000-particle case is below 1% in the same region. This seems to imply that the increased error when increasing the population is purely due to noise. While the standard deviation reaches up to 8% when only 100 000 particles per cycle are used, in the other cases it is strongly reduced. However, it is consistently larger than the CE standard deviation.   Figure 24 shows the reference axially integrated radial flux profile, while Fig. 25 shows the error produced by the 400 000-particle case. Once again, the radial flux has a much lower error compared to the axial one. The MG flux is almost consistently within 0.1% of the reference one.

IV. SUMMARY AND CONCLUSIONS
In this paper, a novel source convergence acceleration method for Monte Carlo transport criticality calculations is proposed. The method consists of simulating most of the inactive cycles with MG cross sections. The MG cross sections can be generated at the beginning of the calculation during a few CE inactive cycles. In most cases, the MG inactive cycles converge to a slightly different fission source compared to the CE full-fidelity one. Therefore, in order to ensure proper convergence, a certain number of final CE inactive cycles can also be added. Then all the active cycles are simulated with CE as is conventional practice. This method was devised to address the problem of slow convergence in high dominance ratio reactor models.
This method was tested on three 3-D light water assembly/minireactor models. These were a PWR assembly with fresh fuel, gadolinium-bearing pins, and an axial water density profile; a CE version of the deterministic benchmark C5G7, and a PWR assembly identical to the previous one but with fuel burnt up to 10 MWd/kg. These models were chosen for their challenging features-the presence of gadolinium, MOX fuel, and burnt fuel, as well as an axial water density profile-that are meant to introduce challenging spectral effects present in realistic scenarios. For these models, Shannon entropy was used as an approximate convergence indicator, while the ultimate measure for successful convergence was the comparison of the final flux profiles of the MG case with the reference case. Other convergence estimators were not considered in this paper.
For all cases, different studies on the population parameters to be used were carried out. In general, it was found that the CASMO23 group structure could successfully be used for all models. For the models implemented, 20 generation cycles could produce cross

RAFFUZZI et al. · MONTE CARLO SOURCE CONVERGENCE ACCELERATION
sections with an average uncertainty around 3% for capture and fission cross sections, and 13% for the scattering matrices. The large cross-section uncertainties were found to be the most limiting aspect of the method. This is exacerbated by the fact that uncertainties do not converge following a 1 ffi ffi ffi N p trend in high dominance ratio problems. This implies that a prohibitively large number of cross-section generation cycles should be used to reduce uncertainties slightly. At the same time, it was found that cross sections have their maximum uncertainty at the top and at the bottom of the model, i.e., where the neutron population is the lowest. This implies that the errors propagate largely toward regions that are not particularly statistically significant. However, errors around 2% on the final flux profiles and uncertainties around 4% could easily be seen at the axial extremities. A possible solution to this problem, to be investigated in the future, is to apply the uniform fission site method. 22 The application of this technique has previously been shown to resoundingly reduce uncertainties in estimators located near core extremities and seems a promising and cheap means of reducing MG cross-section uncertainties (and resulting flux errors) if used alongside the present method. Despite these issues, with a careful selection of the population parameters, all models produced satisfying results. It must be kept in mind that large dominance ratio models suffer from detrimental correlation effects that do not depend on the nuclear data type used. As a consequence, when running independent instances of the CE reference cases, standard deviations of up to 4% could be seen. The average error of the MG cases were generally within that standard deviation band. However, the standard deviation generated from independent MG runs tends to be larger than the CE one.
An additional issue arose for the burnt PWR assembly problem, when generating a separate set of cross sections per each material did not produce satisfactory results. This was a consequence of extreme uncertainties in the cross sections everywhere in the geometry; while the other cases had around 30 materials, the burnt assembly included around 400. In order to reduce statistical errors, cross sections were averaged over chosen sets of spatially separated materials. During the MG portion of the calculation, each individual material within a set used the same cross sections. This greatly reduced the cross-section uncertainties and the final error.
Memory utilization was measured only for the burnt fuel assembly model, i.e., the one with the largest number of materials. When using 23 groups and a set of cross sections per each material, memory usage increased by 17%. When grouping fuel materials together, however, memory utilization only increased by 1%.
The speedup achieved by this method ranges from a factor of around 2.5 to a factor of around 5. In particular, the highest computational gain comes in computationally expensive problems, such as the burnt fuel one. This happens because the CE calculation is heavily slowed down by the presence of more than 200 nuclides and 400 materials. However, it is clear that informed choices of parameters must be made in order to ensure appropriate results. For each calculation, the user must input the group structure to be used, the number of cross-section generation cycles, and the number of final CE inactive cycles. All of these parameters are also influenced by the neutron population per cycle. Additionally, it must be decided whether a cross-section set should be generated per each material or per some group of materials. These choices must be well informed in order not to deteriorate the calculation results. Even if all the LWR models tested produced satisfactory results with 23 groups, it is also expected that different reactor types would require different energy structures and different parameters.
While the use of arbitrary parameters is a disadvantage of the method, it must be emphasized that all source convergence acceleration methods require some arbitrary choices. To state a few examples, for convergence acceleration using CMFD, these consist of choosing an appropriate spatial mesh, energy group scheme, and the number of cycles over which to generate group constants. For the response matrix, these include the spatial mesh, number of inner and outer iterations, and the number of correcting cycles. Finally, for particle ramping, these include the ramping rate and initial number of particles. Therefore, some degree of arbitrariness is not surprising in acceleration methods. However, future work will address the problem of automating the choice of some simulation parameters, such as the switch between CE and MG inactive cycles and between inactive and active cycles.