Geant4-based electromagnetic background model for the CRESST dark matter experiment

The CRESST (Cryogenic Rare Event Search with Superconducting Thermometers) dark matter search experiment aims for the detection of dark matter particles via elastic scattering off nuclei in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {CaWO_4}$$\end{document}CaWO4 crystals. To understand the CRESST electromagnetic background due to the bulk contamination in the employed materials, a model based on Monte Carlo simulations was developed using the Geant4 simulation toolkit. The results of the simulation are applied to the TUM40 detector module of CRESST-II phase 2. We are able to explain up to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(68 \pm 16)\,\mathrm {\%}$$\end{document}(68±16)% of the electromagnetic background in the energy range between 1 and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$40\,\mathrm {keV}$$\end{document}40keV.


Introduction
Understanding the nature of dark matter (DM) is among the most pressing problems of modern physics. The possia e-mail: holger.kluck@oeaw.ac.at b e-mail: valentyna.mokina@oeaw.ac.at c e-mail: c.turkoglu@sussex.ac.uk ble parameter space in terms of mass and interaction cross section of ordinary matter with DM is large, requiring different experimental approaches to exploring it in a comprehensive way. The cryogenic experiment CRESST is one of these experiments, being very well suited for the search for DM particles in the sub-GeV/c 2 mass region, down to a few hundred MeV/c 2 [1]. A major milestone for the CRESST-II experiment in increasing the sensitivity to sub-GeV/c 2 dark matter was the data taking campaign of phase 2: with a detection threshold of (307.3 ± 3.6) eV, the single detector module Lise achieved sensitivity down to 500 MeV/c 2 [2]. This demonstrates the importance of a sub-keV detection threshold for nuclear recoils. The first time, it was achieved earlier in the same run with the detector module TUM40, for which a threshold of (603 ± 2) eV was obtained [3].
The average background rate previously determined in the crystal used for the TUM40 detector module in the region 1-40 keV of about 3.51 kg −1 keV −1 day −1 [4] is significantly lower compared to crystals of the same material with different origin. The sensitivity for low DM masses was limited by the residual background and detection threshold, hence, further improvements are only possible by identifying and reducing components of this background or lowering the detection threshold.
In this paper we present an analysis of the electromagnetic background sources and their composition [5] for the data taken during CRESST-II phase 2 with the TUM40 detector module [3,4]. We use the Geant4 toolkit for Monte Carlo simulations [6][7][8] as the main tool to calculate the differential energy distribution from the decay of various radioactive isotopes. The normalisation of the energy distribution is obtained from the activity of α-decays in the corresponding natural decay chain in secular equilibrium or from the fit to γ -lines.
In the following, we present the CRESST experiment in Sect. 2. Section 3 focuses on the simulation method and Sect. 4 discusses the determination of the different background contributions. The results are presented in Sect. 5. Section 6 gives a summary and provides a short outlook.

The working principle
The CRESST experiment is located in Hall A of the Laboratori Nazionali del Gran Sasso (LNGS) below the Gran Sasso mountains in central Italy. CRESST-II used cryogenic detectors with scintillating CaWO 4 crystals as the target material. Each interaction in the target crystal produces phonons and scintillation photons, which are absorbed by a nearby siliconon-sapphire (SOS) disk. Around 90 % of the total energy deposited in the target crystal are phonons and nearly independent of the interacting particle. On the other hand, the scintillation process is strongly particle-dependent. Its efficiency reaches at most 7 % for β/γ -events [9]; for nuclear recoils the scintillation efficiency is reduced by roughly one order of magnitude called quenching. Both signals are read out by separate thermometers which are realised as transition edge sensors (TESs) operated at a temperature of O(15 mK) [10]. We refer to the CaWO 4 target crystal as "phonon detector" and the SOS disk as "light detector" later on. The ratio between deposited energies in these two channels, defined as the light yield (LY ), allows one to determine the type of interaction. The 122 keV γ -line coming from a 57 Co calibration source is used to normalise the LY of β/γ -band to 1 at 122 keV. Figure 1 shows the expected LY distribution as a function of the deposited energy. As can be seen, a number of bands emerge, separated by their respective LYs. Different bands indicate the different particle interactions inside the target crystal. Starting from high LY values to lower ones, the resulting event categories are: β/γ -events caused by scattering off electrons, α-events and finally events of recoiling O-, Ca-and W-nuclei. The latter three collectively constitute the nuclear recoil bands. A recoil event may be caused by background, e.g. neutrons, or by a potential DM signal. The energy range of interest (ROI) for a DM search in CRESST-II expands from the detection threshold to 40 keV. Above this energy, no significant DM signal is expected for CaWO 4 [3].
Most of the radioactive background falls into the β/γband which is well separated from the nuclear recoil bands at high energies. However, at lower energies, events from the β/γ -band are leaking into the region of interest. This complicates the DM analysis: since the expected rate of DM particle interactions is low compared to the background rate, a potential signal may be covered by the leakage, or an unknown background can be mistaken for a potential DM signal. Especially at low energies where the ROI for the CRESST experiment is, a detailed understanding of the background is crucial.

The detector setup
Most of the materials close to the detectors are selected for radiopurity to keep backgrounds low. The cold finger design of CRESST helps to avoid any line of sight between the dilution refrigerator and the detectors [11] (see Fig. 2). The design of the CRESST setup features layers of polyethylene (PE), lead and copper shielding against ambient background from natural radioactivity of the laboratory, as well as background caused by the shielding and cryostat itself (see Sect.

for details).
The active muon veto tags when a charged particle penetrates through it. A gas-tight container, the so-called radon box, which is constantly flushed with N 2 gas, prevents the accumulation of gaseous radon close to the detectors. The radon box contains a lead shielding, inside this a copper shielding is the direct enclosing of the cold box. The detector modules are placed in the so-called carousel at the core of the coldbox.
A photograph and the schematic view of the TUM40 detector module [12] are shown in Fig. 3a, b, respectively. Below the cryostat, the carousel hosts the detector modules. The shielding consists of PE, lead, copper, an active muon veto and the air-tight radon box used to prevent radon contamination from air The target crystal is held by sticks made of the same material-CaWO 4 [12] and features a dedicated carrier crystal onto which the TES is evaporated. The phonon and light detectors are enclosed by a reflective scintillating polymeric foil named VM2002 1 , to increase the light collection efficiency and actively reject surface backgrounds. The holder of the detector module is made from radiopure copper. A more detailed description of the TUM40 geometry is given in Sect. 3.1, where the implementation into the simulation code is discussed.

Potential background sources
In this Section, the main background sources and the methods to reduce and identify them will be discussed. For the TUM40 detector module, previous studies [3,4] found no indication for a significant neutron component to the background. Based on CRESST-II phase 1 we expect < 10 −3 kg −1 keV −1 day −1 of neutron background [13]. Hence, we focus on the electromagnetic components in this work.

Muons
High energetic atmospheric muons can easily penetrate through meters-thick layers of shielding. For this reason, the CRESST experiment is located in the LNGS underground laboratory, which has an overburden of at least 1400 m of rock in each direction. This provides an average shielding of 3600 m w. e. [14,15]. The remaining muon rate is [14], i.e. it is reduced by about six orders of magnitude compared to sea level.
The remaining muons can induce background events in detectors, either by direct interaction with the target crystal, or by producing secondary particles in their interactions with rock or the material of the experimental setup, e.g. bremsstrahlung or highly energetic neutrons by muoninduced spallation. To be able to reject muon-related events, CRESST is equipped with a muon veto (see Fig. 2). The veto consists of 20 plastic scintillator panels, read out by photomultiplier tubes and covering 98.7 % of the solid angle around the detectors. The reason why the coverage is not 100 % is that an opening is needed on the top to leave space for the cryostat. If the muon veto triggered, all events recorded by cryogenic detectors, within a time window of ± 2 ms, are rejected in the offline analysis.

Cosmogenic activation
Long-lived radioactive impurities in the detector material might be activated due to its exposure to cosmic rays before being brought underground [16]. For CaWO 4 , lines caused by the decays of cosmogenically activated nuclides below 80 keV are reported in [1,2,4,17].
Four lines can be detected due to cosmogenic activation of 182 W via proton capture. With a half-life of T 1/2 = 1.82 year [18] the resulting 179 Ta then decays in the target material by electron capture (EC), leading to the occurrence of the X-ray line as follows: Since the decay happens in the target crystal the entire atomic de-excitation is measured. The signature of the EC therefore is a line at the binding energy of the respective captured shell electron of 179 Hf. Peaks originating from this decay can be seen at energies of 2.60 keV (M 1 shell), 10.74 keV (L 2 shell 2 ), 11.27 keV (L 1 shell) and 65.35 keV (K shell) [4].
Another nuclide, 183 W, is cosmogenically activated through 183 W(p, 3 Due to the slow detector response, the γ -ray and the atomic de-excitation cannot be resolved. The most prominent peak for this decay is thus observed at 73.6 keV which is the sum of the binding energy of the K shell electron of tantalum with an energy 67.4 keV and 6.2 keV from the subsequent nuclear de-excitation [17]. A nuclide which can be cosmogenically produced in a wide range of target materials is 3 H [20]. It is hard to be observed experimentally for CaWO 4 in CRESST, since the β-spectrum of 3 H is buried below the other low-energy background components. However, Eq. (2) indicates that the observed 181 W should be accompanied by a 3 H component. Other reaction paths to create 3 H in CaWO 4 may also be possible.

Ambient γ -radiation
Another type of background are ambient γ -rays, originating from the rock and concrete of the LNGS. The sources of this radiation are γ -decays from the natural radioactive decay chains of 232 Th, 235 U and 238 U, but also individual radioactive nuclides like 40 K. The integrated γ -ray flux in hall A of LNGS for energies below 3 MeV was measured to be (0.28 ± 0.02) cm −2 s −1 [21].
To shield the experiment against this radiation, CRESST detectors are surrounded by an external lead shielding with a mass of 24 tonnes and a thickness of 20 cm (see Fig. 2). With its high density and atomic number, lead is ideally suited to absorb γ -rays. However, it is not a very radiopure material, in particular its naturally occurring radioactive isotope 210 Pb contributes to the background observed in the experiment. Therefore, CRESST lead was produced by the Swedish company Boliden 3 with a low 210 Pb contamination of 35 Bq kg −1 [22]. In addition, an internal lead shielding is placed inside the cryostat directly below the mixing chamber and consisting of 330 kg of lead produced by the company Plombum 4 in Poland with an even lower 210 Pb contamination of only 3.6 Bq kg −1 [22]. To reduce any remaining γ -background coming from the lead, low-background copper with a mass of 10 tonnes and a thickness of 14 cm is placed between the cold box, which contains the detectors, and the lead shielding (see Fig. 2). All the copper used in the experiment is so-called NOSV copper, produced by the Norddeutsche Affinerie AG. 5 Copper is still a good gamma attenuator and in contrast to lead it can be produced with a very low level (< 1 mBq kg −1 ) of internal radioactivity as it has no naturally occurring radioactive isotopes. This makes copper a more favourable shielding material closer to the detectors.
In addition, copper is also used for structures inside the cold box, such as holders for the detectors of the CRESST experiment due to its good thermal conductivity. The ambient γ -flux is reduced to negligible amounts by the described shielding structure. However, even if copper can be produced with low contamination levels, it remains a non-negligible background source due to its proximity to the detectors and large overall mass [4].

Internal contamination of CaWO 4
All CaWO 4 crystals in CRESST-II phase 1 and most in phase 2 were commercial crystals produced by the Scientific Research Company "Carat" 6 and the Prokhorov General Physics Institute of the Russian Academy of Sciences. 7 The contamination of these crystals by α-emitters from the natural decay chains in the energy range between 1.5 and ∼ 7 MeV was measured to be between (3.05 ± 0.02) mBq kg −1 and (107.13 ± 0.14) mBq kg −1 [23]. Even though the energies of α-decays are above the ROI for the DM searches, the subsequent decay chains contain also β/γ -decays that deposit energy in the ROI.
To reduce the intrinsic background, it was decided to produce crystals within the collaboration [23] using a Czochralski furnace dedicated to growing only CaWO 4 crystals [24]. This approach allows one to control all production stages to prevent contamination during crystal growth and aftergrowth treatments. One of these self-grown crystals was used for the detector module TUM40 (see Fig. 3). In comparison with commercial crystals, a factor of 2-10 decrease in the β/γ -background in the energy range of 1 − 40 keV for TUM40 was achieved. The βand γ -emissions, e.g. caused by the decay of 210 Pb, are also reduced since they originate from the same natural decay chains as the measured α-decays.
Even though the amount of background reduction with TUM40 is significant, there still remains a non-negligible amount of β/γ -background at low energies. Hence, it is of great importance to understand the contribution of each background source using the experimental data at hand. The knowledge of background components also helps to guide the efforts of background reduction.

The reference data sets
As experimental reference for the simulation model described in Sect. 3, we are using the complete data 8 recorded with the TUM40 module during CRESST-II phase 2 with a gross exposure of 129.9 kg day. This data set contains an average gross rate in the region 1-40 keV of about 3.1 kg −1 keV −1 day −1 . If the deposited energy is high enough, the detector gets into normal conducting state and the recorded pulses are saturated. Hence, the standard energy calibration, which is based on the determination of the pulse amplitude, is less precise for high energies [25]. Therefore,  we are using three separated data sets: low, medium and high.
The main properties of these reference data sets are listed in Table 1.
To get the energy spectra for each data set different types of cuts are applied: a rate and stability cut, a coincidence cut, a carrier cut, quality cuts and RMS cuts (see [5] for details). A detailed description of the cut types and their purpose are given in [25]. Successively applying the cuts used in the analysis gives the energy-dependent signal survival probability ε. The latter is evaluated from simulated events as described in [25] and shown in Fig. 4. For the medium-and high-energy data sets the survival probabilities are energy independent and listed in Table 1.
The lower energy limit of the low-energy data set is the threshold of the TUM40 detector module of 0.6 keV [3], whereas the upper limit marks the point at which saturation effects cannot be compensated reliably anymore by the applied energy reconstruction. With respect to the DM analysis [25] of this data set the cuts were adjusted to expand the usable energy range to ∼ 500 keV. However, the lower limit we increase as below 1 keV events occur of unknown origin. As their modelling is beyond the scope of this work, we disregard the events in the 0.6-1 keV range. Hence, hereafter ROI refers to the energy range 1-40 keV. Within this ROI we count 11145 background events in TUM40, resulting in the total net rate R TUM40 = 2.2 kg −1 keV −1 day −1 using the signal survival probability ε (see Fig. 4).
The energy scale of the reference data set agrees well with the literature values of identified peaks for low energies, i.e. in energy range relevant for DM search that is 0.6-40 keV for TUM40 [3]. For energies above ∼ 250 keV, the energy scale starts to deviate by more than 1 keV as expected due to the non-linearities [5]. To correct this shift in the low-and medium-energy data sets, a rescaling is performed via a cubic polynomial. The values of the free parameters are determined by minimising the deviation between the measured energy and the literature value of identified peaks (see [5] for details).
The high-energy reference data set includes sharp peaks which originate from α-decays within the 232 Th, 235 U and 238 U decay chains (see Fig. 5c) mainly as bulk contamination. Besides the full absorption lines, also an escape peak around 4.5 MeV due to 230 Th (Q = 4769.8 keV, E γ = 253.7 keV) and 235 U (Q = 4678 keV, E γ = 185.2 keV) decays is visible, labelled escape line. In addition to bulk contamination, also a peak due to near-surface decay of 210 Po is visible, labelled external. Due to the applied cuts, pile-up events caused by the correlated decays 212 Bi → 212 Po, 214 Bi → 214 Po, and 219 Rn → 215 Po, which were observed in [4,23], are not included in the reference data. Because of the non-linearity of the TES, the calibration was corrected by assigning Q-values from literature [18] to well identified peaks and interpolate linearly between them. The spike in the middle of the 211 Bi peak (see Fig. 5c) is an artefact of the energy calibration 9 of the data.
As the electromagnetic background components clearly dominate the total background [3,4,13], we do not take into account e.g. neutron induced nuclear recoils as background. Therefore, no differentiation between the β/γ -band and the nuclear recoil bands is needed. Hence, we neglect the light yield information in the reference data set and will compare our background model to energy spectra of the total background. The final spectra for the three reference data sets are shown in Fig. 5. The medium-and high-energy data sets are used to normalise the background model which will be described in Sect. 4. The low-energy data set is used to cross check the accuracy of the model and to determine the background contribution in the ROI which will be discussed in Sect. 5.

Simulation of the background spectra
For an accurate simulation of the background spectra, hereafter called spectral templates of individual decays, we use a two-stage approach: for the microscopic simulation of the Reference data after rescaling for a the low-energy (0.6-495 keV), b the medium-energy (511-2800 keV) and c the high-energy (4000-7000 keV) range. For all three histograms the bin size is 1 keV. Triangles indicate peaks used for the normalisation of the simulation in this work, for details see text energy deposition in the detector parts we use a Geant4based simulation code. To apply the detector response we process the simulated data with a ROOT-based tool [26]. This approach has the advantage that changes in the detector parameters do not require one to rerun the often timeconsuming Geant4 simulations.
Furthermore, as we exclude pile-up events from the experimental data (see Sect. 2.4), we are allowed to split decay chains X 1 → X 2 → X 3 → . . . into piece-wise decays X 1 → X 2 , X 2 → X 3 , …This has the advantage that the activities A 1 , A 2 , …associated with nuclides X 1 , X 2 , …stay free and can be normalised to calibration or assay mea-surements without the need to rerun the simulations (see Sect. 3.3).

Implementation of detector geometry and physics
For this work we greatly extended the simulation tool ImpCRESST [27] and adapted it for Geant4 version 10.2 patch 1 to simulate the decay of the primary contaminants, the production of the resulting secondary particles, the tracking of all particles through a detector geometry, and the energy deposition in the instrumented detector parts. To obtain reliable simulation results we adjusted the physics list to sub-keV energies, implemented a detailed geometry of our detector, and verified the decay properties of the contaminants as primary input to the simulation.
We use the physics list Shielding as provided by Geant4 with the following three modifications: (i) Instead of the standard physics constructor G4EmStan-dardPhysics, we chose G4EmStandardPhysics_op tion4 because it provides the most accurate modelling of electromagnetic interactions at low energies [28].
(ii) As production cut for secondaries, we set a value of 250 eV throughout all volumes. 10 If the kinetic energy of a potential secondary drops below this value, no actual secondary is simulated but energy conservation is obeyed by locally depositing the energy. By choosing this low value for the production cut, we consider the leakage of radiation out of the finite detector volume via secondary radiation to a high accuracy. Regardless of this production cut, secondary particles caused by atomic de-excitation, e.g. fluorescence photons and Auger electrons, are produced in any case.
(iii) Radioactive decays are handled by the G4Radio-activeDecayPhysics module. As described in [29], we patched the code to correctly treat 3 H as unstable. We note that for a proper simulation of the 234 Th-decay, the most prominent internal background source for the target crystal, at least Geant4 version 10.2 is needed.
The modelled geometry is shown in Fig. 6 and reflects closely the shape and size of the actual detector module TUM40 (see Fig. 3a). In particular, we considered the blockshaped target crystal made of CaWO 4 which is held by sticks of the same material and copper parts. The target is faced by the light detector which is approximated as pure sapphire (Al 2 O 3 ), contrary to the SOS used in reality. We omit the silicon layer of 1 µm-thickness on the back side of the light detector in the simulation because of its negligible mass contribution. With the same argument we omit both TESs, the 10 This is the lowest value for which all relevant electromagnetic physics processes are approved. We note that for specialised use cases the applicability limit can be lower, e.g. for elastic electron scattering in silicon as low as 5 eV. By default the cut value is 990 eV [28]. Both the target crystal and the light detector are surrounded by the scintillating foil, which we approximate with Mylar (C 10 H 8 O 4 ) foil in the simulation. We consider the ring holding the foil, which is made out of plastic scintillator BC408, and the bronze clamps, which hold the light detector. The properties of the simulated materials are listed in Table 2. For all elements, we assume a natural isotopic abundance.
For this work, we only record the Geant4 hits in the target crystal for which we store the deposited energy E dep and the time information t.

Emulation of the detector response
To emulate the finite time and energy resolution of the detector, we apply an empirical detector response model to the outcome of the Geant4 simulation. To consider the time resolution of the detector, we sum up all Geant4 hits which happen within Δt with respect to the first hit of a simulated decay to create one experimental event with energy E dep . If hits happen after t + Δt, e.g. via a delayed decay of an isomeric state, the next experimental event will be immediately created. This procedure is applied until all detector hits are processed. We found that Δt = 2 ms describes the experimental data well. We note that this is a purely empirical approach and does not trivially correspond to length, rise )) centred at E dep . The energydependent variance was obtained from fitting cubic polynomials to the experimental detector resolution via reference data sets in the low-(σ l ), medium-(σ m ), and high-(σ h ) energy range (for details see [5]). The effect of the finite energy resolution is clearly shown on the example of the 223 Rn decay spectrum in Fig. 7.

Simulation of spectral templates
A spectral template T vi X (E obs ) is a probability density function (PDF) which gives the probability to observe an experimental event with energy E obs in the instrumented volume i, caused by a bulk contamination X in volume v. Hence, it has to be scaled to the contamination level to give the total number of background events. We obtain it in five steps: (i) we sample the volume v homogeneously N v,X -times, each time placing a nuclide of type X at rest; (ii) in ImpCRESST the decay X → Y of the nuclide is simulated; (iii) the detector response model is applied; (iv) we count how many of the decayed nuclides lead to an energy deposition E obs in volume i: n i (E obs ); (v) we obtain T vi X (E obs ) = n i (E obs )/N v,X . Assuming that the counting of n i (E obs ) is a Poisson process we define its statistical uncertainty as √ n i (E obs ) and propagate it for all the dependent quantities.
As an example, Fig. 8 shows the sum of the two spectral templates which describe the β-decay of 234 Th: either directly or indirectly, the decay reaches the isomeric state 234 Pa m [73.92 keV] (cf. the level scheme in Fig. 8). Due to the long half-life of this state (T 1/2 = 1.159 min [18]) compared to the time resolution of the detector O(ms), the decay cascade results in two detector events: one containing the interactions before reaching the isomeric state and one afterwards.
The isomeric state can be reached via four branches: once via a direct β-decay (orange filled histogram in Fig. 8) and three times via a β-decay to a state at energies above the isomeric state with a subsequent γ -decay to the isomeric state (blue and red filled histograms in Fig. 8). Due to the slow detector response, the γ -decay cannot be resolved and  Fig. 8). In the remaining 99.84 % of the cases, the isomeric state undergoes a β-decay to the ground state of 234 U [18] resulting in a β-spectrum with an endpoint of 2.27 MeV as the second detector event (violet filled histogram in Fig. 8).
As described in Sect. 2.3.2 one cosmogenic background source is the EC decay of 179 Ta. However, we found that Geant4 assigns the same capture probability regardless from which atomic shells the electron is captured contrary to e.g. [19]. To mitigate this contradiction, we disentangle the spectral templates for the individual atomic shells and determine their activity from a fit to the reference data (see Table 11 for the resulting activities), i.e. we do not rely on the contradicting literature values by leaving the capture probability unconstrained. We disentangle the cumulative decay spectrum, as simulated by Geant4, into individual components by selecting the events according to the total energy of all decay products including the recoiling 179 Hf nucleus. Figure 9 shows the resulting spectral templates, i.e. each histogram is normalised to integral one. We note that each atomic shell not only leads to a peak at the fully absorbed electron binding energy of 179 Hf, but may also causes escape peaks. This is clearly visible for the K shell (Fig. 9, grey histogram) with a peak caused by the fully absorbed binding energy of 65.35 keV [19] and escape peaks nearly coinciding with peaks of the fully absorbed binding energies of the L 2 , L 3 , and M 3 shells. 11 We note that prior to version 10.2, Geant4 did not implement this decay properly [30]: 234

Normalisation of background contributions
The next step is to scale these resulting templates according to the experimental reference data. After the scaling, these individual contributions are combined and compared to the reference data.

Determination of the internal radiogenic background components
Since it is complicated to disentangle each β-decay or Compton continuum as they overlap with each other, we correlate these spectra in the ROI to clearly defined α lines in the energy range between 4 and 7 MeV. The connection between the α lines in the MeV range and the β/γ -decays in the keV range comes from secular equilibrium [31]. We omit the α lines of 147 Sm and 180 W which were observed in an earlier analysis [4] as they are not connected to decays in the ROI.
In this work, we use two requirements for the existence of secular equilibrium: (i) the relative difference of the activities of the parent A and the daughter B has to be (A A − A B )/A A < 1 % after 460 days had passed, which is the time elapsed between the crystal growth and its arrival underground. (ii) We assume that the initial activity A B (time = 0) of the daughter nuclide is negligible if it decays to less than 10 % over time before measurement. If both of these requirements are fulfilled, we consider the parent and the daughter to be in secular equilibrium, so their activities are the same, otherwise, their activities are determined independently.
After identifying the α peaks in the high-energy reference data (see Sect. 2.4), we determine the number of events within a given peak X of amplitudeÃ and width σ by fitting a nonnormalised Gaussian. Hence, the total number of observed events N X,obs of the α line X isÃ·exp(−E 2 /(2σ 2 )). We consider the fit uncertainty ofÃ and σ as a systematic uncertainty with respect to this MC study and propagate it as absolute systematic uncertainty for all the dependent quantities. We ignore any potential correlations between the fitted parame-tersÃ and σ . For the total uncertainty we linearly combine this systematic uncertainty with the previous introduced statistical uncertainty of the simulated templates. From the total number of observed events, the activity of the α line X is calculated as: where T · M stands for the exposure, M is the mass of the CaWO 4 crystal (see Table 2), T is the live-time and ε h stands for the signal survival probability of the high-energy reference data set (see Sect. 2.4). The obtained activities are listed together with the fit parameters in Table 6 in Appendix A. To compare the simulation results with the reference data (see Fig. 5c), the templates of the individual α-decays were scaled with their respective activities and combined (see Fig. 10). The resulting ratio of the integral of counts in the simulated spectrum to the counts in the experimental data between 4 and 7 MeV is (97.8 ± 3.8) %. The uncertainty band shown in Fig. 10 includes both statistical and systematic uncertainties from the simulated templates and the fit of the reference data, respectively. The next step is to determine the activities A Y, pred (β) of the β/γ -decaying nuclides Y , which are in secular equilibrium with their α-decaying parents X : where B.R. X →Y is the cumulative branching ratio between the nuclides X and Y . For the comparison between the predicted count spectrum and the observed one in Sect. 5 we also calculate the number of observable predicted events N Y, pred (β) as: where ε l stands for the signal survival probability of the lowenergy reference data set. The breakdown of the decay chains with respect to the piece-wise secular equilibrium are given in Appendix A. in Tables 7, 8 and 9. The tables list also the activities as determined for the CaWO 4 crystal of TUM40.

Near external radiogenic background components
The main source of near external radiogenic contamination is expected from the copper holders of the detector modules due to its relatively large mass close to the crystal. This type of background manifests itself in the ROI as β/γdecays. The copper holders are separated from the crystal by a thin scintillating foil, so even external low-energy βs may be visible in the experimental data.  [33], we assume the same isotopic abundance as in natural uranium and derive the upper limit: A Cu, 235 U = A Cu, 238 U · 0.70 % ≤ 0.46 mBq kg −1 . Because only activities for the nuclides at the head of the corresponding decay chain are reported in [32] we assume that the chains are in secular equilibrium to obtain the activities for the daughter nuclides. This assumption is supported by measurements of the XENON experiment for another copper batch from the same producer [33]. The mass determines the activity whereas the placement determines what fraction of the activity is visible to the target crystal as background. So, instead of a time-consuming simulation of all contaminants in all copper parts, we simulate the contaminants only in the copper part which has the highest visibility but scale the obtained background spectra to the total mass of all near copper parts. Considering this information, the scaling of the individual decay templates is performed as follows: where A top stands for the activity of the head of the corresponding decay chain, B.R. top→X stands for the cumulative branching ratio between the head of the decay chain and the nuclide X , and m Cu stands for the total copper mass (see Table 2). As we use upper limits for the activities to scale the spectral templates, no systematic uncertainties are taken into account; only the statistical uncertainties associated with the simulation of the templates are used.

Additional external radiogenic background components
After removing the contributions of internal radiogenic and near external radiogenic contributions from the experimental reference data, we still observe γ -peaks. These identified lines belong to the decays of nine radioactive nuclides: 40 K, 208 Tl, 210 Pb, 212 Bi, 212 Pb, 214 Bi, 226 Ra, 228 Ac, and 234 Th. Hence, at least for these nine nuclides the total activity of the near copper parts is not enough to explain the observation. It is reasonable to assume that this additional background is caused by parts of the detector modules or shielding structure which have not yet been considered as a source of background in our simulations: massive but more distant copper parts (e.g. thermal shieldings of the cryostat) and/or closer but less massive components of the detector module (e.g. scintil-lating foil, SOS, TES, wires, bronze clamps). As simulating the complete decay chains in the full experimental setup was beyond the scope of this work, we did a first approximation of the effects of these additional background components. As copper is the most abundant material close to the detectors, we created as a first approximation a 1 mm-thick spherical copper shell surrounding the detector module and simulated only these nine clearly identified nuclides as its bulk contaminants. The shell is therefore a very simplified model of the distant copper parts. We note that in reality this background may originate partially also in the other mentioned components; it is up to a future study to identify them. A consequence of this simplified model is that we do not know the geometrical efficiency, i.e. the probability for a background particle originating in some of these additional components to reach the target crystal. Hence, we can only give the activity A that the crystal would be able to observe if survival probability of the particle was 100 %. The determination of the activity corresponding to the observed γ -lines is very similar to the procedure outlined in Sect. 4.1 for the determination of the activity, i.e. it is the background rate corrected for the signal survival probability. However, there exists a continuous spectrum in the low-and medium-energy ranges which we consider as a linear component of the fit function. In order to get the number of events corresponding only to the γ -peaks, we calculate the number of events N peak, obs below the Gaussian component of the fit. Consequently, the activity A corresponding to the γ -lines can be calculated using Eq. (3). In order to calculate the activity of the full decay, one has to scale the activity of the γ -peak with η, which is the fraction of events under the γ -peak compared to all the events of the particular decay as calculated from the simulation templates. The total activity of each isotope is then A total, obs = A peak, obs /η. The activity A total, obs of the γ -lines, the fraction η and the corresponding decays are listed in Table 10 in Appendix A. In order to predict the observable number of events N total, pred in the low-energy range, we use Eq. (5).

Determination of internal cosmogenic background components
Internal cosmogenic background components contribute to the overall background in the ROI in the form of β-decay (e.g. 3 H) or in the form of X-ray and Auger electron emissions that result from EC (e.g. 179 Ta and 181 W). After removing the internal and external (both near and additional) radiogenic background components, the activities of 179 Ta and 181 W nuclides are determined in the same way as the activities of the external γ -rays (see Sect. 4.3). 3 H may be produced via cosmic activation in several ways. However, the only reaction path which is empirically evident for TUM40 is the co-production together with 181 W (a) (b) (c) Fig. 11 Total simulated background spectrum (MC, orange histogram) with statistical (cyan band) and systematic (magenta band) uncertainties with respect to the experimental data with statistical uncertainties (black data points) in the medium-energy range (a), the low-energy range (b), and the ROI (1 − 40 keV) (c). The bin sizes are 1 keV, 100 eV, and 100 eV, respectively (see Eq. (2)). Hence, we use the activity of 181 W to estimate a lower limit on the 3 H contamination. A first calculation using ACTIVIA [34] indicated that the activity of 3 H is 62 % of the activity of 181 W. The activities calculated for each peak and the corresponding nuclide are given in Table 11 in Appendix A.

Discussion of the results
In this section, we first check the self-consistency of our normalisation procedure used in Sect. 3.3 by investigating the compatibility of the medium-energy reference data with the background model. Afterwards, we discuss the background prediction in the ROI and its agreement with the observation. The comparison of the combined model components with the experimental reference data sets are shown in Fig. 11.
5.1 Self-consistency of the simulation Figure 11a shows the comparison between medium-energy reference data and the combined templates. We obtain the predicted number of background events in the mediumenergy range N MC, m by adapting Eqs. (5) and (6) for the discussed background components. The continuous part of the simulated spectrum matches well with the experimental data within the uncertainties. Besides the γ -lines used to normalise the templates (see Fig. 5), also other prominent γ -lines belonging to nuclides 40 K, 208 Tl, 214 Bi and 228 Ac fit well with regards to amplitude and width. This shows the self-consistency of the simulated spectral templates.
To quantitatively compare the simulation with the reference data we define ζ = N MC /N exp , where N exp is the number of events in the respective reference data set and N MC is the number of predicted background events. In the following we assume that the near external radiogenic background component reaches the maximum activity that is allowed by the upper limit on the copper contamination (see Sect. 4.2). In the medium-energy range between 511 and 2800 keV the reproduction of the background reaches up to ζ = (109 ± 69) %, i.e. the model matches the observed background within the uncertainty. 12 The uncertainty is dominated by the limited statistics of the experimental reference data set at medium energies (see Table 1). This limitation can only be resolved by increasing the exposure of the reference data set and is hence outside the scope of this work.
The comparison between model and reference data for the low-energy range is shown in Fig. 11b. The combined peak due to the K α1 (8.048 keV) and K α2 (8.028 keV) fluorescence lines in copper [19] is clearly visible in the experimental data and appears in situ also in the simulated spectrum (see Fig. 11c). As this peak occurs due to the interaction of β/γ -background with the copper, one can consider it as an indicator of the correct estimation of the amount of external contamination. We determine the counts in this peak as described in Sect. 4.3 and give the results in Table 3. Despite the strong simplification of the model used for the distant copper parts, the simulation agrees with the reference data better than a factor two.
Even though the simulation and the experimental data agree within the uncertainties, it can be seen that the agreement decreases with decreasing energy. This means that we are either missing one or more β/γ -decaying nuclides and/or we are underestimating one or more nuclides considered in this work. If we assume that the match between the simulation and the experimental data in the medium-energy range is correct, then the Q value of the missing nuclide(s) in the low-energy range should be lower than 495 keV, which is the upper limit of this energy range.

Background reproduction at low energies
The coverage of the low-energy reference data by the simulation is up to ζ = (79 ± 33) % (see Fig. 11b). The comparison for the energy range between 1 keV and 40 keV (ROI) (see Fig. 11c), results in up to ζ = (68 ± 16) %. The reduced uncertainty, going down from 33 to 16 %, is caused by the increasing statistics in the experimental reference data sets at lower energies. Higher statistics results in smaller uncertainties in the fit of the model components. The decrease of (a) (b) (c) Fig. 12 Total simulated background spectrum (MC sum, red) along with the individual contributions from different background components (internal radiogenic (IR, blue histogram), additional external radiogenic (AER, green histogram), near external radiogenic (NER, grey histogram), internal cosmogenic (IC, orange histogram) below 74 keV) with respect to the experimental data (black histogram) in the medium-energy range (a), the low-energy range (b), and the ROI (1-40 keV) (c). The bin sizes are 1 keV, 100 eV, and 100 eV, respectively the fit uncertainty, as part of the systematic uncertainty, with decreasing energy is also clearly shown in Fig. 11b, c. Table 4 lists the activities A and background rates R of individual background contributions as result of our Geant4 model and Fig. 12 shows the related background spectra. The activities represent the background that reaches the crystal as it would be seen by an ideal detector over the full energy range, i.e. the combined energy ranges of the low, medium, and high reference data sets. Considering the real signal survival probability ε for the ROI result in the background rates R which are directly comparable to the experi-mentally observed R TUM40 (see Sect. 2.4). Normalising the simulated R to R TUM40 yield the relative background contributions ζ which are listed in Table 5 in comparison to a previous work [4].
The contribution of 6.3 % from the near external radiogenic background is based on the assumption that the copper contamination reaches its maximal allowed value (see Sect. 4.2). In the opposite case, i.e. that the Cu contamination is negligible, our model coverage drops from maximal 68 % in the ROI to roughly 62 %.
We note that the contribution of the "near external radiogenic contamination" and any "additional external contamination" changed between the previous study [4] and this work (see Table 5): Whereas in [4] the near radiogenic component 13 was unconstrained, this work fixed it to the specific activity of the copper (see Sect. 4.2). Consequently, the remaining observed background was shifted to the additional external component. As the contamination in the near copper parts is based on upper limits from literature, the partitioning between near and additional external radiogenic background components may change with the outcome of the ongoing copper assay or/and by including in the simulation more materials as sources of additional background.
It can be seen that internal background is the most prominent component in the ROI which is mostly dominated by radiogenic contaminants. These results indicate that a further purification of the CaWO 4 crystal, as planned for CRESST-III phase 2 [35] is an important step towards the reduction of the overall background in the experimental data.
With the first completely Geant4-based electromagnetic background model for the CRESST experiment, we are able to identify and reproduce up to (68±16) % of the background observed using the TUM40 detector module of CRESST-II. This is a methodological independent verification of the (69.0 ± 9.2) % obtained in the previous work [4] which is based on a combination of Geant4-based simulation and a semi-empirical fit to the background data itself. Furthermore, in this work we used all data recorded with TUM40, resulting in a four times higher statistics in the experimental reference data.
Beside this quantitative confirmation, this work features also qualitative improvements over [4]: the simulation of the complete natural decay chains for the intrinsic background component allows us to quantify the background contribution for each nuclide (see Tables 7,8,9). Using Geant4 version 10.2 patch 1 allows us to simulate the complete 234 Th-decay, whereas in [4] we had to rely on a simplifying workaround as we had to use Geant4 version 10.1. Additionally, we implemented the full geometry of the TUM40 detector module, whereas in [4], only the target crystal was implemented. Hence, in this work, features like the copper fluorescence peak and the external annihilation peak occur naturally in the simulated spectra which had to be added ad hoc or were missing in [4]. These aspects highlight the advantages of using a fully Monte Carlo-based model over a semi-empirical model.

Summary and outlook
Electromagnetic background poses a significant problem for the direct DM search with the CRESST experiment at low energies where the discrimination power between β/γevents and nuclear recoil-events decreases. To be able to observe a possible DM signal, it is imperative to understand and quantify the background. For this reason, Monte Carlo simulations using a Geant4-based toolkit were performed. Based on the origin of the contaminants, we divided the background into two categories: internal (inside the CaWO 4 target crystal) and external. Internal background includes αand β/γ -decays of nuclides from the natural decay chains of 232 Th, 235 U and 238 U as well as cosmogenically activated nuclides ( 179 Ta, 181 W and resulting 3 H). External background includes β/γ -decays from the same natural decay chains and 40 K, but originating from copper parts close to the target crystal, e.g. target holder, as well as distant additional parts, e.g. thermal copper shieldings of the cryostat. We scaled the simulated spectra of energy deposits to the activity values obtained from experimental reference data sets. This work, the first fully simulation-based background model for the CRESST experiment, shows that we can reproduce up to (68 ± 16) % of the observed background in the ROI.
The difference between the contributions from internal and external parts demonstrates that the leading contribution to the background comes from the CaWO 4 target crystal itself. Based on this result, further purification of the target crystals should remain a high priority for the future. The second most prominent background originates from additional background sources around the target crystal. Hence, more precise knowledge of contamination levels of the used materials will result in a more precise background model.
Consequently, we prepare a dedicated measurement campaign for copper, scintillating foil and silicon-on-sapphire as light detector used in the experiment. In addition we refine the study of cosmogenic activation of CaWO 4 to identify potential further cosmogenically produced contaminants. Further improvements are expected from the simulation of the complete set of contaminants in a detailed model of the cryostat and of the shielding, in addition to the highly accurate simulation of the detector module itself which was used in this work. Beyond these studies of electromagnetic background components, we shall also start an in-depth investigation of the neutron background.  Table 10 Gaussian fit values (mean E, amplitude A and variance σ ) of gamma lines observed in the experimental data of TUM40 and attributed to the additional external radiogenic background. The number of observed events (N γ,obs ) and the corresponding activities (A γ ) are calculated from the integral of the fit. The statistical uncertainties for N γ,obs and A γ are calculated using the fit values as described in the text. Based on simulations, η is the ratio between the activity of the photo peak A γ and the complete spectrum A total Nuclide E(keV) Amplitude σ (keV) N γ , obs (#) A γ ± δ stat ± δ sys (mBq kg −1 ) η A total ± δ stat ± δ sys (µBq kg −1 )