Galmoss: A package for GPU-accelerated Galaxy Profile Fitting

We introduce galmoss, a python-based, torch-powered tool for two-dimensional fitting of galaxy profiles. By seamlessly enabling GPU parallelization, galmoss meets the high computational demands of large-scale galaxy surveys, placing galaxy profile fitting in the LSST-era. It incorporates widely used profiles such as the S\'ersic, Exponential disk, Ferrer, King, Gaussian, and Moffat profiles, and allows for the easy integration of more complex models. Tested on 8,289 galaxies from the Sloan Digital Sky Survey (SDSS) g-band with a single NVIDIA A100 GPU, galmoss completed classical S\'ersic profile fitting in about 10 minutes. Benchmark tests show that galmoss achieves computational speeds that are 6 $\times$ faster than those of default implementations.


Introduction
Galaxies, the cosmic building blocks, comprise diverse stellar components such as bulges, disks, bars, spiral arms, and nuclear star clusters.One of the main drivers of extragalactic astronomy is the study of the structural and morphological properties of galaxies from their photometric images, which has been shown to correlate with the galaxies' formation and evolutionary paths (van der Wel, 2008;Conselice, 2014;Dimauro et al., 2022).
Several approaches have been developed for galaxy structural and morphological analysis.Classical eyeball morphology classifications, such as the Hubble sequence (Hubble, 1926), are generally descriptive and rely on visual inspection.Non-parametric morphological analysis (Ferrari et al., 2015;Rodriguez-Gomez et al., 2019) employs quantitative metrics to represent attributes such as concentration, asymmetry, and smoothness (Conselice, 2003).These metrics are considered robust because they do not depend on any underlying model.In contrast, the analysis of surface brightness profiles of galaxies involves fitting the light distribution with parametric models to interpret and quantify morphology using specific model parameters (Nantais et al., 2013;Zhuang and Ho, 2022).This profile fitting method has evolved from onedimensional (de Vaucouleurs, 1958;Sersic, 1968) to two-dimensional analyses, improving accuracy by incorporating factors such as the point spread function and non-axisymmetric components (Andredakis et al., 1995;Schade et al., 1995;Byun and Freeman, 1995).simulated galaxy images.These deep learning approaches establish an implicit link between galaxy images and their profile parameters, substantially accelerating the process of parameter estimation (see also gamornet, Ghosh, 2019).However, these methods cannot be easily adapted to new data domains due to the well-known 'black box' problem (Castelvecchi, 2016), which means that the implicit mappings generated by the models usually lack clear interpretations.
Another method to address speed challenges involves parallel gradient-based methodology, which relies exclusively on GPU-accelerated matrix parallel computation.This is supported by popular frameworks such as pytorch (Paszke et al., 2019), tensorflow (Abadi et al., 2016), and jax (Bradbury et al., 2018), enabling the incorporation of automatic differentiation.Applications of this approach range from galaxy kinematic modeling (Bekiaris, 2017) to cosmological N-body simulations (Modi et al., 2021).
To speed up galaxy profile fitting, bringing it to the CSST/LSST-era and contribute to the literature on galaxy formation and evolution, we introduce galmoss.We aim to harness the advantages of both traditional and contemporary approaches by providing a framework that enables seamless parallelization while preserving the interpretability of more conventional methods.galmoss processes batches of galaxy photometric images as multidimensional arrays, facilitating efficient fitting on GPUs.This tensor-based approach permits dynamic management of data sizes during computations through the adjustment of batch sizes and the optimization of GPU utilization.For the fitting process, galmoss makes use of gradient descent due to its relatively low computational and memory requirements.Finally, galmoss provides two methods for uncertainty estimation.In the realm of galaxy profile fitting, astrophot (Stone et al., 2023) leverages GPU acceleration and automatic differentiation in its serial optimization process while preserving physical interpretability.While our package focuses on processing numerous images of individual galaxies, astrophot is optimized for fitting multiple galaxies within a single, crowded image.
This paper is structured as follows: Section 2 provides an overview of the galmoss workflow.Section 3 describes the data that we use to showcase the software's performance.Section 4 details the generation of model images, using the Sérsic profile as an example, and introduces a set of built-in profiles along with advanced applications.Section 5 discusses the fitting process post-image generation.Finally, Section 6 showcases galmoss' performance through case studies, with concluding remarks presented in Section 7. generates a series of image models following chosen profiles.These models are then convolved with the PSF of the image and augmented by adding a mean sky value to all pixels.This process results in the image model  , , which is then compared to the image data  , and corresponding sigma images  , .The fitting process employs a  2 loss function along with a mask  , to determine which pixels are included in the optimization.Internally, galmoss employs a gradient descent method embedded in pytorch.

General workflow of galmoss
After completing the fitting process, the package saves the data products, including the image block and fitted parameter values, to the designated output path.The image block files, saved in the FITS format, comprise the original image, the best-fit model image, and the residuals.With each component stored in a given FITS Head Data Unit (HDU), in the case of fitting multiple components (e.g., bulge + disk), each component can also be accessed individually alongside other data products.

Data
For showcasing galmoss capabilities, we gathered galaxy images (128 × 128 pixels) and PSF images (41 × 41 pixels) from the SDSS catalog. 1 Our data selection was guided by the availability of independent galaxy parameters, as provided by the MaNGA PyMorph Photometric Value Added catalog (MPP-VAC-DR17, Domínguez Sánchez et al., 2022), encompassing 10,293 galaxies from the final MaNGA release (Bundy et al., 2014).The selected sample predominantly consists of galaxies with an average redshift z ∼ 0.03 and features a relatively uniform distribution of stellar mass (Collaboration, 2022).The catalog offers photometric parameters obtained through two-dimensional surface profile fitting.These parameters include results from single Sérsic and combined Sérsic + Exponential models, calculated using the pymorph pipeline (Vikram et al., 2010).pymorph integrates sextractor with galfit, the former package streamlines the estimation of initial parameters and the generation of stamps, and the latter aims to fit the galaxy profile.
In our selection process, we excluded galaxies located at the edges of plates where obtaining a complete 128 × 128 pixel image was not possible.We also left out galaxies that failed in the single Sérsic fitting process within the MPP-VAC-DR17 catalog.This resulted in a final sample of 8,289 galaxies.We used only g-band images to ensure a good signal-to-noise ratio throughout the sample.Following pymorph, we generated masked images and initial values using sextractor (Bertin and Arnouts, 1996).

Image generation and built-in profiles
For a given set of galaxy or profile parameters, galmoss generates model images that represent the intensity or surface brightness of each pixel, based on the selected profile.galmoss provides six built-in radial profiles: Sérsic, Exponential disk, Ferrer, King, Gaussian, and Moffat, as well as a Flat sky model (with the option to include more complex sky models).These radial profiles are transformed into two-dimensional images by assuming circular symmetry.This process effectively projects the one-dimensional profile over a circular area in two dimensions to create the final image.To illustrate this projection process more concretely, we use the Sérsic profile as an example, given its widespread application in modeling galaxy profiles.
Furthermore, galmoss supports the combination and integration of different profiles.Detailed instructions on the practical use of the code can be found in the Appendices B-D.

Sérsic
The Sérsic profile (Sersic, 1968) is commonly used to fit the surface brightness distribution of elliptical galaxies, as well as the disk and bulge components of other galaxy types.The intensity () as a function of the radius  is given by where  e denotes the surface brightness at the effective radius  e , which encompasses half of the total profile flux.The Sérsic index  dictates the profile's curvature, and the parameter   is computed numerically from the inverse cumulative distribution function of the gamma distribution.
The versatility of the Sérsic profile lies in its ability to model a wide range of galaxy morphologies by adjusting the Sérsic index .For example, setting  = 4 transforms it into the de Vaucouleurs profile, known as the  1∕4 profile.An exponential disk profile is obtained with  = 1, and  = 0.5 results in a Gaussian profile, enabling users to explore these diverse profiles by simply varying .
In a one-dimensional context, the radius  refers to the distance from the galaxy center.In two dimensions, given a galaxy profile center ( c ,  c ), the position angle of the ellipse profile , and the axis ratio , the circular radius  is defined by Robotham et al. (see e.g. 2017): where (4) The boxiness parameter  introduces flexibility in the shape of the ellipse (Athanassoula et al., 1990), allowing for more general elliptical profiles. = 0 corresponds to a standard ellipse, while positive values result in more box-like shapes and negative values in more disk-like shapes.This flexibility is particularly useful in galaxy modeling, where a range of elliptical shapes can represent the diverse morphologies observed in galaxies.
Rather than directly using the surface brightness  e in the Sérsic, magnitude  and its corresponding zero point magnitude  0 is used to specify the intensity level.This approach is consistent with methodologies employed by widely-used galaxy fitting tools such as galfit (Peng et al., 2010) and profit (Robotham et al., 2017), facilitating easier comparison with observational data.Both quantities are related as follow: where ) . (6) In this formulation,  box serves as a geometric correction factor in  e to account for deviations from a perfect ellipse, influenced by the level of diskiness or boxiness.Specifically, when  = 0, indicating a perfect ellipse,  box = 1, implying no geometric correction.The beta function (, ), is calculated using the relationship (, ) =  () ()  (+) , where  denotes the Gamma function.
Fig. 2 presents the fitting results using the Sérsic profile for galaxy J162123.19+322056.4,which is one of the galaxies in our catalog.The quality of this model is evidenced by the residuals, highlighting the precision of the fit.For those interested in the implementation details, the code used for this analysis is provided in Appendix A.

Other available profiles
In this section, we briefly discuss the built-in profiles other than Sérsic.All these profiles but the Flat sky profile have circular radius  that follows Eq. (2) to Eq. (4).

Exponential disk profile
In an exponential disk profile, the intensity is defined as follows: where  0 is the brightness of the profile's surface in the center (at radial distance  = 0), and  s is the disk scale-length.The relation among the magnitude (,  0 ),  s and  0 is given by where  is the axis ratio, and  box is as defined in Eq. ( 6).

Modified Ferrer profile
The modified Ferrer profile, which is characterized by a nearly flat core and a rapidly truncated shape on the periphery, was originally proposed by Ferrers (1877) and later modified by Laurikainen et al. (2005).The intensity is defined as follows: where  0 is also the central surface brightness parameter,  out is the outer truncation radius, and  and  are parameters governing the slopes of the truncation and core, respectively.
The relation between the magnitude (,  0 ) and the profile model parameters is given by where  is also the axis ratio and  is the beta function defined in Eq. ( 6).The modified Ferrer profile is particularly suitable to model the bar structure in galaxies (Blázquez-Calero et al., 2020;Dalla Bontà et al., 2018).

Empirical (modified) king profile
Since its initial illustration by King (1962), the empirical King profile has been extensively used to fit both galactic and extragalactic globular clusters (BAOlab2 presented by Larsen, 2014; Chies-Santos  , 2007;Bonatto and Bica, 2008;Tripathi et al., 2023) The intensity of this profile is defined as follows: Here,  0 is also the central surface brightness parameter.The core radius,  c , signifies the scale at which the density starts to deviate from uniformity, while  t , the truncation radius, marks the boundary of the cluster.The global power-law factor, , dictates the rate at which the density declines with distance from the center.The concentration of stars in globular clusters can be defined using the parameters  t and  c , denoted by  = log ( ) .

Gaussian profile
In a Gaussian profile, the intensity is defined as follows: where  0 is also the central surface brightness parameter, and  is the radial dispersion.In the galmoss implementation, the Full Width at Half Maximum (FWHM = 2.354) is used instead.The classical application of the Gaussian profile includes modeling a simple PSF and point sources.

Moffat profile
The Moffat profile (Moffat, 1969), commonly used for modeling a realistic telescope PSF, defines its intensity as follows: where and  0 also is the central density.The concentration index, , dictates whether the distribution is more Lorentzian-like ( =1) or Gaussian-like ( → ∞).
When considering the ellipse with axis ratio  and boxiness parameter ( box in Eq. ( 6)), the relationship between the observed magnitude  and profile model parameter is Similar to the Gaussian profile, the Moffat profile can also be used to model point sources.

Flat sky
Unlike other radial light profiles, in galmoss, the sky profile is controlled by the sky mean value  sky across all pixels without a radial matrix: In galmoss, the only parameter needed in sky Profile is  sky .If a sky profile varies with position (radius), it can be included as a user-defined profile.

Image fitting and evaluation
Following the generation of model images, galmoss implements an extended 2 likelihood and gradient descent optimization, leveraging python and pytorch functionalities.In addition, uncertainty estimation is available.

Maximum likelihood estimation
The galmoss package employs a  2 likelihood: where  , represents the model at pixel , , and  , represents the uncertainty of  , .In the  2 distribution, the degrees of freedom, , is the difference between the number of pixels with galaxy flux that are not included in the mask and the number of model fit parameters.
Internally galmoss adopts gradient descent for parallel fitting.
Though slower in convergence than other traditional methods (e.g.Levenberg-Marquardt) it has lower computational and memory demands, enabling efficient parallel fitting on GPUs.

Confidence intervals
Galmoss employs two strategies to estimate uncertainties in galaxy images, primarily using a covariance matrix based on Gaussian distribution assumptions (Hogg et al., 2010).The parameter uncertainties are calculated from the covariance matrix's diagonal, derived using the Jacobian matrix (e.g., Gavin, 2022) for computational efficiency.
This method approximates uncertainties as where  is the Jacobian matrix, and  is diagonal with  , = 1∕ 2 , , offering a 1- confidence level for the parameters.The second methodology for uncertainty estimation is bootstrapping, which involves resampling with replacement each galaxy images multiple times (typically 100 iterations) and refitting them.The uncertainty is given by , where m is the mean value of the estimated parameter.Fig. 4 illustrates a comparison of uncertainty estimation methods for a random subset of galaxies.We display the uncertainties determined through bootstrap analysis (red error bars) and those derived from the covariance matrix (blue error bars).Generally, bootstrap uncertainties encompass a broader range, suggesting a more conservative estimate.show that galmoss's speed surpasses galfit when the batch size slightly exceeds 10 and about 6 × faster when the batch size is 1,000.The benchmark was executed in a machine with the following specifications: CPU -2.2 GHz Intel Xeon Silver 4210; GPU -NVIDIA A100 (80 GB); OS -Ubuntu Linux 18.04 64 bits; RAM -3.9 TB.

Results
To evaluate the performance of galmoss against established methods, we performed a validation test by fitting single galaxy profiles from the same dataset and comparing the structural parameters, such as the Sérsic index and effective radius, with those obtained using galfit, as cataloged in the MPP-VAC-DR17.This comparison acts as a fundamental 'sanity check' to verify the reliability of galmoss's fitting results.Fig. 3 shows overall consistency across key galaxy parametersmagnitude (), position angle (), axis ratio (), effective radius ( e ), and Sérsic index ()-as quantified by the  2 coefficient of determination.
The panels for magnitude, position angle, and axis ratio show a strong linear correlation, indicative of a high degree of alignment with galfit results.Specifically, in the position angle panel, objects that deviate significantly from the identity line are those fitted with large axis ratios ( ∼ 1).This suggests that, in these cases, the light distributions are relatively insensitive to variations in the position angle ().
The parameters  e and , known to be challenging to fit accurately (Trujillo et al., 2001), demonstrate a linear relationship, albeit with greater dispersion compared to other parameters.Notably, in the Sérsic index panel, galmoss values are generally lower than those derived from galfit at higher values of .This discrepancy could be attributed to variations in image quality and the characteristics of the optimization algorithms used in pytorch.For an in-depth analysis of this observed bias, we direct the reader to Appendix E.
In addition to the aforementioned test, we conducted an independent speed comparison using galfit and galmoss using the same dataset and initial values.The fitting of 8,289 galaxies with galmoss completed in roughly 10 minutes-six times faster than with serial galfit.Fig. 5 demonstrates this efficiency gain across different batch sizes, showing galmoss' speed advantage becoming more pronounced as batch size increases, up to the limits of GPU capacity.

Conclusions
In this study, we introduce galmoss, an open-source software package specifically designed for fitting galaxy light profiles on large datasets, ideally suited for the LSST era.Built on the torch framework, galmoss provides efficient 2D surface brightness profile fitting for batches of galaxy images, with the added benefit of GPU acceleration.It features a user-friendly interface that allows for the easy definition of parameters and their ranges.Moreover, galmoss is capable of efficiently quantifying uncertainty through both covariance matrix and bootstrap methods.The package also supports the integration of new profiles, making it an adaptable and versatile tool for statistical analysis in galaxy structure studies.
Benchmark tests reveal that galmoss can achieve speedup gains of up to 6 times compared to galfit, depending on the sample size.One novel aspect of galmoss is its use of native parallelization to process multiple observational fields of single galaxies as elements of a single multidimensional tensor, thereby enhancing its speed through efficient PyTorch GPU-accelerated matrix calculations and memory usage (Paszke et al., 2019).This approach contrasts with astrophot, which focuses on larger fields with multiple galaxies or joint multi-band fitting.A current limitation of galmoss is the requirement to manually select profile models and initial parameters, along with its sub-optimal GPU utilization for small galaxy batches-areas for improvement in future versions.
galmoss is freely available at GitHub,3 Zenodo4 and listed in the Python Package Index.5The readthedocs6 contains example usages, along with an overview of the package.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
For a quick start, we let the disk and bulge profile share the initial value from the sextractor, with an initial Sérsic index of 1 for the bulge component and 4 for the disk component.Compared to the single profile case, we only need to change the code snippet of profile definition.We choose to use bootstrap to calculate the uncertainty here.The second sub-model is a side galaxy.

Appendix E. The fitted bias in high Sérsic index end
As shown in Fig. 3, we observe a general bias in the high Sérsic index () end in the comparison results.Galaxies with large  values (e.g.,  > 6) in the galfit results mostly exhibit lower values (e.g.,  < 6) in the galmoss results.
To further investigate the type of high  galaxies, we use the morphological classification afforded by catalog MDLM-VAC-DR17.We identify 8,289 galaxies into 2,849 ETGs and 5,440 LTGs, and further identify ETGs into 1,891 Elliptical galaxies, 793 S0 galaxies and 165 undefined galaxies.We find that ETGs account for 34.37% of the total galaxy samples and represent 88.78% in the biased region.To investigate further, we plot a histogram for the  value in E and S0 (Fig. E.8).
In the plot, the histogram almost coincides between galmoss fitted results and galfit fitted results in S0 galaxies.However, the distribution of the histogram is very different between the galmoss fitted results and the galfit fitted results in elliptical galaxies.This result reveals that Elliptical galaxies contribute the most to the inconsistency in fitted  values.
There are several reasons for the observed result.One of them is the limitation of observation.Most elliptical galaxies have a steep central core and a flat outer wing.Due to the poor signal-to-noise ratio in the flat wing, the  2 is insensitive to the change of , which acts as a small gradient during the optimization.Another possible reason is related to the properties of the optimizer.Most prevalent optimizers in pytorch use self-adaptive learning rates which reduce the learning rate under low gradient conditions to fasten convergence.As a result, the sensitivity of  is further reduced at the high-value end compared to galfit, which adds to the difference in results between the two software in this case.

Fig. 1 .
Fig. 1.The galmoss workflow: from parameter initialization and image processing to the fitting procedure and final data product generation, including the fitted model and residuals, all stored in FITS format.

Fig. 1
Fig. 1 shows the workflow of galmoss.It begins by reading a vector  = {  ,   } consisting of user-defined initial parameter values, along with the data image, sigma image (data uncertainty), mask image, and Point Spread Function (hereafter PSF) image.The first subset   describes the geometric properties, which define the central position

Fig. 2 .
Fig. 2. G-band galmoss fitting result of J162123.19+322056.4,showing a good fitting quality.The left four images show the model, model residuals (data-model), data, and the residual distribution.On the right, a one-dimensional projection is shown, where the upper panel represents the model distribution with a black line and the orange filled markers with error bars represent the galaxy image's flux distribution and its uncertainty.The lower panel represents the flux residual with blue filled markers with error bars.

Fig. 3 .
Fig. 3. Panels comparing the measured Sérsic profile parameters for the selected ∼ 8,000 SDSS galaxies, along with  2 estimation.The panels each display galaxy magnitude (), position angle (), axis ratio (), effective radius (  ) and Sérsic index ().The transparency of the dots illustrates their concentration.The blue line shows where galfit result equals galmoss result.The histogram on top and right illustrates the distribution of measured values for galmoss and galfit, respectively.The comparison shows proper consistency in all panels.

Fig. 4 .
Fig. 4.A illustrative comparison of uncertainty estimations for fitted galaxy parameters using two methods: covariance matrix (depicted with blue error bars) and bootstrapping (depicted with red error bars), for a random selection of galaxies within our sample.Typically, the uncertainties derived from bootstrapping are observed to be larger than those obtained from the covariance matrix.

Fig. 5 .
Fig. 5.A speed-up performance evaluation between galmoss and galfit.Each line represents the speedup time normalized by the running time of galfit.The results show that galmoss's speed surpasses galfit when the batch size slightly exceeds
Fig. B.6 show the decomposition result, with the residual demonstrating a high quality of fitting.Appendix C. Example code for definition of a set of galaxy profilesHere, we provide an example that shows how to define a combination of profiles and extract the model image from the image function for an initial review.Fig. C.7 shows a model image featuring a central galaxy with a disk, bulge, and bar, each defined by two Sérsic profiles and a Ferrer profile, respectively.Additionally, there is a side galaxy defined by a Sérsic profile, two point sources defined by Moffat profiles, and an open cluster defined by a King profile.All of these are convolved with a PSF image produced by a Gaussian profile. 1 import numpy as np 2 import Galmoss as gm 3 import torch_optimizer as optim Upon importing the package, the first sub-model represents the central galaxy, comprising a disk, bulge, and bar. 1 # The galaxy center 2 xcen = gm.p(65.) 3 ycen = gm.p(64.) 4 5 # Define parameters and profiles 6 Bisk = gm.lp.Sersic ( 7

1
Fig. C.7.The images of the model are provided for an initial review.The model consists of a central galaxy that has a disk, bulge, and bar, each defined by two Sérsic profiles and a Ferrer profile, respectively.Additionally, there is a side galaxy defined by a Sérsic profile, two point sources defined by Gaussian profile.