Mesoscale microscopy and image analysis tools for understanding the brain

Over the last ten years, developments in whole-brain microscopy now allow for high-resolution imaging of intact brains of small animals such as mice. These complex images contain a wealth of information, but many neuroscience laboratories do not have all of the computational knowledge and tools needed to process these data. We review recent open source tools for registration of images to atlases, and the segmentation, visualisation and analysis of brain regions and labelled structures such as neurons. Since the field lacks fully integrated analysis pipelines for all types of whole-brain microscopy analysis, we propose a pathway for tool developers to work together to meet this challenge.


Mesoscale whole brain imaging
Developing a deeper understanding of the brain requires knowledge of both its anatomical and functional organisation. For this we need tools and methods that allow us to image for example, gene expression patterns and cell morphology over a broad range of spatial scales. Recent developments in sample preparation and microscopy have opened the door for high-resolution whole brain imaging in small animals such as mice. This presents new challenges to process and understand an ever increasing deluge of data. This review focusses on recent efforts to analyse and understand mesoscale whole-brain microscopy data with the view to establishing best practice approaches across the imaging community.

Traditional methods
Microscopic imaging of tissue sections has been a key method in neuroscience since the days of Golgi and Cajal. Over the last century, mechanisms for enhancing contrast of brain regions and cell types have been improved, particularly by immunofluorescence (Coons et al., 1941) and the discovery of fluorescent proteins such as GFP (Chalfie et al., 1994). It is now possible, using fluorescence, to distinguish dozens of cell types, and many cellular components. Fluorescence microscopy has also developed quickly and many methods such as traditional wide-field, confocal (Minsky, 1961), multi-photon (Denk et al., 1990) and super-resolution (e.g. Betzig et al., 2006;Rust et al., 2006) techniques are now in routine use in neuroscience laboratories. The limitation of all of these methods for neuroanatomy is that they can only be used to image a relatively thin tissue section, up to around 100 mm with confocal microscopy, or up to 800 mm in optimal conditions for multi-photon microscopy (Guesmi et al., 2018). With, for example, single neurons projecting to many regions across the brain (Winnubst et al., 2019), studying small areas can prevent an understanding of the global organisation.
Imaging of large intact brains has been possible for many decades, for example using ultrasound (Donald et al., 1958), computed tomography (Hounsfield, 1973), or magnetic resonance imaging (MRI, Lauterbur, 1973). These traditional three-dimensional imaging methods benefit from being non-invasive, and many aspects of brain activity and its structure can be studied. However, they have two main limitations when it comes to studying detailed organisation. Firstly, they have relatively low spatial resolution which, although continuing to improve, is limited to measurements of gross brain anatomy. Secondly, these methods do not have the specificity required to study all aspects of neuroanatomy using cutting edge viral, genetic and immunofluorescence toolkits.

Whole-brain fluorescence microscopy
To reach a better understanding of neuroanatomy it is necessary to image the entire brain at a sufficient resolution to resolve key structures and with the specificity to distinguish cell types and subcellular features. It is possible to image large volumes of the brain by imaging serial sections, and computationally reconstructing a 3D image volume (e.g. Luzzati et al., 2011). These methods however are laborious, and the manual sectioning process can introduce many artefacts, particularly as the sections can be damaged during processing.
To study the brain's anatomy across spatial scales, highresolution images acquired from intact brains are required, rather than post-hoc assembly of multiple sections into a single image. There are two broad classes of methods to generate these images (Osten and Margrie, 2013), the first of which is the combination of traditional block-face methods, in which an image of the surface of the tissue is taken, and in-built tissue sectioning. These methods vary but the common idea is to acquire an image from the intact brain and then remove a section of tissue to reveal the next part of the brain to be imaged. This process is repeated to build up a 3D image. The sectioning happens after imaging, preventing damage from forming part of the image, and by imaging the intact brain, the individual 2D images are aligned to form a 3D volume without errors introduced by computational reconstruction. Three of the most common methods are STPT (serial two-photon tomography, Ragan et al., 2012), fMOST (fluorescence micro-optical sectioning tomography, Gong et al., 2013) and FAST (block-face serial microscopy tomography, Seiriki et al., 2017). STPT uses a two-photon microscope to acquire a tiled image, just below the tissue surface before using a microtome to remove the surface of the tissue, and the process repeats to build up an image of the entire brain. FAST is conceptually similar to STPT, but uses spinning-disk confocal microscopy to increase the speed of data acquisition. In contrast, fMOST uses a diamond knife to remove an ultra-thin section from the surface of the brain while a line scan is acquired from the section as it is cut.
The second group of methods used to acquire whole-brain fluorescence microscopy images is the combination of optical tissue clearing and light-sheet fluorescence microscopy (LSFM). There are many approaches to render brain tissue transparent for which an in-depth disucssion is beyond the scope of this review (for more details, see Richardson and Lichtman, 2015). These include the use of organic solvents (Dodt et al., 2007;Ertürk et al., 2012;Renier et al., 2014) lipid removal (Chung et al., 2013;Hama et al., 2015;Susaki et al., 2014) or simple immersion in refractive index matching solutions (Ke et al., 2013;Kuwajima et al., 2013). Rendering the brain optically transparent, along with immunostaining (Chung et al., 2013;Kim et al., 2015a;Renier et al., 2014) provides a path towards rich high-quality 3D whole-brain datasets. Widefield or traditional point-scanning microscopy is not able to fully exploit the advances in tissue clearing, because of low speeds, and photobleaching due to repeated exposure of the same parts of the tissue. Although based on a very old method (Siedentopf and Zsigmondy, 1903) light sheet microscopy has only relatively recently been applied to fluorescence microscopy (Voie et al., 1993). LSFM works by illuminating a thin sheet of light, exciting the fluorophores in a virtual section of the tissue. The resulting fluorescence is then detected by a camera positioned orthogonal to the light-sheet. This selective illumination combined with wide-field detection provides speed, and reduces photobleaching, allowing repeated rounds of imaging. LSFM has been extensively applied to image whole mouse brains (Dodt et al., 2007;Lerner et al., 2015;Renier et al, 2014Renier et al, , 2016Susaki et al., 2014;Tomer et al., 2014).
The advances in sample preparation and microscopy over the last decade have now made the acquisition of high-quality wholebrain datasets possible. LSFM and STPT systems are available commercially and through mature open source initiatives (Tomer et al., 2014;Campbell, 2020;Economo et al., 2016;Voigt et al., 2019), ensuring an increasingly large user-base.

Image preprocessing
Once raw image data is acquired, it will often need to be preprocessed before image analysis can be performed. The exact steps will depend on the type of microscopy used. One method that is common to all imaging modalities is the stitching of different views of the data. For high-resolution imaging, the brain is often imaged as an array of tiles which must be stitched together to generate a single 3D brain image volume. It is possible to stitch images by simply aligning the edges of each tile, but this requires a well calibrated microscope. The motorized stage must be sufficiently accurate and well aligned to the imaging axis and the field of view (FOV) must be free of distortion. In most cases however, the tiles are aligned by using the information based on a slight overlap in adjacent FOVs. Tiles are roughly aligned using the stage information, and then computationally fused by maximising the similarity between the overlapping regions. In some cases, this is carried out as part of the acquisition software, but there are open-source software packages available (e.g. BigStitcher, H€ orl et al., 2019).
Depending on the imaging modality, and the stitching approach, there may also be image artefacts that need to be corrected. Firstly, the image intensity may vary across each tile, which can often be corrected by dividing the image by a reference image. This reference image can be created by imaging a uniformly fluorescent reference slide, or calculated by averaging many images. Secondly, if image tiles overlap, there may be reduced signal intensity in the overlapping regions due to photobleaching (those regions will be illuminated for twice as long as others) which can be corrected in a similar way.
Recently, there have been many methods developed for image de-noising (Weigert et al., 2018;Krull et al., 2019;Batson and Royer, 2019;Fang et al., 2021). These can be used not just to correct for image artefacts, but also to improve resolution and signal to noise (SNR). Computational image enhancement should be used with caution, but denoising can be used in whole-brain microscopy (e.g. Chen et al., 2020) to speed up image acquisition (e.g. by reducing required dwell time) or improve analysis (e.g. increased SNR can make feature detection more robust).

Image analysis
Sample preparation and imaging has been the focus of the field for the last few years, but data analysis is now becoming a bigger challenge. Many neuroscience laboratories do not have extensive image analysis experience, and whole-brain images bring their own challenges. There are many existing open-source tools that can be used for many types of bioimage analysis such as FIJI (Schindelin et al., 2012), Icy (De Chaumont et al., 2012), Vaa3D (Peng et al., 2010), andCellProfiler (McQuin et al., 2018). These packages provide many image analysis algorithms, can be extended via external plugins and can be used for batch processing (e.g. by writing macros). However, these packages do not have all of the necessary functionality for integrated analyses of all types of whole-brain microscopy data, due to three specific challenges of whole-brain microscopy data.
The first challenge is that the data is often very large. A single whole-brain microscopy image can be on the order of a terabyte (TB) in size (e.g. for a mouse brain, Seiriki et al., 2017). Many existing analysis approaches require the data to be stored in memory throughout, and so cannot be used on single images of this scale. The second challenge is that these images require very specific analyses, such as atlas registration and analysis in a common reference space. These methods are often used in other types of imaging, such as MRI (Ashburner and Friston, 1997;Jenkinson et al., 2012), but the algorithms are often not implemented within microscopy analysis software, or are not implemented in such a way to permit the analysis of very large images. The third challenge is the integration of different types of analysis. Packages such as FIJI include, for example, cell detection (Weigert et al., 2020) and image registration (Bogovic et al., 2016) algorithms, but it is not necessarily straightforward to combine them together and assign cells to brain regions. To make the most in advances in microscopy and sample preparation, integrated methods that allow for full end-toend analysis of whole-brain microscopy images are required.
Over the past few years many methods have been described in the literature, but very few are designed in a flexible manner and released to the community as user-friendly open-source tools. This review will focus on published tools that are freely available to the community, and which could be easily adopted by a typical laboratory without any high-performance computing resources.
When it comes to analysing whole-brain microscopy datasets, one of the first challenges is segmenting the features of interest. Segmentation refers to the assignment of image voxels to a meaningful label, such as a brain region, a cell, a blood vessel, or any other object. Many of these object segmentation problems have been solved for traditional slice histology, but whole-brain images present new challenges. In particular, the size of the data set, and the subsequent increase in variance in pixel intensities across the brain that can arise from both biological heterogeneity and nonbiological fluorescence artefacts.

Neuronal somata
Although there exists a very large number of laboratories focused on the vital function of glia in the maintenance of brain homeostasis, researchers interested in the detection and mapping of neuronal somata have been the main driver for establishing high-throughput imaging pipelines for brain segmentation, cell identification and counting. In addition to mapping the location of neuronal cell types (Mano et al., 2020), such methods are also used for mapping brain activity  and understanding cell-to-cell connectivity (V elez-Fort et al., 2014). Until recently, neuronal cell detection has been performed manually in wholebrain images (Ogawa et al., 2014;V elez-Fort et al., 2014;Watabe-Uchida et al., 2012), but this does not scale for routine use, when many thousands of cells can be labelled in each brain. Additionally, manual analyses of this scale are difficult to reproduce, and can introduce an additional source of non-biological variability.
In the last few years, many excellent cell soma detection algorithms for fluorescence microscopy have been published (Chen et al., 2018;Hollandi et al., 2020;Weigert et al., 2020;Stringer et al., 2021). However, these have generally not been applied to whole-brain microscopy data, either because they do not perform well in heterogeneous data, do not scale to very large images, or simply because they are not yet integrated with other steps in the analysis pipeline (e.g. registration, see section 4.) Although conceptually simple, detection of cell bodies in whole-brain images is a complex problem, firstly because the structure to be detected can vary greatly between experiments and cells. For example, label type (nuclear or cytoplasmic), cell size and shape, and the image quality and signal intensity can differ between samples and experiments.
There have been two classes of approaches to detect cells in whole-brain datasets. The first is using traditional computer vision approaches such as spatial filters and intensity thresholding. These have been applied in 2D in the WholeBrain (Furth et al., 2018) or AMaSiNe (Song et al., 2020) packages and in 3D in ClearMap , MIRACL (Goubran et al., 2019) and MagellanMapper (Young et al., 2020), but these methods do not always work well with densely labelled cells or in noisy data. The second class are machine learning approaches. Many studies have used random forest classifiers, implemented using Ilastik (Berg et al., 2019) which has been used in CUBIC-Cloud (Mano et al., 2020) and also in ClearMap. More recently, deep learning (Lecun et al., 2015), and in particular convolutional neural networks (CNNs) have been applied for high-performance cell detection (Iqbal et al., 2019b). These machine-learning approaches however can be slow, and require time-consuming annotation of training data into cell, and non-cell voxels. A recently released method (cellfinder) has combined traditional computer vision approaches for speed, with a deeplearning network to curate the results (Tyson et al., 2021a). In many cases, detecting the position of the cell is all that is required (rather than defining the cell boundaries), and this can be used to generate training data over much shorter timescales (Frasconi et al., 2014).
Cell detection (along with registration, see section 4) is an area within whole-brain image analysis with a lot of promising developments though as yet there is no single method that has been shown to work well across all image modalities and label types, and so researchers must trial multiple methods. There are also no methods that allow for identification of cell types (this must be inferred from the input data). In the future, cell detection methods which involve classification (e.g. the machine learning-based methods) could be extended to classify multiple cell types based on morphology, location, and signal intensity.

Neuronal morphology
Sparse labelling of neurons allows for the segmentation and analysis of the morphology of entire cells, including axons and dendrites. Whole-brain datasets should also allow multiple cells to be segmented in their entirety. There are currently no fullyautomated methods for neuronal reconstruction in whole-brain microscopy images, both due to the scale of the problem (data volume) and the complexity (tracing single neurons across large images). Unlike cell somata detection, analysis of cell morphology cannot be carried out in a local manner, i.e. large volumes of data must be analysed for a single cell. Neurons can be traced either manually (Han et al., 2018) or semi-automatically by selecting points along a neurite (Arshadi et al., 2020) or manual connection of algorithmically segmented neurite components (Winnubst et al., 2019). There exist more automated methods (Hang et al., 2018;Zhou et al., 2021) but these still require human supervision. All of these methods can be very time consuming, and relative to the microscopy, represent a processing bottleneck. This is partially due to the manual nature of the analysis, but also because visualising large amounts of data to trace a neuron is computationally very intensive. For this reason, specific methods have been developed to store data in custom formats allowing for rapid access for manual analysis (e.g. Pietzsch et al., 2015;Bria et al., 2016;Li et al., 2017, see section 5.3).

Connectivity mapping
While many studies investigating connectivity use cell soma detection (e.g. V elez-Fort et al., 2014; Menegas et al., 2015) or single neuron reconstruction (Winnubst et al., 2019) sometimes the analysis of dense axonal projections is required. The majority of axonal segmentation algorithms only analyse the brain in 2D sections (e.g. The Allen Mouse Brain Connectivity Atlas: Kuan et al., 2015;Oh et al., 2014). To our knowledge, there are two software packages capable of 3D analysis of whole-brain axonal projections. The first is part of the MIRACL toolbox (Goubran et al., 2019) which uses structure tensor analysis to generate streamlines, estimating the diameter of axon bundles. These streamlines can be traced to determine whether they pass through, or terminate within a brain region, and are then used to map connectivity. The second method is TRAILMAP (Friedmann et al., 2020) which uses a 3D CNN (a modified U-Net: Ronneberger et al., 2015) to segment individual axons from the background, allowing axonal density to be quantified across the brain. This method however does not allow tracing of connectivity from one region to another. There is not yet a method allowing individual axons to be traced in these dense datasets, this is currently only possible with sparse labelling.

Vasculature
In addition to segmentation of cell bodies and projections, analysis of the vasculature is important, particularly in preclinical studies such as the study of Alzheimer's disease (Bennett et al., 2018). There have been two methods released recently for the segmentation and analysis of whole-brain vasculature networks. The first is TubeMap (Kirst et al., 2020) which binarises labelled vessels, uses a CNN to fill the resulting image and then performs skeletonisation to produce a map of vessels throughout the brain. The vessels are classified as arteries or veins based on antibody staining, and a computational graph can be constructed to investigate vessel properties such as branching. The other method is VesSAP (Todorov et al., 2020) which uses a fully CNN-based method to segment the vasculature. Both methods appear to perform well, but as yet no studies have compared their differing approaches.

Brain registration and segmentation
Detecting and locating large numbers of objects (such as all labelled somata) in a whole brain produces a huge amount of data. The obvious way to distil this information and quantify data from multiple animals is to assign the objects to brain regions. The majority of whole-brain microscopy studies now carry out some kind of image segmentation to identify brain structures, and there have been many approaches to solving this problem. The common feature of most of these methods is that they base the segmentation on an existing reference atlas. An atlas typically consists of a reference image (of a single brain, or preferably an average of many), and an associated annotations image, with a mapping from each voxel to a brain region. The standard microscopy reference atlases are traditionally 2D (Dong, 2008;Franklin and Paxinos, 2008), and based on a single animal. While invaluable for many applications, 3D atlases (i.e. a single, aligned 3D volume) are required for computational analysis of whole-brain images.

Image registration
Registration is usually a key part of a whole-brain microscopy image analysis workflow, and refers to the spatial mapping of an atlas reference image onto the sample data. This can be used for atlas-based segmentation (see section 4.1.1), but the sample can also be mapped onto the atlas image. Transforming the sample onto the atlas allows for data from multiple animals to be analysed and visualised in the same coordinate space (Fig. 1.) which allows a more direct comparison than visualising data side by side, due to inherent variations in brain structure across animals.
There have been many published pipelines for registration of sample data to an atlas, but the majority have only been used for 2D data. However, there are now software packages released that are suitable for registration of 3D data. Registration packages typically fall into two categories, whether they register the entire image volume to the atlas, or whether they register 2D sections separately. One of the most conceptually simple is the 2D registration method implemented within WholeBrain (Furth et al., 2018). The Whole-Brain software detects reference points on a 2D image (at the surface of the brain) and maps these to the surface of the atlas brain. Although this approach works well for 2D data, for 3D whole brain microscopy data this can be time consuming because the user must manually identify the part of the atlas that best matches each image section. To overcome this problem, an extension of WholeBrain has been developed (SMART, Jin et al., 2019) that helps to automate some parts of this manual step. Users can specify the atlas position of the first and last 2D slices in their 3D image and the software can select the atlas planes for the images in between. Although SMART is faster than WholeBrain, the authors estimate that registration of an entire LSFM mouse brain image could still take 3e4 days.
The majority of registration is now carried out using 3D registration tools that are wrappers around existing image registration packages such as NiftyReg (Modat et al., 2010), ANTs (Avants et al., 2011) or Elastix (Klein et al., 2010). These tools typically use a combination of linear (affine) and non-linear (e.g. b-spline) deformations to best match the intensity distributions within the sample and atlas reference images following preprocessing. One of the first methods (ClearMap, Renier et al., 2016) provides a Python interface to Elastix to register LSFM mouse brain images to an atlas (Y. Kim et al., 2015b) at a resolution of 25 mm. This method has now been updated (Kirst et al., 2020) to use the new Allen Mouse Brain Common Coordinate Framework version 3 (Allen CCFv3, Wang et al., 2020).
Another tool released around the same time is aMAP (Niedworok et al., 2016) which provides a Java interface and FIJI plugin to the NiftyReg library to register STPT data to a 12.5 mm version of the same atlas (Y. Kim et al., 2015b). Unlike ClearMap, aMAP was validated against expert manual segmentation. This tool has now been updated, providing a Python interface and a command line tool, along with support for additional atlases (brainreg, (Tyson et al., 2021b).
Many more tools have since been developed, such as MIRACL (Goubran et al., 2019) which provides a graphical user interface for ANTs, and has been shown to work well for both LSFM and STPT data (along with other modalities such as MRI). Additionally, MagellanMapper (Young et al., 2020) provides a graphical interface for registration with elastix (via SimpleElastix, Marstal et al., 2016). There is now work to develop more accurate registration algorithms that do not rely on simple intensity-based approaches, particularly for situations in which the samples are damaged, or the data is contaminated in some other way (e.g. additional signals). These approaches show promise, but as yet their complexity prevents widespread adoption. Tward et al. (2020) developed a pipeline that can infer missing data to best register multi-modal mouse brain image data. There has also been work developing deep learning-based approaches for image registration. Ni et al. (2020) use a CNN to register a sample image to the atlas by combining the mappings of small blocks of the sample image to blocks of the atlas.

Segmentation
Segmentation performed on whole-brain microscopy images is typically carried out by registration of an atlas reference image onto the sample image, and then applying the same transform from atlas to sample space to the atlas annotations (Fig. 2.). The atlas annotations can be overlaid upon the raw image, and used to attribute brain regions. An alternate strategy, first developed for human MRI images is to use CNNs to directly segment the image, without registration to an atlas (Guha Roy et al., 2019;Mehta et al., 2017). There has been one study applying this to microscopy data in mice (albeit traditional 2D data, Iqbal et al., 2019a). This method was used for relatively coarse segmentation of 2D data, but can be used without registration and applied to multiple developmental time points. This method could be applied to 3D microscopy data, and could potentially overcome issues with damaged tissue, or for experiments in which reference atlases do not exist.

Reference atlases
3D reference atlases exist for many species, but many of them are not available as a digital 3D image set or are not at a resolution sufficient to take advantage of whole-brain microscopy. Many of them are based on MRI images with relatively low resolution, and some are based on traditional histology with rather modest resolution in the z dimension. For this reason, only high-resolution digital 3D atlases based on microscopy data, or other atlases that have been used for processing of whole-brain microscopy data will be discussed in detail.
The majority of 3D whole-brain microscopy atlases are in mice, and by far the most commonly used is the Allen CCFv3 (Fig. 3A).
This atlas consists of a reference image (with 10 mm isotropic voxels), generated from 1675 STPT images, and an annotations image, delineating 658 different brain regions (including isocortical areas, subcortical structures, fibre tracts and ventricles) defined by transgenic reporter mice and axonal projection data along with in situ hybridisation, antibody staining and traditional cytoarchitectural stains such as Nissl.
The Allen CCFv3 is comprehensive, but is still missing many brain region subdivisions. For this reason, an additional atlas was developed (The Enhanced and Unified Mouse Brain Atlas (UAA), Chon et al., 2019) in the same coordinate space, but with additional annotations (Fig. 3B). Primarily, additional regions were added from the Franklin-Paxinos atlas (Franklin and Paxinos, 2008) along with additional experimental data (e.g. additional transgenic lines) and the striatum was further subdivided based on connectivity data in the literature (Hintiryan et al., 2016;Hunnicutt et al., 2016;Oh et al., 2014). Unlike the Allen atlas, this atlas is not defined at high resolution isotropically, but the additional annotations will be valuable for many studies (e.g. of the striatum).
A different way to define an atlas is by directly using gene expression, rather than a mix of gene expression, protein expression, cytoarchitecture and connectivity. A recent atlas (Ortiz et al., 2020) uses spatial transcriptomics (Ståhl et al., 2016), followed by clustering methods for an unsupervised, data-driven approach to subdividing the brain into meaningful regions (Fig. 3C). This atlas is also in the same coordinate space as the Allen CCFv3, but the annotations differ considerably. It remains to be seen whether these delineations fit better with other data (e.g. electrophysiological cell properties) than the traditional methods for atlas generation.
Although the majority of rodent whole-brain imaging is carried out in mice, other model species are beginning to be used, such as the rat (Branch et al., 2019;Stefaniuk et al., 2016). There aren't any rat atlases of the same quality of the mouse atlases, but there are high quality atlases based on MRI (e.g. the Waxholm Space atlas, Papp et al., 2014). To our knowledge, there has been one study describing a method to register rat LSFM hemisphere images to this atlas (Branch et al., 2019), however this atlas is of relatively low resolution, and only consists of 76 subdivided regions.
Another group of model species that are being used with wholebrain microscopy are monkeys, such as marmosets (Skibbe et al., 2019;Susaki et al., 2014Susaki et al., , 2020. There are efforts to create highresolution marmoset atlases by combining data from MRI and traditional microscopy techniques (Majka et al., 2021;Woodward Fig. 2. Atlas-based segmentation. Atlas reference image (top left) and raw data (top right) are filtered, and the reference image is mapped onto the raw data. Other images, such as the atlas annotations and the brain hemispheres can then be warped similarly onto the raw data. As the tissue clearing and imaging methods evolve, more and more "novel" species will be imaged. Without atlas development, the insights that can be gained from these species will be limited compared to mice, for which there have been many resources developed.
Lastly, there is not just a limited range of atlases for different animal species, but also across the lifespan of an animal. The structure of the brain changes considerably with age, and so an appropriate atlas is required for accurate analysis of both brain development and ageing. Developmental atlases do exist (e.g. Thompson et al., 2014), but like many other atlases, these are not 3D, and so are not suitable for automated registration of wholebrain microscopy data. Generating these atlases is also highly labour intensive as multiple atlases are required for each time point (e.g. through embryonic and post natal development). One approach to generate these developmental atlases is to computationally construct a 3D atlas from existing 2D data (Young et al., 2021). This promising method may also allow 2D histological atlases (e.g. in other species) to be developed without the laborious processes of data acquisition and annotation.

Atlas reference images
All atlases come with a template, reference image upon which the annotations are based. In whole-brain microscopy, this reference image becomes critical, because it is used for registration of sample data into the atlas coordinate space. The reference image of the Allen CCFv3 is a STPT image, and the atlases that are either based on it, or warped to it (Chon et al., 2019;Ortiz et al., 2020) also use the same image. Registration of other STPT images to this template therefore works well (Kim et al., 2015b;Niedworok et al., 2016), but other imaging modalities (e.g. LSFM) may not work as well. Data preprocessing may help improve registration performance, but LSFM images of cleared tissue are considerably different to STPT images. One study has addressed this (Perens et al., 2020), by developing a warped version of the Allen CCFv3 with an LSFM template. The template was generated from 139 mouse brains cleared with iDISCOþ (Renier et al., 2014. To overcome nonuniform morphological changes following clearing, the authors individually registered six brain regions from the Allen CCFv3 template to the LSFM template. The Allen CCFv3 annotations were warped similarly, and so the LSFM atlas can be used directly with LSFM data, without transforming the data into the original Allen CCFv3 coordinate space. As other atlases also use the Allen CCFv3 template image, these annotations (e.g. Chon et al., 2019;Ortiz et al., 2020) could also be warped into the space of the LSFM atlas. Unlike STPT, LSFM data can vary considerably, partially due to the different microscopes available, but mostly due to the tissue clearing method. Different clearing methods rely on different mechanisms to render the tissue transparent (affecting contrast) and can differentially affect brain size (Wan et al., 2018), potentially causing morphological changes around the ventricles. This atlas for iDISCOþ cleared brains is useful for iDISCOþ samples, but similar atlases will likely need to be developed for the different families of clearing methods.

Alternative atlas formats
The majority of brain atlases rely on reference images, with a corresponding annotations image, defined by a raster image (Perens et al., 2020), polygons  or smooth curves (Furth et al., 2018). As the resolution of whole-brain microscopy data increases, so will the file sizes of the atlases required, along with the computational requirements for data processing. There are alternative strategies, such as by defining an atlas based on the coordinates of every cell in the brain (Murakami et al., 2018), rather than an image of the brain. The authors of this atlas used expansion microscopy (Chen et al., 2015) to generate a very-high resolution image of the mouse brain, and then segmented every cell in the image. The 14TB image can be represented by point clouds, taking up less than 3 GB, however data must be prepared in a particular way to use atlases of this form (an image must exist with all cells labelled). It remains to be seen whether this approach will become as widespread as image-based atlases, but tools are being developed to take advantage of this approach (Mano et al., 2020).

Choosing an atlas
Traditionally, atlases suitable for whole-brain microscopy were rare, and there wasn't much choice available. This is gradually changing, and so users will need to choose the most appropriate atlas for their work. In some cases, there will only be a single atlas available for the model species being imaged, but in other cases, a choice must be made. In some cases, the best atlas may be one developed for a specific imaging modality (Perens et al., 2020), but it other cases it may depend on the brain region annotations.
As more atlases are developed, researchers will need to choose which set of atlas annotations to use for a specific project. Fig. 3 shows how brain region annotations can vary between atlases, Fig. 3. Comparison of mouse atlases. Single hemisphere section at bregma. A) Allen Mouse Brain Common Coordinate Framework version 3 . B) Enhanced and Unified Mouse Brain Atlas (Chon et al., 2019). C) Molecular atlas of the adult mouse brain (Ortiz et al., 2020).
A.L. Tyson and T.W. Margrie Progress in Biophysics and Molecular Biology 168 (2022) 81e93 even between two atlases defined in similar ways (e.g. Allen CCFV3 vs UAA). Anatomical boundaries defined using different data, or by different anatomists can produce vastly different results. As an example, primary motor cortex only overlaps by~50% between the Allen CCFV3 and the UAA (Chon et al., 2019), and so downstream analyses based on this atlas may vary greatly. Different annotations have not yet been rigorously tested, and so the responsibility falls to the experimenter to choose the most appropriate atlas. However, it is very difficult to compare atlases within analysis pipelines (rather than simply by inspection). Existing atlases have been developed in relative isolation, and as such they are organised in different ways, use different file types, and in most cases are not interoperable. This makes it difficult for researchers to choose the most appropriate atlas, as their analysis pipelines must be rewritten to make use of a new atlas. The MRI community has been using atlases in this way for much longer, and so there is ongoing work to standardise atlases and make them available (Bakker et al., 2015;Myers et al., 2019). More recently, efforts towards standardising analysis and atlas usage have developed for the whole-brain microscopy field, including the natverse (Bates et al., 2020) and BrainGlobe (Claudi et al., 2020) projects. The BrainGlobe project provides a Python application programming interface that provides a number of atlases in a standard format, allowing users to switch between them. Work to further standardise the generation and release of atlases will be required to simplify their use, and allow the correct atlas to be chosen.

Raw data
Whole-brain microscopy presents new challenges for data visualisation. The first challenge is visualisation of the raw data. Compared to traditional microscopy methods, this is far more difficult as the majority of images do not fit in the memory of most computers. Luckily this is a challenge faced by many other imaging fields, and so there are existing strategies to handle this data. The simplest way is to use so-called "lazy loading" of 2D image data. Software such as FIJI and napari (Sofroniew et al., 2021) allow users to scroll through large 3D images plane by plane, and only the 2D section being viewed at a time is loaded into memory. This approach is useful for visualising data quality, but it can be slow and does not provide any 3D information. An alternative strategy is to use alternative file formats that store "chunks" of 3D data at different resolutions. This allows a 3D low-resolution overview to be viewed, and only the data in the field of view is required to be loaded into memory when the user zooms in. This approach is implemented in many commercial software packages along with the open-source BigDataViewer (Pietzsch et al., 2015) plugin for FIJI and the TeraFly (Bria et al., 2016) extension for Vaa3D.

Segmented data
One of the biggest challenges is specific to whole-brain microscopy: 3D visualisation of segmented data in a common coordinate space. These datasets are very complex, potentially containing segmented cells, neurites, brain regions and implanted devices (e.g. Neuropixels probes, Jun et al., 2017). Registration to a common atlas space allows for data from multiple samples to be viewed together (see Fig. 1) further complicating the data to be visualised. Often these data cannot be easily understood in 2D, and so 3D tools which allow visualisation of arbitrary shapes within an atlas coordinate system are required.
Many of the existing packages for whole-brain microscopy analysis include some tools for visualising segmented data along with an atlas (Furth et al., 2018;Kirst et al., 2020) but these are often limited to the data analysed within the software itself and require some programming knowledge. There are also packages released for visualising data from specific atlases, such as the Allen CCFv3 (https://connectivity.brain-map.org/3d-viewer, https:// github.com/AllenInstitute/cocoframer, https://github.com/Yaoyao-Hao/BrainMesh), but these cannot be used with other atlases, and are limited to what additional data (other than brain structures) can be visualised.
More recently, tools have been developed that allow integration of both publicly available datasets (such as the MouseLight project, Winnubst et al., 2019) along with user-generated data. The natverse (Bates et al., 2020) provides functionality for analysis and visualisation of neuronal morphology, although many of the functions are specific to Drosophila. The SNT FIJI toolbox (Arshadi et al., 2020) allows analysis of neuronal morphology and visualisation of atlas structures and reconstructed neurons from multiple projects in Drosophila, zebrafish and mouse. Lastly, brainrender (Claudi et al., 2021) provides functionality to visualise publicly available data, atlas data and user data, using the same code to visualise data across species. Brainrender is part of the BrainGlobe project (Claudi et al., 2020) to support multiple atlases, and integration with other software such as brainreg ((Tyson et al., 2021b) and cellfinder ((Tyson et al., 2021a).

Data annotation
Data visualisation is linked to the problem of annotation. Some whole-brain microscopy analysis tasks cannot yet be fully automated (e.g. reconstructing neuronal morphology, see section 3.2). In these cases, it is valuable to be able to not just be able to visualise raw and segmented data, but also to be able to interact with the data, to manually annotate brain structures or "proofread" the results of automated steps. The segmentation of large structures (e.g. implanted devices) can be carried out on downsampled data that can be loaded into memory (Liu et al., 2020;Tyson et al., 2021b). However, fine structures (such as axons and dendrites) require the user to be able to visualise, and crucially annotate full-resolution 3D image stacks. Methods have been developed for electron microscopy data, such as CATMAID (Saalfeld et al., 2009) AND KNOSSOS (Helmstaedter et al., 2011). These methods can also be used for light microscopy, along with other more dedicated methods such as TeraFly (Bria et al., 2016).
Even if data can be loaded and annotated, some structures are difficult to visualise in 3D for accurate annotation (e.g. complex dendritic arbors). This has lead to the development of novel visualisation tools, such as virtual reality (Pidhorskyi et al., 2018;Gunther et al., 2019;Wang et al., 2019).
Many methods for rapid visualisation of very large images require data to be saved in specific formats (e.g. Pietzsch et al., 2015;Bria et al., 2016;Li et al., 2017), which most image acqusition software do not support. Converting data to these formats is not trivial, as it can take a considerable amount of processing time, and require double the space on disk (if the original data is retained). It is important to choose the correct format, as often a specific format will be needed for use with visualisation software. The size of the data is also an important consideration. As larger brains are being imaged (such as primates), data formats that are designed for very large images (e.g. Li et al., 2017) will be required.

Additional analyses
Whole-brain microscopy is becoming more common, and is being applied more broadly, but user-friendly tools are not available for all types of analyses. While there are many tools available for registration and segmentation of common structures such as neuronal somata, they do not exist for other structures or classes of cells. Structures of a similar size to neuronal somata, such as amyloid plaques may be detected with existing cell-detection algorithms (Liebmann et al., 2016), but other types of structure cannot. More complex structures such as glial cells are difficult to segment, and there is a need for dedicated tools so that imaging advances can be used to study glia in the same way as neuronal cells. Existing segmentation algorithms are also designed to detect a single type of structure from a single image channel, and mostly cannot distinguish different structures within a single image, although some tools can detect both cell somata and dendrites (e.g. Furth et al., 2018).
Segmentation of large structures (such as lesions, injection sites and implanted devices) is computationally straightforward, but existing software packages do not include such methods and so users must create their own pipelines. The introduction of more complex implanted devices such as Neuropixels probes (Jun et al., 2017) with hundreds of closely-packed recording sites, necessitates the precise mapping of such objects within a common coordinate space from whole-brain microscopy data (Liu et al., 2020). Manual interrogation is possible within the BrainGlobe suite (Tyson et al., 2021b), but general purpose, automated mapping of these devices onto the segmented brain is not yet available.
Most of the existing analysis pipelines for whole-brain microscopy are conceptually simple, their advancement is to be able to deal with the scale and heterogeneity of the data. However, much more sophisticated analyses are plausible. Rather than simple cell detection, some algorithms could be adapted to classify cell types based on morphology. This would allow for much richer information to be extracted from these datasets without antibody staining.

Tool comparison
Many laboratories are now faced with a large amount of data, and a confusing landscape of analysis tools to choose from. While in some cases (e.g. vessel segmentation) the number of tools available are relatively limited, in other areas (e.g. registration) there are many tools, with no obvious answers as to which method is the most suited for a particular application. Comparisons between tools exist in the literature, but these may be biased as they are carried out by the developers of a single tool. It is difficult for a single researcher or team to produce an objective comparison of different analysis tools, so it is good practice to invite tool developers to "compete" to produce the best results on a set of benchmark data (e.g. Sage et al., 2019;Ulman et al., 2017). If the original developers of the software carry out the analysis, they are incentivised to produce the best results, and the interested user can see the theoretical best performance of each tool on standardised data. This has not yet been carried out for any aspect of whole-brain microscopy analysis, but will likely be necessary as many more tools are developed. In some cases, generating a metric of accuracy for validation purposes is relatively simple (e.g. cell counting), but in others (e.g. brain region segmentation) it can be much more difficult (Niedworok et al., 2016).

Neuronal somata detection
Neuronal somata detection is one of the most common wholebrain microscopy image analysis tasks, but each tool was originally developed for different types of data, such as nuclear c-Fos activity mapping  or whole cell labelling in viral tracing experiments (Tyson et al., 2021a). Unless the user has very similar data to that which is described in the software's publication or documentation, it is not clear even which packages should be tested. This will gradually become clearer as more studies are published using these tools, but until then it remains difficult to compare the performance of multiple algorithms.

Registration and segmentation
Many of the existing methods quantify registration performance (Goubran et al., 2019;Iqbal et al., 2019a;Ni et al., 2020;Niedworok et al., 2016) or compare to other tools (Goubran et al., 2019;Iqbal et al., 2019a;Ni et al., 2020), but these are limited in their utility for the user who is choosing which software to use. It is common in MRI registration to compare algorithms across many datasets (Klein et al., 2009), but this is much more complex for whole-brain microscopy. The main reason is that the community has not yet decided on measures to assess registration and segmentation accuracy. Many measures have been used such as comparison to expert region segmentation and landmark registration error (Niedworok et al., 2016;Goubran et al., 2019). Until a standardised set of measures is defined which captures all aspects of registration accuracy, users must use trial and error to find the most appropriate tool.

Workflow integration
There are many well established software tools that include many of algorithms required for the analysis of whole-brain microscopy data (e.g. FIJI and Vaa3D). There are also dedicated tools specifically designed to provide an integrated platform for the analysis of these datasets (Table 1). Some packages can be used for multiple types of analysis (e.g. Goubran et al., 2019), but none of these provide an integrated workflow for all types of analyses. Unlike many other types of microscopy, whole-brain microscopy images may contain different features across spatial scales that need to be segmented and analysed. A single image could contain injection sites and labelled cells along with lesions and implanted devices. To fully analyse the data, all of these features must be segmented and analysed in a common coordinate space. In contrast to traditional image analysis packages in which all necessary analyses can often be carried out (McQuin et al., 2018;Schindelin et al., 2012), whole-brain image analysis must be carried out with multiple packages and combined by the user. This process is timeconsuming and technically difficult because it relies on custom pipelines to be developed by each laboratory. Such pipelines are rarely re-used by the community.
The majority of software in this field is developed by academics, for whom publishing a paper is often the most important end result. There is rarely funding for continued software development and refinement, and so the software is often more difficult to use than necessary, does not interface with other software packages, and does not always use the most up to date technologies. To overcome these issues, without requiring an onerous amount of work, we propose three well-established techniques for increasing interoperability, and reducing duplication of effort. These are common file formats, software packages and plugin systems.
Increasing interoperability of software packages will have two main advantages for the community. The first is that users can combine different types of analysis within a single workflow (e.g. cell detection and vessel segmentation). The second is that it will allow direct comparison of different approaches to the same problem. In the case of cell detection, there are many different methods, each of which was developed for different types of data and cellular markers. It is likely that one of these will be the most successful for an individual dataset, but it is time consuming to directly compare methods on a single dataset.
An increase in interoperability will make it easier for users to compare algorithms (e.g. by visualising results in the same software), and create an integrated pipeline by selecting the most appropriate parts of existing software packages.

Common file formats
One of the easiest ways to increase interoperability of different software packages is by the use of common file formats. Although many packages carry out the same type of analyses, the data is stored in different ways, and as such it can be challenging to visualise. In some cases, the actual file type is different (e.g. NifTI vs TIFF for storing registration results), but in other cases the format of the underlying data also changes (e.g. the image origin for cell somata coordinates). Converting formats often requires programming knowledge and for the user to spend time understanding the underlying format. Deciding upon common formats (even as optional exports from the software) would immediately allow analysis using multiple packages, and visualisation and comparison in a single visualisation environment. For most aspects of wholebrain image analysis, this would be relatively simple, as the majority of files saved are 3D images, points or surfaces, for which existing standards are available.

Common software packages
Although each new software package contains novel analysis algorithms, much of the code is repeated from one tool to another. Routines such as loading and saving data and assigning detected features to an atlas are common to nearly all software. If these tools were centralised and available for use by the community in isolation from specific analysis packages, developers could save lots of time rather than reinventing the wheel. A useful side effect would be that by adopting these common solutions, new software would naturally become more interoperable, as they are written to be compatible with the same common software packages. The use of common software packages is standard practice in all areas of computer science, including microscopy analysis, such as ImgLib2 for FIJI   The only existing package specifically for whole-brain microscopy is the BrainGlobe Atlas API (Claudi et al., 2020), providing a common Python-based interface for downloading, managing and interfacing with neuroanatomical atlases. Software using this package (Claudi et al., 2021;Tyson et al., 2021aTyson et al., , 2021b can simply reuse code for using atlases and for defining neuroanatomical conventions. If packages such as this were widely adopted by the community, it would reduce the burden of developing new software, and increase operability.

Plugin systems
A logical extension of using separate software packages for common tasks, is to develop plugins for existing software. This provides the benefits of developing a central, community-managed software package for all whole-brain microscopy analysis tasks, without the prohibitive amount of effort that would be involved in coordinating such an effort. The plugin ecosystem has been very successful for Vaa3D and FIJI, and some whole-brain analysis packages have been written as extensions for Vaa3D (Bria et al., 2016), or as FIJI plugins (Arshadi et al., 2020;Niedworok et al., 2016). However most recent packages are written in Python (Kirst et al., 2020;Renier et al., 2016;Todorov et al., 2020;Tyson et al., 2021a;Young et al., 2020). Although it is possible to use Python code from FIJI or Vaa3D, reducing the amount of effort to develop compatible software will be key to increasing interoperability.
Napari is a new Python-based image viewer, created with the visualisation and analysis of large microscopy images in mind. One of the aims of napari is to develop a plugin architecture to leverage the growing community of image analysis packages developed in Python and provide a user friendly graphical user interface and interoperability between software. Adopting existing software like napari, in which many difficult problems have been solved (such as visualisation of large multichannel images) would also reduce time taken to develop new packages and would increase the potential for interoperability between software. More importantly, a user friendly interface would encourage users to adopt new methods Table 1 Comparison of selected whole-brain microscopy specific analysis tools. and exploit the benefits of recent developments in sample preparation and imaging. Writing plugins for an existing software ecosystem would also simplify the use of general purpose image analysis algorithms on whole-brain microscopy. Although dedicated tools are necessary for some tasks (atlas-based segmentation), other tasks can be accomplished by existing tools. There are many excellent existing 3D fluorescence microscopy analysis methods implemented as plugins for FIJI or napari such as those for image denoising (Weigert et al., 2018), general segmentation (Berg et al., 2019) or cell segmentation (Chen et al., 2018;Hollandi et al., 2020;Weigert et al., 2020;Stringer et al., 2021). In isolation, these methods may not be sufficient for the analysis of whole-brain microscopy data, but they could be used in combination with domain-specific tools.

Communication and collaboration
A typical whole-brain microscopy study may generate many TB of data from multiple brains, and so sharing raw data can be very difficult. It is also desirable to share data in a standardised manner so that it is most easily re-used (rather than a patchwork of solutions from individual laboratories). Dedicated infrastructure such as the Brain Image Library (http://www.brainimagelibrary.org, Benninger et al., 2020) provides one such way for the community to collaborate by sharing raw data.
Sharing processed data can be much simpler. For a cell detection workflow, this can be distilled down to cell positions in atlas space, but this still presents an issue when communicating the results to collaborators or in a publication. Simple 2D figures are limited, and it is difficult to fully comprehend text summaries of cellular distributions. 3D, interactive, explorable and shareable data summaries are required to allow others to fully appreciate the data. Interactive web visualisations can be exported in 2D using WholeBrain or 3D using brainrender, but these are not fully customisable, and do not yet allow the user to explore arbitrary properties of the data.
The whole-brain microscopy field has shared protocols for tissue preparation and imaging, along with data analysis tools. However, the majority of these data analysis tools rely on complex machine learning algorithms which require training data. Many existing machine learning tools provide repositories for sharing training data and trained models (e.g. http://www.mousemotorlab. org/dlc-modelzoo, https://bioimage.io). A community effort to host and share such data for whole-brain microscopy segmentation methods would further reduce barriers to entry for those new to the field.

Conclusions
The field of whole-brain microscopy in small animals has exploded in the last ten years following advances in tissue clearing and microscopy, although analysis tools have lagged somewhat. There have been many tools for registration and segmentation, but only a limited few such as WholeBrain (Furth et al., 2018) and MIRACL (Goubran et al., 2019) are user-friendly enough to be widely adopted by neuroscientists. These datasets may contain huge amounts of information (labelled cells, neurites, vasculature, implanted devices etc.), but there is no single platform that allows a user to perform all of these analyses. Although some tools have shown promise for integrating multiple types of analysis (Goubran et al., 2019;Kirst et al., 2020;Tyson et al., 2021b), there is not yet a platform that allows for them to be combined along with other custom analyses. We propose that this problem can be solved with collaboration and the development of open standards and plugins for existing software.

Declaration of competing interest
The authors declare that they have no competing financial interests or personal relationships that could be perceived to have influenced the work reported in this paper.