Speeded-Up Robust Features-based image mosaic method for large-scale microscopic hyperspectral pathological imaging

Microscopic hyperspectral imaging technology has been widely used in pathological analysis as it can obtain both spatial and spectral information of samples. However, most hyperspectral imaging systems can only capture images in a single field of view. Therefore, an image mosaic is one of the most important steps in a large-scale microscopic hyperspectral imaging system. This paper proposes a microscopic hyperspectral image (HSI) mosaic method based on Speeded-Up Robust Features (SURF) and linear synthesis to achieve large-scale HSIs. In contrast to other SURF-based image mosaic methods, the proposed method leverages both image content and coordinate information to improve the accuracy and stability of the image mosaic. In addition, multiple bands of HSIs with different texture information and grayscale are applied in image matching to take full advantage of spatial redundancy. Simultaneously, a blank microscopic HSI screening method is introduced in this paper to pick out a clearer blank image for better preprocessing, i.e. removing interference in the optical path and the interference of dust on slides. Finally, the preprocessed images are synthesized by linear-based synthesis methods due to their simple synthesis structure and better universality. Additionally, a file format, i.e. hyperslide, is defined for large-scale HSIs and can be browsed with HyperViewer software. Experimental results show that the proposed microscopic HSI mosaic method can obtain high-quality large-scale microscopic HSIs of tissue sections.

(Some figures may appear in colour only in the online journal) Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Introduction
Pathological diagnosis is of great importance in tumor identification, and microscopic imaging techniques have been widely used for this purpose to provide the specific structures of biological cells, organelles and tissues [1][2][3]. With the development of artificial intelligence as well as novel cheaper and more powerful technologies, increasingly large and complex neural networks are being presented online and new requirements of turning single-field imaging into wholeslide imaging are emerging for pathological images [4]. Individual images, i.e. tiles, usually represent very small areas because they are scanned at very high microscope magnifications for clear texture information, which will directly affect the observation and analysis of sections [5]. In contrast, if the section is scanned at low magnification to obtain an overall image, it is easier to observe the distribution pattern. Nonetheless, the low resolution of these images seriously restricts feature extraction. Therefore, a natural solution to this problem is to leverage computational tools to assemble several individual tiles to form a high-resolution large-scale image [6]. Subsequently, several large-scale pathological databases have also been introduced for the training and testing of neural networks [7,8]. In this case, a whole-slide image (WSI) with high resolution (submicrometer) will assist in clinical diagnosis by giving access to both the detailed structure and overall preview of tissue slide images within a relatively short time [9,10]. Automated whole-slide imaging scanners have been generally used in education, medicine and institutes [11]. For example, Kawano et al proposed a microscopic whole-slide imaging system based on color darkfield internal reflection illumination for biological applications [12], and Guo et al reported a real-time autofocusing-strategy-based whole-slide imaging system with low cost and high throughput [13].
Autofocus and image mosaics are both important steps in whole-slide imaging systems or large-scale pathological imaging systems. To date, several researchers have proposed various image mosaic algorithms mainly based on Speeded-Up Robust Features (SURF) and scale-invariant feature transform (SIFT) to acquire large-scale pathological images [14]. Shan et al leveraged SURF in a mosaic for two adjacent images and then matched eigen points via the Bray-Curtis similarity [15]. Pang et al solved the image mosaic problem of blood cells using a SIFT-Gauss splicing fusion method [16]. However, current pathological diagnosis is mostly based on color images and their limited information obstructs improvements in diagnostic accuracy. Therefore, the microscopic hyperspectral image (HSI) technique is incorporated into pathological diagnosis such as red blood cell count automation, lymphoblastic leukemia cell identification, chemical selective analysis of malaria-infected red blood cells, etc [17][18][19], whereas studies on HSI-technology-based large-scale pathological imaging are still rare. Consequently, after the study of the autofocus algorithm, it is necessary to further study the corresponding mosaic algorithm to obtain high-quality microscopic hyperspectral pathological image data for pathological diagnosis.
The reconstruction of microscopic hyperspectral WSIs is of great importance but difficult because the mosaic algorithm has to simultaneously preserve both spatial information and spectral information. Therefore, some researchers have implemented specific precise devices to generate the WSIs, such as prism-based slit-array dispersion and digital micromirror device-based spatial light modulators [20,21]. Additionally, some microscopic HSI mosaic algorithms, usually consisting of image preprocessing, image registration and image fusion, have also been put forward to obtain microscopic hyperspectral WSIs. The preprocessing of microscopic HSIs is usually implemented by dividing a blank image captured on the same slide with the same light source. This will remove interference from the dirty spots on the slide and optical path. Image registration is a fundamental component in medical image processing and it can be formulated as an optimization problem, finding the optimal spatial transformation model between two or more images [22]. Image registration is generally composed of a coordinate transformation model, optimization model, multiresolution strategy, image interpolation method and cost function [23]. The methods of image registration are typically classified into two groups: nonfeature-based and feature-based [24]. Nonfeature-based methods are sensitive to image transformation and are computationally complex tasks [24], whereas feature-based methods are faster and more robust in relation to noise and large deformations. The frequently utilized features contain points (region corners, points with high curvature), lines (section boundaries, blood vessels, fibrous tissue) and significant regions (tumor regions, hyperplasia regions, cells) [25]. There are several featurebased image registration methods proposed, such as SIFT and SURF [26]. SIFT leverages a high-dimensional vector to describe a feature point and each feature point is independent of image size and rotation with good tolerance to light and noise [27,28]. Al-khafaji proposed a spectral-spatial SIFT method to extract spectral-spatial scale-invariant features for HSI matching [29]. SURF is similar to SIFT with respect to feature point matching, but it can extract many robust local features more rapidly thanks to two-dimensional Haar wavelet responses, integral images and scale space technique [30][31][32].
Image fusion is also a fundamental image processing technique that stitches several images to obtain a large-scale image for better identification, detection and segmentation of natural and artificial objects [33,34]. Large-scale microscopic highspatial-resolution HSIs (HRHSIs) are now mostly obtained by fusing low-spatial-resolution HSIs (LRHSIs) and high-spatialresolution multispectral images (HRMSIs) [35][36][37]. These methods are generally classified into three groups: extensions of sharpening, model-based and low-rank structurebased fusion methods [38]. The span-sharpening methods typically utilize the spatial information of a panchromatic image to sharpen the multispectral images (MSIs) to generate HRHSIs [39]. However, some undesirable effects may occur because of spectral range differences between LRHSIs and HRMSIs [40]. The second method, model-based, is the most popular method in HSI fusion problems. In this method, LRHSIs and HRMSIs are respectively regarded as the decomposition part of HRHSIs in the spatial and spectral domains through degradation matrices [41]. Nevertheless, these large matrices in both the spectral and spatial domains will lead to higher space complexity. Regarding the third fusion method, the low-rank properties are preserved when analyzing the HSIs and then each pixel is processed independently without respect to the inherent spatial relationship [38,42]. Therefore, this method results in distortions of the spectral structure and spatial details as well as the spectral features. The above three fusion methods are widely applied in HSI and MSI fusion and they have obtained good results in generating HRHSIs. However, these methods require both HSIs and MSIs, which is too time-consuming and expensive in memory during image acquisition.
In this paper, we propose a microscopic HSI mosaic algorithm based on SURF and linear synthesis to obtain largescale microscopic HSIs. These images were captured by a homemade acousto-optic tunable filter (AOTF) hyperspectral imaging system with an autofocus algorithm considering wavelength changes [43,44]. In this algorithm, the contentbased mosaic method is repeatedly leveraged to captured microscopic HSIs rather than both HSIs and MSIs. Moreover, this method is supplemented with the position-based mosaic method so that the mosaic HSIs are generated with higher accuracy and stability. In addition, there are differences in texture and grayscale between different wavelengths and the spatial redundancy-based multi-wavelength characteristics are applied to the microscopic HSI mosaic. Experimental results show that the proposed algorithm can obtain high-quality large-scale microscopic HSIs, which is useful for building HSI pathological databases.

Methods
Microscopic image mosaic algorithms have been widely used in large-scale color image mosaics, and most of these algorithms can be divided into two categories: multi-image mosaic with overlapping regions and multi-image mosaic without overlapping regions. As shown in figure 1, the application of a multi-image mosaic with overlapping regions requires several choices to be made, such as multiple images acquired in a certain region, the overlapping area between each pair of images and a suitable size of overlapping area.
Given two horizontally adjacent images with overlapping regions, the ratio of the width of the overlapping regions to the width of the w × h image is assumed to be α. The similarity of eigenvectors is calculated to obtain the shift of each image from the combined image, i.e. displacement vector (∆w, ∆h). In ideal circumstances, if the left image stays still, the displacement vector of the left image is (0,0) and that of the right image is (w × α, 0). Similarly, when it comes to two vertically adjacent images, if the above image remains still, the displacement vector of the above image is (0,0) and that of the below image is (0, h × α). As a result, two images are composited together in figure 2 based on the displacement vectors to generate a large image with size (w + ∆w 1 + ∆w 2 , h + ∆h 1 + ∆h 2 ).
For multiple images without overlapping regions, extra information is needed for the image mosaic. It is usually implemented by recording displacement information, e.g. absolute  position of the triaxial electric loading platform when acquiring the current image or relative shift to the previous position of the triaxial electric loading platform. The displacement of the triaxial electric loading platform is precisely controlled to make sure that the content of two adjacent images is continuous and there is no gap or a very small gap between the two images. Then, all images are stitched together to obtain the large-scale image according to the position information. This method has a high requirement of accuracy of the electric platform in the acquisition system, otherwise there will be a large gap in the resultant image.
The above two methods are both adopted in this system, where the multi-image mosaic with overlapping regions dominates and the multi-image mosaic without overlapping regions assists in inspecting and correcting to increase the accuracy of the algorithm. In addition, microscopic HSI is abundant in information, especially in spatial redundancy. Therefore, the image-content-based mosaic method is repeatedly used supplemented by the image-position-based mosaic method to obtain higher accuracy and stability.

Image registration algorithm
An intersection must exist between SURF eigenvectors of two adjacent images with overlapping regions. The coordinate mapping relation is established for the eigen points with the same eigenvector in the two images, i.e. offset vector. The images with overlapping regions in this paper are captured under the same light source settings, so some SURF eigen points have different coordinate information but similar eigenvectors. Simultaneously, the characteristic also lies in SURF eigen points of overlapping regions and nonoverlapping regions. As shown in figure 3(a), the k-nearestneighbor (KNN) [45,46] combined with random sample consensus (RANSAC) [47,48] is leveraged to generate and screen eigen points to receive a pair of effective eigen points. Compared with optimum matching methods, KNN calculates the first K eigen points and then discards some of them based on the situation. Optimum matching methods are much easier, whereas a correct eigen point may be missed because there are many similar tissues in microscopic physical images.
To generate correct eigen points, SURF eigen points of both image A and image B are calculated through the SURF algorithm. N SURF eigen points in image A are recorded The similarity of all eigen points in image A and in image B, containing N × M eigen point pairs, is calculated in (1) (1) It is shown in figure 3(b) that multiple point matching may appear in KNN. Considering that the image shifts linearly, the correct connection is A and D, whereas B and C should be removed. Therefore, RANSAC is utilized to remove unconscionable eigen points by means of iteration to help with the generation of the correct offset vector during the scan of eigen point pairs. The eigenvectors of SURF eigen points are similar to each other, so a set of eigen points is generated in this system. SURF eigen point ⃗ v contains the sub-pixel coordinate information, i.e. (x ⃗ v , y ⃗ v ), which is important information in extracting SURF eigen points. For one pair of SURF eigen points (⃗ v A,i ,⃗ v B,Ni,j ), the coordinate information of ⃗ v A,i is (x ⃗ vA,i , y ⃗ vA,i ) and that of ⃗ v B,Ni,j is (x ⃗ vB,N i,j , y ⃗ vB,N i,j ), as shown in figure 4.
One pair of SURF eigen points can be formulated as a twodimensional vector ⃗ p i,j in (2). Most of the two-dimensional vectors point in the same direction and few of them diverge, and thus they can be eliminated by RANSAC It is assumed in this system that a linear translation model finally fits a linear translation vector, i.e. the offset vector of the two images ⃗ p final . Given a threshold δ ∈ (0, 1) and an iteration k, all the N × M SURF eigen points will satisfy this model only if there are at least δ × N × M SURF eigen points satisfying it. The specific process of RANSAC screening is as follows:

this pair of eigen points is a reasonable value and vice versa
where ε is the internal threshold in the linear translation model, i.e. the degree to which the two-dimensional vector of each eigen point pair is shifted from the initial linear translation vector. (3) If the number of all reasonable SURF eigen point pairs is larger than δ × N × M, the final linear translation vector ⃗ p final is calculated based on these eigen point pairs. (4) If the number of all reasonable SURF eigen point pairs is smaller than or equal to δ × N × M, go back to step 1 and the iteration number plus one. If the maximum iteration number is exceeded, ⃗ p final is generated from the model of the first k models that fits the data best.
In summary, RANSAC scans all the eigen point pairs and removes unusual SURF eigen point pairs. This method will preserve large numbers of reasonable eigen point pairs to obtain the relative displacement of two images ⃗ p final .

Image fusion algorithm
Microscopic HSI includes more abundant spatial information compared with signal gray imaging, which is taken into consideration to receive microscopic HSI fusion with higher accuracy. Several images of the same field of view {I 1 , I 2 , · · · , I N } are acquired at different wavelengths {λ 1 , λ 2 , · · · , λ N } (N is the number of wavelengths in microscopic HSI) and there are texture feature and gray differences due to the different wavelengths, as shown in figure 5.
Considering that the differences between these images lead to relatively large differences between SURF eigen points, a multi-wavelength image matching method based on the spatial redundancy feature is adopted in this system and then the images are fused based on the location information to obtain the offset vectors of two adjacent images.
The microscopic HSI data consists of N gray images {I 1 , I 2 , · · · , I N } at different wavelengths {λ 1 , λ 2 , · · · , λ N } and its physical position in the imaging system is represented by {x H , y H , z H }. The calculation steps are as follows: (a) Calculate the initial offset vector ⃗ p init .
(b) Calculate the offset correction vector ⃗ p pos .
The physical coordinate information of each microscopic HSI is recorded in the microscopic HSI system. The physical coordinates of two adjacent microscopic HSIs H 1 , H 2 are recorded as {x 1 , y 1 , z 1 } and {x 2 , y 2 , z 2 }. The coordinate unit is millimeters with four decimal places reserved and the highest precision is 0.1 µm. First, calculate the pixel numbers under different objectives based on the magnification factor per unit size α. There are five magnification factors {α 4 , α 10 , α 20 , α 40 , α 100 } which correspond to 4×, 10×, 20×, 40×, 100× objectives. Then, calculate the offset in physical position (dx, dy, dz) of the two images as for- The position of the image on the Z axis is not required in the image mosaic, so this paper only calculates the pixel offset on the X-axis and Y-axis, i.e. offset correction vector ⃗ p pos = (p x , p y ) (c) Calculate the fusion offset vector ⃗ p fusion .
Any two grayscale images {I 1,n , I 2,n , 1 ⩽ n ⩽ N} of two adjacent HSIs at the same wavelength and another two grayscale images {I 1,m , I 2,m , 1 ⩽ m ⩽ N, m ̸ = n} at another wavelength have the same offset vector. However, in a practical imaging system, the images will have different grayscale distributions, contrast ratios and texture features because of the different incident wavelengths, resulting in deviations of offset vectors. Hence, multi-wavelength grayscale images are adopted to calculate the fusion offset vector through linear fusion on account of wavelength information.
The spectral range of microscopic HSI is certain, i.e. (λ start , λ end ) and the interval between wavelengths λ step is formulated in (7) if the image contains N bands. When M bands of the image are extracted, the interval between wavelengths turns into as calculated in (8) Ideally, the microscopic HSI of M bands is suitable only when and satisfy (9) M suitable wavelengths {λ l1 , λ l2 , · · · , λ lM } are as follows: If λ step and λ _step satisfy (9), M suitable wavelengths will satisfy (11). Otherwise, find the proximate wavelengths from N wavelengths. ∀λ li ∈ {λ l1 , λ l2 , · · · , λ lM } , ∀λ n ∈ {λ 1 , λ 2 , · · · , λ N }, if |λ li − λ n | gets the minimum, λ n will be the approximate value of λ li M grayscale images of the two adjacent microscopic HSIs lM } is calculated on the basis of 2.1. Outliers may exist in the set of offset vectors {⃗ p l1 ,⃗ p l2 , · · · ,⃗ p lM } in a practical imaging system, so the position information is exploited to correct the vectors. If offset vector ⃗ p li , 1 ⩽ i ⩽ M and offset correction vector ⃗ p pos satisfy (12), ⃗ p li is considered as the reasonable offset vector, otherwise the offset vector will be removed The original offset vectors {⃗ p l1 ,⃗ p l2 , · · · ,⃗ p lM } turn into ⃗ p l1 ,⃗ p l2 , · · · ,⃗ p lQ after position correction and these Q offset vectors are then linearly fused to obtain fusion offset vector ⃗ p fusion . The corresponding weights of linear fusion to ⃗ p l1 ,⃗ p l2 , · · · ,⃗ p lQ is β l1 , β l2 , · · · , β lQ , representing the contribution of the offset vectors of different wavelength images to the fusion offset vector. In this paper, a linear weight calculation method based on wavelength differentiation is adopted. A certain wavelength is taken as the intermediate point, and the difference between the wavelength of each offset vector and the wavelength of the intermediate point is taken as the weight. Subsequently, the weight of Q offset vectors is obtained by normalization. If M = Q, i.e. M offset vectors are all reasonable, all weights will be evenly divided in (13). Otherwise, some offset vectors are removed and the remaining weights are evenly divided as shown in figure 6 The wavelength interval between offset vectors ∆l is defined by the equation The weight value is given as Finally, the fusion offset vector ⃗ p fusion is calculated as (d) Offset the vector of microscopic HSI ⃗ p final .
If offset vectors at different wavelengths ⃗ p l1 ,⃗ p l2 , · · · ,⃗ p lQ are with sole error, there will be a gap in the mosaic image. In this circumstance, the fusion offset vector ⃗ p fusion should be corrected again by initial offset vector ⃗ p init . Before the correction, the error vector ⃗ p err between the two vectors is calculated as where error vector⃗ p err represents the offset of initial offset vector ⃗ p init from fusion offset vector ⃗ p fusion . In view of the higher stability of the multi-wavelength fused offset vector ⃗ p fusion , the two vectors are averaged when ⃗ p err is very small. The larger the error vector ⃗ p err , the smaller the difference between ⃗ p final and ⃗ p fusion . By that analogy, when ⃗ p err is too large, ⃗ p fusion is adopted as ⃗ p final and ⃗ p init is considered to have a large error

Blank microscopic HSI screening algorithm
In the process of large-scale multidimensional data acquisition, a series of microstructure HSIs with large margins will influence the accuracy of the autofocus and image mosaic. Therefore, the proportion of empty area is calculated in this paper. First, two grayscale images I t,n , I b,n of the nearest wavelength λ n of λ ref are separately extracted from microscopic HSI with histopathological tissue H t and blank microscopic HSI H b . The same wavelengths of H t and H b are defined as {λ 1 , λ 2 , · · · , λ N } and λ n is defined as Secondly, the ratio image I r is calculated as The two grayscale images I t,n , I b,n have the same incident light and I b,n has higher mean brightness because there are no histopathological tissues in the blank area. Therefore, the pixel values in the ratio image I r are all within [0,1] and individual outliers are also set within [0,1].
Finally, the ratio image I r is transferred to a binary image using its average pixel values and the binary image is transferred into six parts in figure 7 to calculate the ratio of blank area in I r .
The ratios of blank area in the six parts are respectively calculated as where white represents white pixels and total represents total pixels. r is leveraged in the system to optimize autofocus and large-scale image acquisition, reducing the negative impact of large blank areas. Simultaneously, a blank image is of great importance in microscopic HSI processing because it assists in both spectral correction and removal of intrinsic interference. In addition, if the blank image is disturbed by factors such as dust on the slide, the image quality will deteriorate. Therefore, two blank images H 1 , H 2 are taken during image acquisition and then one is screened through a simple evaluation method based on image brightness for image preprocessing.  First, two grayscale images I 1,n , I 2,n at wavelength λ n , which is calculated in (19), are chosen in the two blank images. Then, the two images are respectively treated as an image with histopathological tissue and a blank image, and the ratios S 1 , S 2 of blank area in I 1,n , I 2,n are calculated in (22) following formula (21). If there is some dust in I 1,n and I 2,n is clearer, the ratio image S 2 will be determined as blank area. In the end, the clearer blank HSI S final is screened by comparing the total pixel values of S 1 and S 2 S 1 = Calculate (I 1,n /I 2,n ) S 2 = Calculate (I 2,n /I 1,n )

Microscopic HSI synthesis strategy
Multi-image synthesis can be simplified into multiple twoimage synthesis and the process is then iterated in a loop. There are usually two kinds of synthesis strategy: tree-based and linear-based. In a tree-based synthesis strategy, the process of multiple two-image synthesis can be demonstrated as a binary tree, i.e. synthesis tree as shown in figure 8(a). The basic idea of the synthesis tree is to compose a synthesis task with In practical experiments, the number of images is random, so two methods are introduced when the number of microscopic HSIs is N. One method is to avoid grouping the images, always keep one synthesis tree and adjust it according to the number of nodes in each layer. If the number of nodes n in one layer is even as in figure 8(b), the nodes in this layer are combined in pairs and the microscopic HSIs paired by the two adjacent nodes are stitched, etc. Otherwise, the final two nodes in this layer are stitched first and the stitched image is set as one node in this layer, where the number of nodes is changed to even as shown in figure 8(c).
Another method is to group the images into groups of a power of two to construct several full binary trees, as shown in figure 9. If the number of input images in the synthesis tree is N and it can be expressed as the sum of powers of two, this synthesis tree can be transferred into multiple synthesis trees of full binary trees. At the end of each subtree synthesis, the microscopic HSIs represented by these root nodes are stitched again to obtain the final image.
In summary, a tree-based synthesis strategy has a good hierarchy, but there are large numbers of extra processing steps during the reconstruction and it does not have good universality because it relies heavily on the number of input images.
In a linear-based synthesis strategy, the two output images will not have similar size because it is a growing image synthesis strategy as shown in figure 10. Each signal block in this schematic represents a mosaic microscopic HSI and the block turns black once it is stitched until all the blocks turn black, representing the end of the mosaic. Therefore, each microscopic HSI in this method only has two kinds of status, i.e. mosaic and stitched, resulting in a simple synthesis structure. On top of this, there is no requirement for the number of input images and no specific process in this method, which means it has generality.
As a result, a linear-based synthesis strategy is adopted in this paper for the following reasons. With respect to synthesis structure, a linear-based synthesis strategy has a more straightforward and simple structure, which is easier for engineering realization. As for the input image, the tree-based synthesis strategy needs to be adjusted according to the number of input images and synthesizes in a recursive way, whereas a linearbased synthesis strategy directly stitches the images without any additional processing operations. In terms of the output of the intermediate results, the two strategies will both output the microscopic HSIs at each node and store them in the storage system of the Linux server. The intermediate resultant image is the synthesis result of two images in a tree-based synthesis strategy, whereas the intermediate images in a linear-based synthesis strategy are the synthesis result of all the previous images, which can show the process of the image mosaic step by step. Regarding the output image, both synthesis strategies will output a large-scale microscopic HSI and its corresponding parameter files. To summarize, a linear-based synthesis strategy is leveraged in this paper thanks to its simple structure, friendly input and output, and good generality.

Hyperslide large-scale image specification
Large-scale microscopic HSIs will take up a huge amount of memory and hence conventional image file formats are not suitable for storing and browsing large-scale HSIs. Therefore, this paper defines a specific file format named hyperslide based on the TIFF6.0 standard for HSIs. This format can be opened with specialized software HyperViewer which reasonably adopts several development frameworks like .Net and Spring as well as various programming languages such as C#, Java, Python, IDL and Javascript. The number of pixels in length and width of the hyperslide files can be up to 2 32 which is enough to store images of large size. However, hyperslide does not include spectral information because it consists of a hyperspectral cube false color combination pyramid image. Nevertheless, if the hyperslide file and raw file are simultaneously stored in the same folder, spectral information can be viewed in real time.
The physical structure and logical structure of a hyperslide file respectively correspond to the existence of hyperslide in the operating system and the storage structure of image data in a single hyperslide file. As shown in figure 11(a) and table 1, a hyperslide file consists of a three-stage system: image file header (IFH), image file directory (IFD) and image data. A TIFF file contains several images and each image has a corresponding IFD. As depicted in figure 11(a), the logical structure refers to the concrete properties of IFH, IFD and directory entries (DEs), i.e. image size, data and compression mode, etc. A hyperslide file is logically composed of four groups: the triple pyramid image, thumbnail, index information and tag diagram of the sample. Among them, index information is the DE of the first-stage image and the other parts separately correspond to an IFD. The image data in hyperslide can be stored as strips and tiles. Apart from this, the triple pyramid images and thumbnails are stored in tile format, improving the efficiency of random access to large-size images.
As shown in figure 11(a), the resolution of each layer of the image in the pyramid decreases in equal proportion layer by layer and the image with the maximum resolution is the original image. The triple pyramid images in this paper contain the first stage main image and the second as well as the third stage secondary image. The resolution relationship of each layer is defined as where r i is the resolution of the ith-stage image.
Although the pyramid structure can increase the real-time performance of multi-resolution display, it increases the disk storage space. However, as a data cube, a single HSI can reach GB level so a stitched HSI will cause huge pressure on the storage space of ordinary computers. In addition, each single-band image of HSIs is 256 grayscale images. Human eyes can only distinguish more than 30 grayscale images, but will distinguish thousands of different shades and brightness [49]. Therefore, color images are more conducive to human eyes' interpretation than grayscale images. In order to reduce the storage burden of the system and make it easy to view images, a pseudo-color image is leveraged as the content of each stage image in this paper. The first-stage image is the pseudo-color image of the original stitched HSI array at 40× and the second-stage image is the down-sampled image of the first-stage image. Similarly, the third-stage image is obtained after secondary downsampling which is displayed on the screen.
Hyperslide is a large-scale index image file with only spatial information and no spectral information of the stitched HSIs. In order to obtain spectral information, the spatial coordinates in the large-scale HSIs can be transformed into the spatial coordinates of the specific numbered cubes in the cube array through a coordinate position mapping relationship which is demonstrated in figure 11(b). Tiles in the figure are numbered in ascending order from left to right and top to bottom and cubes are numbered in collection order. Tiles of the same color are generated from the same cube and the color also indicates exactly which part of the cube's two-dimensional space generates these tiles. JSON character strings, i.e. index information, are utilized to record the exact position of each cube in the first-stage image and they are written to the custom directory entry tag HyperMatch in the first-stage image.
An example of index information is shown in figure 11(c), where (start_x, start_y) represents the coordinate of the actual   As shown in figure 11(d), due to the use of image pyramid technology, the image block with the most appropriate resolution in the pyramid will be displayed to increase the clarity. Image roaming includes three kinds of operations: up roaming, down roaming and peer roaming. In peer roaming, the feature of internal block storage is used to ensure that are only partial block data exist in the memory. Up roaming means switching from the current resolution of image to a high-resolution image, as opposed to down roaming. Peer roaming is used to operate in the same-stage image of the pyramid. Assuming that the current roaming image is in the ith layer of the pyramid, the magnification coefficient of the current display image j is from 0.25 to 1.75. If the user scales the image and j > 1.75, the system will default that the image of the current layer is not clear, so it will switch to the (i − 1)th layer, i.e. up roaming, and the magnification of the ith layer will be set to 0.875. if j < 0.25, the system will consider that the low-resolution image is more suitable for the current display, so it will switch to the (i + 1)th layer, i.e. down roaming, and the magnification will be set to 0.5. Otherwise, users will maintain peer roaming operations.
In conclusion, switching based on this magnification scheme is insensitive, smooth and infrequent.
The spectral curve of a hyperslide file can be obtained from the corresponding cube based on the mapping relationship. First, it is shown in figure 11(e) that a coordinate matrix is established according to (start_x, start_y) and (zero_x, zero_y) in HyperMatch. The element in the matrix represents the upper-left corner origin point (0, 0) of the two-dimensional index of each cube in the first-stage image. The operation of the viewing spectrum can be divided into two types: viewing a single-point spectrum and viewing a regional spectrum. Viewing a single-point spectrum can be modeled as a twodimensional plane search problem and viewing a regional spectrum can subsequently be simplified as multiple singlepoint search problems.

Single-point spectral mapping algorithm
Singlepoint spectral mapping is triggered when the point (x, y) in the image is clicked. Assuming that the coordinates of the origin point of the cube (two-dimensional size is H × W) in the first-stage image are (x', y'), if (x, y) satisfies (25), it means that this point is located within the cube and its absolute coordinate is (x -x', y -y'). Then, the spectral information S(x -x', y -y') can be obtained through the absolute coordinate. However, the time complexity of this solution is O(mn) which is not efficient. Therefore, the order of the matrix and variability of search step length are used to reduce the search time and improve the real-time performance of the software.
It is obvious in figure 11(e) that the x-coordinates of pixels in each row increase from left to right and top to bottom, and the y-coordinates of pixels in each column increase from top to bottom and right to left. According to the order of the matrix, the binary search method will speed up the algorithm and the time complexity will be O(lg(mn)). However, this method cannot satisfy the variability of search step length. Therefore, an heuristic algorithm is designed in this paper: look for the position in the horizontal direction first and the position in the vertical direction second. Then, the upper-right corner of the matrix is selected as the search starting point by setting the direction marker DTrigger to control the search direction (1 means horizontal, −1 means vertical). The specific search steps are as follows: If y' + W ≤ y ≤ y' + H, set DTrigger = -DTrigger and reevaluate whether x is within the range. If the condition is satisfied, the cube to which the coordinate belongs can be found out. Otherwise, repeat step a. If y ≤ y', search to the up until y' ≤ y ≤ y' + H; If y' + H ≤ y, search to the bottom until y' ≤ y ≤ y' + H. Then, repeat step b.
This method does not need to traverse the entire twodimensional matrix to find the target. Simultaneously, in the case of constant step size, the time complexity is O(m + n) and the length of a search path is the Manhattan distance (the shortest).
Additionally, the variability of the search step length is also useful for improving search efficiency. The variables used in this method are shown in table 2.
△x and △y are defined as The step size for each move is defined as Step x = ∆x/Threshold row · Step min Step y = ∆y/Threshold col · Step min .
The default unit step size is set as 1, i.e. one grid at a time, and the empirical values for Threshold row and Threshold col are 1350 and 950 (depending on the cube size). With the help of variable search step size, the search distance is shorter and thus the number of attempts is greatly reduced and the time complexity can be reduced to O(1).

Regional spectral mapping algorithm
The regional spectral mapping algorithm can be triggered by drawing a closed area or curve. This problem can then be simplified to multiple single-point spectral mapping by sampling M coordinate points of the circle trace or the closed region, i.e. P 1 , P 2 , … , P M . The regional spectra can be obtained by adding and averaging the spectra at all points of the point set which are acquired through single-point spectral mapping. This method is quite simple but ignores the spatial correlation of each point and always needs to search from the top right corner of the two-dimensional array. Therefore, the search time will be too long if M and the two-dimensional array are too large. Consequently, after the mapping of the initial sampling point P 1 is completed, the search starting point of the remaining point P n is set to the cube which contains P n-1 (P n-1 and P n are adjacent as shown in figure 11(f)). This method can effectively shorten the search time. If P n-1 and P n do not belong to the same cube, the cube number of P n will be ideally found by moving at most one step.

Experimental results and analysis
All data used in this paper were collected by a homemade AOTF microscopic hyperspectral imaging system. The HSI system used in this paper is composed of a microscope (Nikon 80i, Nikon Corp.), an AOTF (VA310-.37-.80-L, Brimrose Corp.), an AOTF controller (VFI130-140SPFB2C2exSTS, Brimrose Corp.), a gray scientific complementary metal oxide semiconductor (sCMOS, Dhyana 400D, Tucsen Corp.), a color charge-coupled device detector (color CCD, DigiRetina 16, Tucsen Corp.) and a personal computer [7]. In addition, multidimensional imaging software is developed to control these devices, supplying a friendly interface to obtain color images and HSIs simultaneously. On top of this, the autofocus

Measurement of magnification factor
In this system, the relative position of two adjacent microscopic HSIs is used to correct the offset vectors. However, the relative position is based on the physical coordinate system of the platform, while the vector correction is based on the pixel offset. Therefore, a magnification factor is needed for the conversion between the two quantities and the magnification factor is related to the microscope objective multiple. The measurement of the magnification factor is operated by capturing the objective micrometer, which is a calibration facility of size 785 mm × 25 mm. The objective micrometer used in this system is a CW-DY objective micrometer from Beijing Pediwei Instrument Co. Ltd. It is shown in figure 12 that the objective micrometer has three standard scales: 10 mm subdivided into 100 cells, 1 mm subdivided into 100 cells and 10 mm subdivided into 200 cells.
The images of the second standard scale are captured under different objective multiples in figure 13 and the number of pixels per unit calibration is calculated from these images. As illustrated in table 3, several different calibrations are captured and they are averaged in consideration of image quality and artificial measurement error. These magnification factors are utilized in offset correction.

Image registration and refusion
It is shown in figure 14 that there is some lateral offset between the two adjacent mosaic microscopic HSIs. The two images are stitched through the SURF method (⃗ p init ), position vector calculation (⃗ p pos ) and the image fusion method (⃗ p final ). To verify the necessity of the three processing steps, three experiments are operated in this paper. First, only the SURF method is used in linear-based synthesis and the result in figure 15(a) shows that there will be displacement in the stitched image. Second, only the triaxial coordinate information is used in linear-based synthesis and the result is similar to the first experiment shown in figure 15(b). Third, image fusion is adopted in linear-based synthesis, generating the correct resultant image in figure 15(c).
If the microscopic HSI is stitched only using its texture information, there will be offset in the stitched image. In addition, the coordinate-information-based matching method will import movement error which will be magnified to a large image offset in the microscopic imaging system. Although both types of information will lead to some offset, they are indispensable in an image mosaic. Therefore, the microscopic HSI mosaic method based on the spatial redundancy characteristic and position correction integrates the basic image matching algorithm, the position information of the imaging system and the HSI information, effectively improving the accuracy of the image mosaic. Besides, the multi-wavelength fusion method is adopted in this paper, removing wavelength selection and enhancing the efficiency of the algorithm. Compared with existing fusion methods, there is no need for both HSIs and MSIs in the multi-wavelength fusion method, avoiding spending too much time on image collection.

Blank microscopic HSI screening
Continuous multiple histopathological HSIs with large blank areas will seriously influence the autofocus and large-scale microscopic HSI mosaic. Therefore, this experiment calculates the ratio of blank area in the histopathological HSIs. First, one microscopic HSI with histopathological tissue and one blank image are taken in the system and then the ratio image of the two images calculated by (20) is shown in figure 16(a), i.e. preprocessed image. To calculate the ratio of blank area in this image, the image is divided into two parts with the average brightness of the image as the threshold. Then, the ratio image is turned into a binary image with the same threshold, as illustrated in figure 16(c). Next, the ratio of blank area is separately calculated in six parts of the binary image in figure 16(d). If the ratio is higher than a certain threshold, the autofocus will ignore the concerned HSI in the case of blurred HSIs and then the sharpness of the stitched image will be ensured.
This method can also be utilized to compare two blank microscopic HSIs to screen a clearer blank image for image preprocessing. The screening will supply a better blank image as the reference image in autofocus and assist in image preprocessing with better spectral correction and removal of intrinsic interference in the optical path. Given the two blank microscopic HSIs in figure 17, it is obvious that there is less dust in

Large-scale microscopic HSI generation
The linear-based synthesis strategy is leveraged to generate a large-scale multidimensional image. This paper displays the stitched image of four images of a melanoma sample as well as the intermediate output images. In this experiment, four HSIs of a melanoma sample (40×) are collected with 30 bands from 400 nm to 800 nm. Figure 18 shows the 15th band, i.e. 606 nm, of the stitched image of four microscopic HSIs. It is demonstrated that the multidimensional image mosaic method used in this paper realizes large-scale pathological HSI acquisition in figure 19(d) and solves the problem of small field of view at high magnification. The resultant HSI contains both rich information and a large observation field which   can be browsed in figure 20, providing more extensive and indepth information for the subsequent pathological analysis. As Spectrum curve rendering area: When the spectrum acquisition tool is clicked, the spectrum is displayed in this area. The horizontal axis is the wavelength and the vertical axis is the transmittance or pixel value, and they are adaptive to the band and pixel value range of the underlying cube. shown in figure 20, the initial display multiple is 2.5× and the main elements in HyperViewer are introduced in table 4. Moreover, as shown in figures 19(a) and (b), the shape of the spectral curves of the pre-stitched HSI and stitched HSI are almost the same, which reveals that the mosaic method will preserve the spectral characteristics of HSIs.
To evaluate the proposed method, Table 5 compares SURF with SIFT on ten microscopic HSIs of lymph node samples in the feature detection time. The comparison experiments are operated on a 2.3 GHz Intel Core i5 GPU with 16 GB 2133 MHz LPDDR3 memory. The system environment is MacOS Mojave 10.14.4. It is obvious that SURF outperforms SIFT in all experiments. SURF can save half the detection time in comparison to SIFT.
Moreover, the RMSE (root-mean-square error) and CS (cosine similarity) are leveraged to evaluate the accuracy of matching results. RMSE is a standard statistical metric to  [50]. It is given by where X i and Y i represent the interior point coordinates of two microscopic HSIs (I 1 , I 2 ) which are to be stitched together. N represents the number of interior points. H 12 represents the transformational matrix from I 1 to I 2 . The smaller the RMSE, the more accurate the match. CS measures the spectral angle between two vectors to calculate the spectral similarity of two images [51]. It is given by where x i and y i represent the spectral curves of two matching feature points (X and Y). The smaller the angle between two vectors, the more similar the two vectors are and the more accurate the algorithm is. Table 6 compares the matching accuracy of the first nine matches in the lymph node image mosaic. The RMSE and CS of the two matching algorithms are close and their CE are both very high. However, SURF can detect feature points much faster than SIFT.

Conclusions
In this work, we have presented a microscopic HSI mosaic method to generate large-scale microscopic HSIs. The problem is simplified into the mosaic of two adjacent images, so the images will be stitched well when the offset vectors of the two images are calculated precisely. This paper utilizes a SURF-based mosaic method to extract the features of adjacent HSIs and meanwhile the position information is used to assist in obtaining more accurate offset vectors. Then, a spatial redundancy-based multi-wavelength image matching method is applied in this paper thanks to different texture structures and grayscale between different wavelengths to obtain the final fusion vectors. Moreover, a blank microscopic HSI screening method is also introduced to obtain a clearer blank image for better image preprocessing. Finally, adjacent microscopic HSIs are merged by the linear-based synthesis method due to its simple synthesis structure and good generality. The method in this paper will save a lot of time in image acquisition because the fusion method is only based on HSIs instead of both HSIs and MSIs. There is no need for special instruments in image acquisition or particular sample types in the image mosaic. The experiments demonstrate the necessity of combining SURF and coordinate information and there are no gaps in the resultant images of the microscopic HSI mosaic method, proving its validity and practicability. In addition, this paper defines a file format that is named hyperslide and opened with software HyperViewer for large-scale microscopic HSI browsing. However, each group of microscopic HSIs is preprocessed by the same blank image, ignoring the possible deviations of the incident light. Moreover, the proposed method relies on the imaging technology and mosaic algorithms of two-dimensional images and there may be some breakthrough in future research.