Bio-empirical mode decomposition: visible and infrared fusion using biologically inspired empirical mode decomposition

Abstract. Bio-EMD, a biologically inspired fusion of visible and infrared (IR) images based on empirical mode decomposition (EMD) and color opponent processing, is introduced. First, registered visible and IR captures of the same scene are decomposed into intrinsic mode functions (IMFs) through EMD. The fused image is then generated by an intuitive opponent processing the source IMFs. The resulting image is evaluated based on the amount of information transferred from the two input images, the clarity of details, the vividness of depictions, and range of meaningful differences in lightness and chromaticity. We show that this opponent processing-based technique outperformed other algorithms based on pixel intensity and multiscale techniques. Additionally, Bio-EMD transferred twice the information to the fused image compared to other methods, providing a higher level of sharpness, more natural-looking colors, and similar contrast levels. These results were obtained prior to optimization of color opponent processing filters. The Bio-EMD algorithm has potential applicability in multisensor fusion covering visible bands, forensics, medical imaging, remote sensing, natural resources management, etc.


Introduction
Image fusion is the process of combining two or more registered images of the same scene to get a more informative image.Visible and infrared (IR) color image fusion has become a process with multiple applications.From situational awareness to medical imaging, fusion has provided users with images that are more meaningful than source images.Image fusion techniques can be broken down into two main approaches: multiscale and nonmultiscale. 1 Multiscale techniques include wavelet transforms and pyramid transforms. 2 Nonmultiscale techniques include linear, nonlinear, estimation theory, artificial neural networks, and color composite approaches. 2 Despite, the enormous research done on the subject, obtaining a fused image with very high information content and an informative depiction of the scene is a domain of active research.4][5][6] Most of these methods perform the integration of grayscale images.Since human eyes distinguish about a hundred grayscale levels and thousands of color variations, color-fused images provide more information content than grayscale fusion.Color fusion provides a chromatic representation of the fused image in false or near-true colors for situational awareness and medical applications.8][9] These works point out three gaps: (1) there is a need to develop and implement efficient algorithms performing color fusion of IR and EO images and generating natural-looking colors; (2) the systems need to transfer as much information as possible from the input images to the fused one, and generate a high-quality image, scalable in terms of number and size of images being integrated; and (3) in information fusion domain, empirical mode decomposition (EMD) is a fully data-driven technique that provides a decomposition of images into finite sets of signals called intrinsic mode functions (IMFs) and the literature points out that better fusion can be achieved on IMFs.
In this work, we proposed a new technique to fuse low-light visible and IR images and generate near naturally looking colors.The method is based on EMD and centersurround opponent processing. 10,11This paper is organized as follows.EMD, opponent-processing, and image fusion quality metrics, which form the theoretic background, are introduced in Sec. 2. In Sec. 3, we present the structure and algorithm supporting this work.Section 4 is based on the result evaluation and comparison to some existing technique outcome for some sets of images.Finally, conclusions are drawn in Sec. 5.

Background and Concepts
In this section, we present the necessary theoretical background for the development of an opponent processing information fusion technique.This includes the dynamic neural network equations and techniques developed to produce color-fused images.

Dynamic Neural Network Equations
Visible electromagnetic waves have a wavelength between 400 (violet) and 700 nm (red).Human eyes are more sensitive to colors in the middle of the visible spectrum (green to yellow) and have dim sensitivity of the spectrum toward the extremes. 7Like several nature-inspired solutions, the human vision system has inspired some image processing developments.Inside the retina, photoreceptors are responsible for image formation.Rods participate in achromatic image formation which has poor details and no color.Conversely, cones, horizontal cells, and bipolar cells produce contrast enhancement of color information which correspond to spatial opponent signal processing.There are three types of cones containing photo-pigments with distinct spectral sensitivity. 2L cones are sensitive to long-wavelengths around 560 nm, M cones are sensitive to medium-wavelengths around 530 nm, and S cones are sensitive to shortwavelengths around 420 nm.These cones share some sensitivity regions.Emulating retina processing has resulted in image fusion architectures with center-surround operations. 12,13enter-surround operations are the result of cones transforming photons into signals through opponent mechanisms.The relations governing the activities in the retina when the excitation and the inhibition are performed by filtering (center-surround operations) 2 are summarized in Table 1, which illustrates the channels within the luminance and color that are coupled within the retina, resulting in one being excited and the other being inhibited.
The neuro-dynamic interactions representing the centersurround model at a pixel (x ij ) level are summarized in Table 2.
3, and β ¼ 1.875 such that both Gaussian kernel filters cover the same area. 2,11C ij is the ON-center interaction and E ij is the OFF-center interaction; both represent a discrete convolution of the input pattern I ij with a Gaussian kernel.At the equilibrium, x ij has a constant value so its derivative is equal to zero.The coefficient A affects the lightness/darkness of the filtered image.

Color Fusion of IR and Visible Images
Unlike grayscale image fusion, color fusion provides a chromatic representation of the fused image in false or real color.Figure 1 shows a hierarchical model of some color composite image fusion procedures, their authors, and institutions.In many cases, image fusion approaches are applied in combination with other algorithms.
Toet et al. developed a false color mapping technique where the "unique" and "common" components of two images are assigned to the RGB band. 7,14Their results showed enhancement of features unique to each modality.However, common features were diminished in the fused image and resulted in colors that were different from the original color image.Remapping different gray levels of a unique region in images produced different colors, a process that creates unsatisfactory color visual effects.
Waxman et al. [15][16][17][18] developed a variety of low-light visible/IR fusion architectures that merge EO images with thermal IR imagery by emulating some principles of biological opponent-color vision.Their approach to frame fusion relied on biologically motivated neuro-computational models of visual contrast enhancement.Their architecture fused EO image and thermal images successfully, but the integrity of the color information is not preserved, reducing the ability   to recognize objects.This was the case of the other architectures developed by this team during the same study.
Relying on Land's experiment on color constancy of human vision, 19 Huang et al proposed a new method to fuse visual and IR images and generate a false color image.Their proposed architecture is based on equal energy distribution assumption of colors reflected to eyes.Testing results showed lower colorfulness compared to Toet et al. methods but allowed target detection. 19,20These results also confirmed that the reddish features in the fused image are pulled from the IR source while greenish objects are from the visible source.
Nunez et al. 21developed a new approach to merge high-resolution panchromatic images with low resolutionmultispectral images.Several techniques offer the conversion of multispectral images into intensity hue saturation (IHS). 22,23This method has the advantage of adding the spectral quality of the color image to high resolution details from the panchromatic image.Similar frameworks have been applied utilizing pyramid-based fusion method. 24An expansion of this work applying the spectral response of sensors is detailed in Ref. 22.

Evaluation of Color Fusion
Fusion evaluation metrics have been largely developed for still images.Fused frames are evaluated on accuracy, robustness, and sensitivity of the generating algorithms.Image fusion can be done subjectively or objectively.In subjective image evaluation, an audience of qualified observers grade the results of integration based on the amount of useful information extracted from the original images.Conditions of observation must be identical for all observers and the screen must be sufficiently large. 25,26This work utilized concepts developed to assess results from fusion processes objectively. 9,27-31

Empirical Mode Decomposition
EMD 10,[32][33][34] is a nonparametric and self-adaptive method which makes effective use of an image data to derive its decomposition into a set of finite IMFs.This is important when fusing real-world images.The advantages of EMD are multiple.EMD is self-adaptive, nonparametric, make no assumption about data being decomposed and corresponds to the nonstationary and nonlinear behavior of imagery from different modalities.Also, this approach is computationally light and intuitive compared to other decomposition techniques. 10,35Figure 2 shows how the IMFs are generated.
The image to be decomposed is first converted from twodimensional (2-D) array to one-dimensional (1-D) array and treated as a signal xðtÞ (function of t where t goes from 1 to the number of pixels in the image).The colors of the pixels determine the amplitude of the signal at each index.The maxima and minima of xðtÞ are identified and generate the upper and lower splines (envelopes).The mean signal mðtÞ of these two envelopes is subtracted from the original signal xðtÞ to obtain a new signal hðtÞ.If hðtÞ is symmetric to the zero-crossing axis and the difference between the number of maxima and minima is not greater than 1, hðtÞ is considered an IMF and xðtÞ is replaced by the residual xðtÞ − hðtÞ; otherwise xðtÞ is replaced by hðtÞ.The process stops when rðtÞ becomes monotonic.The EMD provides a decomposition of the images into IMFs and residual.Each IMF sample carries pixel color information.For original images of size M × N, each IMF will be M Ã N samples long; the information is stored as row vectors.Depending on the image content and the sifting process utilized, the decomposition generates a certain number T of IMFs in total.
3 Bio-EMD Fusion of IR and Visible Images Integration of IR and visible images should generate a fused imagery with a high level of information transferred, present clarity of details, facilitate detection or identification, and render near true colors.Our approach is to fuse the spatial and frequency components of the input images obtained by EMD through opponent processing.Figure 3 presents the design model.
The source images (IR and visible) are registered images.The IR image is grayscale but may also be a dual band or RGB signal, depending on the sensor output.The visible image is a color image.Both images are pre-processed for noise removal, contrast enhancement, and resized to the same size (numbers of rows and columns), if different.These source images are decomposed into their IMFs, generated through EMD 33,35,36 according to the architecture presented in Fig. 4 before the fused-image reconstruction process (sum of IMFs and conversion from 1-D to 2-D).
Figure 3 presents the developed IMF integration model.The IMFs are integrated following a biological model that emulates the human retinal system.The red, green, and blue signals of the color image are decomposed into their respective IMFs.The IR image is converted in luminance (Y), where Num k represents the convolution of F k and the filter BCDE, where Den k represents the convolution of F k and the filter CE added to the constant A that is defined in Sec.2.1.
The filters Cpq and Epq are 1-D forms of the filters C ij and E ij as described in Sec.2.1.The parameters making the filters have suggested the values by Carpenter and Grossberg and can be tuned for optimization of the image evaluation results.The OFF-center IMFs are computed similarly.Double opponent processing fusion is realized by combining pairs of single opponent IMFs obtained from input images as summarized in Eq. ( 4) and detailed in Fig. 4.
The   as CCDs and IRs, the approach preserves features and edges due to its ability to separate spatial frequencies.

Experimental Results
The state-of-the-art techniques for color composite image fusion may be subdivided into opponent processing and improved IHS algorithms.In the field of opponent processing, major works have been done by the teams of Dr. Toet, Dr. Waxman, and Dr. Huang. 14,18,19The recent works on color opponent fusing techniques 14 and multispectral image fusion 37 justify the choice of the algorithms we chose for performance comparison.In order to evaluate the Bio-EMD fusion, testing was conducted on all the pairs of registered images available in Ref. 38.The performance was consistent throughout the testing samples and we are presenting three of the datasets.For each dataset, our result was compared to

Image Fusion
Among the sets of IR and EO images fused, three sets representing different scenes are presented here.
Figure 5 presents the input images and their fusion results.The visible image displays a field view partially obstructed by smoke, the IR image captures thermal differences in areas that are obstructed.Some reference features are the color of the roof in the EO image and the people standing in the IR image.These features are depicted in the fused image.Our model generated enhancement of the fused image.Color Figure 6 presents the input images and their fusion results.The reference feature is a group of people in the woods under limited lighting, but depicted in the IR imagery.The fused image rendered the vegetation and the people.This dataset confirms the enhancement properties of our method.The all dark visible image has details bare eyes could not identify but the Bio-EMD filtered out.The dataset however shows the limitations on color enhancement.To have a colorful fused image reflecting reality, the visible image is required to capture some color difference so that opponent processing enhances the information carried by the different IMFs. Figure 7 presents the input images and their fusion results.The reference is a crouched down person unseen in Fig. 7(a), EO image, and difficult to depict in Fig. 7(b), IR image.The visible depiction provides no information about the scene, details are not perceptible; the IR image suggests that there is a crouched person on the scene and little can be said about the scene background.Figure 7(g) shows the Bio-EMD fusion result.Figure 7(g) shows the reference feature and the background can be identified, i.e., vegetation.Equation ( 4) presents the synthesis of Cb channels from blue signal excited and yellow signal inhibited.To get a Cb channel with significant information, some blue and yellow signals need to be present.The same is true with red and green signals to generate a Cr channel.The visible input image visibly lacks these two pairs of color, going through the fusion process.This justifies the low colorfulness of the fused image compared to the other two dataset results and presents the limitations of this algorithm.However, our method delivered a fused image showing sufficient details to detect and recognize objects on the scene.Objective evaluation relies only on the analysis of original images in comparison with fused image results.The evaluation process focuses on preservation of useful information and fused image depiction.

Assessment of Color Fusion Image Quality
The objective evaluation of fused images depends on the amount of information retained from the input images, the edge raggedness, the distinction between bright and dark pixels, and the vividness of the object representation. 9,39

Mutual information
The first evaluation criterion is the well-known mutual information (MI).In this contest, MI evaluates the quantity of information transferred from input images to fused image Z. Piella proposed the MI, I, between two inputs images X and Y fused to generate a composite image Z as the sum of the MI between the composite image and each of the inputs, divided by the sum of the entropies of the input images 40 IðX; Y; ZÞ ¼ IðX; ZÞ þ IðY; ZÞ HðXÞ þ HðYÞ : IðX; ZÞ is the MI between an image candidate to fusion and the resulting image; HðXÞ is the entropy of the image X.The higher the MI between X, Y, and Z, the more the information is transferred to Z. Thus, MI is a similarity measure.Table 3 contains the MI for the three datasets.The metric is computed for a fused image generated by each of the techniques tested.

Sharpness
The second metric utilized is the image sharpness metric (ISM) developed by Yuan and her colleagues 10 and defined as jWj is the total number of w (3 × 3 size windows) and G x and G y represent the Sobel operator at a pixel (x, y).Color image quality attribute sharpness is related to the clarity of details and definition of edges.Sharpness of an image includes details, line quality, adjacency, effective resolution, edge sharpness, and edge raggedness. 39,41Sharpness can be measured by the edge information.With a color image, sharpness relates to its luminance and therefore the gray intensity of the image.Table 4 presents sharpness evaluation of our technique and some others.

Contrast
Contrast is the perceived magnitude of visually meaningful differences, global and local, in lightness and chromaticity within the image. 32Contrast of an image is a perceptual attribute representing the ratio between the brightest pixel and the darkest pixel intensities.This is a dynamic range where higher values indicates better image contrast and lower values are synonym of lower contrast, lower quality.Many metrics have been developed for contrast evaluation in grayscale images. 29,41Yuan and her colleagues proposed to employ the L Ã channel from Commission Internationale de l'Eclairage standard CIE 1976 L Ã a Ã b Ã color space to evaluate the color contrast since human perception is more sensitive to the luminance in contrast evaluation.Equation (7)  defines their proposed image contrast metric (ICM) 9 where C g and C c represent the gray contrast and color contrast metric, w 1 and w 2 , and their corresponding weights; where PðI k Þ and PðL Ã k Þ are the probability density functions of the gray intensity I and the CIELAB L Ã N I and N L Ã are the total number of levels.I ranges from 0 to 255 while L Ã ranges from 0 to 100.α I and α L Ã represent the dynamic ranges for intensity and color such that where N indicates the number of pixel levels, N 1 the number of pixel levels with nonzero count, and N 2 ¼ N-N 1 .Table 5 presents contrast evaluation results in fused images generated by of our method and some other techniques.

Colorfulness
Color depiction and rendering is one of the major differences between current and past imaging systems.Colorfulness, also referred to as "chromaticness," is the attribute of a visual sensation according to which the perceived color of an area appears to be more or less chromatic. 30,31Yuan and her colleagues proposed a different approach based on color chroma metric CCM 1 and color variety metric CCM 2 such that the image colorfulness metric CCM is defined as where the chroma metric is defined by Eq. ( 11) and the variety metric is computed as presented in Eq. ( 12). 9 where C Ã represents the component computed in Eq. ( 12). 27 where jWj is the total number of all windows (w); the color difference gradient of pixel fðx; yÞ is defined in Ref. 9.
Colorfulness metric proposed by Hasler and Susstrunk 31 generated results similar to the ones in Table 6.
Bio-EMD transferred twice as much information as each of the others.It also generated twice the sharpness of the other techniques, and displayed more natural looking colors.Bio-EMD generated contrast values in the same range as other approaches.Colorfulness was weak based on the metric although the depiction is meaningful compared to others.
In general, each of these three datasets projected the same performance with the Bio-EMD method.Bio-EMD technique transferred the information from input images to output image better than the other techniques which was compared with MI.Also, Bio-EMD provided a better clarity of details and definition of edges (ISM).Although Bio-EMD resulted in more natural looking colors in the output image, the image vividness metric (ICC) utilized and others tested during this work did not always convey that strength.The evaluation of our result perceived magnitude of visually meaningful differences, global and local, in lightness and chromaticity, is in close range with the other methods (ICM).Better quantitative results may be obtained by tuning the filter parameters.

Conclusion
The analysis of existing fusion techniques applied to visible and IR images showed a need for an approach that performs color fusion of these two modalities and generates high-quality images with near true color.This work presents the development and testing of a new image fusion method based on EMD and opponent processing.EMD represents input images as IMFs carrying their spatial and frequency components about each pixel.Following a human retinal model, IMFs from visible and IR sources are combined utilizing the proposed network of dynamic equations feeding YCbCr channels of the output section.Testing was done on all the registered pairs of images available in Ref. 38 and the performance was consistent throughout all the samples.
Observation of the resulting images shows significant improvement compared to previously developed procedures.Quantitative assessment of the fused image attributes consisted of four metrics: MI, sharpness, colorfulness, and contrast.These metrics confirmed that the proposed approach generated twice the information transfer from original images compared to existing techniques.The clarity of details was comparable to the major color fusion techniques.Contrast generated in the fused images was adequate; however vividness of the images was subpar although the fused images obtained had more meaningful colors.This highlights the lack of a fused image metric.The Bio-EMD algorithm produced imagery with higher information content than either the low-light visible or IR input image taken separately.Fusion generated a depiction of objects seen only in one modality or not seen in the original images.This breakthrough can be applied in multisensor fusion involving visible bands and has applications in situational awareness, forensics, medical imaging, remote sensing, natural resources management, etc.The breaking point of this method is that it performs well only when there is a minimum of color information in the visible image; a visible image with no or very low-color information will produce a fused image with low quality.This is echoed by dataset 3 in Fig. 7. To get a Cb channel in fused image with significant information, some blue and yellow signals need to be present.The same is true with red and green signals to generate a Cr channel.The visible input image visibly lacks these two pairs of color, going through the fusion process.This justifies the low colorfulness of the fused image compared to the other two dataset results and presents the limitations of this algorithm.In order to obtain the best result, this fusion algorithm requires some minimum information from the pair of signals blue-yellow and red-green.How much information do we need?What is the threshold of color level in visible input image?The answers to these questions are the object of on-going research where we also consider the different parameters of the enhancement/inhibition filters.This will define the conditions of its applications.
sign indicates the center-surround filtering; (+) is for ON and (−) for OFF.Equation (4) summarizes the relationship governing the activities in the retina where the excitation and the inhibition are performed by filtering.The IR image in this work is a white-hot IR image.The design extracts the luminance (grayscale) from the IR image.Its chrominance signals (blue and red) may be added to the chrominance obtained after opponent processing (within and cross modalities) if the IR color information representing cold/warm objects are to be shown.The luminance Y of IR is opponent-processed generating an ON center-surround signal (R Y þ ) and an OFF center-surround signal (IR Y − ).These signals carry information about details in the luminance and are averaged forming the fused-image luminance.The color image has its noise removed through a median filter then generates a fourth channel (yellow) by averaging its red and green channels.Following neural activities in the retina, the blue channel IMF is excited (EO Blue þ ) whereas the yellow is inhibited (EO Yellow − ) to generate the blue chroma IMF.The red channel IMF is excited (R þ Y ), whereas the green is depressed (IR Y − ) to produce the red chroma IMF.The achromatic information has a high spectral sensitivity.IMFs in each channel are summed, generating three 1-D signals (Y, Cb, and Cr) converted into 2-D arrays, the fused image.When merging images of different modalities such

Table 1
Retina color excitation and inhibition.

Table 3
Mutual information results.

Table 5
Contrast results.