Hyperspectral imaging and robust statistics in non-melanoma skin cancer analysis.

Non-Melanoma skin cancer is one of the most frequent types of cancer. Early detection is encouraged so as to ensure the best treatment, Hyperspectral imaging is a promising technique for non-invasive inspection of skin lesions, however, the optimal wavelengths for these purposes are yet to be conclusively determined. A visible-near infrared hyperspectral camera with an ad-hoc built platform was used for image acquisition in the present study. Robust statistical techniques were used to conclude an optimal range between 573.45 and 779.88 nm to distinguish between healthy and non-healthy skin. Wavelengths between 429.16 and 520.17 nm were additionally found to be optimal for the differentiation between cancer types.

VNIR spectroscopy has also proven useful for the detection of melanoma [31], while frequencies between the wide window of 400 to 1700nm have proven useful in some studies for the detection of SCC [32]. Although infrared light seems to be the most common spectral range for detection of SCC and BCC, some studies have also achieved success detecting SCC tumours within the 450 to 900 nm range [33], or 500 to 900 nm for BCC [34].
Needless to say, however, hyperspectral imaging also has its disadvantages, seen in the high complexity and large size of the images obtained. From this perspective, the processing of images of this type can frequently be considered not only difficult, but also computationally expensive. A frequent task in the preprocessing of any dataset, prior to more advanced applications such as image classification or segmentation, consists in feature selection and extraction [35,36]. This process entails the calculation of areas that are most informative, allowing for the removal of redundant variables that may be hindering model performance. While many studies are able to extract valuable information from hyperspectral images of NMSC skin lesions [22][23][24][25][26], most perform these studies directly on entire datasets. In light of this, a preprocessing procedure such as that of feature selection could be considered a valuable step towards optimizing these approaches. While multiple approaches exist for the purpose of feature selection and extraction, some of the most important tools available in the field of data science are those used in advanced statistics. From this perspective, advanced statistical analyses of hyperspectral data can provide a valuable insight into the precise nature of the data at hand, as well as any underlying patterns, thus facilitating the selection of the most important wavelengths for subsequent classification tasks.
The present study develops this perspective for the analysis of NMSCs, using samples of hyperspectral images obtained from both BCC and SCC patients. The present study employs the use of a pushbroom hyperspectral linear camera registering wavelengths 398.08 to 995.20 nm over 270 bands. Using robust statistical approaches, these analyses show a window between 573. 45 and 779.88 nm to be particularly powerful for the detection of differences between healthy skin and both types of NMSC lesions. This type of analysis can be considered a fundamental basis upon which more complex computer vision and artificial intelligence-based studies can be built upon.

Hyperspectral image acquisition
A Headwall Nano-Hyperspec, Visible-Near InfraRed (VNIR) hyperspectral imaging sensor, was used for the purpose of the present study (Table 1). This particular sensor is a pushbroom linear camera, offering a vectorial array of pixels (1 × 640 px per image). While pushbroom sensors present the disadvantage of a longer measuring time than that of snapshot hyperspectral cameras, pushbrooms offer higher spatial resolution than many other types of hyperspectral sensors. Nevertheless, these sensors require that the Field-Of-View (FOV) be moved, or pushed, along the x-axis in order to obtain an entire image of more than one column of pixels. For this purpose, the present study built an ad-hoc platform.
The platform constructed for this study consists of a motorized structure, composed of an 80 cm long aluminium rail with a motorized base. Control of this base can be managed by external hardware that can control the speed and modality of base displacement (single or loop movements). On top of the base, a multifunctional structure was constructed to fit the hyperspectral sensor, alongside a system of illumination, as well as a frame that controls the distance between the object and the scanning window ( Fig. 1(a) & 1b). The system of illumination consisted of two 60 watt halogen light sources mounted on either side of the hyperspectral sensor, with a distance of 14 cm between themselves, and 19 cm between the lens and the object to be photographed ( Fig. 1(a)).
In order to control the platform, an external electronic module device was designed ( Fig. 1(c)), synchronising the movement of the platform with the illumination system and the sensor's shutter speed. This is also highly important so as to ensure a stable displacement speed and thus optimise the generation of a final complete image. With a simple switch, this controller could manage all three elements, as well as simultaneously managing the power source for the entire system (motor, platform control, illumination and the camera; Fig. 1(d)). For data acquisition, each image was calibrated using the same marker board and frame. For this purpose, the present study used a known reflectance pattern (Spectralon) in order to obtain reflectance values for all 270 bands of the camera (Fig. 2(a)), instead of digital values. Reflectance values were calculated and radiometrically corrected (Eq. (1)). This consists in taking raw uncorrected data (S) from images and calculating % reflectance values (X), taking into consideration the dark current (B) and a "white" reflectance standard (W) [37]. Dark pixel offset values were thus obtained by taking photographs with the lens cap covered [38]. Considering how charge-coupled devices are never capable of measuring an absolute zero value (black), even in cases where no light is available, the present study performed these calibrations so as to calculate the presence of residuals prior to photographing each patient and thus removing dark noise across all bands (Eq. (1), Fig. 2(b)). W values were obtained from a designated region on the Spectralon ( Fig. 2(b)). Upon the Spectralon frame, circular elements were additionally used to calibrate the movement and speed of the platform, whereby the known shape and size of these circles could be used to correct displacement and ensure removal of image distortion ( Fig. 2(b)).
In order to perform these calibrations, a software tool was designed to perform radiometric corrections for both eliminating dark current as well as obtaining reflectance images.
Once the entire image had been generated, each image's size was measured at 640 × 1785 px, consisting of up to ≈ 308.5 million digital values and occupying ≈ 1.2 Gb of memory. After cropping images to remove the calibration marker board, final images were of size 431 × 851 px. The first and last five hyperspectral bands were also removed as a safe-measure prior to further processing, as they were occasionally observed to present unusual anomalies. The final spectral window was thus calculated to fall between 409.18 and 984.10 nm. After cropping, final images could therefore be reduced to tensors of size 431 × 851 × 260 (rows, columns and bands) with ≈ 95.4 million numeric values, occupying ≈ 0.3 Gb of memory.

Sample
A total of 115 patients with observed skin lesions were selected and photographed using the hyperspectral sensor. After images had been acquired, the presence and type of cancer was confirmed after a histopathological analysis and final diagnosis. Of the 115 patients studied, 1 patient was diagnosed with melanoma, 1 patient with merkel-cell carcinoma, 5 with actinic keratoses, 67 with BCC, and 22 with SCC. 19 patients were not diagnosed with malignant skin cancer.
All patients agreed to participate in the study, however due to patient animosity, no further details have been disclosed. All patients were registered and treated at the Institute for Biomedical Research of Salamanca (IBSAL), University Hospital of Salamanca, Spain.

Hyperspectral signature analysis
Once images had been obtained, careful evaluation of the characteristics of each image was performed. After careful visual inspection, some images were found to be out of focus due to patient movement. Each image was therefore assess and discarded in cases where quality was found to be insufficient (blurry of incomplete). After careful inspection, 3 SCC patients were unfortunately discarded due to insufficient image quality, while 26 BCC patients were discarded. Final medical samples, therefore, consisted of 60 patients presenting 41 confirmed cases of BCC and 19 cases of SCC.
Once the best images had been obtained, Regions of Interest (ROI) were established to sample pixels of Healthy (H) and pathological skin (BCC & SCC) (Fig. 3) on each of these patients. SCC and BCC ROIs were established directly over the tumour, while H samples were taken from skin farthest away from the tumour so as to avoid possible contamination. After defining ROIs for each of the images, a Python algorithm randomly sampled pixels to extract hyperspectral signatures with as little intervention by a human analyst as possible. Randomness was employed so as to avoid subjective sampling. Sampling was additionally performed until a sufficiently large sample size of pixels had been obtained for H, SCC and BCC. The final selection obtained consisted of 504 hyperspectral signatures for BCC samples, 513 signatures for SCC samples, and 488 signatures for healthy (H) skin samples (total n = 1,505).
In order to determine the best statistical means of characterising this data, signatures were first subjected to normality testing. The concept of "normality" is a fundamental component in statistics, considering how many statistical tests are conditioned by precise underlying mathematical properties. From this perspective, and in order to select the most reliable statistical tests for comparing samples, the analyst must be aware of the nature of the distributions being studied.
For these purposes, Shapiro-Wilk tests were first passed over each band for each of the samples [40]. Shapiro-Wilk testing was additionally complemented with a visual inspection of each distribution using density plots and quantile-quantile plot calculations [41]. From a similar perspective, mean residual counts from a linear Gaussian model were calculated for each band to visualize areas of greatest deviations from the mean. Residual calculations were performed so as to assess areas where a simple linear model is less likely to capture the general trend of the distribution in question. Calculations for sample skewness and kurtosis were also performed, so as to better define the nature of each distribution. With regards to skewness, in cases where samples are normally distributed, sample skewness would be expected to be close or equal to 0, indicating a symmetric distribution. As for kurtosis calculations, distributions that are normally distributed typically present kurtosis values close to 0, indicating neither an excessive concentration of information (kurtosis > 0), nor a wide spread of values (kurtosis < 0).
Upon accepting or rejecting the null hypothesis, H 0 , of normally distributed data, different statistical approaches were used to define the hyperspectral "signature" of each sample.
√ BWMV values are calculated in accordance with the Median Absolute Deviation (MAD, Eq. (2)), which takes the absolute difference of each value (x) to the sample median (x-tilde). When non-symmetric measures of variance were required, robust quantile calculations were performed using 95% confidence intervals [41].
For hypothesis testing, tests were performed to analyse homoscedasticity and thus determine areas of important differences in variance. For parametric testing of homoscedasticity, the Bartlett's test was used [48], while non-parametric tests employed Levene's test [49]. Both the Bartlett and Levene test assume H 0 to infer samples have equal variance. For multivariate analyses, a Multivariate Analysis of Variance (MANOVA) was performed, using either the Hotelling-Lawley test statistic for parametric approaches [50], while in cases where distributions proved to be non-homogeneous, a pairwise Wilcoxon test was performed [51]. In each of these tests, H 0 assumes samples to be similar.
In addition to hypothesis testing, and as a means of comparing differences and similarities between probability distributions, the Jensen-Shannon Distance (JSD) was computed using the Kullback Leibler divergence [52,53]. This method was used to measure the similarity between different distributions across the spectrum, thus finding areas of greatest separations between samples in accordance with mutual information theory. Samples that are considered similar would thus produce distance calculations closer to 1, while values closer to 0 indicate greater differences between sample distributions.
All statistical applications were performed in the R programming language (v.4.0.4) [54]. The Python programming language (v.3.7.4) was also employed for hyperspectral image processing and signature extraction.

Evaluation of hypothesis test results
Recent years have seen a rise of criticism on the "blind" use of p-values for withdrawing scientific conclusions, especially with regards to the use of p < 0.05 for hypothesis testing [55,56]. In light of this, the present study has made a particular effort to avoid the misuse of p-values, so as to ensure the highest possible validity of the presented conclusions.
In accordance with the most recent recommendations set forth by the American Statistician [55,56], p-values were not evaluated using the more traditional p < 0.05 as a threshold for defining statistical significance. Likewise, the term "significant" has been avoided throughout the present study. Instead, frequentist p-values were evaluated in accordance with calibrated Bayesian statistical approaches, converting each p-value into Bayesian Factor Bound values, following the suggestions by Benjamin and Berger [57]. In each of these cases, the upper bound on the posterior probability of the alternative hypothesis, H a , were calculated (P U (H a |p); Eq. (6) & (7)); Similarly, where necessary, the Bayes Factor Bound (BFB) derived from Eq. (6) was used to determine posterior odds of H a to H 0 for each of the tests. For the majority of tests, each calibration was performed using prior probabilities indicative of complete randomness (prior = 0.5), as suggested by Colquhoun [58]. Nevertheless, for the construction of confidence intervals around these calculations, prior probabilities of 0.8 and 0.2 were also considered and reported.
To assess and account for possible Type I statistical errors among hypothesis tests [58], each of these p and P U (H a |p) values were also accompanied by calculations of the False Positive Risk (FPR; Eq. (8) & (9)); While a number of different formulae have been proposed as a means of defining the likelihood ratio of H 0 against H a , in other words L 10 in Eq. (8) [58,59], the present study employs the Sellke-Berger approach [57,60], which is the equivalent of using Eq. (6) for the calculation of FPR (Eq. (9)). Additionally, considering observations by Courtenay et al. [61], where necessary, a complementary calculation of FPR for deriving the probability of H 0 (P(H 0 )) was also performed (Eq. (10) & (11)); P(H 0 ) is used as a means of calibrating p-values above p = 0.3681, considering how observations by Courtenay et al. [61] found this value as a point of maximal curvature in p-value calibration curves using equations 7 & 9. From this perspective, the inverse of FPR (Eq. (10)) can be used to ensure each p-value between 0 and 1 have their own unique calibration values [61].
For the interpretation of these calibrated metrics, BFB values indicate the data-based odds of H a being true to H 0 , whereby high values of BFB support H a . So as to facilitate the interpretation of these odds, the P U (H a |p) calculation ensures this number is reported as a percentile falling between 0.5 and 1. The posterior odds of H a to H 0 are interpreted the same as BFB, where high values support H a . FPR, on the other hand, returns a decimal value between 0 and 0.5, with 0.5 indicating a high chance of the concluding hypothesis to be a Type I statistical error. The P(H 0 ) thus ensures that FPR values are returned between 0 and 1, with 1.00 indicating a 100% probability of incorrectly concluding H a to be true.
In light of these calibrations, p-values were evaluated in accordance not only with their corresponding P U (H a |p), FPR and P(H 0 ), but also using a more robust and conservative p values < 3σ from the mean (0.003) as a threshold for more conclusive results. For ease of comparison and calibration, Table 2 presents different p-values calibrated using metrics BFB, P U (H a |p), FPR and H a to H 0 ratios. Table 3 presents different p-values calibrated using P(H 0 ). As can be seen, the traditional p < 0.05 (2σ) presents a very low BFB value of 2.5 to 1, with a 28.9% probability of being a Type I statistical error using prior probabilities of 0.5 in support of the alternative hypothesis. This indicates that p < 0.05 is not a reliable threshold to define conclusive evidence. p < 0.003 (3σ), on the other hand, results in BFB values of 21 to 1, with only a 4.5% chance of being a Type I statistical error using the same prior probabilities.

Descriptive statistics
Hyperspectral curves for all three samples (H, BCC & SCC) presented highly inhomogeneous distributions across most of the spectrum (Fig. 4 & 5). This was especially evident for frequencies below 699.97 nm (Fig. 5). While frequencies above this threshold presented increasingly more  Calculations regarding residuals when fitting linear models onto the data additionally reveal a notable decrease in residuals towards the NIR regions (>702.19 nm) of the spectrum (Fig. 6). Combined with observations regarding sample skewness, it can be seen how regions of lower frequencies present great positive skewness (skewness > 0, Fig. 7), contributing to the lack of  Fig. 7). Nevertheless, as noted by the lack of normality in SCC samples, positive skew remains high throughout. In light of each of these observations, it can be seen how SCC is strongly characterised by a highly inhomogeneous and skewed distribution throughout the spectrum, while BCC and H hold a more Gaussian-like nature towards the end of the visual light spectrum. Upon plotting central tendency and variance curves, it can be seen how BCC samples reflect the least amount of light across all frequencies, while SCC reflects the most light, especially between the ranges of 606.74 and 862.02 nm. Healthy skin samples, on the other hand, appear to have signatures midway between the two samples, appearing more similar to SCC samples. Nevertheless, great overlapping is observed across most samples (Fig. 8), especially between H and SCC samples below 600 nm. This is especially evident when observing central tendency calculations. The greatest differences are observed when considering the variability of sample distributions ( √ BWMV), highlighting the importance of robust statistical approaches in this type of analysis.
In either case, the notable peaks in reflectance between ca. 600 nm and 850 nm indicate all three samples to be strongly characterised by greater reflectance of orange, red, and the lower frequencies of NIR light (Fig. 8), than any other part of the visible spectrum.

Univariate hypothesis testing
From a more analytical perspective, Levene tests reveal most samples to reject the terms of homoscedasticity, especially when comparing H and BCC, as well as between SCC and BCC (Fig. 9). Nevertheless, comparisons reveal high similarities in the variances between SCC and BCC (minimum F = 4.5e-09, p = 0.17, P U (H a |p) = 0.55), with posterior odds of at most 1 to 0.61 ∈ [0.24, 0.85] in favour of the alternative hypothesis.
In the case of comparing H with BCC, Levene's test reveals important divergences in the range 571.23 to 651.14 nm ( Fig. 9; central F = 11.4, p = 0.0007, P U (H a |p) = 0.987), with posterior odds of at most 1 to 36.7 ∈ [14.7, 58.7] in favour of the alternative hypothesis, and a 1.3 ∈ [0.4, 5.2]% chance of this observation being a false positive. When comparing both types of cancer, deviation also occurs beyond 571.23 nm (Fig. 9), however, in this case divergences are prolonged throughout the spectrum until 691.09 nm, with the exception of 7 bands within this 54-band range. Furthermore, while divergences between SCC and BCC are slightly less marked than  In addition to these windows, BCC and SCC were also found to differ in an additional window towards the violet, blue and cyan regions of the visible spectrum. From this perspective, Levene's test found notable divergences between 440.25 and 502.41 nm ( Fig. 9; central F = 11.2, p = 0.0009, P U (H a |p) = 0.984). In this case, only 2 bands proved an exception to this rule in the 28 band window. This window is additionally associated with posterior odds of at most 1 to 30.4 ∈ [12.1, 48.6] in favour of H a . FPR values were calculated at 1.6 ∈ [0.4, 6.2]%. Table 4 presents a summary of these results and the windows where the greatest differences have been found.

Jensen-Shannon distances
Similarity measures in accordance with JSD ranged from highly similar probability distributions (JSD = 0.62) to areas of notable divergences (JSD = 0.001). Nevertheless, the only window presenting convergence of differences across all samples was found to be located approximately between 582.32 to 748.81 nm (JSD median = 0.011, range = [0.004. 0.041]). When considering differences between cancer samples and healthy skin, the greatest divergence of BCC from H was found at 735.49 nm (JSD = 0.006), while SCC from H was found at this frequency as well (JSD = 0.004). In the case of separating between different types of cancer, the greatest differences for BCC and SCC were found at 673.33 nm (JSD = 0.006).
Interestingly SCC and H are clearly differentiable from most points beyond 580.10 nm, with only 3 peaks in similarity in the NIR spectrum at ca. 890.87, 943.04 and 948.59 nm (Fig. 10).

Multivariate testing
When comparing all the different results obtained throughout this study, a final window where most major differences seem to be located can be established between 573.45 and 779.88 nm, occupying a substantial proportion of the visible light spectrum, including the spectral colours yellow, orange and red, as well as the beginning of the NIR light spectrum (Fig. 11). Performing multivariate statistical analyses using the Wilcox test across this region reveals important differences between H and SCC (p < 2.0e-16, P U (H a |p) ≈ 1), as well as between H and BCC (p = 1.5e-12, P U (H a |p) ≈ 1). In the case of H vs SCC, this corresponds to a posterior probability of at most 1 to 2.5e+13 with prior odds of 1:2, and thus a worst case scenario of a posterior probability of at most 1 to 1.0e+13 when considering prior odds of 2:10. In the case of H vs BCC, a worst case scenario's posterior probability can thus be calculated at 1 to 1.8e+09. Needless to say, on both accounts, the probability that this observation is a Type I statistical error is 1.1e-08%.
Needless to say, when processing the entire spectrum, all 270 bands present important multivariate differences between H and both cancer samples. Nevertheless, p-values are slightly higher (p < 4.5e-12), while BCC and SCC are still indistinguishable (p = 0.3, P U (H a |p) = 0.5).
Finally, when these bands are used to visualise skin lesions, it can clearly be seen how certain frequencies have greater potential of isolating cancerous skin cells over others, helping in determining areas of subclinical invasion, which may be a valuable tool for future classification tasks as well as for applications in surgical removal of these types of lesions (Fig. 12).

Discussion
While cancer is often perceived to be a disease of modernity, studies have shown neoplasms to have great antiquity [62][63][64], having an effect on most living things at least 255 million years ago (Mya) in mammals [65], and 1.7 Mya in humans [66]. Nevertheless, the severity and increase in malignancy in these pathological phenomena is of increasing concern to modern-day society, a fact that is strongly conditioned by our way of life. NMSC is the most frequent type of cancer in humans [67], representing a major health problem in fair skinned elderly people, while being associated with elevated health costs [68]. Moreover, beyond this, cancer is known to have a significant emotional impact on not only the patient but their families too [69].
Timely detection of skin cancer is fundamental for suitable treatment and improving patient survival rates. The present study has revealed important statistical differences between the multiple samples analysed within the VNIR spectrum using a linear hyperspectral camera. Robust statistical techniques, as well as feature selection algorithms, have been able to highlight these differences particularly within the range of 573.45 and 779.88 nm; occupying a considerable portion of yellow, orange and red visible light, as well as a more reduced proportion of the NIR spectrum. Moreover, the probability that observations separating healthy skin from cancer are a false positive has been calculated at less than 1% based on the present data. While differences between both cancer types were limited within these frequencies, a secondary window was found to be important for these samples between 429.16 to 520.17 nm; occupying portions of violet, blue, cyan and green visible light.
Multiple studies have shown how substances such as melanosomes, collagen, blood and water, affect the spectral signatures of skin and other biological tissues [70]. Similarly, the content of these substances varies among different parts of the body, causing great natural variability within healthy skin hyperspectral signatures. Most studies concur that one of the greatest conditioning factors in the morphology of skin spectral signatures is product of melanin's red light absorption rate, explaining the large peaks and variances in the range of 600 to 800 nm [22,24,26,33,70]. While this is evidently a greater biomarker for the study of melanoma, the present study also highlights portions of this range for the detection of NMSCs as well. Likewise, substances such as haemoglobin have been noted to condition reflectance within the range 530 to 600 nm, while pagetoid growth is known to affect the lower end of the visible spectrum, between 400 and 500 nm.
Halicek et al. [33] describe these phenomena in a study of SCC patients, noting haemoglobin to reflect less light in SCC than healthy skin. While the signatures between 530 and 600 nm are relatively similar in the present study, SCC is indeed seen to reflect 0.65% less light than H samples, while BCC presents a difference of 0.16% less light reflected. While the present study has a smaller number of SCC patients, each of these differences have been reported here as minute, revealing no statistical data that supports using this region of the spectrum as a diagnostic biomarker. Moreover, a similar lack of statistical differences was noted by Gareau et al. [71] and Hosking et al. [26] in the case of melanoma.
Halicek et al. [33] also describe SCC samples to reflect more light from 600 to 900 nm than healthy skin, concurring with observations by Pardo et al. [24]'s study of melanoma patients. These observations were additionally attributed to melanin's greater absorption rates of light. Via robust statistical techniques, Pardo et al. [24] additionally prove this a valuable biomarker. The present study confirms both these observations, concluding SCC to have a slightly higher absorption rate than healthy skin (0.53% less reflected light), while BCC is strongly characterised by a greater absorption of this type of light (0.85%), much similar to melanoma.
Observing the differences found within the present study for SCC and BCC between 429.16 and 520.17 nm, in a study on melanoma, Hosking et al. [26] note blue light to be particularly susceptible to differences in the dermoepidermal junction. From this perspective, it can be argued that this region of the spectrum is useful for containing information regarding the specific variability of skin cancer types. Considering how SCC and BCCs affect different types of cells, particular atypia and possible variances in pagetoid spread may be contained within this region of the spectra, proving a possible starting point for future investigation in hyperspectral diagnostics.
Nevertheless, the present study still has some limitations. From one perspective, the characteristics of the platform designed, while proving useful, present room for improvement. Considering most patients were of elderly age, the patient's ability to stay still during the 15 s exposition time was greatly reduced. In cases where skin lesions were found on patient's hands, possible tremors meant many images presented a "wavy" pattern or appearance. Similarly, when lesions were found on the face or top of the head, many patients commented on a lack of comfort trying to hold their heads in position. As a consequence, a number of images were unfortunately discarded. In order to overcome this, future efforts should consider the use of a more ergonomic platform, designed with a resting pad where the patient can place their arm or head in a more comfortable position during image acquisition processes.
From a similar perspective, considering the natural variability most lesions presented, due to within-group typological variances among NMSCs, a larger sample of different types of SCC and BCC tumours should be obtained. Moreover, a large number of the patients sampled here were at advanced stages of both tumour growth and spread, especially in the case of SCC patients. From this standpoint, research into earlier stages of carcinoma development and metastasis would be a valuable step towards ensuring, not only efficient diagnoses, but also early detection. It is also strongly recommendable that research goes into premalignant lesions -such as actinic keratosis-, as well as non-malignant nevi, so as to provide a better diagnostic tool.
Similarly, although the present study concurs with data provided by other authors, here artificial skin pigmentation prior to hyperspectral image acquisition was avoided, making some comparisons with other studies that did use pigmentation difficult. Likewise, the natural variability of human skin pigmentation is likely to be an important factor when performing future studies. From this perspective, it can be predicted that healthy skin signatures for fairer-skinned patients will present greater differences than those observed here, with trends towards more reflected light in the 530 to 600 nm range, while less light would theoretically be reflected between 600 and 800 nm.
Needless to say, the application of robust statistical measurements was still able to reveal important features of SCC and BCC signatures, presenting promising possibilities for applications with larger patient sample sizes. Similarly, the differentiation between H, SCC and BCC samples, in specific regions of the spectrum, allows for efficient feature selection. The most recent advances in the integration of hyperspectral imagery to medical applications have shown these tools to greatly increase the accuracy when delimiting skin lesions as opposed to human analysts [24,34]. While the present study has not taken this step, a detailed statistical analysis and characterisation of these types of skin lesions is fundamental for more advanced applications. In sum, and based on the present data, it can clearly be seen how robust statistical analyses of this nature provide a basis upon which more complex computational learning applications can be built from, especially using artificial intelligence.

Conclusion
In this study, robust statistical tests were employed on hyperspectral data in the VNIR spectrum in order to identify the hyperspectral differences between carcinomas (SCC & BCC) and healthy skin (H). The optimal spectral ranges for discrimination between SCC, BCC, and H, as well as between BCC and SCC, have been defined using robust statistics. The results are especially promising for the discrimination between cancerous and non-cancerous areas, paving the way for future research in NMSC diagnostics using hyperspectral images.
A study with larger patient samples should be carried out in the future, allowing for a wider representation of lesion variability (including different stages of growth and stage). The same can be said considering the variability of healthy skin (including features such as non-cancerouslesions). Similarly, while the presented ad-hoc platform has been designed for its implementation in clinical practice, especially as a tool to support medical practitioners, further improvements in the image acquisition platform should also be made in order to make it more ergonomic and facilitate data acquisition. This can be considered of great importance for more practical scenarios, especially for elderly patients.