PRNU enhancement effects on biometric source sensor attribution

: Identifying the source camera of a digital image using the photo response non-uniformity (PRNU) is known as camera identification. Since digital image sensors are widely used in biometrics, it is natural to perform this investigation with biometric sensors. In this study, the authors focus on a slightly different task, which consists in clustering images with the same source sensor in a data set possibly containing images from multiple unknown distinct biometric sensors. Previous work showed unclear results because of the low quality of the extracted PRNU. They adopt different PRNU enhancement techniques together with the generation of PRNU fingerprints from uncorrelated data in order to clarify the results. Thus they propose extensions of existing source sensor attribution techniques which make use of uncorrelated data from known sensors and apply them in conjunction with existing clustering techniques. All techniques are evaluated on simulated data sets containing images from multiple sensors. The effects of the different PRNU enhancement approaches on the clustering outcome are measured by considering the relation between cohesion and separation of the clusters. Finally, an assessment on whether the PRNU enhancement techniques have been able to improve the results is given.


Introduction
Investigations in the field of digital image forensics usually comprise forensic tasks, such as device identification, device linking, recovery of processing history and the detection of digital forgeries. The photo response non-uniformity (PRNU) of an imaging sensor has emerged as an important forensic tool for the realisation of these tasks. Slight variations among individual pixels during the conversion of photons to electrons in digital image sensors are considered as the source of the PRNU; thus, it is an intrinsic property which forms an inherent part of all digital imaging sensors and their output, respectively. All digital image sensors cast this weak, noise-like pattern into each and every image they capture.
This systemic and individual pattern, which enables the identification of the image sensor itself, is essentially an unintentional stochastic spread-spectrum watermark that survives processing, such as lossy compression or filtering. Essential criteria like dimensionality, universality, generality, stability and robustness [1] make it well suited for forensic tasks, as the ones mentioned before. The identification of a digital image sensor can be performed at different levels as described by Bartlow et al. [2]: technology, brand, model, unit. In this work, we focus on the unit level, which corresponds to a distinction of sensor instances of the same model and brand. For the purpose of sensor identification, a so called PRNU fingerprint can be calculated from multiple images of the same sensor, which is considered to be more robust for this task than a single image.
Besides the application of the PRNU for forensic tasks in general, it can also be useful in a biometric context. A biometric sensor's PRNU can also be used to improve a biometric system's security by ensuring the authenticity and integrity of images acquired with the biometric sensor deployed in the system. Previous work by Uhl and Höller [3] performed a feasibility study on the CASIA-Iris V4 database. They investigated the differentiability of the sensors in the CASIA-Iris V4 database by exploiting their PRNU and concluded that the equal error rates (EERs) and respective thresholds fluctuate considerably, depending on the sensor. Other work by Kalka et al. [4] regarding the differentiability of iris sensor showed varying results as well, while studies conducted on fingerprint (FP) sensors by Bartlow et al. [2] showed more satisfactory results.
The question raised, that if PRNU FPs are being applied as an authentication measure for biometric databases, the reason for the poor differentiation results for some sensors has to be investigated. On the one hand, it was assumed that this high variation could be caused by the correlated data that was used to generate the sensor's PRNU FP, since all images investigated in [3] have a very similar image content. On the other hand, Kalka et al. [4] concluded that the variations are caused by the absence of the PRNU in saturated pixels (pixel intensity = 255) or under saturated pixels (pixel intensity = 0) for different images in the data sets. Furthermore, Uhl and Höller [3] suspected that multiple sensors may have been used for the acquisition of the CASIA-Iris V4 subsets. If a PRNU FP is generated using images of different sensors, it will match images acquired with all of these sensors and hence lead to a decreased differentiability. Other factors that negatively have negative effects on the differentiability are non-unique artefacts (NUAs) [5] and other high frequency components of the images, such as textured image content or edges. Several techniques to attenuate PRNU contaminations have been proposed in the literature [6][7][8][9][10][11][12].
For the previously mentioned sensor identification task the PRNU FPs are usually pre-calculated using images from sensors available to the investigators. However, when we think about a realistic scenario, this availability is not always given. The images under investigation could be part of an image set containing images from an unknown number of different cameras. Before an image source identification can be performed in this scenario, images acquired with the same camera need to be identified and grouped together first. This task is known as source camera attribution in an open set scenario [13] or source camera clustering. Several clustering techniques have already been suggested by other researchers, who performed hierarchical agglomerative clustering [14,15] or multi-class spectral clustering (MCSC) [13] for this scenario by formulating the classification task as a graph partitioning problem. Other related work by Bloy [16] relies on an iterative algorithm that progressively agglomerates images with similar PRNU using a pre-calculated threshold function to generate a PRNU FP for the sensor. Some of the source sensor attribution techniques used in [17] are used in this work together with the previously mentioned approach of Bloy [16] [14,15,18]. The size of the extracted PRNU for consumer cameras used for source sensor attribution found in the literature ranges from a very small size of 128 × 128 [15], 256 × 512 [14], 640 × 480, [19] to full size images of several megapixels, where the most common size appears to be 1024 × 1024 [16,20]. The results reported for consumer cameras show that the size of the extracted PRNU plays a major role for the performance of the various techniques, where plausible results can be obtained with PRNU patches larger than 1024 × 1024 pixels in general and 256 × 512 pixels using additional PRNU enhancements.
In this work, we conduct a source sensor attribution on different biometric data sets from different biometrics modalities, which aims at determining whether the images in the data sets described in Section 4 have been acquired using multiple instances of the same sensor model. The investigation is conducted without taking any a priori knowledge about the sensors into consideration. To improve the quality of the extracted PRNU, we make use of various PRNU enhancement techniques which aim at attenuating undesired artefacts in the extracted PRNU as described in Section 2. Furthermore, additional uncorrelated data acquired with the same sensors as utilised to acquire the data sets is used for the generation of high-quality PRNU FPs. The performance of using the high-quality PRNU FPs is compared to the application of the various PRNU enhancement techniques. We propose novel extensions of the previously mentioned source sensor attribution techniques in Section 3 to be able to make use of the uncorrelated data. Section 5 explains the experimental set-up and describes the measure used for the evaluation of the clustering outcome and also contains the discussion of the experimental results. Finally, Section 6 concludes the paper.
This work is an extended version of a paper previously published in [21]. We extend our previous work by proposing additional source sensor attribution techniques that make use of uncorrelated data from known sensors and measure their performance on simulated data sets containing images from multiple sensors and different PRNU sizes as well as on existing biometric data sets mostly containing an unknown number of source sensors. Furthermore a quantitative assessment on the effects of using data from known sensors compared to various PRNU enhancement approaches and the combination of both of them is given based on a metric measuring the cohesion and separation of the clustering result for each technique.

PRNU extraction and enhancement
The extraction of the PRNU noise residuals is performed by applying Fridrich's approach [22]. For each image I the noise residual W I is estimated as described in the following equation: where F is a denoising function filtering out the sensor pattern noise. In this work, we made use of four different denoising algorithms: The two wavelet-based denoising filters proposed by Lukas et al. in Appendix A of [23] (F Luk ) and Mihcak et al. in [24] (F Mih ), the BM3D denoising filter proposed by Dabov et al. [6] (F BM3D ) and the FSTV algorithm proposed by Gisolf et al. [9] (F FSTV ).
After the PRNU extraction the noise residual W I may be contaminated with undesired artefacts. To attenuate their effects different PRNU enhancement techniques have been proposed in the literature. Zero-meaning of the noise residuals's pixel rows and columns (ZM) removes NUAs with regular grid structures as described in [22]. Li [7] developed a technique for attenuating the influence of scene details or textured image content on the PRNU so as to improve the device identification rate of the identifier. This approach is referred to as Li. According to Lin and Li [12] some components of the extracted PRNU noise residual are severely contaminated by the errors introduced by denoising filters. They proposed a filtering distortion removal (FDR) algorithm that improves the quality of W I by abandoning those components. The extracted and enhanced PRNU noise residual for a sample image using the various denoising filters and PRNU enhancements can be seen in Fig. 1.
Finally, the PRNU noise residual W I is normalised with respect to the L 2 -norm because its embedding strength is varying between different sensors as explained by Uhl and Höller [3].
The PRNU FP K of a sensor is then estimated using a maximum-likelihood estimator for images I i with i = 1, …, N.
PRNU FPs can be contaminated with NUAs as well. To further enhance the quality of PRNU FPs a Wiener filtering (WF) applied in the discrete Fourier transform domain is proposed in [1] to suppress periodic artefacts. Lin and Li [11] proposed a novel scheme named spectrum equalisation algorithm (SEA), where the magnitude spectrum of the PRNU FP K is equalised through detecting and suppressing the peaks according to the local characteristics, aiming at removing the interfering periodic artefacts. A method to detect the presence of a specific PRNU FP in an image which has not been geometrically transformed is the normalised cross correlation (NCC), which is defined as A and B are two matrices of the same size w × h and Ā and B are their respective mean. The mean of a matrix X with size w × h is defined as The NCC is used to detect the presence of a PRNU FP K in an image I with where ρ indicates the correlation between the noise residual W I of the image I and the PRNU FP K weighted by the image content of I.
On the other hand, the NCC can also be used to measure the similarity of two PRNU noise residuals Ŵ I and Ŵ J from two sensors S i and S j , as shown in the following equation: Fridrich [1] proposed an alternative technique for measuring the similarity of two PRNU noise residuals or a PRNU noise residual and a PRNU FP, the peak correlation energy (PCE), which has proven to be yield more stable results in a scenario where the images have been subject to geometrical transformations, such as rotations or scaling. Since all images used in this work have not undergone any of these transformations and Kang et al. showed that PCE by definition may increase the false positive rate in [25], we decided to use the NCC over the PCE.

Source sensor attribution techniques
In this work, we consider various techniques for the source sensor attribution task, where we apply various existing source attribution techniques and propose a novel one. We furthermore propose novel extensions for these existing methods for the case that the sensor is available to the investigators and uncorrelated data is used to generate the PRNU FP. The uncorrelated data is generated by acquiring images with high saturation (but not over saturated) and smooth content, according to Fridrich [1]. All the mentioned clustering techniques generate a list of clusters, where the association of each image in the investigated data set to a cluster and thus a cluster label is obtained. The novel extensions of the existing methods together with a brief explanation of the original techniques are given in the following section.

Known sensor blind camera fingerprinting and image clustering ((KS)BCF)
In [16] Bloy proposed the blind camera fingerprinting and image clustering (BCF) technique, which performs an agglomerative clustering to construct PRNU FPs from a mixed set of images, enabling identification of each image's source camera without any prior knowledge of source. This technique solely depends on a precalculated threshold function. Using this threshold function t an automatic clustering algorithm performs the following steps: 1. Randomly select pairs of images until a pair is found whose noise correlation exceeds t(1); average the PRNU of this pair to form a FP. 2. Perform the first pass: for each remaining image, correlate the PRNU with the FP. When the correlation value exceeds t(# of images in FP cluster), average (cluster) it into the FP. When n = 50 images have been averaged into the FP or all images have been tried, stop and go to Step 3. 3. Perform the second pass: loop over all the unclustered images a second time, correlating with the current FP and adding those that exceed the threshold. (Do not average more than 50 images into the FP but allow more than 50 to be associated with the FP.) 4. Repeat Step 1. Stop when Step 1 has tried 1000 pairs without success.
To be able to use the uncorrelated data, the first step (Step 1) is modified so that during the first iteration a PRNU FP is calculated from the uncorrelated data and the selection of two random images is skipped. After this modified step each remaining image is compared to this FP as described in Steps 2 and 3. After comparing all images, Step 1 is repeated as in the original algorithm by selecting two random images. We call this extension Known Sensor Blind Camera Fingerprinting and Image Clustering (KSBCF), as noted in the original paper [21].

Known sensor sliding window fingerprinting ((KS)SWx)
The Sliding Window Fingerprinting (SW) technique proposed in [26] consists of a so called 'sliding window' with an arbitrary but fixed size n that moves over a data set image by image. This forensic technique uses an iterative algorithm which performs the following steps: 1. Start at image with index i = 0. 2. Gather images inside the sliding window with size n, hence the images with index i, …, i + n. 3. Extract the PRNU noise residual for each image. 4. Compute a PRNU FP using the images inside the window. 5. Increment the index i by 1. 6. Repeat step 2 until all the images have been used to calculate a PRNU FP.
Moving the window over the whole data set yields a list of PRNU FPs, which have been computed using sequential overlapping windows. For a data set containing m images, m − n PRNU FPs are generated. After generating the FPs, the similarity of a PRNU FP FP i from the iteration i with all other FPs FP j where i ≠ j is computed by calculating the NCC score of each FP pair. This leads to a similarity matrix with size (m − n) × (m − n) containing all the pairwise NCC scores. The NCC scores of the PRNU FP comparisons where the FPs contain at least one common image are set to 0 because their correlation score would be much higher than average and introduce a bias to the clustering.
In [26], the number of clusters is determined in an explorative way by observing changes of the correlation scores. This leads to a rather vague estimation of the cluster structure in the data set. Hence, to assess the underlying cluster structure in a quantitative manner, we propose to apply different existing clustering techniques to cluster the obtained similarity matrix of pairwise PRNU FP comparisons. In this work, we applied the unsupervised clustering of digital images (UCDIs) [14], the fast image clustering (FIC) [15] and finally the MCSC algorithm [18]. The lower case 'x' in the technique name indicates the applied clustering technique: U for UCDI, F for FIC and M for the MCSC technique.
These techniques yield a list of clusters and the PRNU FPs associated to each cluster. To obtain a cluster association for each image in the data set instead of each generated PRNU FP, we perform a majority voting based on the images used to generate each PRNU FP and the cluster association: Each image is used for the generation of multiple PRNU FPs because of the sliding window property, hence we count the cluster association frequency of the PRNU FPs, which contain the specific image, and select the highest cluster label occurrence as the final decision for the image. This gives a cluster label for each image in the data set.
For the Known Sensor Sliding Window Fingerprinting (KSSWx) a PRNU FP is calculated with the uncorrelated data and is added to the list of PRNU FPs generated from the data set. This leads to a similarity matrix with size (m − n + 1) × (m − n + 1). This similarity matrix is again clustered using the previously mentioned UCDI, FIC and MCSC clustering techniques.

Known sensor K-means clustering ((KS)KM)
For this source sensor clustering technique Lloyd's K-means clustering algorithm [27] (KM) has been adopted, as previously proposed in [17]. K-means is a vector quantisation method for cluster analysis used in data mining that partitions n objects into k clusters. The centroid for each cluster is the point to which the sum of distances from all objects in that cluster is minimised which leads to a set of clusters that are as compact and well-separated as possible. We define the PRNU noise residuals of the images in the investigated data set as the n objects to cluster, while k is the number of different sensors (clusters). Due to the number of sensors for some data sets is unknown, we repeated the clustering for k = 1, …, 5 with the assumption that not more than five sensors have been used. This limitation is not mandatory and can be extended if necessary, but increases the computational effort significantly.
We propose an extension of this technique, the Known Sensor K-Means Clustering (KSKM), to be able to make use of the uncorrelated data. We first generate a PRNU FP from the uncorrelated data, which is then added to the set of PRNU noise residuals n which is clustered. In addition, we select this generated PRNU FP as starting point for the algorithm together with k − 1 random other samples from the data set. We repeat the K-means algorithm five times with the computed PRNU FP and k − 1 randomly chosen samples as starting points to avoid the possibility to get stuck in local minima and the clustering of the best run out of these five is selected as the final result.

(Biometric) data sets
First of all, we generated simulated data sets to examine the performance of the source sensor attribution techniques presented in Section 3. These data sets all consist of images from three distinct sensors from a popular Sensor Forensics benchmark database, the Dresden Image Database [28]: Agfa DC-830i, Panasonic DMC-FZ50 and Nikon D200. The data sets all contain randomly selected images from each sensor, where we shuffled chunks of 50 images to obtain a random order. We then generated three different data set types based on the frequency of images from each of the three sensors: • SIMeven: 150 images from each sensor.
• SIMuneven: 200 images from the first, 150 from the second and 100 from the third sensor. • SIMdominant: 350 images from one sensor and 50 from the two others each.
We repeated the data set generation ten times for each of the three simulated data set types, where the sensors' order for the image distribution is determined randomly each time, e.g. the sensors providing the most images in the SIMdominant data set was chosen randomly each time.
The existing biometric data sets under investigation in this work consist of images for two different biometric modalities, iris and FPs, which are illustrated in Table 1 together with the simulated ones. These biometric data sets have not been published; however, the iris data sets ending with '2013' and FP ones 'URU_1' and 'URU_2' have been acquired during a COST Short-Term Scientific Mission (STSM) as described in [29], while data sets ending with '2009' have been provided by the host institution during the mentioned COST STSM. The ground truth on the number of sensor instances used for the acquisition is only known for the H100_2013, IPH_2013, URU_1 and URU_2 data sets, which consists of one sensor instance. For all other data sets only the sensor model is known, but not how many instances of this model have been used.
All images in this work are 8 bit grey-level JPEG files. The iris data has been collected under near infrared illumination, while the FP sensors used red LEDs. The uncorrelated data used in this work to acquire the PRNU FPs for the known sensors has been acquired according to [29] for the following sensors: OKI Irispass-h, Irisguard H100 IRT, Digital Persona UrU4000 #1 and Digital Persona UrU4000 #2.
To obtain high-quality PRNU FPs as described by Fridrich [1], images with uncorrelated content and high saturation have been acquired. In some cases the sensor's quality assessment prevented the acquisition of such images, therefore the acquisition was performed in a best effort approach by varying the image content as much as possible to gain a 'cleaner' PRNU FP when averaging the images. Fig. 2 shows exemplary iris and FP images from the existing data sets described above and uncorrelated data acquired with the same sensor. It points out a successful acquisition for the Irisguard H100 IRT sensor, and a less successful one for the Digital Persona UrU4000 #2 sensor.

Experimental set-up and results
In the following section, we discuss the results of applying the various source sensor attribution techniques illustrated in Section 3 to the data sets in Section 4. First, we explain the general experimental set-up, which contains a description of the methodology and parameters valid for all experiments. After that we characterise the different experiments conducted in this work, which are divided into two different Sections 5.1 and 5.2.
All the data sets described in Section 4 are investigated independently. The PRNU noise residuals are extracted from a square patch located in the centre of each image. After the extraction the PRNU noise residuals are enhanced using one or more of the techniques mentioned in Section 2. For all clustering techniques where a PRNU FP is generated, in addition PRNU FP enhancements are also applied. The configuration of both For the (KS)BCF and (KS)SWx only clusters containing ten or more images are considered for the final number of clusters results. These techniques are prone to generate a few very small clusters for small PRNU sizes which would have a strong impact on the results because of the overall rather small number of clusters and furthermore, in the investigated biometric scenario, the case that such a small number of images in the data sets is acquired with a different sensor is highly unlikely.
In order to be able to quantitatively assess the clustering of the data sets and reveal differences caused by the various PRNU enhancement techniques the mean silhouette value (MSV) by Rousseeuw [30] has been calculated for each source sensor attribution techniques clustering outcome.
The silhouette value for each point is a measure of how similar that point is to points in its own cluster, when compared to points in other clusters, hence it is a measure between intra-and intercluster distances. This technique does not rely on any ground truth information about the clustering of the investigated data set and is therefore well suited for our investigation because the ground truth is not known for all data sets used in this work, which can be seen in Table 1. The result for a single cluster, or k = 1, has been determined by calculating the pairwise NCC between all point combinations i and j, where i ≠ j, and then calculating the mean correlation over all points. For all k ≥ 2 the MSV for the ith point, S i , is defined as where N is the number of noise residuals, a i is the average distance from the ith point to the other points in the same cluster as i (cohesion), and b i is the minimum average distance from the ith point to points in a different cluster (separation), minimised over all clusters. The silhouette value ranges from −1 to +1. A high silhouette value indicates that a point i is well-matched to its own cluster, and poorly-matched to neighbouring clusters. If most points have a high silhouette value, then the clustering solution is considered to be an appropriate solution. On the other hand, if many points have a low or negative silhouette value, then the clustering solution may have either too many or too few clusters. This concludes the general experimental set-up and we will now continue with the discussion of the experimental results for the Simulated Data Sets.

Simulated data sets
The performance evaluation of the source sensor attribution techniques is an important part of this work, since the effects of the advanced PRNU enhancement techniques evaluated later are assessed using the clustering outcome of the different techniques. Hence we applied the various clustering techniques on the simulated data sets SIMeven, SIMuneven and SIMdominant. The PRNU is extracted with the basic ZM + WF configuration, which uses the F L uk denoising filter, enhances the noise residuals with (ZM) and the PRNU FPs with ZM + WF according to [22].
We measure the performance of the proposed source sensor attribution techniques on the simulated data sets for varying PRNU patches (square size): 64, 128, 256, 512, 768, 1024, 1536 and 2048 pixels. In this case, the resulting scores and the number of clusters are averaged over the ten different randomly generated data sets of each data set type (SIMeven, SIMuneven and SIMdominant) separately.
For the simulated data sets, where the ground truth on the source sensor for each image is known, we compute the V-measure (VM) [31] score for the clustering outcome, which is defined as harmonic mean of homogeneity (h) and completeness (c) as shown in the following equation: The homogeneity h measures whether each cluster exclusively contains images from the same sensor, while the completeness c measures if all images belonging a sensor have been assigned to the same cluster. H(C | K) refers to the conditional entropy of the different classes for the given cluster associations and H(C) denotes the entropy of the classes. Further details can be found in the corresponding paper [31]. First of all we have a look at how the size of the extracted PRNU affects the performance. Since the simulated data sets contain higher resolution images than the biometric data we are able to test various extracted PRNU sizes from 64 × 64 to 2048 × 2048 pixels. The results show that the VM scores increase proportionally with the PRNU size for some techniques, where BCF shows a steady increase in clustering performance with increasing PRNU size, while for KM the performance increases until a certain point and then stagnates. The stagnation of the VM scores after a certain PRNU size occurs due to the technique's inability to further exploit the additional data for the differentiation of the sensors in the data. Thus it reaches a point where additional data does not change the cluster association of the images.
The MSV scores in general increase with larger PRNU size, except for the KM technique. The decreasing MSV scores for the KM technique with larger PRNU sizes can be explained by how the MSV scores are calculated. For the MSV scores we consider pairwise Euclidean distances between the PRNU noise residuals, which become more and more inaccurate with increasing dimensionality (i.e. PRNU size), as shown in [32]. Due to the cluster association staying the same for larger PRNU sizes, the MSV scores decrease because of this effect in higher dimensions. For the SWx techniques the MSV score increases with higher dimension because of their inability to cluster the data properly.
For the SWx techniques, the VM performance is consistently bad across all tested PRNU sizes. The reason for this are the very low homogeneity scores for the SWU, the very low completeness scores for SWF and while SWM shows the best VM score of the three, but suffers from both mediocre homogeneity and completeness scores. The VM and MSV results for BCF, KM and SWF are illustrated in Fig. 3.
Due to the limit of the biometric data to extract the PRNU from a 256 × 256 patch we compared the performance of all techniques with this configuration, which can be seen in Fig. 4. It shows that the highest VM score is obtained by the KM technique, which shows a high score for the SIMeven and SIMuneven data sets, while it seems to struggle with the SIMdominant data set. In general, all techniques obtain much lower scores for the SIMdominant data set with BCF being the only exception. Although the SWU and SWM generate a number of clusters close to the expected result of 3, the quality of the clusters in respect to the homogeneity and completeness is quite low. BCF on the other hand generates a few more clusters, but their quality is higher, which is indicated by the higher VM score.
Summarising the KM and BCF techniques are the most qualified techniques to cluster the data for the tested PRNU size. The KM technique obtains the highest scores for all three simulated data sets, but the performance varies highly depending on the distribution of the images from different sensors within the data sets. The BCF technique on the other hand performs worse than the KM one due to being prone to produce more clusters, which is penalised by the VM measure. However, the produced clusters all have a high homogeneity and by having the most consistent results across all the simulated data sets still consider this method as well suited for the clustering. Due to the poor results for the SWx techniques they cannot be recommended for this kind of scenario, thus for the remaining evaluation only the BCF and KM techniques are taken into consideration.

Iris and FP data sets
In this section, Iris and FP Data Sets, we discuss the effects of applying different PRNU enhancement techniques on the existing biometric data sets. For these iris and FP data sets we are only able to extract 256 × 256 pixel patches because of the varying image size to ensure the comparability of the results among all data sets.
The different configurations for the PRNU extraction process used for the experiments can be seen in Table 2. The parameters of all PRNU enhancement techniques have been chosen as recommended by the authors of the respective papers.
This section is further divided into the following three subsections: • In Section 5.2.1, we briefly evaluate the results obtained with the basic ZM + WF configuration applied for the PRNU extraction for all clustering techniques. • Section 5.2.2 discusses the effects of the different PRNU extraction configurations applied for all clustering techniques. • In Section 5.2.3, we recapitulate the effects of the various PRNU extraction configurations and compare their performance across all data sets.
Before discussing the Baseline results, an overview overall results for the biometric data sets is given in Table 3, where we will depict some interesting observations in the following.

Baseline: The resulting MSV values relevant for the
Baseline evaluation correspond to the ZM + WF rows of Table 3.
The resulting clusters for all source sensor attribution techniques can be seen in Fig. 5. First of all we have a look at the iris and FP data set results separately. The first thing we notice when looking at the iris data sets is that the BCF and KSBCF techniques produce a large number of clusters for the IPH_2009 and IPH_2013, where both are not able to cluster the data properly. This is also confirmed by the negative MSV scores. However, the use of uncorrelated data helps to improve the MSV scores slightly for KSBCF compared to BCF. KM and KSKM yield one cluster for all iris data sets, even for those with known ground truth that have been acquired with a single sensor. The use of uncorrelated data does not affect the MSV scores at all for the KSKM technique compared to KM.
For the FP data sets URU_1 and URU_2 all clustering techniques fail at clustering the data correctly and yield two clusters, even though the correct number would be 1. Yet all MSV scores are positive which indicates that the separation of the data into two clusters could be reasonable. The effects of the uncorrelated data are the same as for the iris data sets, where the MSV scores of KSBCF are slightly better than those for BCF and the MSV scores for KSKM do not show any change in comparison with KM.

PRNU enhancements side by side:
In this subsection, we will have a look at the Li, BM3D, FSTV and FDR + SEA rows of Table 3, which contain the results of applying the PRNU extraction configurations described in Table 2. The evaluation of the results focuses on the BCF and KSBCF techniques first, followed by the KM and KSKM techniques.
The results for the BCF and KSBCF techniques are graphically depicted in Fig. 6. As we can see the BCF results for H100_2009   Fig. 7. The first thing that we notice here is that the use of uncorrelated data in the KSKM technique has absolutely no effect on the scores and the number of clusters. Therefore, all of the following statements relate to both KM and KSKM. In most cases, the PRNU enhancement configurations show an improvement of the MSV scores, while not changing the resulting number of clusters. The only exception is FSTV, which increases the number of clusters to 2 for the H100_2009 data set. The highest MSV scores for the iris data sets (H100_2009, H100_2013, IPH_2009 and IPH_2013) are

Summary biometric data:
The preceding results for the biometric show that the adoption of the different PRNU enhancement configurations did indeed help to improve the clustering outcome of the clustering techniques. Fig. 8 shows the PRNU enhancement configurations resulting in the highest MSV scores for each technique and data set. We can see that for the KM and KSKM technique FSTV is the best choice for the iris and BM3D for the FP data sets. Regarding the BCF and KSBCF the choice of PRNU enhancement configuration is dependent on the data set or rather on the sensor model: For the data sets using the Irisguard H100 IRT sensor BM3D is the configuration of choice, while for the OKI Irispass-h sensor it is the Li configuration. The additional use of uncorrelated data had a very large impact on the clustering outcome of the KSBCF techniques applied on the IPH_2009 and IPH_2013 data sets compared to BCF. However, for the other data sets the impact was quite small and for KSKM the uncorrelated data had no impact at all. This can be explained by how the KSKM technique makes use of the uncorrelated data, in fact, it is only used to create a starting point for the K-means algorithm which then nevertheless converges to the same cluster centroids as without using this additional data.
Concerning the data sets for which the ground truth is known, the correct number of clusters for all iris data sets could be determined at least by applying a combination of uncorrelated data and PRNU enhancements. In contrast, for the FP data set the correct number could only be established in one case. In all the others the clustering techniques failed to do so even with any combination of uncorrelated data and PRNU enhancement.
Recapitulating we can say that there is no single best PRNU enhancement configuration for this scenario, yet it is highly situational which one should be chosen.

Conclusion
In this work, we proposed novel source sensor attribution techniques based on the sensors PRNU and applied existing ones. We generated multiple simulated data sets containing images from multiple sensors taken from the Dresden Image Database and computed different clustering quality metrics to evaluate the proposed techniques. The results showed that the size of the extracted PRNU has a significant impact on the clustering result. Two of the techniques BCF and KM have been able to cluster the data properly and showed consistent and promising results in the case of 256 × 256 PRNU patch sizes and have been considered appropriate for the source sensor attribution of biometric sensors.
In the following, all techniques have been applied to biometric data sets with low resolution images of two different biometric modalities, iris and FPs, to cluster the images according to their source sensor. Different PRNU enhancement techniques have been adopted in response to the special characteristics of biometric data, such as highly correlated data and contamination of the PRNU by the image content, in order to improve the clustering performance. In addition, we used uncorrelated data acquired with the sensors and proposed several extensions for already existing sensor attribution techniques to be able to use this uncorrelated data in conjunction with the source attribution techniques.
The evaluation of the various PRNU enhancement and uncorrelated data effects was conducted by means of a quantitative measure for the clustering outcome that considers the cohesion and separation of the clusters without the need of any knowledge about the underlying cluster ground truth. Summarising the results it can be stated that most PRNU enhancements did indeed help to improve the clustering results compared to the original work in [21] by increasing the differentiability of the PRNU noise residuals. However, we could not identify any single enhancement technique or combination that was able to improve the clustering outcome for all data sets alike, but the choice of the best performing technique is highly situational. Furthermore, the clustering techniques in most cases did not succeed in determining the correct number of clusters for the FP data sets, even with the support of the different PRNU enhancements techniques.
For the FP data sets the absent PRNU enhancement effect and poor results clearly needs some further and deeper investigation. The insufficient quality of the extracted PRNU might be an issue in this case, either caused by the image content or by other contaminations or factors, e.g. the amount of denoising applied during the biometric sensor's processing of the acquired image. Since biometric sensors are often closed systems tailored to acquire a specific type of images, the identification of these issues is challenging. In conclusion certainly further studies have to be conducted in this manner in regard to the special requirements posed by biometric sensors and the data they produce. A fusion of the source sensor attribution techniques' clustering outcome will also be investigated in future work.

Acknowledgment
This work was partially funded by the Austrian Science Fund (FWF) under project no. P26630 and partially supported by a COST 1106 Short-Term Scientific Mission (STSM).