Comparison and validation of surface topography segmentation methods for feature-based characterisation of metal powder bed fusion surfaces

Feature-based characterisation, i.e. the characterisation of surface topography based on the isolation of relevant topographic formations (features) and their dimensional assessment, is a developing field of surface texture metrology. Feature-based approaches provide dimensional assessments of individual features (area, width, height, etc) as well as statistical properties of feature aggregations (e.g. mean, standard deviation, etc), which may be more intuitive or related to functionality. For powder bed fusion surfaces, a commonly investigated feature of interest is the particles or spatter present on the surface. In this work, we address segmentation, a necessary step of feature-based characterisation, where the measured surface topography is spatially partitioned into regions to isolate the targeted features from their surroundings. Three topography segmentation methods are investigated: morphological segmentation on edges, contour stability analysis and active contours. To perform the comparison, three powder bed fusion surfaces obtained at differing build orientations (0°, 30° and 90°) and measured using focus variation microscopy are subjected to the three segmentation approaches - optimised to isolate spatter and particles on the surface. The comparison of the segmentation methods focuses on performance in feature identification (i.e. the capability to correctly detect the presence of features) and performance in feature boundary determination (i.e. the capability to correctly trace the boundaries of each feature). Results show that no segmentation method is consistently superior for all test cases, but the comparison approach is useful to explore and optimise segmentation alternatives for feature-based characterisation scenarios.


Introduction
Feature-based characterisation, i.e. the characterisation of the salient topographic formations (features) of a topography, is a developing field of surface metrology [1]. Whilst conventional characterisation of surface topography is based on assessing the properties of the entire measured field by computing texture parameters (e.g. ISO 25178-2 [1,2]), feature-based approaches typically target individual features (e.g. pores, scratches, particles, discontinuities and other singularities) and the characterisation of their geometric properties (e.g. area, width, height, depth) [1].
Feature-based characterisation requires topography segmentation, i.e. partitioning of the surface into regions. Segmentation may be performed in multiple ways by means of a wide range of algorithms, but ultimately, it should lead to a partitioning that delimits the features being targeted, separating them from their surroundings. Accurate identification of segment boundaries is, therefore, essential to determine feature localisation, extents, shape and size properties. Most of the accuracy of feature-based characterisation is tied to accuracy of the segmentation step [1]. However, for complex topographies, the identification of an optimal segmentation approach is usually far from trivial. A challenge which is typically encountered is that specialised user input is often required to determine what defines the feature of interest and, in turn, what criteria should be used to identify the exact transition boundaries between a feature and its surroundings. Even when the definition of the targeted feature is sufficiently clear, the lack of a reference result of what should constitute an optimal partitioning outcome, makes it challenging to assess whether a segmentation method/algorithm performed well.
In this work, a specific test case involving featurebased characterisation is considered. The case revolves around metallic surfaces fabricated via powder bed fusion (PBF), an additive manufacturing process that produces rough surfaces which often need finishing operations before they can fulfil their functional role. Additive surfaces are conventionally characterised by computation of texture parameters [3,4]. Parameters, such as the ISO 4287 arithmetic mean deviation (Ra) [5] for profile-based surface characterisation, and the ISO 25178-2 arithmetic mean height of the scale limited surface (Sa) [2], for area-based surface characterisation, are amongst the most popular choices for PBF surfaces, and can provide an effective, overall indication of which surfaces are 'rougher' in a comparison. On the contrary, the appeal of feature-based approaches is that they provide the opportunity to decompose a surface into its relevant constituent topographic formations (features), and thus describe the surface itself in terms of the geometric attributes of such features [6]. As opposed to texture parameters, featurebased characterisation may provide indication as to why one surface may be 'rougher' than another, i.e. what topographic formations (features) may be contributing the most in determining the overall roughness of the surface. For the metal PBF test case, a typical matter of interest involves the identification and characterisation of spatter formations and unmelted particles present on the as-built topography (that is, before any finishing process). Spatter formations result from molten particles ejected during surface processing and deposited on the surface in the form of solidified aggregates [7]. Similar particle clusters (though not technically spatter), may be created by excess input energy from the melt pool, that can also act to sinter loose powder adjacent to the build geometry in laser PBF (LPBF) [8] and for the electron beam PBF (EBPBF) process deliberately sintering a larger 'cake' region around the build geometry of the layer prior to melting the layer [9]. Forming a clear picture of location, distribution, size, shape and other geometric properties of spatter, particles and particle clusters accumulated over an as-built PBF surface, for example as a function of surface orientation during the build process, helps to achieve an enhanced understanding of the manufacturing process, and helps when assessing the surface finishing challenges [10][11][12][13].
In a feature-based characterisation scenario involving metal PBF surfaces, where the target is the isolation and characterisation of spatter and particles on asbuilt surfaces, the choice of an appropriate segmentation method is paramount. As surface topography data is commonly available as height maps, i.e. matrices of height values (scalars) distributed along of the rows and columns of a regular grid [1], potentially applicable segmentation methods are commonly found in the domain of image processing (a height map is mathematically equivalent to a grayscale, digital image [1]). In this work, three methods of feature-based segmentation were investigated: morphological segmentation on edges [2,11] and active contours [14], both derived from the domain of image processing and recently adapted to operate on topography data, and contour stability analysis [10,12], an original method developed directly for areal topography data. In-house developed implementations of the three methods were applied to a selected set of surfaces belonging to the test case, and their performance was quantitatively compared.

Sample surfaces
To perform the comparison, three PBF surfaces obtained at differing build orientations (LPBF top (0°) surface, EBPBF angled (30°) surface and EBPBF side (90°) surface) were measured using focus variation microscopy [15,16], optimised for measurement following the work in [17], and subjected to the segmentation approaches which were optimised to isolate particles and spatter on the surface. The three surfaces were chosen as representative of a large range of scenarios, with the LPBF top surface typically featuring the least number of features, the EBPBF side surface featuring the most, and the angled EBPBF surface featuring an intermediate number of spatter and particles. Example measured topographies from the three samples are shown in figure 1.

Segmentation using contour stability analysis
Contour stability analysis was originally presented in reference [6], where it was applied to identify particles and spatter features on LPBF surfaces. Contour stability analysis is essentially an edge detection method that privileges sharp transitions, so it preferentially works for features delimited by steep 'walls', which applies to most particles/spatter formations in the test case. In contour stability analysis, the measured topography is sectioned by a series of slicing planes at decreasing height starting from the top. Each slicing plane results in a series of cross-sectional contours. Each contour is tracked as its shape changes moving down through the sequence of slicing planes. Those contours that change minimally (i.e. within a small, predefined threshold), are defined as stable, and are representative of steep feature boundaries in the original, sectioned topography. In order to efficiently track multiple contours across many slicing planes, the method implements a spatial binning process for the contour maps and considers as stable those portions of contours that do not exit their original bins. Once the more stable contours are identified and cleaned via a sequence of morphological operations, those forming closed loops are extracted and used to isolate features [6].
In this work, contour stability analysis was implemented for particle and spatter detection with the following parameters.
(a) An S-filter of nesting index 8 μm and an L-filter of nesting index 250 μm were applied to remove noise and the underlying large-scale waviness which may confuse the contour stability analysis.
(b) Contour stability was run with the following settings: the threshold for maximum lateral movement of contour points across slicing planes was set to 2 μm over a vertical range of 5 μm (computed with a series of vertically stacked slicing planes set 0.25 μm apart); to connect edges that are detected by the algorithm into closed regions that represent feature objects on the surface, morphological dilation and erosion over a three pixel structuring element was applied in the segmentation binary mask.

Morphological segmentation on edges
Morphological segmentation consists of partitioning the topography into hills or dales, as described in ISO 25178-2 [2] and elsewhere [18]. Hills are areas from which maximum uphill paths lead to one specific peak and dales are areas from which maximum downhill paths lead to one specific pit. As a rough surface will typically result in a multitude of hills or dales, methods have been defined to simplify the partitioning by aggregating individually less relevant (i.e. smaller) hills or dales to larger ones [18]. The most widespread aggregation methods are area pruning and Wolf (i.e. height) pruning [18], respectively based on merging hills/dales with smaller footprint areas, or smaller local height/depth, to larger ones. Morphological segmentation into hills/dales can be performed using a variant, specifically designed to detect edges [17,18]. In this variant, an artificial topography is created, containing the absolute values of the local slope of the original topography. This topography is partitioned with dale-based segmentation [19,20]. The method is colloquially referred to as morphological segmentation on edges [16] because local concentrations of large slopes (visible as dale crests in the absolute slope map) are typically representative of edges in the original topography. Morphological segmentation on edges was recently applied to the identification of spatter and un-melted particles in PBF surfaces [11], using the following steps.
(a) An L-filter with nesting index 250 μm is applied to suppress large scale (waviness) components on the surface. As for contour stability analysis, this step is designed to remove topography components which may confuse the actual segmentation algorithm.
(b) Sobel operators are applied to produce a gradient magnitude map of the surface (particles and spatter would possess high gradients around the edge of the feature).
(c) The gradient magnitude map was taken as absolute value (that is, negative slopes are turned into positive, so that high-slope regions appear as crests surrounding low-sloped regions-dales in the gradient map). Finally, morphological segmentation into dales was applied and pruning of these segmented regions was performed by thresholding the heights above three standard deviations of the mean height to isolate the topmost regions of the segmentation map, which are most likely to correspond to protruded formations such as spatter and particles.
In this work, morphological segmentation on edges was implemented for particle and spatter detection, following the method proposed in reference [4]. The following parameters were adopted.
(a) An S-filter with nesting index at 8 μm and an L-filter with nesting index 250 μm were applied to extract the roughness surface.
(b) Sobel operators were applied to produce the gradient magnitude map, later turned into absolute values.
(c) Morphological segmentation into dales was applied.
(d) Threshold-based isolation of the top-most regions, with varying thresholds depending on the surface condition.
For the LPBF top surface, the threshold was applied at the value of the mean height, plus one times the standard deviation; for the EBPBF angled surface, the threshold was set at the mean height plus half standard deviation; and for the EBPBF side surface the threshold was set at the mean height. These values were chosen as optimised to isolate the top-most regions corresponding to the protruding formations.

Segmentation using active contours
Active contours is a method that, starting from an initial guess, iteratively refines the position of a closed contour that is meant to separate the region of interest from its surroundings [14,[21][22][23][24]. Strictly speaking, active contours should be regarded as an edge refinement method, not an edge identification method, however, the method is always paired to an initial contour rough-guessing method, so that the combination of both steps represent an actual edge identification. Starting from the initial guess, active contours makes use of mathematical models that mimic energy minimisation, to iteratively move the contour towards its most stable position (moving outwards or inwards, depending on the variant). The final stable position is assumed as the boundary of the feature being isolated.
In this work, active contours was implemented for particle and spatter detection by subjecting the surface topography to a L-filter with nesting index 70 μm. The index was chosen based on the approximate size of individual particles, to remove larger topographical features. On the L-filtered surface, a thresholding operation was applied at 90% of the height range on the surface for the LPBF top surface and the EBPBF angled surface, and at 70% of the height range for the EBPBF side surface. The thresholding is designed to isolate the top-most regions of the filtered topography, which are most likely to correspond to protruded formations such as spatter and particles. On the resulting threshold map, topologically disconnected isles were isolated, then some filtered out based on size, horizontal aspect ratio and height (in the corresponding height map) when not consistent with typical spatter and particles of known geometric attributes. Boundaries extracted from the final isle map were individually used as initial contour rough guesses for running the active contours algorithm. Active contours was run using 100 iterations, and the geodesic 'edge' method [21] with negative contraction bias (leading to outwards growth of the contour). The output of each run was a segmentation mask that could be applied to the surface topography to isolate the spatter/particle features.

Comparison of segmentation methods
A performance comparison of the segmentation methods applied to the test cases was carried out by designing a series of quantitative performance indicators. The performance aspects targeted by the indicators were (a) the feature identification capability, i.e. the capability of producing segments containing the targeted features; and (b) the accuracy in feature boundary identification, i.e. the capability of segmentation to place segment boundaries corresponding to actual feature boundaries.
All the quantitative performance indicators assume the availability of a reference, ideal segmentation result, where each targeted feature is appropriately represented by a segment, and where the feature boundaries exactly correspond to the segment boundaries. The reference segmentation result can be compared with the result of each investigated method, to assess performance of the latter. However, due to lack of an optimal segmentation method whose performance is recognised as ideal for the selected test case, a segmentation result was hand-drawn for each test surface to act as a comparison reference. Clearly though, adopting a reference result created by a human operator is susceptible to bias and repeatability/reproducibility issues because of the presence of subjective assessment [25].
The performance indicators illustrated in the following assume that any segmentation result, whether generated by one of the compared methods or manually generated by the operator, is available in the form of a map of identifiers (IDs). A map of IDs is a grid of ID values, the same size as the original height map, so that each location (map point) in the height map is univocally associated to one and only one segment (the one represented by the ID value associated to that point). To maintain the useful parallel to digital images, map points will be referred to as 'pixels' from now onwards.
The following definitions for the quantitative performance indicators assume the pre-processing steps shown in see figure 2.
For each subset of the height map containing an individual feature (figure 2(a)), the corresponding reference, ideal segmentation result is assumed available ( figure 2(b)). Notice that in the ideal result, the feature has been identified (i.e. there is a segment covering the region occupied by the feature), and the feature boundary has been correctly localised (i.e. the segment boundaries do coincide with the actual feature boundaries). In figure 2(c), the result of a segmentation algorithm to be evaluated is shown. Clearly, whilst a segment has been created approximately corresponding to the position of the actual feature (i.e. successful identification), the segment boundaries do not correspond exactly to the feature boundaries, leading to different statuses associated with the segment pixels, depending on where they fall with respect to the actual feature ( figure 2(d)). Such statuses can be derived from the results of a binary classifier and are summarised in table 1.
The following performance indicators, originally devised for binary classifiers, can be adopted to describe the performance of segmentation with respect to an individual feature. High precision implies a low number of excess pixels, a 100% precision implies zero excess pixels.

TP TP FN
Recall sensitivity, true positive rate TPR no. feature pixels no. feature pixels no. missing pixels 2 High recall implies a low number of missing pixels, a 100% recall implies zero missing pixels.

TP TP FP
Specificity selectivity, true negative rate TNR no. background pixels no. background pixels no. excess pixels .
Similar to the concept of metrological precision, high specificity implies a low number of excess pixels ( i.e. pixels wrongly recognised as belonging to the feature). However, different to precision, the viewpoint is the identification of the background.
The above indicators can be computed for each individual feature and its associated portion of the segmentation map. Once repeated for all the individual features presented on a test surfaces, they can be aggregated into performance statistics (e.g. mean and standard deviation of each indicator). The indicators provide information intuitively related to metrological accuracy in feature boundary identification. To quantify performance in feature identification, the number of features that have no corresponding pixels in the segmentation map (i.e. the number of totally ignored features) is counted and compared to the total number of features present in the analysed region (from the reference segmentation result). The following ratio is defined: Identification error ratio no. ignored features no. total features 4 =

( )
The identification performance is defined as the complement of the identification error ratio: Identification performance no. identified features no. total features no. total features no. ignored features no. total features 5 In addition, the indicators applied to individual features can be applied to all pixels within the image to offer a complementary assessment of the segmentations approaches with respect to the whole surface. This is done by comparing the binary maps. Alongside this, an indicator for the accuracy of the segmentation approach can be used to describe the performance with respect to the whole image:

TP TN TP FP TN FN Accuracy
no. feature pixels no. background pixels total no. pixels .
Accuracy provides an overall view of classification performance. However, the result is skewed by different number of feature and background pixels in the analysed region of the segmentation map. Therefore, the following balanced form can be adopted:

Results
In figures 3, 6 and 9, segmentation maps are shown from the different segmentation methods applied to the LPBF top surface. Pixels are coloured based on the comparison with the ideal reference classification results. Note that the coloured maps have been obtained by aggregating the comparison results obtained for each individual feature (particle or spatter). Missing (feature) pixels are shown in blue, thus particles/spatter features that are shown as entirely blue are ignored features that reduce the overall identification performance of the method. On the contrary, identified (or partially identified) features are marked with yellow pixels (feature pixels). For Table 1. Classification of pixels in the segmentation map depending on whether they correspond to features or background pixels.

Class Description
Short name (feature-centric) TP (true positive) 1-valued segmentation pixel overlaid to a feature pixel in the height map feature pixel FP (false positive) 1-valued segmentation pixel overlaid to a background pixel in the height map excess (feature) pixel TN (true negative) 0-valued segmentation pixel overlaid to a background pixel background pixel FN (false negative) 0-valued segmentation pixel overlaid to a feature pixel missing (feature) pixel those, excess pixels (orange) and missing pixels (blue) provide an indication of boundary detection performance. Figures 4, 7 and 10 show boxplots of all the performance metrics calculated on individual objects that are found on both the reference and the segmentation approach for each surface. Each object is considered as a whole regardless of how small the overlap of matching pixels might be; this is to determine how effective the segmentation is for specific features on the surface. Figures 5, 8 and 11 show the values of the performance parameters calculated over the whole surface as determined by the binary classification tests shown in figures 3, 6 and 9. These results only give a general assessment of the segmentation and do not consider the effectiveness of boundary detection.

LPBF top surface
For the LPBF top surface (figure 3), where there is an expected low number of features (particles and spatter) on the surface, the morphological segmentation on edges ( figure 3(a)) resulted in the lowest identification performance of 0.087 (about 9% identified features). Contour stability identified more features ( figure 3(b)) with an identification performance of 0.621. Active contours resulted in the highest identification performance ( figure 3(c)) with a score of 0.776. Figure 4 shows specificity, precision and recall calculated on the matched features for the LPBF top surface. For morphological segmentation on edges, the boxplots were calculated for nine matched features. The boxplot for contour stability was calculated on sixty-four matching features and active contours was calculated on eighty matching features.
Morphological segmentation on edges has the highest scores for precision and specificity, both with low dispersion. For precision, contour stability and active contours have higher dispersion whilst contour stability possesses a much higher median value very close to unity. Active contours has the highest scores and significantly low dispersion for recall, performing better than the others, suggesting that it is often the best approach to identify most of the spatter formations and particles present on the surface. All three approaches result in high scores and low dispersion for specificity, with all values greater than 0.99.
As shown in figure 5 for the whole surface, morphological segmentation on edges possesses the lowest scores for balanced accuracy and recall. However, morphological segmentation on edges results in the highest values for precision and specificity. Contour stability appears to be not as precise as morphological segmentation on edges. Otherwise, the performance parameters for contour stability fall between the two other approaches with a comparably higher score for specificity. Active contours does result in a high score for recall, suggesting that it does identify a lot of the features as found in the reference, consistent with the highest score for accuracy. However, lower scores of precision and specificity suggest that active contours generally overestimate both the size and number of relevant features, a result that can be visualised in figure 3.

EBPBF angled surface
The EBPBF angled surface (figure 6), features an increased number of particles and spatter formations with respect to the LPBF top surface. Morphological segmentation on edges (figure 6(a)) resulted in an identification performance of 0.574. Contour stability resulted in the lowest score for this surface ( figure 6(b)), with an identification performance of 0.465. Active contours resulted in the highest identification performance (figure 6(c)), with a score of 0.929. Figure 7 shows the boxplots for individual matching features between the reference and segmentation results for the EBPBF angled side surface. For morphological segmentation on edges, the boxplots were calculated for seventy-three matched features. The boxplots for contour stability were calculated on fiftynine matching features, whilst for active contours they were calculated on 118 matching features. As shown in figures 1 and 6, there are some particles that appear adhered to layer edges and as groups of particle clusters.
In figure 7, a relatively high dispersion of the scores for all segmentation methods is observed when compared to the top surface, with exception of contour stability, likely due to the difficulty in defining contours on particle clusters that have low gradients. Morphological segmentation on edges results in high scores for precision and specificity but lower scores for recall. Contour stability possesses the highest scores and lowest dispersion for both precision and specificity. Active contours result in the highest score and lowest dispersion for recall, with the interquartile range (IQR) above 0.8, following a similar trend as previously observed for the LPBF surface ( figure 4).
For the performance indicators calculated over the whole image, morphological segmentation on edges, as shown in figure 8, resulted in the highest value of balanced accuracy and very high values for precision and recall. Contour stability still has reasonably high values for precision and recall, but with a much lower specificity leading to a lower balanced accuracy. Active contours, whilst having a high recall and a good precision, has the lowest specificity suggesting that, whilst there was good agreement between the method and the reference, there was still some over-estimation of features.

EBPBF side surface
The EBPBF side surface (figure 9), featured the highest number of spatter formations and particles, with an increased occurrence of particle clusters. Morphological segmentation on edges ( figure 9(a)) resulted in the lowest identification performance with a value of 0.313. Contour stability identified more features ( figure 9(b)) with an identification performance of 0.417. Active contours resulted in the highest identification performance (figure 9(c)) with a score of 0.600.
The boxplots for the individual matching features for the EBPBF side surface are shown in figure 10. Due to the further increase in the number of features and increased presence of agglomerations, there appears to be an even greater dispersion for most of the performance metrics across the segmentation approaches. For morphological segmentation on edges, the boxplots were calculated for 36 matched features. The boxplots for contour stability were calculated on 48 matching features, whilst those for active contours were calculated on 69 matching features.  . Binary classification test results between the manual reference and the segmentation approach for EBPBF angled surface, in the figure, yellow denotes matching pixels, orange represents excess pixels and blue denotes missing pixels.
In figure 10, morphological segmentation on edges results in the highest values and lowest dispersion for precision and specificity, with both contour stability and active contours possessing higher dispersion and lower median scores. For recall, there is a very large dispersion for active contours, however, the median is significantly lower when compared to contour stability. Whilst contour stability has the highest score for recall, it also has high dispersion. In addition, contour stability possesses the largest IQR for specificity-which is also greater than the greatest result found across all the surfaces considered.
The performance indicators calculated over the whole image can be seen in figure 11. Morphological segmentation on edges appears to have performed the best for the EPBPF side surface, with the highest value of balanced accuracy and very high values for precision and recall. Contour stability still has reasonably high values for precision and recall, but with a much lower specificity leading to a lower balanced accuracy (not shown). Active contours, whilst having a high recall and a good precision, has the lowest specificity suggesting overestimation of feature size.

Summary of the comparison results
For identification performance, active contours scored the highest across all surface cases with morphological segmentation on edges resulting in the lowest scores for the LPBF top surface and the EBPBF side surface. Contour stability performed reasonably for all surface cases. It appears that the EBPBF angled surface was the easiest surface to segment, as reflected by the highest scores for each approach. On the contrary, the lowest score for identification performance was observed for morphological segmentation on edges on the LPBF top surface.
The results for the individual matched objects (particles, spatter and particle cluster formations) show a trend for higher dispersion for the scores (lower agreement) with increasing complexity (from LPBF top surfaces to EBPBF side surfaces) which suggests that, as the individual features on the surface get more complex, all segmentation approaches find it more difficult to identify features that agree with the reference segmentation. Consistently, morphological segmentation on edges reported high scores and a lower dispersion for precison and recall, whilst active contours generally had the higher scores for recall. Contour stability generally performed better than active contours for precision and specificity, and showed scores for recall improving with increasing complexity of the surface.
When comparing the result for the whole surface, the increasing complexity and number of features is reflected in the balanced accuracy, with the lowest values found on the EBPBF side surface and the  . Binary classification test results between the manual reference and the segmentation approaches applied to the EBPBF side surface. In the figure, yellow denotes matching pixels, orange represents excess pixels and blue denotes missing pixels.
highest scores found on the LPBF top surface. The lowest values for precision were found on the LPBF top surface where all methods were unable to identify many of the spatter and particle features.
In general, there is a trade-off between recall and specificity across all three methods. Active contours is generally a good approach, with low precision and specificity but high recall. On the contrary, morphological segmentation on edges, whilst leading to lower recall, possesses higher scores for precision and specificity. Essentially, active contours will find all objects found in the reference, but at the cost of oversizing feature boundaries. Morphological segmentation on edges may struggle to identify all objects in the reference but will more closely track the edges of the object boundaries. Contour stability falls between the other two methods in terms of performance, but is particularly weak when confronted with agglomerated particles, resulting in lower scores for recall on surfaces where these types of features are present.

Limits of the segmentation validation method
A reference segmentation result is necessary to compute the performance indicators that have been proposed in this work. However, in the absence of an ideal segmentation method to use as a reference, the use of a manually obtained segmentation result has been suggested. Clearly, the reliance on a result obtained by a human operator is prone to be affected by subjective bias, and such bias is not only operatordependent, but may also be application dependent, as operators may find different challenges when processing different types of surfaces. Visual understanding of reconstructed, digital topographies is affected by many confounding factors, including the influence of measurement error, and-for complex topographic objects-lack of definition of what an actual feature may look like, or even worse, how exactly a feature boundary may be identified. In addition, performance degrades with increased feature count, as the operator is more likely to cause errors when large numbers of features must be manually assessed. These issues should be considered when assessing the reliability and reproducibility of the results presented by this paper. Regardless, the goal of this research is to highlight the need for a quantitative evaluation of segmentation performance, and several relevant, quantitative indicators has been provided.

Additional computational costs of segmentation
The segmentation methods illustrated in this work have been compared solely in terms of their performance on a specific test case. It is important to point out the fact that such performance is normally not obtained out-of-the-box, and each segmentation method requires a long tuning process, in order to perform optimally on each class of surfaces and target features. When choosing a segmentation method, the number and complexity of actions and decisions involved in tuning the method for the test case should be considered as well.
For example, morphological segmentation on edges run with default parameters will always result in over-segmentation, even if the original topography is only moderately complex [19]. Subsequent post-processing to reduce the number of segments is typically required which, however, requires the careful tuning of several parameters (as described in section 2.3). Contour stability also requires careful tuning of several parameters. In addition, contour stability was designed to preferentially address steep edges, and performs relatively weakly when encountering locally smooth gradients, such as those observed with agglomerated particles in the test case (see section 3.2). Active contours is possibly the approach requiring the most involving set-up, in particular because of the need to perform a rough-guess of the initial contours, which requires a whole new topography pre-processing step. Ultimately, thus, the choice of a segmentation method may also be dictated by complexity of its set-up and fine-tuning, which in turn may be affected by application-dependent circumstances.
Other challenges have been found to be consistently shared across applications. For example, for all the test cases and all the methods investigated, filtering was required to remove larger-scale topographic formations which can confuse the segmentation process. Though this aspect has not been covered in detail in this paper, the identification of optimal filtering parameters is often challenging and still subject to trial and error. Initial set-up is important for any segmentation approach and it is important that the user is experienced with both surface characterisation and the processes that produce the surface in order to meaningfully determine the features being assessed.

Measurement uncertainty for feature-based segmentation and characterisation
Measurement uncertainty for feature-based segmentation and characterisation should be provided, just as measurement uncertainty for areal topography datasets has been previously investigated [26][27][28]. Estimation of uncertainty in feature-based characterisation is an important challenge to the adoption of these methods, requiring understanding of the influence factors associated with the topography data from the measurement as well as how this error may propagate through the various stages of the segmentation and characterisation.

Conclusions
Feature based characterisation is a way to assess surface topography that is complementary to texture parameters, and in some cases may provide richer information content, as features can be defined that more closely match the subject of interest in any specific surface investigation scenario.
Segmentation, the act of partitioning a surface topography into regions (segments) plays a fundamental role in feature-identification. In particular, the accuracy of a segmentation method at identifying region boundaries directly influences the accuracy in the assessment of a feature geometrical properties.
This paper presents a method to compare segmentation results and quantitatively assess their performance under different viewpoints related to both the capability of identifying features, and the capability to accurately delimit feature boundaries. The method is based on computing a series of quantitative performance indicators and requires a reference (ideal segmentation result) on which to compare. In the absence of an ideally performing, algorithmic segmentation method acting as a reference, the ideal result is currently produced manually by an expert operator. Manual generation may create issues of reproducibility, especially on complex surfaces with many features. However, if multiple segmentation methods are compared with each other using the same reference result, the method can provide a comparative, comprehensive assessment of segmentation performance.
Future work would see the methodology proposed used to compare segmentation methodologies and settings in order to optimise the segmentation methods for specific features, such as particles and particle clusters, as well as further developing segmentation approaches to target different features present on the additive manufactured surface.