Performance evaluation of a magnetic field measurement NDE technique using a model assisted Probability of Detection framework

Receiver Operating Characteristics (ROC) are a powerful tool used to evaluate the performance of NDE methods; however, the need to manufacture and scan many test pieces with realistic defects means that they are expensive and time-consuming to produce. Advances in computational power now mean that it is possible to use numerical models to greatly increase the efficiency of producing ROC for practical applications. A Model Assisted Probability of Detection (MAPOD) framework has been developed to predict the performance of magnetic field measurement NDE techniques. The MAPOD method is used to predict the performance of a promising new technique relying on the deflection of a current injected into a pipe at remote locations, and measurement of the resulting magnetic field perturbations due to defects. A significant proportion of pipes cannot be inspected by pigging methods, and external inspection often requires complete coating removal; therefore, an NDE method that functions outside pipe coatings and cladding is attractive. In this method, changes in the radial and axial components of the field are measured and attributed to defects, but a strong azimuthal component means that misalignment can give significant apparent radial and axial signals due to the azimuthal field apparently having a component in these directions. This requires that the second-order gradient of the magnetic field be measured to maximise sensitivity. Fluctuations in the sensitivity and orientation of the gradiometer during the scan are expected to determine the maximum sensitivity of the technique in most practical applications; however, the flexibility of the framework allows performance to be rapidly predicted and quantified for many test scenarios. Results suggest good detection performance for defects greater than 15% of the wall thickness (T 1⁄4 7.1 mm) in a 6 pipe with 2 A (200 A/m) current injected when measuring above typical insulation thickness (25–50 mm).


Introduction
Corrosionhasbeenestablishedasthemajorcauseoffailureinthe petrochemical industry. The most common way of screening for corrosion in oil and gas pipelines is with the use of so-called "smart pigs" -tools that are passed through the body of the pipe and consist of sensors that are able to provide a direct indication of the wall condition of the pipe. The choice ofNDEtechniquedeployedonapig can vary, although Magnetic Flux Leakage (MFL) and Ultrasonic Thickness (UT) gauging are by far the most commonly used [1][2][3], the latter being often used for detecting cracks in addition to corrosion.
The infrastructure of many pipes does not support the use of a pig due to the potential of such a device to become stuck at sharp bends or diameter changes, or the inability to launch or retrieve the device. In such cases, the application of an NDE technique external to the pipe is required. Pipes are often coated and/or clad for corrosion resistance, insulation and protective purposes [4], and many applicable NDE techniques require the undesirable removal of this coating and cladding in order to achieve a satisfactory sensitivity. Guided wave monitoring can inspect a long section of pipe from a single location, however the removal of insulation and cladding is often required in order to install the bracelet transducer [5].
The Magnetic Tomography Method (MTM) [6] is a passive NDT method claiming detectability of defects at a significant distance from the pipe by measuring magnetic field anomalies due to regions of high mechanical stress, although the sensitivity is low [7]. There are multiple reports that the technique has a high false call rate and an inability to size or characterise defects [7][8][9]. Bespoke flexible pigs have been proposed for the testing of pipework with sharp bends, although these devices are expensive and require access to the pipe interior for launch and retrieval [10,11].
The NoPig method is an NDT technique in which an alternating current (ac) is injected into a pipe at potentially very distant ($ 1 km) locations [12]. A frequency-dependent difference in the induced magnetic field is caused by deflection of the current around defects such as corrosion which can be measured a large distance (<2 m) away from the pipe. The technique is sensitive to only relatively large defects [12].A method relying on similar principles has been proposed as a potential candidate for traditionally difficult-to-inspect regions of pipe [13]. In this method, the magnetic field perturbations from deflection of a low frequency ac current are measured using sensors positioned outside pipe coating and insulation.
The underlying physics of this "current deflection" NDT method is similar to the Alternating Current Field Measurement (ACFM) method [14,15] although direct injection of a current into the pipe at far away contact points would allow a much wider operating frequency range, from quasi-DC current to the high frequencies generally used in ACFM (typically 5-50 kHz) [16].
The electromagnetic skin depth, δ, is given by where σ and μ are the conductivity and the magnetic permeability of the conductor and ν is the current frequency. When δ is greater than the wall thickness, T, the current distribution is limited only by the conductor geometry and can be perturbed by defects occurring at any location throughout the wall thickness. The current can be thought of as quasi-DC in the frequency range which is generally less than 5 Hz for carbon steel pipes. The principles of low-frequency current deflection, and the resulting magnetic flux density, B, perturbations from a corrosion-like defect are illustrated in Fig. 1. In (a), flowlines of the current density, J, are plotted, showing the a general flow in the y direction, except near the defect where the current is deflected. The large J y induces a strong B x (a), with a local minimum centred over the defect where J y is smallest, and peaks above the defect edge perpendicular to the current flow. B y is plotted in (c). The quadripolar profile results from deflection in the current in the x direction around the defect. The bipolar profile of B y is a result of the clockwise and counter-clockwise rotation of the current as it deflects around the defect. The local maxima and minima in each component of B approximately mark the boundary of the defect, although they spread apart due to diffusion of the field with lift-off. The current deflection principles are identical when the conductor is a pipe rather than a plate. In this case, the x; y and z components become r; θ and z. Using Anisotropic Magneto-Resistive (AMR) sensors, Jarvis et al. suggested that current deflection from defects in a pipe could be detectable at around 50 mm lift-off using a few amps of injected current [13]. However, before any NDE technique can reliably be implemented for general use in an industrial setting, its performance must first be rigorously studied.
The most common metrics of performance for NDE techniques are the expressions of the Probability of Detection (POD), which describes the likelihood of correctly identifying a defect, and the Probability of False Alarm (PFA), which describes the probability of labelling a good area as defective. The Receiver Operating Characteristics (ROC) method expresses both the POD and PFA as orthogonal axes of the same graph and is a convenient way of illustrating the performance of an NDE technique [17].
Traditionally, ROC have been determined entirely empirically by manufacturing test pieces containing artificially produced defects of interest that increase in size or depth [18]. The number of test pieces must be large enough to account for statistical variability and the scatter in measurements that may be expected in a practical setting, e.g. due to changing Environmental Operating Conditions (EOC), must be accounted for. Blind tests are then performed which determine the POD and PFA accordingly. The manufacture and subsequent testing of a large number of test pieces with different defect morphologies over different EOC is expensive and time consuming. Due to the rapid advance of computing power in the last decade, Finite Element (FE) modelling now permits a very accurate and rapid prediction of the signal from a large number of defect geometries; therefore, there have been efforts to use Model-Assisted Probability of Detection (MAPOD) to generate POD and ROC curves more quickly and efficiently to allow multi-parameter POD studies to be completed, and even create a single "volume POD" that encompasses all possible test parameters [18][19][20][21].
A framework for efficiently generating the ROC of practical Structural Health Monitoring (SHM) applications has been presented by Liu et al. [21]. In their proposed methodology, experimental data is collected for an undamaged structure over multiple EOC onto which synthetic damage signals are added by superposition. Decoupling the effects of environmental variations and damage growth on the received signal allows the performance of a permanently installed SHM system to be evaluated in a practical setting. Although this framework was designed for the evaluation of SHM techniques, a similar principle can be applied to predict the performance of one-off NDT inspections. In this paper, the performance of current deflection NDT will be evaluated with this method.
As the signal from current deflection can easily be predicted using numerical modelling techniques, the key challenge is to establish a realistic estimate of the magnetic signal due to everything but the defect to capture the variation in the signal that may be expected in a practical scan. In Liu's framework, different EOC were the key drivers for signal variation so measurements of an undamaged structure were taken at a range of temperatures; however, for the current deflection technique, the following effects contribute to the measured signal: 1. sensor and electronics noise; 2. temperature effects; 3. variations in magnetic permeability, μ r , causing distortion of the induced field from the injected current; 4. electromagnetic interference e.g. mains/line frequency at 50/60 Hz; 5. variations in current distribution due to slight geometry changes in the pipe from the wall thickness varying within manufacturing tolerances, 6. current distortion from non-critical general pipe corrosion. 7. misalignment of magnetic vector sensors during a scan The amount by which each of these factors contributes to the total scan signal can be determined from a mixture of experiment and FE analysis. FE can be used where there is confidence that it is representative of a true practical scenario, and for all other cases, experimental measurements of an undamaged structure must be taken. A variety of defect signals can then be superposed onto the undamaged structure signals (a combination of measurement and FE) to give different defect scenarios with varying severity, position etc. which include the measurement variations seen on an undamaged structure. The undamaged structures for these scans should be similar in condition, material and geometry to the structure on which the desired NDE technique is to be used. The flowchart in Fig. 2 (a) represents the initial data collection of the undamaged structure signals; FE contributions to the undamaged structure signal, and the synthetic defect signals. Fig. 2 (b) outlines how each of these data are used to generate an estimate of the ROC. This paper is structured as follows: Section 2 outlines the methodology for the scans of the undamaged structure and the validated FE model used to generate the data required for the MAPOD framework; section 3 shows ROC predictions for bowl and slot defects generated using the framework; the discussion and conclusions follow in sections 4 and 5.

Scans of undamaged structure
The measurement of the magnetic flux density surrounding an undamaged 6-inch schedule 40 pipe (wall thickness 7.11 mm, outer diameter 168.4 mm) was performed using a pipe scanning rig (Fig. 3). Details of the equipment used can be found in Ref. [13]. A current of 2 A at 5 Hz was injected at the centre of an aluminium end cap which made contact with the pipe via springs spaced 20 apart. This ensured that edge effects in the magnetic field that occur from the current spreading out over the circumference were limited to a region <300 mm from the pipe end. As the maximum length of pipe tested was 1.5 m, this was necessary as the edge effects limit the region of the test pipe that would be representative of a practical test in which current would be injected remotely. A stainless steel rod was held in tension at the pipe axis ( Fig. 3 (c)) allowing the current injection and retrieval points to be at the same end of the pipe, negating the need for long trailing wires to complete the current loop. The rod also suppresses the large azimuthal field component which simplifies the study of sensor misalignment (point 7 in the list in section 1). This will be discussed further in the following section.

Sensor configuration
The signal measured outside the pipe consists of current deflection from defects in addition to spurious signals that arise due to changes in the current distribution from slight wall thickness changes and out-ofroundness tolerances within the manufacturing specification. Magnetic property variations could also leave signatures in the magnetic signal that are variable only in one direction, for example due to a longitudinal pipe weld. In comparison to magnetic variations, conductivity variations are usually less severe. As these effects vary over a much longer wavelength than the defect, they can be effectively suppressed by combining the measurements from multiple sensors to form a gradiometer. This principle has successfully been applied to detect magnetic anomalies in geophysics [22,23], and to locate ferromagnetic targets in the presence of the geomagnetic field [24,25]. Recently, the concept has been adapted for magnetic NDE of pipelines [26]. In the current deflection method, the static geomagnetic field is suppressed by using alternating current injection and sensing; however, the injected current in the pipe induces a large azimuthal component which could saturate the sensors and mask the defect signal if any misalignment were to occur during the scan. A second-order gradiometer is proposed to suppress the slowly varying magnetic signals, and to reduce the severity of errors due to sensor misalignment.
Magnetic vector sensors are positioned at each corner of a square as shown in Fig. 4 where the constituent sensors are oriented to detect the radial field. Note that a local co-ordinate system ðx; y; zÞ with origin in the centre of the gradiometer is used in addition to the global co-ordinate system of the pipe ðr; θ; zÞ where y is approximately aligned with r and x with θ. The measured quantity is the second-order gradient of the field  with respect to x and z. Using a finite difference approximation, the gradient at the centre of the gradiometer is given by where α ¼ x; y or z, s is the sensor separation and the subscripts A-D refer to the different sensors shown in Fig. 4. The gradiometer is more effective at suppressing misalignment errors at higher lift-off as the azimuthal field drops with 1 r . In this study, the scans were completed with single sensors and the gradiometer was formed synthetically which allowed the parameter s to be investigated without needing to manufacture multiple sensor arrays. In the field, however, completing a scan on a well aligned scanning rig would not be possible, so the gradiometer must be formed with hardware. Hardware subtraction is only effective if the sensitivity of each constituent sensor is matched. Sensitivity variation can be caused by thermal gradients, sensor ageing, nonlinearity etc.; for the chosen sensors, this can usually be confined to less than 1% [27].
In the experimental set up, a return current rod suppressed the large B θ , and the sensors maintained constant alignment throughout the scan; therefore, the equivalent B θ that would be measured without the rod was added to the azimuthal scan by superposition. This enabled a prediction of the amount by which potential misorientation of the sensors during a practical scan would affect the sensitivity due to pollution of the small B r and B z components with the large B θ . The magnetic gradient was calculated for possible misalignments of up to 2 ∘ in each of the three degrees of freedom of potential gradiometer rotation. A random sensitivity variation of 0-1% was then assigned to each sensor forming the gradiometer to generate a Gaussian noise profile that could be added by superposition to the measured scan of the undamaged structure to account for the fact that a practical scan, unlike the scan completed in the laboratory, would have significant noise due to misalignment. The selected scenario of a misalignment by up to 2 ∘ and sensitivity variation of 1% that changes between each position in the scan represents very undesirable practical test conditions. At a lift-off of 25 mm and sensor separation of 50 mm, this scenario results in misalignment noise with a standard deviation that is approximately equivalent to the amplitude of a 21 mm diameter corrosion patch removing 10% of the pipe wall. In most practical applications, misalignment will be the greatest contributor to the noise; therefore, by completing the scans of undamaged pipes with a precise positioning system and applying realistic misalignment noise by post-processing, we are able to separately study each of the contributions to the signal outlined in section 1.

Example scans of undamaged structure
In this study, scans of the seamed (longitudinally welded) 6 00 schedule 40 carbon steel of Fig. 3 pipe were completed at lift-off distances 25 mm and 50 mm to represent typical pipe insulation coating thickness.    5 shows the induced magnetic flux density measured at Δ ¼ 25 mm with 2 A of current at 5 Hz. The axial pipe weld is located at 90 ∘ . In scans (a) and (c) a perturbation in the magnetic field can be noticed at this azimuthal location, likely due to the weld material having different magnetic properties from the surrounding metal. In the radial component (b), this is shown by positive and negative regions where the magnetic field is being drawn in over the weld. This also causes the line of high intensity in the azimuthal scan. Each of the scans (a-c) exhibits a magnetic flux density that varies slowly in the axial direction. The peak-topeak variation in the field is on the order of tens of nT in (a,b) and hundreds of nT in (c), implying it would be preferable to use the radial and axial components of B for defect detection. In practice, the azimuthal component of B would vary from a value of around 2 μT; however, it has been suppressed by the rod used for the return current path in these experiments.
The measurements presented here represent a much more realistic signal set for the scan of a similar structure than could be predicted using FE alone as the noise from a real transducer and instrumentation and the variability across a real pipe are included. Time variation of the signal was determined to be minimal in this study as the residual between repeated measurements on 4 consecutive days was less than 5 nT. At each sensor location, 35 averages were taken to ensure that random noise was effectively suppressed. At low-frequencies (<5 Hz) this requires each measurement to last several seconds; however, for practical tests where inspection time must be minimised, an array of sensors could be manufactured to record data at multiple locations simultaneously. The passive environmental magnetic field of the laboratory was measured over 0-100 Hz prior to the test to ensure an operating frequency free from electromagnetic interference that could reduce the Signal to Noise Ratio (SNR) was chosen. It is also important to repeat this step for practical tests, and the lock-in bandwidth should be set to achieve a satisfactory SNR with maximum measurement speed (here, an applied bandwidth of 0.125~Hz was used).

Numerical modelling
The numerical model used to predict the response of the magnetic field to various defect geometries was created in COMSOL and is shown schematically in Fig. 6; additional detail of the model can be found in Ref. [13]. The model has been previously validated using current deflection measurements of flat-bottomed slots in an austenitic pipe [13]; however, validation is required for ferromagnetic pipes with corrosionlike defects. Circumferential scans around a pipe containing an outer wall 3T Â 3T Â T 2 semi-ellipsoidal defect ((c) in Fig. 3) were therefore used to validate the FE model. A 1D first order gradiometer was synthetically formed by subtracting the circumferential scan measured using a single sensor positioned axially in line with the defect with the scan measured when the sensor was axially offset by a distance s (shown by dashed lines in Fig. 6): ∂B r ∂z s≈B r ðr; θ; zÞÀB r ðr; θ; z þ sÞ (4) As the sensors were well aligned with the pipe during the experimental scans, there was no need to form the 2D gradiometer shown in Fig. 4 for the purpose of FE validation. Fig. 7 shows the single sensor measurement (closed circles), the synthesised 1D gradiometer measurement (open circles) and the FE prediction (solid line). The differential sensing method is effective at suppressing both variations in magnetic flux density that vary slowly along the pipe axis and spurious magnetic signals originating far from the pipe, at the expense of slightly reduced sensitivity to the primary field originating from current deflection from the defect. This is clear from Fig. 7 (a) where there is good agreement between the gradiometric measurement and FE. The amplitude of the peaks in the measurement match the FE to within 10%.
At the increased lift-off of 25 mm (b), the primary field due to current deflection diffuses and reduces in amplitude. The bottom axis has been extended to cover a full rotation of the pipe showing that, in the single sensor scan, the largest perturbation occurs around 100 ∘ from the defect location. When the gradiometer is simulated with s ¼ 100 mm, the signal much more closely resembles the FE prediction. These data provide validation of the FE model, and indicate how the sensitivity to defects can be increased with differential sensor configurations. The agreement between the azimuthal spacing of the peaks of the measured and modelled signals show that FE is efficient and accurate at predicting the defect signal from current deflection. So long as the current behaves as quasi-DC, uncertainties in the FE model are low, and arise from slight differences in current amplitude and the geometry of the real and modelled pipe. The discrepancies between FE and measured data away from the defect signal are due to magnetic field variations at the sensor that originate from within in the pipe or from the surrounding environment, and highlight the importance of clean structure measurements to obtain an accurate evaluation of performance.

Signal contribution from shallow general corrosion
In many cases, there may be some additional contributions to the signal that cannot easily be captured with the scan of an undamaged structure. The undamaged pipes scanned were procured in a "new" state; however, shallow general corrosion will cause some perturbation in the current distribution in the test piece, and therefore in the induced magnetic field. In order not overestimate the performance of the technique, this should be accounted for.
A model was created with a current-carrying plate of wall thickness 7.1 mm; approximating the pipe as a plate allowed simpler digitization for FE. It has been shown in the literature that general uniform corrosion (i.e. not including pitting) can be described statistically [28], and several experimentally measured surfaces have followed a Gaussian distribution of depths [28,29]. A surface profile was therefore generated using a Gaussian distribution of depths with root mean square depth of 0.1 mm and correlation length of 5 mm, based upon the levels used to model the general shallow corrosion from an undamaged section of pipe from Ref. [30], although ideally these would be matched to the roughness characteristics of the target test structure.
The FE simulation could not be readily solved at this point due to the computational burden of the requirement of having an element size small enough to resolve the roughness. However, the roughness-induced disturbance to the field is not of interest at the surface of the conductor but at some lift-off distance Δ so a spatial-frequency Low-Pass Filter (LPF) may be used to approximate the representative roughness profile for different lift-off distances and so to simplify the FE mesh. Assuming a linear filter, a magnetic dipole oriented parallel to the current direction was used as the impulse function to approximate the LPF transfer function: where r is the distance from the dipole to the sensor plane, ψ is the scalar potential and m is the magnetic dipole moment defined to be m ¼ m z ⋅b e z . A typical rough surface is shown in Fig. 8 (a), with the filtered surfaces at 25 mm and 50 mm shown in (b) and (c) respectively. This methodology was used to generate 25 different profiles with the same roughness characteristics which were then filtered for 25 mm and 50 mm lift-off. The FE model was then solved for B r ; B θ and B z above a current carrying plate whose surface is defined by these effective roughness profiles. Fig. 9 shows the geometry of the various different defects modelled in this study. A concave, semi-ellipsoidal defect was chosen to represent a corrosion patch, and transverse and longitudinal slots were selected to represent the cases causing maximum and minimum current deflection. The defects were modelled on the outer surface of a 6 00 schedule-40 pipe. Low-frequency ACFM has the advantage of being sensitive to defects occurring throughout the pipe wall, but in this study only outer wall defects are considered due to the relative ease of manufacturing an equivalent defect in a pipe for validation purposes.

Defect prediction
The resulting magnetic flux density profiles for a variety of defect depths were saved and stored in a database. The resulting perturbation amplitudes followed the expected trends outlined in the literature [15]. The signal from the transverse slots was of the greatest amplitude due to the greatest amount of current deflection occurring. For the longitudinal slots, the signal in B was of a lower amplitude and was almost entirely due to flux leakage effects.

Defect roughness
The effect of defect roughness was investigated by multiplying a generated surface profile hðx; yÞ (as discussed in section 2.2.1) with a flattopped tapered cosine profile with unity maximum value and diameter of 100 mm. The tail of the distribution of depths was clipped to limit the maximum depth to twice the standard deviation of the RMS depth, and any heights above the surface of the plate were removed to avoid any addition of material. Fig. 10 (a) shows the resulting surface profile of a defect with correlation length of 10 mm and RMS depth 1 mm. These values were based upon the parameters used to describe typical rough corrosion from Ref. [30]. Fig. 10 (b) and (c) show the induced axial magnetic flux density at distances of 1 mm and 25 mm above the plate. The greatest disturbances in B z occur outside the dotted line that represents the defect diameter, demonstrating that the most significant contribution to the signal comes from the overall footprint of the defect and not the roughness. At Δ ¼ 1 mm, some roughness induced distortion of the field is visible within the perimeter of the defect; however, the diffusion of the field with lift-off means that the current deflection signal from a rough defect is generally indistinguishable from that of a smooth defect of equal size when the   lift-off is greater than the length scale of the defect roughness. This result indicates that realistic defect morphologies can be well approximated by smooth, simple-to-model geometries; therefore, semi-ellipsoidal defects will be used to approximate corrosion defects for the remainder of the paper. Fig. 11 visually illustrates the process of combining the measurement of an undamaged structure (a), with the synthetic defect signal (b), and additional contributions to the signal (c) to give a composite scan (d). Gradiometric sensing is then simulated by the application of the function (Eq. (3)). Noise from potential misalignment of the gradiometer during the scan is then simulated (f) and added to give the final result (g) that represents the signal that would be measured from scanning a defective structure. Decoupling each contribution to the signal in this way allows simulation of the wide variation of scanning conditions that are required to determine ROC for the technique.

Defect detection methodology
A detection algorithm was chosen that exploits the fact that the defect profiles are always multipolar, with the local minima and maxima occurring near the defect edges. Firstly, the median of the scan is subtracted so that adjacent peaks have opposite polarity. The locations where a peak exceeds a threshold level, and the area of the peaks that exceed the threshold are calculated; the peaks that have an area less than A are discarded. The optimal value of A will increase with the lift-off as the defect signals diffuse. The distances between the remaining peaks and their closest neighbours of opposite polarity are then found, and if this is greater than κ it is deemed not to be a detection. κ should be set depending on the largest defect signal of interest (where the peaks will be separated by the largest amount), and the lift-off. The circumferential continuity of the scan was accounted for when applying this detection algorithm in the cases where the defect signal spanned both the top and bottom of the scan due to the choice of datum in the azimuthal axis. Fig. 12 shows a decision tree for the detection algorithm.
It is important to note that any arbitrary damage detection methodology could equally be investigated in the presented MAPOD framework   ; κ ¼ 50 mm for Δ ¼ 25 mm scans and A ¼ 40 mm 2 ; κ ¼ 75 mm for Δ ¼ 50 mm scans. and the one outlined here serves as an illustration. The algorithm could be improved to further distinguish true and false positives via additional filters. Machine vision or machine learning algorithms have proved very effective in automatically recognising defects in NDT scans [31][32][33][34] and could well serve this application, although the optimisation of such solutions is out of the scope of the current study. The outputs of the detection algorithm are the number of true and false positives at the selected threshold level. The framework allows a practical scan of any defect geometry, under any set of operating conditions to be produced. In this way, a Monte Carlo simulation can be performed to give the POD and PFA required to calculate the ROC.

Results
Fig . 13 shows the ROC curves generated by the framework for increasing depth 3T Â 3T Â d bowl defects on the outer surface of a 6 00 schedule-40 carbon steel pipe. The ROC curve allows the relative performance of different test conditions to be compared. Perfect detection with no false calls occurs when the Area Under the Curve (AUC) is unity, implying POD unity even at zero PFA. As the performance degrades, the ROC can approach the POD¼PFA line that corresponds to a random guess [35]; in this case, AUC ¼ 0.5. For the geometry modelled, AUC ¼ 1 for the d ¼ 1 mm defect at Δ ¼ 25 mm. For shallower defects, the AUC falls as the defect amplitude becomes less distinguishable from the background signal. With Δ ¼ 50 mm, the corresponding curves on the ROC plot drop away from the ideal and the AUC decreases, indicating poorer detection performance as the defect signals become broader and reduce in amplitude. It is clear that the shallower defects are approaching AUC ¼ 0.5, so they could not reliably be detected in practice. The ROC curve for each defect depth is calculated from the mean of a distribution of 1000 samples at each threshold level. The dashed lines below the solid ROC curves show the 95% confidence limits. On this curve we are 95% confident that the ROC curve in a given test will be above this level [35]. Fig. 14 shows the equivalent of Fig. 13 (a) with a sensitivity variation of up to 1% and misalignment throughout the scan of 2 . The detection performance for equivalent depth defects is reduced by consideration of the misalignment. For example, the ROC curve for the d ¼ 0:7 mm defect in (a) is similar to that of the d ¼ 0:9 mm defect in (b). The AUC for each scenario are summarised in Table 1. Fig. 15 shows the ROC curves for the detection of flat bottomed slots ((c) in Fig. 9)a tΔ ¼ 25 mm when using a second-order gradiometer formed of sensors measuring the (a) radial and (b) axial components of B. The gradiometer maintains constant alignment throughout the scan. The detection performance decreases with the slot depth, and measurement of the gradient of B r results in better performance, likely due to the fact that the magnetic signal from current deflection is of a higher amplitude in B r than B z .

Discussion
This paper has presented a framework for rapid and efficient ROC prediction for magnetic field measurement techniques achieved by the combination of numerical modelling and experimental data. The framework has been applied to the analysis of a low-frequency current deflection technique that has shown promise for the in-situ detection of corrosion-like defects on insulated and coated pipes with remotely injected current.
The methodology allowed each contribution to the signal and noise to be analysed in isolation. This revealed that misalignment of the secondorder gradiometer in combination with variations in the sensitivities of its constituent sensors are the limiting factor on the sensitivity of the   technique. This is due to the difficulty in detecting small perturbations of the field in the presence of the large B θ . The experimental measurements were completed using a scanning frame that ensured constant alignment throughout the scan, and with a return current path via a rod that suppressed the large B θ . This enabled simulation of varying degrees of misalignment by adding back the large B θ and post processing the scan data. Performing the experiments in a controlled environment like this demonstrates how the modular nature of the framework is powerful, as it allows each contribution to the final signal to be analysed separately, and it supports simple implementation of updates and improvements to the noise model. Other contributions to the noise were identified by measuring the field surrounding an undamaged pipe. Slowly varying background signals due to slight wall thickness variations and magnetic property variations (e.g. in the pipe weld) were effectively suppressed by using the second-order gradiometer. For the most accurate performance evaluation, the undamaged structure should have similar surface characteristics to the structure to be tested. The undamaged pipes scanned in this study were procured new, so it was demonstrated that a validated FE model could be used to identify the contribution to the scan from the perturbation of current in a generally corroded conductor. At higher lift-off, the signal due to roughness diffuses and significantly reduces in amplitude suggesting that the sensitivity is most affected by very rough surfaces when measuring close to the pipe.
Receiver operating characteristics were generated under various scenarios using corrosion and slot-like defects. Without consideration of misalignment and sensor variation, an area under ROC curve (AUD) above 0.91 was predicted for 3T Â 3T Â d bowl defects with d greater than 12% of the wall thickness at a lift-off of 25 mm, implying that at 92% POD and 95% confidence, the PDA is 0.5%. When considering a misalignment of up to 2 ∘ in each degree of freedom of the gradiometer, the AUC was reduced by 15%. In this case, a PFA of <1% at >99% POD and 95% confidence was predicted for corrosion defects deeper than 20% of the wall thickness, T. The sensitivity reduced when the lift-off was doubled to 50 mm, with the AUC decreasing by 17% on average. Important future work to develop the magnetic field measurement technique will be to validate these ROC predictions with experimental studies.

Conclusions
A MAPOD framework for predicting the performance of magnetic field measurement techniques has been developed. The framework separately evaluates the contributions to the signal measured in a practical scan by combining measurements of defect free structures with numerical modelling results, which allows the contributions to the noise to be evaluated separately. The modular nature of the framework supports future development.
The framework was used to evaluate the sensitivity of a current deflection technique where a low-frequency current is injected into a pipe at remotely located points. Perturbations in the induced magnetic field due to defects are then measured with magnetic sensors. ROC were predicted for a range of scanning scenarios.
A false call rate of <1% at >99% POD and 95% confidence was predicted for corrosion defects 1.5 mm deep and three times the wall thickness in diameter. For 1 mm depth defects a false call rate of 5% at 90% POD and 95% confidence was predicted. Results suggest this level of sensitivity at 25-50 mm lift-off so the technique is practical for scanning outside of pipe insulation.