AMagPoseNet: Real-Time Six-DoF Magnet Pose Estimation by Dual-Domain Few-Shot Learning From Prior Model

Traditional magnetic tracking approaches based on mathematical models and optimization algorithms are computationally intensive, depend on initial guesses, and do not guarantee convergence to a global optimum. Although fully supervised data-driven deep learning can solve the above issues, the demand for a comprehensive dataset hampers its applicability in magnetic tracking. Thus, we propose an annular magnet pose estimation network (called AMagPoseNet) based on dual-domain few-shot learning from a prior mathematical model, which consists of two subnetworks: PoseNet and CaliNet. PoseNet learns to estimate the magnet pose from the prior mathematical model, and CaliNet is designed to narrow the gap between the mathematical model domain and the real-world domain. Experimental results reveal that the AMagPoseNet outperforms the optimization-based method regarding localization accuracy (1.87<inline-formula><tex-math notation="LaTeX">$\pm$</tex-math></inline-formula>1.14 mm, 1.89<inline-formula><tex-math notation="LaTeX">$\pm \text{0.81}^{\circ }$</tex-math></inline-formula>), robustness (nondependence on initial guesses), and computational latency (2.08<inline-formula><tex-math notation="LaTeX">$\pm$</tex-math></inline-formula>0.02 ms). In addition, the six-degree-of-freedom pose of the magnet could be estimated when discriminative magnetic field features are provided. With the assistance of the mathematical model, the AMagPoseNet requires only a few real-world samples and has excellent performance, showing great potential for practical biomedical and industrial applications.


I. INTRODUCTION
P ERMANENT magnetic tracking is an established technique for locating medical instruments inside the human body [1]. Owing to its no line-of-sight limitations, low cost, and no power consumption for tracking targets, the magnetic tracking technique was also employed for vehicle pose estimation in industrial applications [2], [3].
However, only three-degree-of-freedom (DoF) position and two-DoF orientation can be obtained when using the radical magnetized cylindrical magnet [4] or spherical magnet [5] due to their symmetric magnetic field distribution along the magnetic moment direction. Nonrotational symmetric magnets have been suggested to avoid such a problem. Yang et al. [6] and Cichon et al. [7] adopted a cuboid magnet, and Song et al. [8] utilized a radially magnetized annular magnet to obtain the six-DoF pose. Although various indirect methods have been presented, such as employing two magnets with opposite magnetic directions [9] and fusing with inertial sensing [10], the space occupation and hardware complexity have increased. In contrast to the cuboid magnet, the annular magnet with a hollow design offers better compatibility with commercial wireless capsule endoscopy (WCE) structural integration, such as enabling the deployment of required cylindrical electronic modules. In addition, annular magnets are also preferred for the functional design of magnetically driven capsule robots [11]. Hence, in this study, we address the challenge of the six-DoF magnetic localization of a single circular magnet with a radial magnetization field.
Several equivalent models for calculating the spatial magnetic field distribution of annular magnets include magnetic dipole, magnetizing current, and magnetic charge [12]. The magnetic dipole model with simple expression is generally adopted in magnetic tracking and magnetic actuation [5], [11], [13], [14], [15], [16]. However, only the five-DoF pose could be estimated because the magnetic dipole model loses the description of the rotation information around the magnetic moment [17]. In addition, the magnetic dipole model will introduce approximation errors for nonspherical magnets, especially in the near field [18]. Thus, the equivalent magnetizing current model was adopted in [6], [7], and [8] to accurately calculate the magnetic field distribution for annular and cuboid magnets. Nevertheless, the yielded model based on numerical integration is highly complex and nonconvex, hence unsuitable for immediate real-time magnetic tracking.
To solve the underlying inverse magnetostatic problem (pose estimation), the Levenberg-Marquardt (LM) algorithm is widely used for the magnetic dipole model owing to its outstanding execution speed and calculation accuracy [5], [13], [14], [15], [16]. The LM algorithm is a local optimization algorithm that relies on an initial guess and model gradients. For the numericalintegration-based magnetic model, swarm intelligence algorithms, such as particle swarm optimization (PSO), were typically used [6], [8]. However, depending on populations and iterations, the PSO algorithm is computationally intensive and might fall into a local optimum. For instance, it took up to 830 s to estimate the annular magnet's six-DoF pose even though the dipole model and the LM algorithm were utilized to restrict the search boundaries of PSO [8].
Recently, deep learning has emerged in a few studies [19], [20], [21] in magnetic tracking. Sebkhi et al. [19] developed a data collection setup and spent a week gathering a dataset composed of ∼1.7 million samples in a 10 × 10 × 10 cm 3 working space with a positional resolution of 5 mm and an angular resolution of 10 • . The trained five-layer fully connected (FC) network model only outputs a three-DoF position with a median error of 1.4 mm. As stated by the authors, if the positional resolution is increased to 1 mm, the duration of data collection will increase to six months, which is unacceptable for practical applications. To reduce the dependence on the dataset, a regression model trained by 20 691 samples [20] and a classification model trained by 66 000 samples [21] were proposed. However, owing to their low localization accuracy, the trained models were just used to provide a five-DoF initial pose for the LM algorithm at the initial start-up. Thus, how to train a high-precision localization model with a few real-world samples is the primary issue for learning-based magnetic tracking.
Fortunately, prior knowledge-based few-shot learning makes it possible to learn a new task that contains only a few samples [22]. There are various ways to integrate prior knowledge into deep learning. For example, we can use prior knowledge to augment the training dataset [23], narrow the search space of parameters [2], and optimize the search strategy [24]. Thus, we propose an annular magnet pose estimation network (named AMagPoseNet) based on dual-domain few-shot learning. First, leveraging the mathematical model of magnet sensing as prior knowledge, we train an end-to-end convolutional neural network (CNN) model that maps from sensor measurement to the magnet's pose. After that, instead of collecting a sufficient and comprehensive dataset, we narrow the gap between the mathematical model domain (MMD) and the real-world domain (RWD), requiring only a few real-world samples for few-shot learning. Modeling diagram of an annular magnet based on the equivalent magnetizing current and the Biot-Savart law. According to the superposition principle, the magnetic induction intensity produced by the annular magnet is identical to the subtraction of the magnetic induction intensity produced by two cylindrical magnets with different radii.
The main contributions of this study are summarized as follows.
1) This article introduces an approach that significantly reduces the dependence on real-world data with the assistance of the prior mathematical model. The fact that only a few real-world samples (780 = 1300 × 0.6) were used in this study proves the practicality of the proposed method. 2) It improves the orientation representation to match well with the symmetrical property of the magnetic field distribution, where the magnetic field symmetry is a natural property for regular-shaped magnets.
3) It proposes a network structure consisting of CaliNet and PoseNet as well as a corresponding training strategy to bridge the gap between the MMD and the RWD. Benefiting from deep learning techniques, the AMagPoseNet can estimate the six-DoF pose of the annular magnet in real time and outperforms the LM-based approach in terms of localization accuracy, robustness, and computational latency.

A. Magnetic Induction Intensity of an Annular Magnet
As illustrated in Fig. 1, a magnet coordinate frame {m} is established at the magnet's center, and the magnetic moment direction is from Z m − to Z m +. From the equivalent magnetizing current model, magnetic currents are only present around the permanent magnet's outer surface. With the Biot-Savart law, the magnetic induction intensity dB l at an arbitrary point p i (p x , p y , p z ) generated by a surface current element can be expressed as where μ 0 is the permeability of free space, I denotes the current magnitude, dl indicates the current direction, and P 0 is the vector from the current element to point p i .
Furthermore, the magnetic induction intensity dB z at point p i can be calculated by a surface current I abcda with the current path a→b→c→d→a, i.e., where ab, bc, cd, and da are the segment paths of the current.
Assuming that the radius of the cylinder magnet is r, the magnetic induction intensity at point p i can be estimated by The Simpson rule, a numerical integral method, can be employed to calculate the above double integral expression. The detailed numerical expression can be referred to [8]. Thus, the magnetic induction intensity at point p i generated by the cylinder magnet with radius r can be presented as The subtraction of the magnetic induction intensity generated by two cylindrical magnets with different radii that yield an equivalent annular magnet's magnetic induction intensity at point p i is estimated by the superposition principle where r 1 and r 2 are the outer and inner radii of the annular magnet, respectively.

B. Coordinate Transformation
A magnetometer array is usually adopted to perceive the magnetic fields generated by the annular magnet in space. As shown in Fig. 1, a system coordinate frame {s} is built at the center of the magnetometer array. The annular magnet's magnetic induction intensity in (5)  Assume that v = [x, y, z, α, β, γ] is the six-DoF pose of the annular magnet in the frame {s}, where t sm = [x, y, z] T is the translation vector, and (α, β, γ) are the Euler angles at which the magnet rotates sequentially about the X s -, Y s -, and Z s -axes. Such a rotation sequence can be represented as where R sm is the rotation matrix.
Given the annular magnet's pose v, the magnetic induction intensity at the ith magnetometer in the frame {s} can be calculated as follows: Equation (8) is the magnetic field model of the annular magnet, from which we can predict the magnetic induction intensity at any location in space.

A. Orientation Representation
The annular magnet's orientation can be described in various representations, including Euler angles, quaternion, and rotation matrix.
Generally, the original Euler angles range is defined as α ∈ However, the magnetic field distribution generated by the annular magnet is symmetrical along the X m -Z m plane. Any γ i in the range between (−180 • , 0) obviously will have a corresponding γ j in the range between (0, 180 • ), resulting in the annular magnet producing the same magnetic field distribution. As illustrated in Fig. 2, γ i = −90 • and γ j = 90 • indicate two completely different orientations, while they cause the same sensor measurements. It should be noted that the ambiguity of orientation representation will lead to nonconvergence in pose regression.
Alternatively, the quaternion is a continuous and smooth representation of orientation, which is suitable for interpolation operation. However, there is ambiguity in quaternion, such as a quaternion q represents the same orientation as −q. The rotation matrix is a redundant parameter representation of orientation. It is difficult to impose constraints in the learning process to ensure the orthogonality of the output rotation matrix. In addition, as components in quaternion and rotation matrices are highly coupled, matching them with the symmetric properties of the magnetic field distribution is challenging.
The original Euler angles adopt three separate rotation angles to intuitively describe the orientation of a rigid body. We define γ ∈ [−90 • , 90 • ) to match the symmetrical characteristics of the magnetic field distribution. However, the original Euler angle faces a similar problem with quaternion: periodicity leads to very different angular values for similar sensor data [25]. For example, α might jump from −180 • to 180 • though the annular magnet is just a smooth rotation. To avoid this problem, we map each Euler angle to a 2-D unit circle through a pair of sine and cosine functions, and we name it the Euler variant. The orientation of the annular magnet can be represented as Scaling γ to 2γ guarantees the numerical continuity of e γ when the annular magnet rotates periodically. Consequently, the pose of the annular magnet can be expressed as

B. Neural Network for Pose Estimation
A triaxis magnetometer measurement can be viewed as an image pixel with red, green, and blue values. Inspired by this, we employ the CNN as the fundamental network layer of the AMagPoseNet to capture the local spatial correlation between adjacent sensors. Fig. 3 shows the network architecture of the AMagPoseNet, which consists of two subnetworks: PoseNet and CaliNet. PoseNet learns to estimate the magnet pose from the prior mathematical model. PoseNet is similar to ResNet-18 [26] in that it adds residual connections to facilitate the backpropagation of the gradient. The primary changes from ResNet-18 are the reduction of the kernel size in the first convolutional layer and the network depth and the splitting into four FC output branches to accommodate the data dimensionality and the pose regression task. A smooth and injective regression loss is defined in Euclidean space to learn to predict the magnet's pose (position and orientation) where M means the input magnetometer data, L p (M) indicates the position regression loss, and L o (M) denotes the orientation regression loss. Since the position and orientation are expressed in different units, a scale factor β is adopted to balance the two regression tasks, and two losses are defined as follows: where · δ denotes δ -norm and the symbols with· indicate PoseNet outputs. There is a gap between the MMD and the RWD since the measurements from the magnetometer array are imperfect and mixed with sensor noise. A calibration procedure has been proposed for the optimization-based magnetic tracking method [13], [16]. However, the general calibration procedure is based on the inverse solution of the magnetic dipole model, which is unsuitable for this study using the numerical-integration-based magnetic field model.
Thus, we design a preposition network named CaliNet that is placed between the sensor array output and PoseNet input to narrow the gap, and the δ -norm is utilized to evaluate the calibration performance of CaliNet After training, CaliNet C(·) could correct the input sensor datâ M containing noises to make it closer to the desired outputs M.
To evaluate the efficiency of the AMagPoseNet, the number of parameters (NPs) and floating-point operations (FLOPs) are employed as metrics for the computational complexity and the memory cost. Both the CNN and FC layers in the AMagPoseNet do not contain bias terms. Thus, NPs and FLOPs in a given layer can be calculated according to the following formula [27].
1) CNN layer: where K is the kernel width (assumed to be symmetric), H and W are width and height of the input feature map, respectively, and C in and C out represent the number of input and output channels, respectively. 2) BN layer: where D in is the input dimensionality and D out is the output dimensionality. Table I shows the calculation results of different models, where ResNet-18, a common backbone network in computer vision, is used as a benchmark for comparison. Qasaimeh et al. [28] measured the performance of ResNet-18 on embedded platforms and showed that ResNet-18 could achieve 5.17 frames/s on ARM Cortex A57 CPU and 145 frames/s on Jetson TX2 GPU. Compared to ResNet-18, the AMagPoseNet has only 24% of its NPs and 0.98% of its computation (FLOPs). It should be noted that the calculation of FLOPs depends not only on the NPs but also on the input data size. In Table I, 3 × 224 × 224 is the standard input size for ResNet-18, while 3 × 5 × 5 is the data size measured by our magnetometer array. The comparison indicates that AMagPoseNet is a lightweight network with the potential to be deployed on embedded platforms.

C. Network Training Process
To take advantage of prior knowledge and reduce the dependence on real-world samples, we train the AMagPoseNet in combination with the mathematical model detailed in Section II. Typically, both 1 -norm and 2 -norm can be employed in (11) and (12), but we employ 1 -norm because it performs more robust on datasets that contain saturated sensor data (outliers). The specific training process is summarized as follows.
1) Training PoseNet based on the prior mathematical model. The main task of PoseNet is to learn the inverse solution of the mathematical model in (8). A generated dataset D model (M model , v model ), sampled from the mathematical model over the working space, is provided to train PoseNet. Section IV-A presents an example illustrating how to obtain a generated dataset from the prior mathematical model. 2) Pretraining CaliNet with the prior mathematical model. The neural network trained by a few samples might lead to overfitting and poor generalization [22]. Hence, we first pretrain CaliNet with the generated dataset D model . The generated sensor data M model is used as the desired output of CaliNet, and as the input data after adding Gaussian noise obeying N (0, σ 2 ), where σ = 1 μT in this study. 3) Fine-tuning CaliNet by real-world sensor data and prior mathematical model. A real-world dataset D real (M real , v real ) is created by sampling from the actual magnetometer array. Here, v real is a subset of v model . CaliNet is further fine-tuned based on D real , where M real is used as the input data. A subset of M model corresponding to v real is utilized as the desired outputs, i.e., CaliNet is trained under label-shared dual domain.

IV. EXPERIMENTS
A. Experimental Setup 1) Hardware Platform: As shown in Fig. 4, a 5×5 magnetometer array (STMicroelectronics LIS3MDL) was adopted to perceive the magnetic induction intensities. A total of 25 magnetometers were welded evenly on a printed circuit board at Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. 40-mm intervals. The measurement range of the magnetometers is set to ±1200 μT. A microcontroller unit (STMicroelectronics STM32H743) collected the magnetometer data and then transmitted it to a personal computer running Ubuntu 20.04 with an AMD Ryzen 9 5950X CPU and an NVIDIA GeForce RTX 3090 GPU.
2) Datasets: A working volume of the system is defined as The generated dataset was obtained based on the prior mathematical model. The expected sampling points were evenly distributed in the working space, with a position interval of 10 mm and an angular interval of 20 • . Each expected sampling point had a random offset added to it, where the position offset was in the range between (−10, 10) mm, and the orientation offset was in the range between (−20 • , 20 • ). Finally, 293.76 (15×15 × 12 × 17 × 8×8) million samples were gathered into the generated dataset. At each height (Z s -axis), we randomly selected 2% samples as the validation dataset, 2% as the test dataset, and the remaining 96% as the training dataset.
The real-world dataset was collected with the assistance of the calibration board, as shown in Fig. 4. The height above the magnetometer array varied from 33 to 153 mm with a 10-mm interval. At each height, we sampled 100 points, which were composed of 25 kinds of positions and four kinds of orientations, thus yielding a total of 1300 samples. To verify the generalizability of the AMagPoseNet in the case of a few samples, we extracted five from 25 kinds of positions and one from four kinds of orientations as the test dataset (40% of total samples), and the rest as the training dataset (60% of total samples). In this way, none of the positions in the test dataset are equal to the ones in the training dataset.
Here, an example is shown to illustrate how to yield the generated dataset from the prior mathematical model. The sensor position is based on the hardware configuration of the magnetometer array shown in Fig. 4. Since our magnetometer array is composed of 5×5 triaxial magnetometers, 75 magnetic induction intensity measurements will be acquired. To facilitate the training of the CNN, the 75 data are recomposed to three channels, each with a size of 5×5 pixel cells.
3) Training Details: The AMagPoseNet was implemented in Pytorch. The training process was divided into three steps, as detailed in Section III-C. We adopted the Adam optimizer to train our model on the two training datasets with a base learning rate of 1 × 10 −4 , reduced with the cosine annealing schedule in a total of 50 training epochs. The batch size was specified as 512 for the generated dataset and 8 for the real-world dataset. It took about 6 h in total to train the AMagPoseNet on the computer configured with an AMD 5950X CPU and an NVIDIA 3090 GPU. After pretraining on the generated training dataset, PoseNet can achieve a localization accuracy that E p = 1.15 mm, E α = 0.88 • , E β = 0.51 • , and E γ = 28.71 • on the generated test dataset.

4) Evaluation Metrics:
The position error E p and the orientation error E o are defined as where (x r , y r , z r , α r , β r , γ r ) is the ground truth and (x e , y e , z e , α e , β e , γ e ) is the estimated value. The annular magnet's three-DoF position and two-DoF orientation can also be predicted based on the magnetic dipole model and optimization algorithms. The orientation is represented by a unit direction vector (m, n, p) of the magnetic moment. The relationship between (m, n, p) and (α, β, γ) is as follows: Then, we have

B. Comparison of Orientation Representations
In this experiment, PoseNet was trained based on quaternion and Euler variant, respectively (for quaternion, the three FC layers with 256 inputs and two outputs in the last layer of PoseNet were modified to one FC layer with 256 inputs and four outputs). We generated 360 test samples when the magnet was placed at (0, 0, 50) mm and rotated around the X s -axis from −180 • to 180 • with 1 • interval (excluding 180 • ). The ground truth and estimated values inferred from the PoseNet model are shown in Fig. 6. Fig. 6 shows that PoseNet based on the Euler variant could estimate values that fit well with the ground truth over the entire period, while the quaternion-based PoseNet performs poorly on periodic critical points. The reason is that similar sensor data on periodic critical points were assigned quite different quaternions in pose regression.

C. Dual-Domain Few-Shot Learning Behavior
Apart from the training scheme of fine-tuning CaliNet adopted in this study, another training scheme is skipping CaliNet and directly fine-tuning PoseNet on the real-world dataset. To more obviously exhibit the advantage of dual-domain few-shot learning, in this comparison experiment, none of the positions in the test dataset is equal to the ones in the training dataset.
The loss curves of these two training schemes are shown in Fig. 7(a), which reveals that fine-tuning PoseNet results in significant overfitting over the training dataset since there is a large gap between the training loss and the test loss. In contrast, the gap is obviously smaller if we fine-tune CaliNet. The main reason is that PoseNet maps from high-dimensional measurements to a low-dimensional magnet pose, which is a complex task. In contrast, CaliNet maps from one measurement space to another equal-dimensional measurement space. The primary purpose of CaliNet is to eliminate measurement noises and measurement offsets caused by sensor position and orientation deviations. The experimental results indicate that CaliNet could be well trained on the real-world dataset containing only a few samples, while PoseNet is susceptible to overfitting.
To visualize the efficacy of the trained CaliNet, we performed dimensionality reduction analysis on data from three domains related to CaliNet: input data (real-world data), CaliNet output data, and desired output data (generated data from the mathematical model). The t-distributed stochastic neighbor embedding (t-SNE) of samples under different domains is shown in Fig. 7(b). There is a domain shift between the MMD and the RWD, while CaliNet could alleviate the problem, i.e., CaliNet migrates the real-world data distribution toward the generated data distribution.

D. Localization Accuracy at Different Heights
The AMagPoseNet is trained based on dual-domain few-shot learning. To show that the proposed magnetic tracking method is more effective, we compared our method with the LM-based method on the same experimental platform for the following reasons.
1) Five-DoF pose estimation based on the magnetic dipole model and the LM algorithm is a well-established method and has been adopted in most studies [5], [13], [14], [15], [16] due to its outstanding performance in accuracy and speed.
2) The performance of a magnetic tracking system is highly dependent on both hardware and algorithms. The performance of the same tracking algorithm is likely to be different on different hardware platforms. Influencing factors include the calibration procedure [21], permanent magnet shape [6], [8], number of magnetometers [15], sensor arrangement [16], sensor types [13], [15], and so on. Depending on whether the real-world training dataset is used, our proposed method and the LM-based method are subdivided into the following four types: 1) PoseNet, which is only trained on the generated dataset; 2) AMagPoseNet, which includes CaliNet and is trained by dual-domain few-shot learning; 3) LM_ONLY, which regards the hardware design specifications as the position and orientation of the magnetometers, rather than performing a calibration procedure; 4) LM_CALI, as it carries out a calibration procedure on the real-world training dataset for correcting the position and orientation of the magnetometers [13]. These four methods were tested on the real-world dataset. Since only α and β could be estimated by the LM-based method, we let E γ = 0 when calculating the orientation error using (13). Fig. 8 shows that the localization accuracy of the proposed method (1.87±1.14 mm, 1.89±0.81 • ) outperforms the LMbased method (2.07±1.19 mm, 2.70±1.75 • ) within the working space, with or without the participation of real-world datasets. The difference is more significant at lower heights because: 1) some magnetometers closer to the magnet are saturated at the heights of 33 and 43 mm, which is also the main reason for the deterioration of our proposed method's accuracy; 2) the LM-based method utilizes the magnetic dipole model to predict sensor measurements. However, it has a significant approximation error in the near field [18]. Comparing PoseNet and LM_ONLY shows that the PoseNet trained only with the assistance of the prior mathematical model could achieve comparable localization accuracy to the LM-based method. This result illustrates that PoseNet has the potential to be generalized to the real-world dataset after being sufficiently trained on the generated dataset.

E. Computational Latency and Robustness Testing
The computational latency of the algorithm is one of the key metrics in real-time applications. We compared AMag-PoseNet against LM_CALI on ten test samples randomly chosen from the real-world test dataset, as shown in Table II. The ground truth of each sample added an offset v offset = (Δx, Δy, Δz, Δm, Δn, Δp) was utilized as the initial guess of LM_CALI. Since LM_CALI is sensitive to the initial guess, three groups of offsets were set:  AMagPoseNet and LM_CALI were implemented in Python and ran on the AMD 5950X CPU without any acceleration. The two algorithms were executed 100 times on each sample. Table II shows that the computational latency of AMag-PoseNet is less than 32% of LM_CALI and nearly does not vary with different magnet poses. The LM algorithm is based on an initial guess and gradient descent for iterative optimization. It requires the computation of the second-order Taylor expansion term (Hessian matrix), and the convergence time will be varied when different initial guesses are assigned. When the initial guess is far from the ground truth (e.g., high-speed movement), it might diverge or fall into a local optimum. On the contrary, AMagPoseNet is a single feedforward neural network whose outcomes depend only on the input sensor data after the network has been trained, avoiding the risk of falling into local optima. Thus, the AMagPoseNet is superior to LM_CALI in terms of computational complexity and robustness (nondependence on initial guesses).

F. Evaluation of 6-D Pose Estimation
In the experiment, we found that the annular magnet rotating around Z s -axis produces much less variation in magnetic field distribution than rotating around X s -axis or Y s -axis. This characteristic makes γ more susceptible to noise disturbances in pose regression, and the estimation error on γ is significantly larger than that on α and β, comparing Figs. 8 and 9. Therefore, we reduced the working volume of the Z s -axis to the range between (30, 100) mm and investigated the accuracy of the AMagPoseNet for estimating the γ angle on the generated dataset and the real-world dataset.
As shown in Fig. 9, it is difficult for the AMagPoseNet to estimate γ accurately on the generated test dataset even without adding any noise. Moreover, the real-world test dataset contains various disturbances, such as external ambient magnetic interferences and internal measurement noises. We added different levels of Gaussian noise to the generated test dataset to mimic these disturbances. From the results, the estimation error is closer to that of the real-world test dataset after adding the Gaussian noise with the variance of 1 μT on the generated test dataset. Note again that the real-world dataset includes Fig. 9. Comparison of γ error between the generated test dataset and the real-world test dataset, where the length of the annular magnet is 20 mm. Different levels of Gaussian noise were added to the generated test dataset to evaluate its impact on the estimation performance. some outliers at 33-and 43-mm height due to sensor saturation, resulting in the deterioration of AMagPoseNet performance.
Increasing the length of the annular magnet can provide more discriminative magnetic features for γ estimation. We yielded different generated datasets for various lengths of the annular magnet, and the test results are shown in Fig. 10. Fig. 10 shows that increasing the length of the annular magnet could significantly improve the estimation accuracy on γ. When the length is increased to 40 mm or above, the AMagPoseNet achieves an estimation accuracy better than 4.66±3.06 • , verifying that the AMagPoseNet can estimate the six-DoF pose when discriminative magnetic field features are provided.

G. In Vivo Testing
We designed a WCE with an outer layer of the annular magnet. A polytetrafluoroethylene tube was passed through the working cavity of the flexible endoscope (Smart GS-60DQ, HUACO, Beijing, China), where one end of the tube was tightly fixed to the WCE inside the pig's stomach, and a surgeon manipulated the other end. In the experiment, the surgeon manipulated the WCE to move 30 mm of displacement by the tube. The outputs of AMagPoseNet were recorded in real time during the entire movement. All the procedures were conducted in accordance with the Guiding Principles in the Care and Use of Animals and were approved by the Animal Ethics Committee of Qilu Hospital of Shangdong University (Dwll-2021-021).
It is challenging to obtain the ground truth of the WCE's pose in vivo. The trajectory length L t and the Euclidean distance L d were used to evaluate the localization performance of the AM-agPoseNet indirectly. Fig. 11 shows that L t and L d are around 30 mm, which verifies the feasibility of the AMagPoseNet for in vivo application to some extent.

H. Discussion
With the help of the prior mathematical model, the AMag-PoseNet outperforms traditional optimization-based methods, while only a few real-world samples need to be collected manually. Compared to the use of automated data acquisition equipment [19], the proposed approach could rapidly generate a comprehensive training dataset in the working space without potential electromagnetic interference, equipment freedom constraints, and hardware and development costs. In addition, the prior mathematical model of the annular magnet is derived from the basic Biot-Savart law. The modeling method is also suitable for other shapes of magnets, such as cuboid [7] and cylinder magnet [13]. Thus, the AMagPoseNet is generalizable and could be applied to other shapes of magnets.
Magnetic field symmetry is a natural property for regularshaped magnets, and our improved orientation representation (Euler variant) has the following benefits: 1) improving the prediction accuracy of AMagPoseNet because the symmetric magnetic field distribution and the periodicity of the sensor measurements are well described, as illustrated in Section IV-B; 2) decoupling γ angle from the orientation. Comparing Figs. 8 and 9, α and β could be accurately estimated even though there is a lack of discriminative magnetic field features for γ. Providing discriminative magnetic field features is a prerequisite for AMagPoseNet to estimate γ accurately. Fig. 12 presents the magnetic field distributions for various shapes of magnets simulated by the finite-element analysis software COMSOL Multiphysics. When the magnet is rotated along the magnetic moment direction (N-S axis), the magnetic field distribution generated around the cylindrical magnet and spherical magnet remains unchanged. Thus, the γ angle cannot be estimated for these two magnets. In contrast, the cuboid and annular magnets could produce discriminative magnetic field features. The more discriminative the features of the magnetic field are, the more accurate the estimated γ angle will be. To improve the discrimination of magnetic induction intensity measured by the sensor array, in addition to increasing the length of the annular magnet, we can also: 1) optimize the magnetometer layout, e.g., adopting a 3-D spatial layout [15], [16] instead of a 2-D planar layout; 2) combine two magnets with different magnetic moment directions [9], [14]. One deficiency in the six-DoF pose estimation is that γ is in the half-cycle range between [−90 • , 90 • ) rather than the full-cycle range between [−180 • , 180 • ). The deficiency arises from the symmetry distribution of the magnet's magnetic field. Since the motion of the rigid body is continuous, we can extend the γ angle to the full-period range based on the historical pose information.

V. CONCLUSION
Learning-based magnetic tracking strongly depends on the size and resolution of the training dataset, while collecting a comprehensive and sufficient dataset is labor intensive and time consuming. This article proposed an end-to-end neural network model (AMagPoseNet) to solve the underlying inverse magnetostatic problem by dual-domain few-shot learning from the prior mathematical model. Compared with the traditional optimization-based method, AMagPoseNet has the following advantages: 1) higher localization accuracy (1.87±1.14 mm, 1.89± 0.81 • ), especially in the near field; 2) enhanced robustness, as the AMagPoseNet is just a single feedforward neural network that does not rely on initial guesses and avoids the risk of falling into local optima; 3) lower computational latency (2.08±0.02 ms) since the magnet pose is directly regressed from a single feedforward network rather than iterative optimization; 4) real-time estimation of six-DoF pose if discriminative magnetic field features are provided. In terms of practical applications, the suggested method can track WCEs, magnetic surgical instruments, and the tips of conventional flexible endoscopes. Thanks to it, physicians will be able to know the exact pose of medical instruments inside the human body.
Saturated sensor measurements deteriorate the tracking system's performance at lower heights. Future research will reconstruct normal data from saturated sensor measurements and thus enhance the localization accuracy of magnetic tracking in the near field.