SEAMLESS REALTIME LANE LEVEL VEHICULAR NAVIGATION IN GNSS CHALLENGING ENVIRONMENTS USING A RTK GNSS/IMU/VINS INTEGRATION SCHEME

: Outdoor positioning requires a reliable solution that can work in environments where satellite signals are often blocked or degraded. Global Navigation Satellite System (GNSS) is a common choice, but it may not provide accurate results for land vehicles. To address this challenge, this research proposes a multi-sensor integrated system for vehicle navigation that combines GNSS with other sensors. The system uses Extended Kalman Filter (EKF) to fuse the data from different sources and improve the navigation performance. The algorithm targets to provide seamless navigation for urban environments as well as various indoor environments fields with INS/GNSS/VIO aiding integrated solutions. The experimental vehicle of this research is equipped with a tactical-grade inertial sensing measurement unit (IMU) as the test system, a self-designed and assembled visual platform, which includes a camera with a time synchronization protocol and a low-cost IMU. Also, both indoor experimental fields and outdoor urban scenarios with different high challenging were tested to verify the developed algorithm. To evaluate the performance of the proposed real-time navigation system, we use a high-accuracy navigation-grade system as a reference, which provides a stable and reliable trajectory. The result indicates that using the GNSS RTK solution with VIO aiding integration scheme reduced the RMS errors in long outage (450 sec, 1812 m) by 87.4% and


INTRODUCTION
Global Navigation Satellite System (GNSS) is the user-preferred solution for the requirement of outdoor positioning. However, in the land vehicle application environment, the satellite signal is often obscured and denied, which further reduces the accuracy of the satellite positioning solution. Therefore, it is imperative to develop automotive solutions that can be applied to the hostile satellite signals environment. The Inertial Navigation System (INS)/GNSS integration solution is widely used in the positioning and navigation applications of land vehicles. This research proposes a multi-sensor integrated system for vehicle navigation adopting an Extended Kalman Filter (EKF). The proposed algorithm aims to provide seamless navigation for urban environments as well as various indoor environments with INS/GNSS/VIO aiding integrated solutions. This research focuses on the integration of visual-inertial odometry (VIO) and INS/GNSS system in Real Time Kinematic (RTK) mode.
The major contributions of adopting VIO into INS/GNSS are: 1. VIO provides continuous measurement and uses image and IMU constraints to reduce the accumulation of errors of the main receiver IMU. 2. VIO brings higher performance to an IMU/GNSS system that uses a MEMS-level IMU. 3. Compared to lidar, VIO can reduce the overall cost of future autonomous driving payloads. 4. The integration scheme becomes more suitable for navigation in unknown or GNSS-denied environments. This research contributes to the performance enhancement of traditional positioning algorithms aided by cameras and inertial sensors and is dedicated to achieving accuracy at the level of "which lane". * Corresponding author

RELATED WORKS
Visual Inertial Odometry (VIO), also known as Visual Inertial Navigation System (VINS), is developed from Visual Odometry (VO) (Zhang & Ye, 2020). The problem of estimating displacement using visual information alone was first proposed in the 1980s, and the term VO was proposed (Nistér et al., n.d.). VIO makes VO more robust by adding a complementary IMU with the camera. Compared with GPS, VIO can still provide pose estimation indoors and does not suffer from the urban canyon scenario effect that degrades accuracy with near urban high buildings. Compared with LiDAR, the sensors required for VIO are relatively light, inexpensive, and have the means to provide higher frequency pose estimation. A major limitation of the monocular visual odometry method is the inability to determine the scale factor. This problem is inherent to the monocular approach and cannot be solved without additional information. The image frames processed by monocular VO easily lose the depth information of the environment. Although this process recovers three-dimensional information, it does not yield the actual physical scale of the depth recovery. It is necessary to provide benchmarks for scale recovery through other external sources such as GNSS. The accuracy of triangulation based on visual features depends on the frame-to-frame displacement. If the camera rotates approximately, the triangulation algorithm deteriorates and causes feature point tracking to fail. By comparing the advantages and disadvantages of monocular visual and IMU, the fusion of camera and IMU has good complementary characteristics (Aqel et al., 2016). The steps for fusion IMU and VO can be described briefly as follow. First, by aligning the pose sequence estimated by the IMU with the pose sequence estimated by the camera, we can estimate the actual scale of the camera/IMU platform trajectory. Second, the IMU can predict the image frame pose and the feature point location in the next image frame, which enhances the feature tracking algorithm's speed and robustness against fast rotation. Finally, by using the current gravity vector from the IMU accelerometer can transform the estimated position into the navigation coordinate system required for actual navigation. The VIO process can be decomposed into several parts, including internal and external orientation parameter calibration, feature extraction, data preprocessing, initialization and alignment, and nonlinear optimization, which will be explained in the following sections. Due to the slight error in the production process of the camera lens, as well as the difference in focal length and lens shape. The difference between the captured image and the real world is called lens distortion, including radiation distortion and tangential distortion. Using the camera calibration technology, the corresponding internal orientation parameters can be accurately determined. Then the original image with deformation can be corrected so that the relationship between the image point and the corresponding real-world object point satisfy the collinearity condition. In this research, calibrating the visual platform of the camera with the Kalibr toolbox (Qin & Shen, 2018). The following equations represent the radial distortion and tangential distortion of the distortion point, respectively.
As well as camera, IMU calibration is also important. IMU calibration is performed using the Allen variance technique to characterize the random measurement and process noise of the accelerometer and gyroscope. This method requires collecting static data for a long duration to obtain reliable results. In this study, more than 10 hours of data were. Then the imu_utils toolbox were used to estimate random walk and bias instability. These coefficients are essential for modelling the IMU errors and improving the accuracy of the sensor fusion algorithm. After the camera and IMU are calibrated individually, continue to use the Kalibr toolbox to calibrate the visual platform designed in this research. After receiving the data, consider the calibrated camera interior orientation parameters and the random walk, bias instability of the IMU, and input the image and IMU data into the Kalibr toolbox to estimate the exterior orientation parameters (Qin & Shen, 2018). These are the rotation and translation relationships between the camera and the IMU. One of the key steps in visual odometry (VO) and visual-inertial odometry (VIO) is feature extraction, which builds the correspondence between images in a sequence. A common method for feature extraction is corner detection, which identifies salient points in different images. However, storing the features as gray value matrices is not robust enough for complex environments. . Processing pre-integration of IMU measurements, and 4DOF pose graph optimization are proposed. The overall VINS-Mono architecture can be divided into the following three parts: 1. Preprocessing, using the optical flow method to track the feature points of the sequence images, and pre-integrating the measurement data of the IMU at the same time. 2. Parameter initialization, compare the estimated pose of the camera with the pre-integrated pose of the IMU, and complete the initialization of the velocity vector, the gravity vector, and the scale factor, as well as the calibration of the gyro bias. 3. BA estimation optimization, based on the sliding window algorithm to marginalize distant keyframes to ensure the real-time performance of VIO estimation.
VI-DSO is a method for estimating camera pose and sparse scene geometry by minimizing photometric and IMU measurement errors, as well as the optimization-based method (von Stumberg et al., 2018). It can track any pixel with a sufficiently large intensity gradient, but it does not easily maintain the global map to affect accuracy. Finally, ORB-SLAM3 is an optimizationbased tightly coupled VIO method, which can navigate for a long term with weak visual information with high robustness (Campos et al., 2021). However, it can be time-consuming and resourceconsuming when computing, as the same ORB-SLAM2. In this study, the review of the state-of-the-art keyframe-based and filter-based VO confirms that without the intervention of the IMU, the scale issue cannot support long-term and real-time stable operation. Further, the filtering-based and optimizationbased VIO frameworks are reviewed. VINS-Mono performs excellent and is stable in the real-time application, while outperforming other types of VIO frameworks in accuracy and robustness. For these reasons, the VIO algorithm used in this research is designed based on VINS-Mono, as shown in Figure  1. This visual inertial odometry starts with measurement preprocessing, where features are extracted for tracking, and preintegration of IMU measurements between two consecutive frames is performed. The initialization process provides all necessary values, including attitude, velocity, gravity vector, gyroscope bias, and 3D feature position, for guiding the subsequent nonlinear optimization-based VIO. And the scale of the visual odometry can be recovered by aligning the IMU measurements with the vision-only structure from motion (SfM) (Schönberger & Frahm, n.d.). Finally, the pose graph optimization module accepts the geometrically validated slidingwindow relocalization results and performs the global optimization to eliminate drift to obtain state estimation solution.

METHODOLOGY
In this study, we analysed the performance of traditional approaches including INS/GNSS/Camera, and vision navigation by using the proposed visual platform sensors.

Coordinate Frames
We now define notations and frame definitions that we use throughout this paper. Figure 2 depicts the coordinate frames used in this study. The navigation frame (N) is a locally defined frame and in this study, the north-east-down (NED) frame is chosen, the origin of which is located at the center of a platform for which the x-axis and y-axis are aligned with the Earth's north and east directions, respectively. The z-axis is orthogonal to a reference ellipsoid that points downward the center of the Earth.
In the vehicle coordinate system (V), the x-axis, y-axis, and z-axis point in the forward direction, to the right, and in the downward direction of the vehicle, respectively. We assumed that the transformation matrix ( ) between the V frame and the IMU sensor frame (B) would be an identity matrix and that the effect of the lever-arm between the IMU sensor and the vehicle center would be ignorable. The transformation matrix between two frames is denoted by quaternion 1 2 . Finally, in the camera coordinate system (C), the x-axis, y-axis, and z-axis represent the left to right axis of image, top to bottom axis of image direction, and optical axis of camera, respectively. To estimate the 3D coordinates of the features in each image, it requires transformation that contains extrinsic and intrinsic parameters.

INS/GNSS/Camera (VINS) Integration Scheme
In this section, the measurement state content and fusion strategy provided by VIO are discussed in detail and a checking mechanism is designed to ensure that all measurements provided by auxiliary sensors meet predetermined criteria before going into the fusion algorithm. In addition, the proposed model for VINS velocity update is developed to improve some important issue areas in traditional positioning. The concept of the proposed INS/GNSS/Camera integration scheme.

Visual Inertial Odometry Integrated Algorithm
The VIO applied in this research is VINS-Mono where = It represents the kth image. = The translation between camera frame and IMU.
= It is the inverse distance of the lth feature from its first observation.
= The total number of features in the sliding window.
This equation also shows that, to get a good estimation of this state, the premise is the joint calibration (images and IMU) that obtain good interior orientation parameters (IOPs) and exterior orientation parameters (EOPs), and then a good initialization through the specific movement. Thanks to the fact that the features of most outdoor scenarios and indoor parking lot scenarios are repeatable and abundant so the extracting and tracking procedure by the VINS algorithm is not difficult. This not only provides a robust position but also limits drift caused by IMU. In this algorithm, estimator initialization is carried out firstly the scale of the visual odometry can be recovered by aligning the IMU measurements with the vision structure from motion. Mathematically, given extrinsic parameters ( , ) between the camera and the IMU, the translate poses from the camera to IMU sensor frame are derived by: where = The scale obtains by visual-inertial alignment. 0 = It represents the first camera frame is used as the reference frame for SfM. 0 = The derived refined gravity vector.
In this research, using provide the heading angle to coincide local level frame to n-frame. In this algorithm state estimation process, the IMU is applied to accurately predict the pose of the image frame and the position of the feature points in successive frames. It is worth mentioning that this method improves the matching speed of the feature tracking algorithm and the robustness against fast camera rotation to support real-time applications. For the establishment of this visual inertial odometry, the roll angle and pitch angle are completely observable, and the cumulative drift is only for four degrees of freedom (x, y, z, heading angle), so the pose graph is only for these four degrees of freedom optimizations. Finally, the corresponding velocity is obtained and further utilized in the proposed EKF using Equation ( 6 ). Since in-vehicle navigation applications usually do not match the premise of loop closure scenarios, the loop closure thread is not used in this research. = + Where = The vector of residual of velocity between current solution and VINS solution. = The design matrix derived from the linearization process to connect the navigation states and measurement values. = The VINS position and velocity error in the n-frame.
On the other hand, for the VIO aiding application of this research, in terms of visual aid information processing, the measured coordinate vector and rotation matrix are transformed from the visual platform to the main receiver (Pwrpak7D-E2); this is the coordinate vector and the rotation matrix from the c-frame to the b-frame (lever-arm and boresight). And then calculated aiding data converted to the n-frame for this research to integrate the navigation application of the system.

GNSS Measurement Condition Assessment
This VIO innovation sequence can indicate the quality of the GNSS signal based on the VIO innovation sequence. By setting a threshold for the VIO innovation sequence, the EKF can reject the GNSS measurements that are affected by multipath or nonline-of-sight (NLOS), and only update the EKF with reliable measurements. On the other hand, if the velocity derived from the VIO is authorized, then the difference between the GNSS velocity and the velocity derived from the VIO (∆ ) must also be smaller than the velocity threshold, which means that the GNSS position is consistent with the VIO position and can be trusted. Conversely, it can be derived for possible effects, such as multipath, NLOS, and other sources, which means that the GNSS position is affected by some errors and should be rejected. The condition assessment of the design is shown in Equation (  8 ).

VIO Scale Coefficient Estimation
As described before, the visual-inertial odometry (VIO) system that combines the camera and the inertial measurement unit This scale coefficient estimation method can be applied into realtime applications. Therefore, the scale estimation mechanism designed in this research is shown in Figure 3. The flowchart of this work can be explained as follows. will be compared to and the duration must exceed . If these conditions are met, the scale coefficients calculated from the INS/GNSS previous stable solutions will be used to update the scaling adjustment process.

VIO Velocity Update
The proposed fusion algorithm employs measurements from the VIO, such as velocity measurements. These measurements with different drift behaviors could reduce the drift issue in the INSonly case and enhance the detection and rejection of contaminated measurements. Consequently, the velocity update is calculated based on the difference between the two frame positions and the time interval. The velocity of VIO-derived can be written as follows: where = The velocity information in the navigation frame. ̂ = The position of VIO at the time kth frame. Assuming that the measurements of velocity are independent and uncorrelated, which covariance matrix can be defined as: where = The VIO covariance matrix. = The measurement covariance matrices of velocity.

The proposed INS/GNSS/Camera Integration Scheme
The proposed integration scheme is shown in Figure 4. Navigation states from GNSS and INS are fused through an Extended Kalman Filter (EKF), while high output rates and continuous INS trigger prediction equations each epoch. Utilizing camera/IMU source measurements for updating the EKF, these will be described separately. Inputs are the sensor measurements and motion constrain, output is the navigation state, including the position, velocity, and attitude of the filtered solution. In this scheme, IMU measurements are processed using INS mechanization to provide a navigation solution in position, velocity, and attitude in the navigation frame. GNSS provides the absolute position as the primary survey update. For the camera, the VINS-Mono method constructed with several characteristics was chosen to process the data and provide measurement updates in this research. It is including image feature optical flow tracking, IMU data pre-integration, vision-only SfM, sliding window-based nonlinear optimization to achieve tight coupling, and 4DoF pose graph optimization, so on. To provide the velocity measurements as a minor update.

Figure 4 Flowchart of INS/GNSS/Camera fusion algorithm
In general, velocity is one of the important states for land vehicle applications. Velocity measurements could provide information for estimating IMU error state (such as bias) in EKF. It is clearer to identify the state of the vehicle based on the velocity of the vehicle. For example, if the velocities in the three directions are below a certain threshold, it can be considered static to allow the application of motion constraints, such as zero velocity update (ZUPT) or zero integrated heading rate (ZIHR) in EKF. This information also allows users to identify or check for outliers in other measurements. This velocity update assists with overall navigation solutions, especially in GNSS-hostile environments.

EXPERIMENT
The experiment setups were show as Figure 5 The test system data was obtained using a SPAN (Synchronized Position Attitude Navigation) system from NovAtel, model Pwrpak7D-E2, which consists of a MEMS-based tactical-grade IMU, Epson G370, and an OEM7 GNSS receiver. The visual platform is composed of a low-cost MEMS-based consumer-grade IMU and an industrial camera. The reference data system is from iMAR model iNAV-RQH-10018 which provides high-precision and reliable IMU measurements at data rates up to 300 Hz. The test system and the reference system are secured with aluminium extrusions and hardwood plywood boards to ensure the lever arm. Finally, the reference trajectories generated by commercial software (NovAtel Inertial Explorer® ) that use double differential and smooth mode tightly coupled carrier phase measurements for INS/GNSS integration processing are regarded as true values.  To ensure that the benchmarks for each test experiment are the same, an experimental platform was designed with support from Free CAD software for the Basler acA1300-75gc camera and OpenRTK330 IMU. The relative relationship between the original hole size and the design of the two sensors into this software for implementation. Considering the tolerance unit and size of the 3D printer, the results of the 3D printing design platform can be placed and fitted on the aluminium extrusion of the roof rack, as shown in Figure 6. In this way, it can be ensured that the relative relationship (distance and angle) of the camera and the IMU changes very little during installation.    Table 3 OPENRTK330LI EVB specifications The first experiment sight was in Tainan city, Taiwan. The scenario contains a long outage in Hai-An underground parking lot. The performance of the proposed multi-sensor integration scheme in the Hai-an underground parking lot is shown in Figure  7, where the red line represents the reference, the green line represents the result of the proposed multi-sensor integration scheme without the VIO aiding, the blue line represents the proposed the multi-sensor integration scheme with the VIO aiding, and the orange line represents the two systems using the same GNSS results (RTK), and the purple box represents the height of the indoor. The proposed multi-sensor integration algorithm is statistically analysed and evaluated with and without the aiding of VIO. Table  4 and Table 5 are the PVA analysis of the proposed multi-sensor integration without and with VIO aiding, sequentially. For the position analysis, with the VIO aiding, the maximum error in the horizontal direction was reduced from more than 178 m to less than 18 m and the RMS error was reduced from 79 m to 10 m. The along-track and cross-track maximum errors were drastically reduced from 177 m to 17.8 m and 125 m to 10 m, respectively. In particular, the position error of the horizontal maximum per distance travelled was reduced from 9.7% to 0.1 %, and the position error of the horizontal maximum per time travelled was reduced from 0.394 m/s to 0.039 m/s as well. The along-track, cross-track errors and the comparison between the proposed multi-sensor integrate algorithm with or without the VIO aiding scheme in the Hai-an underground parking lot are shown in Figure 8 and Figure 9. In particular, the ability of the VIO-aiding algorithm to reduce errors is demonstrated in the horizontal position and velocity, along-track, and cross-track directions as well. VIO aiding for the multi-sensor integration scheme has better performance improvement in indoor environments.     Table 5 PVA analysis with VIO aiding in underground parking lot. The second experiment sight was in Taipei City, Taiwan. The performance of the proposed multi-sensor integration scheme in the Taipei urban area at the experimental site is shown in Figure  10, where the red line represents the reference, the blue line represents the proposed multi-sensor integration scheme with the VIO aiding, and the orange line represents the two systems using the same GNSS results (RTK). It is worth noting that executing navigation under urban canyons and under the elevated road over-trusts poor GNSS results causing errors accumulation, as shown in Figure 10   In the second scenario, the proposed algorithm is tested in the outdoor Taipei urban area. Table 6 and Table 7are the PVA analysis of the proposed multi-sensor integration without and with VIO aiding, sequentially. For the position analysis, with the VIO aiding, the maximum error in the horizontal direction was reduced from more than 38.4 m to less than 1.5 m. The alongtrack maximum errors were reduced from 16.7 to 1.4 m, and the cross-track maximum errors were reduced from 37.3 to 1.5 m. It is worth noting that the probability of the along-track improvement from 86.3 % to 100 %, and the probability of the cross-track improvement from 80.1 % to 100 % as well. As well as due to the GNSS and VIO-aiding conditions, the maximum error in the height is greatly reduced by 29.7 m to 1.1 m, refer to the elevation position error analysis plot, as shown in Figure 12.

CONCLUSION
This research proposes a navigation approach based on the visual integration structure using a camera and a low-cost IMU, which integrates INS, GNSS, and VIO respectively. For land vehicle positioning systems, to achieve the level of "which lane", it must be able to provide 1.5 meters of along/cross-track direction accuracy as a prerequisite. The characteristics of VIO composed of low-cost IMU are used to further solve the problems of drift and algorithm divergence of traditional VO in the travelled distance. The fundamental concept is to update existing velocity measurements based on INS/GNSS when VIO conditions are detected as non-stationary. And, built condition evaluation of GNSS measurements to verify the measurement results before the EKF update process. Protecting the navigation state for inaccurate measurement, whether GNSS or VIO. This research discusses the integration of GNSS RTK measurements with INS by using the same GNSS receiver, both with and without visualinertial odometry aiding to validate the proposed algorithm, respectively. GNSS outage and GNSS-hostile scenario are discussed and selected separately. Show that the proposed strategy to integrate multi-sensor has great potential to achieve lane-level accuracy (1.5 m). It is worth mentioning that compared with the traditional INS/GNSS integration, the maximum error is greatly reduced, and the performance is stable and reliable. Especially, for GNSS solutions that suffer from multipath interference or NLOS contamination. The improvement of the VIO aiding scheme reveals the ability to reduce 96.1 % of horizontal direction maximum error and 94.9 % of horizontal direction RMS error in the cross-track direction. As well as, the ability to reduce 90.2 % and 96.8 % maximum error for the alongtrack and cross-track, respectively.