The orthorectified technology for UAV aerial remote sensing image based on the Programmable GPU

Considering the time requirements of the disaster emergency aerial remote sensing data acquisition and processing, this paper introduced the GPU parallel processing in orthorectification algorithm. Meanwhile, our experiments verified the correctness and feasibility of CUDA parallel processing algorithm, and the algorithm can effectively solve the problem of calculation large, time-consuming for ortho rectification process, realized fast processing of UAV airborne remote sensing image orthorectification based on GPU. The experimental results indicate that using the assumption of same accuracy of proposed method with CPU, the processing time is reduced obviously, maximum acceleration can reach more than 12 times, which greatly enhances the emergency surveying and mapping processing of rapid reaction rate, and has a broad application.


Introduction
The trend of natural disasters occurrence in recent years has gradually expanded in frequency, scale, sphere of influence, the degree of harm. China is a natural disaster-prone country, facing a huge disaster relief pressure. Therefore, grasping the disaster-related information timely and accurately is essential to provide the scientific basis for decision making for emergency command and disaster relief. UAV Remote Sensing completes the area of remote sensing image precisely for disaster information channels.
For a complete remote sensing image process, the most important step is orthophoto generation. Because the UAV aerial images in the imaging process is affected by the perspective projection, photography-axis tilt, atmospheric refraction and the curvature of the Earth , the terrain and many other factors, different degrees of geometric deformation and distortion thus emerged in the image pixels. The orthorectification image not only eliminates image distortion caused by a variety of factors in the imaging process, but also includes much richer and more vivid and intuitive information than DLG, orthophoto production has become an indispensable foundation work in a variety of remote sensing application process.
With the continuous improvement and development of computer technology, the powerful computing and parallel processing capabilities, programmable graphics processor GPU has become one of the focuses in the current state of affairs at home and abroad. Due to the characteristics of the hardware structure of GPU, it has inherent advantages in terms of matrix calculation in image processing; matrix multiplication has already been using GPU multi-texture technology for many years. Some domestic research institutes are also GPU parallel computing has yielded some results. Therefore, for the orthorectified process computationally intensive, time-consuming characteristics of long, to meet the requirements of real-time image processing, parallel processing algorithms has become the current trend of the emergency mapping.

The digital orthophoto basic principle and process
The orthorectification of digital image is a purely digital process, based on the elements of exterior orientation of the images, the terrestrial digital elevation model (DEM) initial data orthorectified, the computer determines the location of the orthographic projection like point corresponds to the center of the projection image point position, then the original image pixel gray value will be given to the corresponding ortho position of pixels one by one, to complete all orthorectified images [1]. In order to meet the emergency requirements of Surveying and Mapping and accessing the affected areas orthophotos timely, this paper proposes the application of GPS and inertial measurement unit (Inertial Measure Unit), a combination of location and orientation system for (Positioning and the Orientation System POS) fast access flying the carrier high precision position and attitude information. Using this information by the solver to derive elements of exterior orientation of the sensor while applying the DEM data in the terrain database, open source library GDAL orthorectified aerial imagery thereby greatly shorten the mapping period, at the same time the precision can achieve the requirements of the emergency mapping.
The UAV images orthorectified process is shown as below: The main time-consuming operation in orthorectified algorithm on the CPU side is a lot of cycle and calculated existed in collinear equation like point coordinates and gray interpolation.
The collinear equation can be calculated by mapping between each point on the source image pixel coordinates corresponding ground coordinates of each point of the image is shot after.
Thereinto is elevation corresponding points which are obtained by the DEM Interpolation. Each pixel should be carried out in accordance with the above algorithm analysis, collinear equation solver to a new location and interpolation calculated pixel values. The collinear equations mapping between relative independence of the data, you can take advantage of the GPU parallel processing technology, which is suitable for CUDA programming model.

CUDA parallel computing
Because the experimental platform of this paper is NVIDIA's GPU products, so we will only focus on its programming model.
CUDA is a scalable parallel programming model the thread hierarchy shared memory, shielding synchronized three core abstractions based on the efficient calculation of the GPU, the key is to allow multiple computing unit by parallel synchronous operation to maximize the maximum efficiency and parallel computing unit; using the memory hierarchy to maximize communication bandwidth; take advantage of the high computing density hidden delay. Due to the structural characteristics of the GPU hardware, it determines the poor performance of its branch control, but their math ability is huge, so the final execution mode is related to with the CPU program control, thus GPU parallel is used to do high-density computing tasks. The CPU will be used as Host end, GPU as a coprocessor or Device end in CUDA programming model [2].
CUDA programming model calculation steps are basically divided into the following three steps: 1. Pending data is copied from memory to graphics memory; 2.GPU parallel data processing; 3. Copied the result back to the processing memory. GPU multithreaded process is driven by __global__ function, called the kernel function. Kernel function parameter specifies the model of the Grid, Block and Thread-level parallelism [3].
CUDA programming model is not necessarily applicable to all programming algorithms; a suitable algorithm must meet its unified programming model architecture. The requirements of CUDA programming model on algorithm are as following: 1) A large amount of data, the high computational complexity. If the amount of data that is not occupied memory and graphics memory copy of the same time, the computational complexity is small, there may appear the CPU than GPU computing speed.
2) The data can be divided into blocks. And calculation steps are very similar to mutual independence between data blocks.
3) Algorithm should be possible to use less logic. GPU has a lot of computing unit and very little logic unit.
4) It must pay attention to the GPU data memory access mode in GPU programming. The merger of the global memory fetch, a reasonable application of constant memory, texture memory and shared memory are must be realized.

GPU-based orthorectified
Ortho-correction algorithm analysis based on the first section, each pixel must be collinear equation solver to a new location and interpolation calculated pixel value. The collinear equations mapping between relative independence of the data, you can take advantage of the GPU parallel processing technology, suitable for CUDA programming model. On the basis of orthorectified algorithm analysis and CUDA programming model, the algorithm implementation process is as follows: 1) CPU reads the image and DEM data firstly, as well as build the corrected image, these data are stored in the memory; 2) Algorithm needs data from memory copied to graphics memory, open up graphics memory space as well as the corrected image, the original image and DEM data access asymmetry into the texture memory to reduce the access time to correct the required common parameters into shared memory to speed up access; 3) The GPU-side, multi-threaded parallel processing technology with CUDA programming model to achieve the collinear equation, grayscale interpolation within each thread, merger memory access global memory access, to avoid each entry access violation; 4) Finally, the CPU side, the copy of the corrected image into memory, the release of a variety of GPU memory resources, and the corrected data will be written to the hard disk [4].

Experimental environment and data
More efficient GPU in the field of remote sensing image fast processing of the papers all programming experiments are under the unified computer configuration environment, in addition to adding a GPU Quadro 4000 graphics cards, the rest of the configuration is consistent to classic programming environment.
Algorithm hardware platform: CPU: Intel (R) Xeon (R) CPU W3550@3.07GHz 4 core 8 threads. GPU: NVIDIA Quadro 4000. The experiment uses leaflets UAV aerial remote sensing image with the size of 5432 * 7296, and a ground resolution of 0.05m, 483 * 578 DEM data sampling interval of 30m, ortho-corrected images of ground sampling distance of 0.05m. UAV image and DEM data are as follows:

CPU and GPU data accuracy analysis
According to the different processing unit, CPU and GPU, support for single floating-point and floating-point type may be different, and may result in a different division reserved decimal places, eventually leading to the differences in image gray level. Same algorithm suitable for the CUDA a unified programming model must make the appropriate changes, and the same may differ, it is necessary to compare their data accuracy. In order to verify the correctness of the orthorectified implemented GPU-based digital imaging, precision analysis for both experiments should be done. Pictured CPU and GPU corrected are as following image: 2) Control point and its geographical coordinate's value orthorectified.
Through the examination of the geographical coordinates of points, the difference in the X and Y direction about 2-3 meters, which meet the requirements of the Emergency Mapping accuracy due to the use of exterior orientation elements from the POS data.

Subtraction test.
Based on the orthophoto subtraction between CPU and GPU acquired respectively, the difference between them can obtained as below:  when the GPU using CUDA texture memory comes interpolation, there is a difference and the methods used in the CPU. So it caused a small portion of the pixel grayscale offset. However, such an error in the allowed range, and still be able to meet the application requirements.
In summary, the precision of the orthorectified technology based on the GPU is equal to the accuracy of the CPU processing; both of them reached UAV remote sensing image fast processing requirements.

CPU and GPU computing rate analysis
To test UAV remote sensing image based on the GPU computing orthorectified rate, the CPU and GPU are the two ways of image contrast experiment. As the application of GPU can only play role in the pixel-based orthorectified, read and store image are completed in the CPU, so the comparison of the experimental rate just calculated from the pixel orthorectified time contrast. Table 1 shows the statistical results of the two ways of using the CPU and GPU experiment ten times. As can be seen from the Table 1 in the CPU processing algorithms and GPU processing algorithm to perform the same calculation process, has the same amount of computation and complexity premise, based GPU orthorectified performance relative to CPU-based algorithms have a huge improvement, the GPU computational rate with respect to the CPU has increased 8-11 times .

Conclusion
By using the GPU orthorectified experiment of UAV remote sensing images, it can be see significantly the time was accelerated when the data accuracy is still consistent with the CPU case. At the same time, UAV sensing image processing is a mainly operation for the pixel, the data processing has the following characteristics: large amount of data, computational complexity, fixed processing method, processing method has the inherent parallelism of characteristics. These characteristics determine the remote sensing image processing; it is suitable for parallel processing architecture of the GPU and the CUDA programming model. Therefore the application of GPU technology is conducive to the rapid response of the emergency treatment requirements efficiently and effectively, and it should have broad application prospects.