Extracting animal migration pattern from weather radar observation based on deep convolutional neural networks

The weather radar can operate in all weathers and all time, and has a large coverage area. Besides monitoring the weather, the weather radar can receive other echoes including biological echoes. In order to utilise weather radar biological monitoring capability, recognising and classifying local insect and bird echoes is one of the biggest obstacles for analysing their migration, foraging, and reproduction activities. Here, a pixel-wise classification method based on the fully convolutional network (FCN) is proposed which is trained by the radar reflectivity and the spectral width images. Moreover, to increase the biometric detection accuracy, the region growing method is combined for achieving the region edge alignment. Finally, the proposed method is validated based on the real weather radar datasets in Yantai. The FCN training results have a high pixel accuracy of 92.96%, and the region growing method performs well in the edge alignment.


Introduction
The weather radar plays an increasingly important role in the national life [1]. The widespread weather radars can realise the weather forecast and the monitoring of disastrous weather such as the typhoon, rainstorm, strong convection, and sandstorm. In addition, the weather radar can receive non-meteorological echoes in the scanning range, in which biological echoes are extremely important category. It can help understand the laws and patterns of the local creature and explore the complicated factors that affect the migration system [2]. However, it is necessary to discriminate the weather and the biological echoes accurately.
The problem has been studied since the second half of the 20th century. DS Zrnic et al. used dual-polarisation radar to conduct taxonomic studies on insects, birds, and weather phenomena [3], and proposed the threshold ranges of differential reflectivity, differential phase, and correlation coefficient. It was also an example of using the thresholds to categorise. The following classification methods were based on the weather radar parameters generally. At almost the same time, based on the assumption of creature migration, J Koistinen used the reflectivity information to study the phenomenon of biological migration in the Gulf of Finland and obtained a series of reflectivity images [4]. With the improvement of technology, people gradually put forward some new classification methods. C Kessinger used the fuzzy logic algorithm to classify the different echo components including the weather and the biology [5], and proposed the classifiers named APDA and PDA. In addition, DRL Dufton developed the fuzzy logic filtering algorithm to remove the non-meteorological echoes based on the dual-polarisation radar data [6]. Recently, AR Chowdhury et al. used the convolutional neural network (CNN) to classify birds and precipitation [7]. Among the common kinds of networks, the effect of 'very-deep-16' was the best with an accuracy of 94.4%. Compared with the classification methods based on the thresholds, the method performed well, but it only realised the whole-image classification and failed to specify the classification of each part in each image.
In this paper, we develop a classification method based on the fully convolutional network (FCN). Compared with the above CNN, FCN can further realise the pixel-wise classification and specifically point out the meteorological parts and the biological parts in an image. The trained network can achieve the pixel accuracy of 92.96%. Then, the region growing method can realise the alignment of the region edges.

Method
Due to the disadvantages of the CNN, we use the FCN to realise the classification of the weather echoes and the biological echoes. We train the FCN-32s by our own datasets and get the final parameters of the network. Based on the trained network, we get the rough boundaries of the two echoes. Combining the region growing method, we modify the boundaries and get the correct results.

Pixel-wise classification based on FCN-32s
2.1.1 Production of datasets: Compared to the photos provided in the computer vision recognition challenge (ImageNet), there are no ready-made images in the field of the weather radar that can be used for training and thus we need to produce the datasets by ourselves. In this study, we use the data from the weather radar station in Yantai, Shandong province, because this radar station is able to observe the obvious phenomenon of biological migration from the late afternoon to the next sunrise in the autumn. The radar station is S-band and SA-type weather radar (CINRAD/SA). The parameters of the radar are shown in the following table (Table 1).
The rendered images use reflectivity and spectral width values sampled at a radius of 200 km around the radar station. Then when we colour the images, we adhere to the principle that there must be a large chromatic difference between different categories. Thus, the reflectivity of the first elevation angle (0.5°), the reflectivity of the second elevation angle (1.5°), and the spectral width of the first elevation angle correspond to the three channels of R, G, and B, respectively, and the background of all images is pure black. As the velocity information measured by the weather radar is affected by the local wind easily, the velocity is not adopted in the selection of colour channels ( Fig. 1) ( Table 2). When labelling the images, we use a MIT open source labelling software named Labelme. We are based on the characteristics of meteorological components and biological components: i. Intensity: weather echoes are strong and the reflectivity is high.
Biological echoes are weak and the reflectivity is low [8]. ii. Height: the meteorological components are high and can be observed at both elevation angles. The biological components are low and observed by two elevation angles only at a close range [7]. iii. Velocity: the velocity difference of meteorological components is small, but the difference of biological components is large [8].
So, we can conclude that the meteorological components are yellow or yellow-green in the composite images, and even white can be seen in places where the velocity difference is large. However, the biological components are red and purple in the distance of the composite images and dark yellow-green in the near places.
After the label, we use the skimage library in python for the colour filling. In this paper, we set up three categories: background, meteorological components and biological components, and the corresponding colours are pure black, red, and green, respectively. Fig. 2 shows the result of this step. We should notice that the colour images are not the greyscale images required by the FCN, and finally, the labels need to be converted to greyscale index images.
We show the steps of the datasets production in Fig. 3. As can be seen from the flowchart, the Labelme software needs to be operated by humans, and other steps can be achieved through batch processing.

Structure of FCN-32s:
The FCN was first proposed by Jonathan Long in 2015, he improved the CNN VGG16 and got the FCN. The VGG16 network has 13 convolution layers and three fully connected layers, of which 13 convolution layers are divided into five groups. The structure of the convolution layers in each group is the same. Then, the extraction of abstract information is realised through the maximum pooling layers. In order to realise the end-to-end pixel-wise classification, Jonathan Long used the convolutions of the same size to replace the three fully connected layers, and did the deconvolution at the end of the network to upsample the images [9]. Table 1 and Table 2 The structure of the VGG16 network is shown in Table 3. According to the different multiples of the up-sampling, FCN is divided into FCN-32s, FCN-16s, and FCN-8s with a multiple of 32, 16, and 8, respectively. The difference of the three networks is the refinement degree of the obtained results. Before the FCN, the traditional segmentation method was based on CNN. To classify a pixel, people used an image block around the pixel as the input of the CNN and determined the category of the pixel by the obtain results [10,11]. There are several drawbacks. First of all, it is largely spending [12]. If the required pixel block is 15 × 15, the storage space required is 255 times of the original image. Second, the adjacent pixel blocks are basically the same, and the calculation is largely repetitive and inefficient [13]. In addition, the size of the    pixel block also limits the size of the sensing area, and the pixel block can only extract some local information, which can limit the performance of the classification [14]. FCN can accept input images of any size compared to CNN, without requiring all training images and test images to have the same size. The FCN is also more efficient and avoids the problem of repeated computation of pixel blocks [9] (Fig. 4).

Network training and evaluation:
In this paper, the deeplearning framework we use is caffe. Since we only need to point out the approximate location of the meteorological and biological parts in the images, we use the network FCN-32s to classify the images. Since there are only 1,000 images, we cannot train the initialisation network to convergence; so, we can get the final network parameters through the fine tuning of the network. We divide the 1,000 images into the training set and the test set by about 2:1, and before the training, the parameters of the VGG16 network are loaded as the initial parameters of the FCN-32s, and we change the classification number of FCN-32s into three categories. The network has been trained over 4,000 times and tested every 400 times. The basic learning rate of the network is 1 × 10 −9 , and for some network structures, such as fc6 and fc7, the learning rate is 10 times larger than the basic learning rate. In order to ensure the stability of the parameters in the later period of network training and prevent the occurrence of parameter fluctuations, we set the learning rate policy as 'step'. A certain attenuation is performed every 1,000 iterations with the attenuation coefficient of 0. In order to evaluate the results of network training, we define the indexes of the test [9] as pixel accuracy (PA), mean accuracy (MA) and the mean IU (mean region intersection over union). The detail is: In the formulas, N represents the number of categories, p ji denotes the number of pixels that belong to the class j but are classified to the class i, C i = ∑ j p ij and represents the number of all pixels belonging to the class i.

Region growing algorithm
In order to realise the alignment of the neural network output with the edge of the original image, we conduct the region growing algorithm after the neural network [15]. The main idea is that choose a small block in the segmentation part of the image, also called a seed region, and then add the surrounding pixels to it by some certain rules to achieve the ultimate goal of separating the different regions. The advantage of region growing lies in the ability to divide the regions with the same characteristics, especially the boundary information is better [16]. In addition, the idea of the region growing algorithm is relatively simple, and only a certain number of seeds and its growing criteria need to be given in advance. In contrast, other image processing methods, such as morphological operation [17] and image filtering [18], cannot get obvious boundary characteristics. This also results in a large calculation in the region growing algorithm. When the original image is noisy and the greyscale is not uniform, it will lead to the phenomenon of over-segmentation [15]. The basic principle of the region growing algorithm is as follows. Let R 1 , R 2 , …, R n be the segmentation parts in the region R, and we can get: Then, every part is connected domain. P(R i ) represents the logical predicate of all elements in the assemble R i . The above four formulas, respectively, reflect the completeness, independence, similarity, and mutual exclusion of the algorithm. In this paper, the region growing algorithm is mainly used to align the edge of meteorological parts. The region growing algorithm is applied to the output images of the neural network. If the image is larger than the original image, we can take a corrosion operation. We select random points as the seeds and judge the growing criteria of the eight pixels around them. We define the growing criteria as: if the corresponding position of the original image is not zero, then it will grow. Otherwise, the growth stops. The seeds of the whole image are stored in the stack in coordinates. At the time of growth, the corresponding coordinates are pushed into the stack, and after each processing, a seed is popped out of the stack, until the stack sequence is empty, and the region growing algorithm is ended. Table 4 shows the process of region growing algorithm.
We can summarise the process of the FCN-32 s and the region growing method as: (Fig. 5)

Results of FCN-32s
After the network training, we select the three types of testing images, of which include the images with weather (red), biology (green) and the mixture of these two classes. The results of the tests are as follows: (Figs. 6-8) Since the network is tested every 400 iterations, Fig. 9 shows the curves of the iterations and evaluation indexes of the FCN-32s. As can be seen from the figure, the performance of the network increases rapidly in the first 1,500 iterations. Since the training reaches 4,000 iterations, the pixel accuracy, mean accuracy, and mean IU have all converged. Table 5 shows the evaluation indexes of the last test. Compared with the network at the beginning of  Table 4 Process of region growing algorithm Step Process 1 Corrode the image (not necessary), and locate the seed position. 2 Push the seed coordinates into the stack. 3 Determine whether the stack is empty, and if the stack is empty, end the region growing algorithm; Otherwise, pop the coordinates out of the stack. 4 Judge 8 pixels around the seed, and push the matching pixels into the stack. training, although the evaluation indexes have improved, there is still a large room for improvement. We can improve the training performance of the network by increasing the number of datasets and refining the labels.

Results of region growing algorithm
In order to study the biological quantification, we carry out the removal of the meteorological components in the original image according to the output of the neural network. Considering that the boundaries between the two images are not completely consistent, the area reduction will produce some yellow-green areas, which will affect the accuracy of biological quantification. Therefore, we use the region growing method before the removal of the meteorological components, so that the boundaries can be aligned as far as possible.
It can be seen from Figs. 10-12 that the region growing algorithm can obtain accurate boundary information. There are also fewer residual boundaries in the process of removing meteorological components. However, there are some errors, as shown in Figs. 10 and 12, there are still some green meteorological components in the final results. This is due to the fact that there are more biological components around, limiting the growth of meteorological components. In the left bottom corner of Fig. 12, there is also a phenomenon of mistaking biological components to meteorological components. This may be due to the fact that this part of the biological components is linked to the meteorological components, and the meteorological components are overgrown.
In the production of datasets, we map the reflectivity and spectral width linearly to the RGB channels. After the region growing algorithm, the linear relationship is not destroyed, and we can recover the results of reflectivity and spectral width of biology through the final images. The reflectivity and spectral width of biology can provide some data support for the biometric detection.   This paper puts forward a pixel-wise classification method based on FCN of weather radar images with the pixel accuracy 92.96% and can avoid uncertainty of threshold selection in traditional threshold classification methods. The method can clearly point out the meteorological and biological components in the RGB composite images generated by the reflectivity and spectral width. It can realise the automatic and programmed judgment of the weather radar. At the same time, considering the quantitative detection of biology, the output of the network is modified by the region growing algorithm, which can recover the edge information of meteorological components and biological components so as to reflect the quantity of biology as far as possible. The research of this paper also has some limitations. As for the current network, because of the small number of categories and the relatively simple features, we can reduce the network layers to reduce the computation. For the complex classification situations, the number of images in the datasets and the number of categories can be increased appropriately in order to improve the generalisation performance of the network. In addition, as for the region growing algorithm, the growing criteria can be optimised to reduce the possibility of misjudgement and omission, so that the quantitative detection of biology is more accurate.