V-RSIR: A WEB-BASED TOOL AND BENCHMARK DATASET FOR REMOTE SENSING IMAGE RETRIEVAL

: Benchmark datasets play an important role in evaluating remote sensing image retrieval methods. Current benchmark datasets are mostly collected through the Google Map API or other desktop tools. However, the Google Map API requires the users to have programming skills and other collection tools are not publicly available, which may hinder the development of new benchmark datasets. This paper develops an open access web-based tool V-RSIR to help users generating new benchmark datasets with volunteers for remote sensing image retrieval. Using this tool, a new benchmark dataset V-RSIR that contains 38 classes with at least 1500 images per class is created by 32 volunteers. A handcrafted low-level feature method and a deep learning high-level feature method are used to test the dataset. The evaluation results are consistent with our perception. This shows that the tool can help users effectively creating benchmark datasets for RSIR.


INTRODUCTION
The recent advances in satellite technology lead to a dramatic growth of remote sensing (RS) images (Chaudhuri et al., 2016;Tang et al., 2018).This has provided new opportunities for various RS applications; however, it also results in the significant challenge of retrieving RS images from a considerable volume of RS images (Ye et al., 2018;Shao et al., 2018).Therefore, developing remote sensing image retrieval (RSIR) approaches becomes one of the active and emerging research topics (Zhou et al., 2018).
In general, developing such RSIR approach requires extensive benchmark datasets to evaluate its performance (Zhou et al., 2018).The benchmark datasets need to include a huge amount of RS images labelled with categories (i.e., airplane, forest and freeway).At present, several RSIR benchmark datasets have been published on the web.For example, Zhou et al. (2018) presents a novel large-scale RS dataset named 'PatternNet' including 38 classes with 800 images per class; Shao et al. (2018) presents a multi-labelled dataset termed DLRSD including 17 classes with 2100 images.Besides, some benchmark datasets for RS image classification and object detection have also been used to evaluate RSIR approach in the past few years (Aptoula, 2014;Li et al., 2017;Zhou et al., 2017).For instance, the University of California, Merced dataset (UCMD) that is created initially for a land use/land cover classification (Yang and Newsam, 2010)  classes are 800 and 31500, respectively in the above datasets.These datasets are still small-scale datasets compared to ImageNet, which contains more than 14 million images and covers 20,000 categories (Russakovsky et al., 2015).On the other hand, the above RS datasets are mostly collected through the Google Map API or other desktop tools.However, the Google Map API requires the users to have programming skills and other collection tools are not publicly available.This may hinder the development of new large-scale benchmark datasets.Therefore, these have motivated increasing research interest on designing and developing an open access web-based tool for generating large-scale RSIR benchmark datasets.This paper establishes an open access web-based tool termed V-RSIR.Concretely, this paper provides an overview of the V-RSIR tool to date, which can help users generating new benchmark datasets with volunteers for RSIR.This includes image single-label, image cropping, image editing, image review, image statistics, spatial distribution of images and image sharing.Besides, Using this tool, a new benchmark dataset which is also termed V-RSIR is created by 32 volunteers.The new benchmark dataset contains 38 classes with at least 1500 images per class.
A longer term goal of V-RSIR tool is to simultaneously offer image single-label and multi-label function, and build a really large-scale benchmark dataset for RSIR as good as ImageNet.
The remainder of this paper is organized as follows.Section 2 overviews the V-RSIR tool and Section 3 describes the new V-RSIR benchmark dataset, followed by conclusions.

V-RSIR Architecture
The V-RSIR tool is a web application that annotates and crops RS images from online images maps by worldwide volunteers for constructing RSIR benchmark datasets.Currently, the V-RSIR tool is only available in English.It is now hosted in the Aliyun server and is available at http://www.geoinfobar.com:5321.In addition to the above permissions, the image inspectors are mainly responsible for checking whether the category is labelled correctly by volunteers.The middle tier is the business logic of the V-RSIR tool, including user registration, login, image single-label, cropping, editing, review, quantity statistics, spatial distribution, sharing and so on.

Main functions of the V-RSIR tool
The V-RSIR tool is implemented based on an open source library of OpenLayers.This section outlines the main functions in Figure 1.

Image single-label and cropping:
When the user successfully logs, the tool will skip to the main pages of image single-label and cropping (Figure 2).For example, if users want to label baseball field category, they can select the baseball field category in the left setting panel and enter the keywords of " baseball Washington" into the search box, some similar scenes will be displayed in the drop-down box (Figure 3).When users determine the correct category in the located scenes, they can click the square icon in the top left corner of the page to draw a yellow bounding box around the correct scene (Figure 4).If the scene has been labelled by other users, a hint panel with thumbnails will pop up in the bottom right corner of the page (Figure 4) and the details of the labelled image will appear if users click the thumbnail (Figure 5).If the scene has not been labelled, users can click the 'submit' button (Figure 4) to crop it and store it into the database.

Image editing:
The volunteers can manage labelled images by themselves in the page of image editing (Figure 6).
Once they find images are labelled incorrectly, they can delete them individually or in batches through 'delete' button or 'bulk delete' button.
Figure 6.Interface of image editing

Image review:
Occasionally, volunteers may make mistakes due to their negligence and not all volunteers follow the instructions to label images.The solution to these issues is have professionals to review the images again.Therefore, the V-RSIR tool sets some image inspectors to review the images.The image inspectors can delete incorrect images individually or in batches in the page of image review (Figure 7).Besides, the correct images should be submitted to the benchmark dataset by clicking 'Check' button (Figure 7).9).The result is displayed in the form of clusters.After clicking one circle of the clusters(Figure 9), the map will locate to the position of the corresponding image.

Image sharing:
Image sharing provides sharing of the checked datasets via the web.At present, the labelled data can be acquired by download link of Baidu cloud disk

WALK-THROUGH EXAMPLE: V-RSIR BENCHMARK DATASET
In this section, a walk-through example of constructing a new benchmark dataset termed V-RSIR is presented to demonstrate the usefulness of the V-RSIR tool.A handcrafted low-level feature method of HSV color (hue, saturation, value) and a deep learning high-level feature method of DenseNet (dense convolutional network) are selected to verify its ability of evaluating different RSIR methods.We randomly select 50 images from each class of the VGoogle-RSIR dataset as the test set and the remaining images are used as the training set.In other words, there are 1900 images in the test set and 57504 images in the training set.
Average normalized modified retrieval rank (ANMRR), mean average precision (mAP) and precision at k (Pk where k is the number of retrieved images) are used to evaluate the retrieval performance.For ANMRR, lower values mean better performance and higher values mean better performance for the other two metrics (Zhou et al., 2018).The averages of all queries about the three metric are shown in The performance of HSV is worse than the DenseNet method DenseNet, which is consistent with our perception.This illustrates that the dataset labelled by the V-RSIR tool can effectively evaluate the retrieval performance.

CONCLUSIONS
This paper presents an open access web-based tool V-RSIR and it is intended to help users generating new benchmark datasets for RSIR.The tool offers functions of image single-label, cropping, editing, review, quantity statistics, spatial distribution, sharing and so on.To evaluate the usability of the tool, 32 volunteers and 6 image inspectors are organized to label the images by using the tool.Ultimately, a new benchmark dataset V-RSIR covering 38 classes with least 1500 images per class is constructed.This demonstrate the effectiveness and applicability of the V-RSIR tool.Future work will concentrate on improving the functions of image sharing and image multilabel.

Figure 1 .
Figure 1.Conceptual architecture of the V-RSIR toolFigure1presents the conceptual architecture of the V-RSIR tool.From bottom to top, the V-RSIR tool is divided into three tiers.Tier 1 is the data layer, in which Google maps, Bing maps and ESRI (Environmental Systems Research Institute) maps and other online maps are served as RS image sources.These online maps are directly integrated into the V-RSIR tool through their service addresses.The top tier is the user group including volunteers and image inspectors.The volunteers can label, crop and edit RS images.Besides, they also can browse the quantity statistics and spatial distribution of cropped images and download the inspected images.In addition to the above permissions, the image inspectors are mainly responsible for checking whether the category is labelled correctly by volunteers.The middle tier is the business logic of the V-RSIR tool, including user registration, login, image single-label, cropping, editing, review, quantity statistics, spatial distribution, sharing and so on.

Figure 2 .
Figure 2. Interface of image single-label and croppingUsers should first set image size, image source, image category in the left setting panel.Then, users can enter category keywords and place names into the search box to retrieve and locate similar scenes.This is implemented based on Nominatim (https://nominatim.openstreetmap.org).For example, if users want to label baseball field category, they can select the baseball field category in the left setting panel and enter the keywords of " baseball Washington" into the search box, some similar scenes will be displayed in the drop-down box (Figure3).When users determine the correct category in the located scenes, they can click the square icon in the top left corner of the page to draw a yellow bounding box around the correct scene (Figure4).If the scene has been labelled by other users, a hint panel with thumbnails will pop up in the bottom right corner of the page (Figure4) and the details of the labelled image will appear if users click the thumbnail (Figure5).If the scene has not been labelled, users can click the 'submit' button (Figure4) to crop it and store it into the database.

Figure 3 .
Figure 3. Interface of search example

Figure 7 .
Figure 7. Interface of image review 2.2.4 Image statistics: Users can browse the number of checked and unchecked images through the 'Statistics' button (Figure 8).The result can be displayed in the form of histogram or line chart by using the library of ECharts (https://echarts.baidu.com/) .

Figure 8 .
Figure 8. Interface of image statistics Figure11represents the spatial distribution of the V-RSIR dataset.The images in this dataset are mainly distributed in North America, Europe and South America.

Figure 11 .
Figure 11.Spatial distribution of the V-RSIR dataset becomes a most common benchmark dataset for RSIR.

Table 2 .
The results of the two methods on the VGoogle-RSIR dataset.