POTENTIAL AND USE OF THE GOOGLENET ANN FOR THE PURPOSES OF INLAND WATER SHIPS CLASSIFICATION

This article presents an analysis of the possibilities of using the pre-degraded GoogLeNet artificial neural network to classify inland vessels. Inland water authorities monitor the intensity of the vessels via CCTV. Such classification seems to be an improvement in their statutory tasks. The automatic classification of the inland vessels from video recording is a one of the main objectives of the Automatic Ship Recognition and Identification (SHREC) project. The image repository for the training purposes consists about 6,000 images of different categories of the vessels. Some images were gathered from internet websites, and some were collected by the project’s video cameras. The GoogLeNet network was trained and tested using 11 variants. These variants assumed modifications of image sets representing (e.g., change in the number of classes, change of class types, initial reconstruction of images, removal of images of insufficient quality). The final result of the classification quality was 83.6%. The newly obtained neural network can be an extension and a component of a comprehensive geoinformatics system for vessel recognition.


INTRODUCTION
In terms of vessel traffic, there are several techniques and methods for monitoring water areas. The most popular are AIS (Automatic Identification System) and radar. These two systems are often supported by radio communication and video surveillance (CCTV). All these components make up the RIS (River Information Service) or VTS (Vessel Traffic System) system, where the operator can easily identify the vessel and acquire all information about its voyage, crew members, shipowner or cargo. The water traffic observation system is well organized in the scope of large sea vessels and inland cargo ships, mainly due to the fact that these vessels are covered by legal equipment requirements. However, there is a huge problem in the field of small inland boats, such as motor or rowing boats, small sailing yachts and pleasure crafts. In general, they are not equipped with AIS, radar or VHF systems. Many of them do not even require registration in the relevant registers. Given the growing interest in pleasure water tourism, there is a need to support the monitoring through a recognition and identifying system dedicated to small boats and their approximate location.
The Automatic Ship Recognition and Identification (SHREC) project aims to develop a system for detecting, recognizing and identifying small boats based on video monitoring from CCTV cameras located in strategic points of the water area such as bridges and marinas [1]. It will cover the inland water areas in Szczecin-Swinoujscie harbor located in the northwest of Poland and will provide support for the RIS-ODRA system (River Information Service). While the detection of a vessel itself is simple, its recognition and identification are more complex tasks. For this purpose, the SHREC project assumes the use of artificial neural networks and searches for an available tool that can be implemented with the SHREC system. The project is funded by the Polish LIDER NCBiR program.
Recognition of vessel types based on images is not a new issue. Specialists in the field of image processing and analysis as well as computer vision have been dealing with this problem for many years. However, there is no solution that can be directly implemented with the SHREC system. First, solutions based on the analysis of satellite imagery can be indicated [2][3][4]. However, these methods are not suitable for the recognition of small ships and real-time data analysis. The key to the SHREC project is a solution that will allow real-time recognition based on images collected using standard digital cameras at a short distance (e.g., a camera located near to a water reservoir or river) to recognize the vessel.
Systems that monitor vessel movement (such as the River Information Service of Lower Oder) [5] are equipped with video cameras. Unfortunately, often the video stream is not analyzed using artificial intelligence methods but by manual analysis performed by the operator. Yet, such systems can be a huge source of data to train neural networks.
By analyzing the literature and available knowledge on the use of computer vision when analyzing a video stream of vessels from cameras, three issues can be identified. The first of these is the issue of vessel detection [6][7][8]. The next issue is the detection and recognition of text on the vessel (e.g., International Maritime Organization (IMO) number and registration) [9]. The last is the issue related to vessel recognition. These issues are the most important in the context of the research carried out in this study. Researchers dealing with this topic most often utilize convolutional neural networks (CNNs) for classifications. There are also analyses involving the use of simple classifiers for the purposes of the task [10]. In this case, the classification method used was k-nearest neighbors (kNN). For the purpose of selecting the features, an image analysis was performed with the use of the Hough transform. Thanks to the tested approach, it was possible to detect sailing yachts and separate them from other vessels. The described achievement is not sufficient for the purposes of the classification of all vessels that are the subject of the SHREC system analyses. An example of using more complex classifiers and neural networks is the publication [11]. In this case, convolutional neural networks VGG19 were used. Scientists classified five classes and obtained an F1-score of 70%. Another study [12] used deep convolutional neural networks and gnostic fields. The authors of this publication obtained classification quality results of more than 85% for images taken during the day due to combining images in the visible and infrared spectrum. A multi-task learning framework has also been proposed by scientists [13]. They used deep feature embedding, coarse-grained classification and fine-grained classification. The above-mentioned solutions do not meet the key conditions that should be met by the SHREC system component aimed at the classification of ships: -obtaining high values of the classification quality based on the visual image, -carrying out the classification for objects that are subject to classification in the SHREC project, -minimizing the speed of the classification process -by default, the classification should take place in real time.
Therefore, it was decided to verify another available solution, which is the verification of one of the most popular neural networks which is easily available, widely used and tested, achieving high values of classification quality for objects based on images. For this reason, it was decided for the purposes of the project that the pretrained GoogLeNet network would be tested.
The aim of this study is to analyze the potential development and use of the most popular neural network for image classification -GoogLeNet -for the classification of vessels on inland waterways. GoogLeNet is a pretrained convolutional neural network that is 22 layers deep. This net is commonly used, for example, with Chinese handwriting [14], scene recognition [15], autonomous driving [16], feature tracking [17], artifact removal, classification [18], domain adaptation [19] and medicine [20].

DATA COLLECTION
For the initial stage of the task, an analysis of the categories of vessels that the SHREC system would deal with in the classification process was made. The following ship types were selected: 1. kayak, pedalo, rowing boat -small units powered by muscle power; 2. small boat, motorboat -small units with an outboard internal combustion engine, usually without a built-in cabin or without a cabin and a low superstructure; 3. motor yacht -mechanically propelled boat, with an outboard or permanent engine, higher superstructure and a cabin up to approximately 10 m in length; 4. sailing yacht with a mast -yacht with a single sail, with or without a cabin; 5. sailing yacht with a mast down; 6. large motor yacht -power boat, with fixed engine, higher superstructure, a cabin, over 10 m in length, luxury; 7. sailing ship -sailing vessel with more than one mast; 8. barge -large inland waterway vessel with cargo holds for general cargo, bulk or liquid cargo; 9. inland pusher -inland waterway vessel used for "pushing" barges without a mechanical drive, the whole creating a pushed set; 10. pushed convoy -inland pusher with a set of inland barges; 11. water services -police, WOPR (Volunteer Water Rescue Service), border guards; usually these are smaller vessels, serving in the port and in the coastal zone of the territory of a given country; they are distinguished by appropriate colors and descriptions of the service they represent; 12. small ship -small conventional ship defined as a ship below 24 m length, ships with cargo holds for the transport of bulk and general cargo, ships for the transport of liquid materials/chemicals, ships for the transport of bulky cargo; 13. medium ship -conventional ships up to 120 m long, ships with holds for the transport of bulk and general cargo, ships for the transport of liquid materials/chemicals, ships for the transport of bulky goods; 14. large ship -conventional ships of over 120 m in length, ships with holds for the transport of bulk and general cargo, ships for the transport of liquid materials/chemicals, ships for the transport of bulky goods; 15. navy ship -military ships, properly marked and in gray colors; 16. special vessel -measuring vessel, dredger, icebreaker; vessels designed for maintaining the navigation path and works in port: measuring vessel -small vessels; dredgervessel with visible dredging equipment/pipes; tug -small vessel with visible low stern side; 17. passenger ship -sea or inland ship for the transport of people; 18. special purpose service ships (e.g., hydrographic, security, fire, telecommunications, customs, sanitary, school, pilot, icebreakers, rescue); 19. fishing vessels -small vessels for fishing at sea, often with fishing gear/frame at the rear; 20. ships of historical value; 21. other ships.
It was determined from which angles the unit's view was assumed. Angles from 0° to 315° were set with a 45° division clockwise. The next step was to collect photos, which were gathered from two sources: • photos gathered from the internet; • photos obtained from video recording in the area of the Szczecin-Swinoujscie port complex.
The first source consisted of 200 photos for each of the 21 categories of vessels, which gives a total of 4,200 photos. As far as possible, attempts were made to collect photos representing vessels from different angles. The second catalog, which was created from video registrations of the 2018 measurement campaign, showed that the previous categories of vessels were too detailed for the inland basin of Szczecin-Swinoujscie harbor. Taking into account the depth of the water area together with the shipment possibilities of the port of Szczecin itself, some ships do not appear at all, due to too much draft of the ship. Therefore, the number of vessel categories was reduced from 21 to 6: • barge -combining inland barges, pushers and pushed sets; • special purpose service ships; • motor yacht, with motorboat; • passenger ship; • sailing yacht; • other. Finally, from the video registration, about 6,000 images were obtained. Both catalogs were then filtered for the quality of the photo itself, the repeatability of the vessel and the size of the ship in the photo. The filtered catalogs were then used to train GoogLeNet.

METHODS AND TOOLS
The key tool used to achieve the goals of this research was Matlab software, with the following toolboxes: Deep Learning Toolbox Model for GoogLeNet Network, Deep Learning Toolbox, Image Processing Toolbox, Computer Vision Toolbox. The use of this numerical computing environment for the needs of analysis related to artificial intelligence and machine learning is very popular today, in particular for the classification and recognition of objects based on photos [21,22]there is a growing interest in the use of non-contact techniques to automate this process. Machine learning techniques, such as artificial neural networks, support vector machines (SVMs. Matlab is a tool that enables the use of ready-made components (functions, scripts, etc.), but it is also possible to edit and create additional components from scratch. We focused on verifying the potential to use a readymade tool -pretrained GoogLeNet deep convolutional neural network (GoogLeNet CNN) -for vessel recognition.

PRETRAINED GOOGLENET DEEP CONVOLUTIONAL NEURAL NETWORK (GOOGLENET CNN)
GoogLeNet CNN is a pretrained network with 22 layers. This network facilitates the classification of images in 1,000 object categories (there is no category related to vessel classification) in the ImageNet dataset. The Matlab environment gives the opportunity to train GoogLeNet CNN based on new datasets, as was done in this study.

ARTIFICIAL NEURAL NETWORK TRAINING AND CLASSIFICATION OF IMAGES PROCESS
The commonly used transfer training was used to retrain the convolutional neural network for the new set of images in this study. This process does not involve training the network from the very beginning but uses a pretrained network as the starting point. Some pretrained layers can be reused when training a new set. This saves a lot of time to develop learned functions for new collections and allow the network to be trained with a smaller number of images. Figure 3 presents the process used for this study, which was based on a defined algorithm of process that is described in detail on the software manufacturer's website. [23] The process, which was aimed at developing a scheme for the classification of vessels, focused on testing one developed algorithm for network retraining and classification. This process is presented below (Figure 3). The selection of the appropriate dataset for network training involved the development of several different sets of images that represented different classes.
The process of training and testing began with image preprocessing. As a way of selecting the best algorithm for image preprocessing, the assessment of the quality of the classification based on the confusion matrix was chosen. In a situation where the quality of classification for a given set decreased, another way of developing the set was sought. As quality increased, it was verified whether there was still an optimal alternative, testing another variant. All changes that were made to the dataset were named variants and are described in Section 3 (Results). Variant testing was performed until the desired classification quality value was reached (above 82%).

RESULTS
For each variant of the dataset, the GoogLeNet network was trained from the beginning and the classification was carried out. The scheme of division of chosen variants is presented in Figure 4. Table 1 presents information on the selection of individual variants, a description of the set and a comment on the results obtained.
The work began with the classes, i.e., barge, other, special purpose service ships, motor yacht, passenger ship and sailing yacht. These classes were selected because of the place where the recordings were made -these vessels sail on inland waters. Successively, changes to class variants were carried out to verify the recognition potential among more classes. Collections of sets of different class variants have been developed. There are collections that contain more detailed classes of recreational facilities such as a kayak, rowing boat or pedalo. Some collections only contain information on the general classification of floating objects or collections combining several classes of objects, such as pushed convoy and barges combined into one class. The collections are as follow: • Z1 -barge, other, special purpose service ships, motor yacht, passenger ship, sailing yacht; • Z2 -barge, other, special purpose service ships, kayak, motor yacht, passenger ship, sailing yacht; • Z3 -barge, other, special purpose service ships, kayak, passenger ship, sailing yacht, small motor yacht, large Fig. 3. Diagram of the operation process during retraining of an artificial neural network motor yacht, pushed convoy, catamaran, rowing boat, pedalo, scooter, motorboat; • Z4 -barge, special purpose service ships, motor yacht, passenger ship, sailing yacht; • Z5 -barge, other, special purpose service ships, kayak, motor yacht, passenger ship, sailing yacht, pushed convoy; • Z6 -barge (with pushed convoy), other, special purpose service ships, kayak, motor yacht, passenger ship, sailing yacht. Initial testing consisted of variants W1 to W8 (see Table 1 for variant definitions). This was intended to generally verify the classification options based on GoogLeNet. In this case, network training was carried out on 70% (arbitrarily assumed value by the authors, resulting from the repeatability of use in Matlab scripts used to train and test artificial neural networks) of the set images for each class; this division was automatic. At this stage, the data structure in collections must be emphasized. The images were clippings from film frames, and often one unit is included in many images [24,25]. It hould be noted that often during the collection of data, the same vessels floated on the river. This was characteristic of the places that were chosen for data collection. Due to these two aspects of the data, there should be a degree of caution when considering the classification quality results, as the same units may have been in the training and testing set.
The proper testing consisted of variants W9-W11. Here, the sets were manually divided into two, where the training was done on separate images for separate ships, and testing was also carried out on separate ships. In this case, the possibility of the same unit being duplicated in two sets was eliminated.
The details of the development of variants are presented in Figure 4 and Table 1. -repeated incorrect classification of the sailing yacht -this phenomenon occurs due to the data that the sailing yacht represents: the sailing yacht is often represented without a mast or sail (there is a "cut") and is "confused" with a motor yacht; -decreased the quality of the classification and is also associated with the incorrect rewriting of kayak and motorboat class objects as "other" (some blurred objects in the images may be confused with a rowing boat, which is part of the class other) -verify the change of attachment of the set pushed into the barge; -work on changing the image database, refining algorithms for recognizing sailing yachts and cutting them out of the film frame with the mast, and obtain better quality images W11 -compared to W10, the push set is includedin the barge class -the quality of the classification has not changed significantly compared to W11 -as above Work was completed on the W11 variant because of the conclusions drawn. It was thought that the best option was the trained network from the W11 variant (set Z6), due to the availability of pictures associated with the pushed convoy and the frequency with which this unit occurs on inland waters. Figure 5 shows the confusion matrix for the W11 variant. It provides numerical values regarding the quality of the classification, giving the numbers of correctly and negatively classified vessels [27]. Examples of correctly classified vessels are presented in Figure 6, while incorrectly classified vessels are shown in Figure 7. These examples confirm that there is a problem with the classification of some yachts, motorboats and kayaks due to their strong similarity with other vessels. In addition to the similarity of the hull of the vessels and incorrect cropping of the image, the authors put forward some hypotheses about the reasons for the misclassification of vessels, which will be the subject of further research. As such, the following can be indicated: -low image quality (photo resolution less than 224 × 224) in the test and training set, -representation of the class in the training set by an insufficient number of differentiated units (with nonstandard characteristics), -too few photos in the training set, -problem with a down or invisible mast in the sail yacht class.

DISCUSSION AND CONCLUSIONS
The developed classification algorithm will be one of the key components of the SHREC system. Ship classification will be partly based on the retrained GoogLeNet network. Images representing the inland ships will be the input values. However, the output will take into account the probability values of assigning a particular class. The entire classification developed in the SHREC system will be complex and based on logical principles. Nevertheless, the data on the probability of the recognized class will be important supporting information toward making the final decision in ship recognition.
For the purposes of achieving the best results in the classification of vessels, it is suggested to develop guidelines for photos constituting the basis for teaching the network. It is obvious that there should be as many images as possible for each category. Pictures for individual categories should be taken in the highest resolution and quality (color images with a resolution of 224×224 pixels are optimal). It is also important to view the vessel in the largest possible size (in the whole frame) as it allows the valuable details that characterize a given class to be captured. Similarly, it is important to use only those pictures on which the background is minimized, i.e., it is best that the vessel or craft is presented in open water, where the background will be water and sky. It is also crucial to gather a database of vessels without duplicates.
Assuming that the SHREC system will operate on CCTV cameras directed either from the side of the fairway or exactly on its axis, it is necessary to have a database of vessels taking into account the view of the vessel from the boat, stern and each side. Proper classification is influenced by both the shape of the bow or stern, as well as the silhouette of the superstructure or the characteristic elements on the ship's sides. For this purpose, the 45° angle scheme is recommended. For the purposes of unit classification, parallel to the research on the GoogLeNet network, machine learning [28] will be carried out. The future work is in collaboration with machine learning algorithms. It involves the use of both the GoogLeNet network and machine learning, as two components in the classification module. Wherever a given method brings better results, it will be used in another component of the SHREC system -in the identification module. For machine learning purposes, six classes of objects will be used: 1. inland barge -combining inland barges, pushers and pushed sets; 2. motor yacht; 3. sail yacht; 4. passenger ship; 5. port service ships; 6. other -kayak, small boat, pedalo and all other.
GoogLeNet is supposed to be used as a second step in the classification process, wherever it will be possible to divide the class into individual units. A good example is Class 6 (other), where kayaks, small boats, pedalos and catamarans are included.
The goal established at the beginning of the research was achieved. The potential to use the GoogLeNet framework for the purposes of inland ship classification has been proven.