DEVELOPMENT OF USV AUTONOMY: ARCHITECTURE, IMPLEMENTATION AND SEA TRIALS

Summary This paper presents the development of autonomy capability for an unmanned surface vehicle (USV). The development mainly focuses on the high-level autonomy on perception, path planning, guidance and control to achieve real sea applications of the USV. First, visual recognition and point cloud data processing techniques are utilized to achieve a real-time perception of the object in the sea environment. Second, detailed path planning strategies are illustrated to plan the easily reachable path for different missions, and the classic guidance and heading controller are adopted to implement the path following algorithm. Subsequently, these autonomy algorithms run in the high-level computer and render the actuator commands for the low-level embedded control system. Finally, sea trials of the USV are conducted by attending the 2020 Zhuhai Wanshan International Intelligent Vessel Competition (IIVC) in Dong Ao Island of South China Sea. The USV accomplish three missions: 1) path following, 2) navigating around the obstacle, and 3) rescuing the drowning. Sea trial results verify the autonomy of the USV in terms of the achieved performances.


Introduction
In recent years, unmanned aerial vehicles and ground robots have made great achievements in some searching, mapping, and rescuing missions. However, the development of marine robots, such as unmanned surface vehicle (USV) [1,2,3,4], unmanned underwater vehicle (UUV) [5,6] and underwater vehicle-manipulator systems (UVMS) [7] is quite challenging due to the complex working environment [8,9,10]. Nowadays, there are some competitions of marine robots to promote the autonomy and the applications, by setting some practical missions for students, researchers, and professionals in the ocean engineering The competition puts forward the following missions, the difficulty level of which varies from the easy mission to the difficult one, in order to examine the autonomy of the USV: 1) path following, 2) navigating around the obstacle, and 3) rescuing the drowning, The competition was held in the open sea for four days, regardless of weather and waves. This USV platform provided by Zhuhai Yunzhou Intelligent Technology Ltd., consists of a catamaran hull, propulsion systems, powering systems, and remote control modules. To achieve navigation and perception capabilities, sensors such as global navigation satellite system/inertial measurement unit (GNSS/IMU), camera, lidar are installed on the USV. An integrated low-level control system is mounted on the USV, which can take over the motion control at any time by the shore-based radio remote control equipment to ensure the USV safety. A high-level computer running autonomy algorithms enables the USV to autonomously perform specific missions through the Ethernet communication with the low-level controller.
In the competition, once the USV is launched from the starting zone, no interactions between human operators and the USV are allowed during the entire running test, except for monitoring the status of the USV remotely via the local wireless network. Hence, autonomy plays an essential role in the autonomous missions. The USV is required to handle all information gathered from sensors to know its status and surrounding environment, plan desired path based on missions, and drive itself along the path autonomously. The high-level autonomy architecture is designed to accomplish different missions, and its performance is tested by integrating the proposed perception, planning, guidance, and control algorithms. This paper is organized as follows: Section 2 describes the hardware architecture and software structure of the USV system. Autonomous strategies of the USV based on the high-level autonomy for each mission are described in Section 3, and the sea trial results are illustrated and discussed in Section 4. Finally, Section 5 summarizes the experience and lessons of the competition.

USV system
The USV system (as shown in Fig. 2) consists of all the hardware and part of the software provided by the competition organizer as well as the high-level autonomy software developed by the participants running on an industrial computer. Another computer runs ground control station (GCS) software to visualize status, tune parameters, and assign missions of the USV. There is a detailed description of the hardware architecture and the software architecture below. The USV platform with a shore-based monitor system is shown in Fig. 2. A shore LAN is built to connect with the onboard LAN to support Remote debugging. The hardware architecture is described in Fig. 3, which consists of five parts: 1. a catamaran hull, 2. a propulsion power system, 3. a sensor system, 4. a computer system, and 5. a wireless communication and remote controller system. Catamarans have a stronger ability to resist lateral interference than monohulls [16]. Their flexible assembly, greater task capacity, and deck area make them widely used as unmanned engineering boats. The catamaran used consisted of two buoyancy tanks, an upper equipment tank made of carbon fiber, and an aluminum alloy support tube, whose principal dimensions were 2.5×1.4×1.5 m. The buoyancy tanks had been installed with a thruster and two batteries. The upper cabin is divided into interconnected front and rear control compartments. A vertical carbon fiber rod and a horizontal one are used to mount the monocular camera and the lidar to improve their perception range.

Sensor system
To complete assigned missions autonomously, the status of the USV and surrounding environment information should be extracted from multi-source sensor pieces of information. The GNSS/IMU system is used to estimate the position and orientation in real-time. To perceive the environment, a camera and two radars are installed to identify the color of balls and estimate the positions of balls relative to the USV.
Accurate navigation is the key to accomplishing the mission. Hence, an inertial navigation system (INS) device consists of a 3-axis MEMS gyroscope and a 3-axis MEMS accelerometer. A strapdown inertial navigation algorithm is adopted to provide information on high-frequency pose (i.e., position and attitude) and velocity. However, the status estimation error of strapdown inertial navigation system (SINS) increases rapidly with respect to time, thus the initial state is required to serve as the dead reckoning (DR) origin of calculation [17]. Two GNSS modules are added to the INS device to supply the initial position and orientation. More importantly, an integrated navigation system with GNSS and IMU yields high-precision navigation. The fusion information is sent through the RS232 interface using the standard NMEA-0183 protocol.
A monocular camera with its network server is used to collect the visual environmental information in the front. The camera has a field of view (FOV) angle of 98.2°. The rate of the output video frame is adjusted to 10 fps and the resolution is adjusted to 720p according to the computational requirement.
One of the onboard radars is a 20-channel mechanical lidar which creates 3D imaging by 360° mechanically rotating with 20 oblique laser diodes up and down between +8° and -25° inside the housing. Subsequently, the point cloud data at the frequency of 10 Hz is sent through UDP. Although the lidar has a long detection distance and high precision, it has poor performance in extreme weather such as rain, snow, and fog. So, a millimeter-wave (MMW) radar is also installed to deal with this harsh situation.

Computer system
There are three computers used in the USV system. The embedded ARM computer is connected with sensors and actuators, which is responsible for forwarding sensor information and processing and distributing commands. It is also connected with a radio frequency (RF) module to receive remote controller commands from the shore station. The other onboard computer is an industrial computer (IPC) based on Intel i7 running the autonomy control software developed by the ARMs team. The computer has two network cards, which are connected to the onboard router and lidar respectively. Besides, a dedicated computer is deployed on the shore station to run remotely monitoring software.

Communication and remote control
An onboard router connects the embedded computer, the industrial computer, and the webcam into the same local area network (LAN). A pair of wireless bridges (effective distance about 5km) connects the LAN of each USV in the competition with the shore station LAN, so these three computers mentioned above can communicate with each other. In the early stage of testing, the USV can navigate autonomously through the IPC placed on the shore. While the IPC is installed on the USV, the local software can be debugged and modified through the remote tools.
The other wireless communication link is 2.4 GHz RF used for the remote control. As shown in Fig. 2, there is a switch to change USV working mode, a knob to adjust the max throttle, and two channels to control the motion of the USV. In case of high-level autonomous software occurs with bugs or emergency operations are required, the remote controller could ensure the safe return of the USV.  To ensure the USV always in controlled condition, even if the high-level software in the IPC has faults, a robust software is implemented in the embedded computer developed by the IIVC organizer. The built-in software is not only used to ensure the safety of the USV but also to process the sensor information and send commands to actuators, which provide a good lowlevel software to enable the competition participants to develop high-level autonomy software.

Software architecture
On the one hand, due to no direct link between the INS/the MMW radar and the IPC, a serial server maps COM ports to TCP ports. The high-level software running on the IPC can obtain the original output data of the two sensors through TCP. On the other hand, an Message Queuing Telemetry Transport (MQTT) broker is used to transfer messages, which distributed messages published by MQTT clients to clients who subscribe to the same topic [18]. Two control interfaces are provided as shown in Fig. 5. The message with the topic of /Ctrl has a payload of a float number with range -1 (fully back speed) to 1 (fully forward speed) representing the desired speed, and the other float number with range -1 (hard port) to 1 (hard starboard) representing the desired turn of the USV and a byte number representing the communication priority. The other message with the topic of /Diff has the same frame which uses two float numbers with range -1 (fully forward speed) to 1 (fully reverse speed) to control the rotation rate of left and right thrusters, respectively. After receiving the two MQTT messages, the main program converts them and then sends commands to the thruster. For users, the former interface is more intuitive and easier to use, but the latter one can make the USV turn in one place which will be used in mission 1 of the competition. The main program also received MQTT messages from the RF module, including the thruster commands mentioned above. Additional mode information is picked up from another message package. If it works in the teleoperation mode, the USV moves according to the commands from the remote controller. Otherwise, the USV moves according to the commands from IPC. As long as the USV is under the range of RF, the operator can take over the control function. The high-level autonomy software developed by competition participants runs on the onboard IPC. The key to winning the competition is how to develop a simple and feasible software architecture together with a robust autonomous program to drive the USV according to the missions, under the disturbance of light, wind, and wave.
The software is divided into five threads, which use shared memory to transfer information. After initialization, the following five threads are created respectively: Thread 1 creates a TCP client, periodically receives INS data, and resolves navigation information according to NMEA protocol.
Thread 2 uses OpenCV to get real-time images from the webcam through Real Time Streaming Protocol (RTSP) and write the color and the relative position of the detected object into the shared memory.
Thread 3 processes the point cloud from the lidar to obtain a precise relative position of objects.
Thread 4 creates a TCP server to wait for the connection from GCS software. Then receive commands from the operator and upload status information to GCS.
Thread 5 implements the functions of task strategy, path planning and motion control for real application. Firstly, the object information from the webcam and lidar is fused. Secondly, the corresponding path planning strategy is selected according to the mission, so that the desired waypoint sequence can be planned out uniformly. Then, the path following controller calculates actuator commands to drive the USV along the projected path. Next, publish the message to the MQTT broker and save the log.
This architecture is simple and extensible so that a sub-thread can be flexibly added to handle additional equipment, or a new planning strategy can be added for other applications. The sequence diagram of the once common application is shown in Fig. 6. The start command is launched by the remote operator. Then, the motion control thread calculates the control signal of actuators according to scheduled mission and perception information. Meanwhile, the GCS thread uploads USV's status when the communication link is available. Finally, if the mission is accomplished, the remote operator will be informed. The ground control station software shown in Fig. 7 is developed through the Qt software installed on Windows 10. As shown in Fig. 8, the communication protocol between the software running on the IPC and the GCS software is MAVLink V1 [19], which is an open-source protocol widely used in various robots. After customizing the payload, the open-source tool MAVLink Generator is used to generate the API. The software calls the API to resolve messages, displays the USV track on the map and the status via the dashboard and numerical values directly. Furthermore, the software can also pack the task information and system parameters and then send them out. Using GCS software to display USV's motion status and tune parameters made the debugging more efficiently.

Autonomy control
Based on the hardware and software architectures mentioned in the previous chapter, the high-level autonomy is designed to accomplish the IIVC missions. More specific developments are presented to show the perception, planning, guidance and control in this section. Fig. 9 shows three tasks of the IIVC competition mission, including path following, navigating around the obstacle, and rescuing the drowning.

IIVC missions
Task 1 Path following: The USV sets off from the departure area and follows a predetermined route precisely. The path consisted of five continuous straight lines determined by five GPS waypoints. Path tracking accuracy is defined as the average deviation from the real-time trajectory to a targeted straight line.
Task 2 Navigating around the obstacle: The USV departs from the vicinity of GPS point A and sails to GPS point B. During the voyage, the USV is required to identify and cross the start gate and the end gate represented by black balls. At the same time, when USV recognizes the red ball, it should pass the left, circle around, and to the right of the ball in sequence.
Task 3 rescuing the drowning: The USV sails to the disaster area for rescue where yellow balls and black balls are randomly arranged, which are regarded as drowning persons and obstacles respectively. Each circumnavigation of the yellow balls means that a drowning person is successfully rescued. Cameras are used to directly detect the presence of balls and the color. To match with the information of the balls detected by the lidar, the rough position of the ball is required. Fig. 10 shows the flowchart of the visual processing. To get more accurate relative position information, it is necessary to correct the image distortion using the calibration method of Zhang [20]. In the vision thread, the obtained image is first corrected for distortion and then the color space of the image is represented by HSV for facilitating color contrast and extraction [21]. By setting the threshold of each HSV channel, the color of interest is extracted to get the binary image [22]. After a set of morphological transformations, the ball is extracted with little noise. According to the horizontal position of the circle contour in the image and the FOV, the relative orientation of the ball is calculated. A lens imaging principle is used to estimate the relative distance [23]. It is noted that the reflection of the ball must be filtered out because it will result in the mistake of the distance estimation. Although the factors mentioned above are considered, the position estimation accuracy is still low. The ball positioning data from the monocular vision is shown in Fig. 11. The static positioning error is less than 3 meters, but it might be inferred if the dynamic positioning error is large. So, it is necessary to fuse sensing data with the lidar data.  [24]. After getting point cloud, a downsampling operation using a voxelized grid approach is performed firstly to reduce the number of points and meanwhile ensure the shape characteristics of the point cloud. The operation can reduce the calculation time of subsequent segmentation and clustering algorithms. By filtering out the far points, the amount of data can be further reduced. Plane model segmentation based on random sample consensus (RANSAC) is used to find the water surface points and filter them out, which improved the clustering effects [25]. After this step, the water surface points are marked as red, and the others are marked as white. The fast clustering algorithm combining Euclidean clustering with the K-dimension tree (KD-Tree) is used to segment the point cloud data [26]. Finally, each group of point clouds is enveloped by cubes and the centers of which are calculated as the centers of balls.
Another important part of perception is the complementary information fusion of balls processed by vision and lidar. Assuming that the position sensed by the lidar is accurate, the information calculated by vision is used for the position matching, in which the threshold of the matched distance is set according to the error of visual positioning. So the color and position of the balls can be obtained at the same time. End if Output:

Guidance and control
The whole path in the mission can be divided into the straight-line paths and the arc paths, which are also the accessible paths for USVs. The function of the guidance and control part is to calculate the commands to drive the USV moving along the desired path. Especially in the first task, it takes care of the path following accuracy.
The path following algorithm for multiple lines is shown in Algorithm 1, and the key component consists of Line of sight (LOS) guidance algorithm and proportional-integralderivative (PID) based yaw autopilot [27,28]. Firstly, the along-track error and cross-track error are calculated. Then, to deal with the underactuated properties of the USV, LOS is adopted as the guidance algorithm for line tracking shown in Fig. 13 [29], which converts the path following problem into the course tracking using Eq.(1). In which, is the desired course, is the path tangential angle of desired straight path decided by two GPS waypoints, ∆ is the forward distance which is a parameter to be tuned, usually 3-5 times of the USV length.
Then, by using the classic PID yaw autopilot algorithm Eq. (2) and (3), renders the actual course converges to the desired course . In this real application, the current heading is used to replace the course , since the sensor data obtained by the INS is relatively stable [30]. In addition, is saturated between -1 and 1. In the application, the victory radius is set as 5m. If the along-track error is less than the victory radius, the USV is regarded as accomplishing following the current path, and the next desired path is updated.  End if • Continue following path using Algorithm 1 3.5.1 Task 1: Path following Path following accuracy directly reflects the robustness and performance of USV controllers under the disturbances of wind and wave. In task 1, LOS and PID are directly applied to follow multiple lines. Although the turning path within 5 m around the waypoints is not included in the calculation of the accuracy, there is no guarantee on the turning path in the circle by taking deceleration and early steering actions, especially when the turning angle was acute. Considering the good reverse ability of thrusters, a dedicated strategy for path following in IIVC Task 1 is proposed as shown in Algorithm 2. When < 3 , the USV turns without radius by turning a propeller forward and reversing the other propeller through the MQTT message of /diff topic. After turning to the orientation of the next path, the USV switches to the straight-line following using Algorithm 1 again.
3.5.2 Task 2: Navigating around the obstacle Task 2 requires a precise perception of the start gates, end gates, and yellow balls between them which are achieved by the perceptual processing modules. Then make five corresponding actions according to the rules. As shown in Fig. 15(b), define the major task as following line L determined by the predefined GPS point A and B. If an object is distinguished, the corresponding action is carried out. For example, the USV tracks line L firstly at the beginning of the mission. When the positions of the two black balls are generated via information fusion, waypoint 1 is placed at the center between the two balls. After sailing to waypoint 1, USV tracks line L again. Similarly, waypoints 2-9 will be planned until USV enters the end gate.  Fig. 16(a) is automatically planned in the incident rectangular area, the space of which is determined by FOV and effective distance of camera and lidar. The USV follows the lawn-mower path to search for the drowning. Unlike task 2, the number and position of yellow balls are randomly generated, so all actions of USV cannot be specified in the  [31], the whole task is divided into states as shown in Fig. 16 (b). The definitions of states and trigger regulations are shown in Table 1 and Table 2 respectively. After receiving the start command from the operator, the USV switches into state S2 and runs the program according to the defined action and switching logic. At state S4, the USV needs to judge whether the ball is on the left or right side of the current path, and plan the sequence of waypoints to circle the ball anticlockwise and clockwise, respectively.  To Line Following lawn-mower path S2 On Line Following lawn-mower path S3 To Ball Perceiving and approaching the yellow ball in real-time S4 Around Ball Planning and tracking the waypoints around the ball Cross-track error is less than 3m Get Ball Location Perception module locates a yellow ball Lost Ball Location In the process of approaching the ball, USV losts its location < 10 Relative distance of the ball is less than 10m Arrive Waypoint 5 USV arrives at waypoint 5 Arrive H USV reaches the last lawn-mower waypoint

Sea trials
In this section, sea trial results from the commissioning stage in the competition are presented, which verify the performance of the USV autonomy in the sea environments. Task 1 is to evaluate the path tracking accuracy of the USV directly in real sea environments. Both the common strategy and the improved strategy are tested during the commissioning stage. The comparisons of the sea trial results are shown in Fig. 17 and Fig. 18. Both strategies are adopted to track the pre-planned path, in which the tracking error is within 0.2 m in the straight-line following stage. However, using the improved strategy, the offset at the corner waypoint is less than 5 m, which reduces the average tracking error of the whole path following mission to 0.47 m. Fig. 17 Trial result of the common strategy for task 1 Fig. 18 Trial result of the improved strategy for task 1 4.2 Task 2: Navigating around the obstacle According to the designed strategy for task 2, the USV perceives the corresponding target and takes the corresponding action. Fig. 19(a) is the actual scenario of task 2 and Fig. 19 (b) records the whole trajectory of the USV. As designed, when the USV perceives the first yellow ball, it plans a waypoint to achieve the left avoidance action. After reaching the waypoint, the USV follows the route again. Similarly, the USV orbits around the second yellow ball and executes the right avoidance action.   Fig. 20(a) shows the sweeping trajectory, in which the USV detects two yellow balls and navigates around them successfully. Because there are no radius and shape requirements, 5 waypoints are planned to generate a circular path around the ball. Fig. 20(b) draws the state transition curve which satisfies the predesigned state switching mechanism. It can be found that there is a moment that state S3 does not switch to state S4. Because the low accuracy of visual positioning causes the failure of information fusion and the loss of ball positioning information. It concludes that the information fusion about balls has a lot of room for improvement.

Conclusion
This competition has put forward technical requirements for the high-level autonomy of the USV. A number of experiences of the USV autonomy learned from the competition are shown as follows: Open-source tools and libraries: For the development of an autonomous vehicle, perception, planning, control, communication, and other techniques are indispensable. It's worth mentioning that it is difficult to master the vision or point cloud processing methods proficiently in a short time. However, some useful open-source tools, such as OpenCV, PCL, and MAVLink protocol provide a quick and convenient solution to implementing these technologies in practice. These open-source library documents and cases are helpful to learn basic operations, explore solutions, and quickly integrate open-source modules into practical applications. However, simple API calls and their combinations could not be well applied, thus proper optimization and improvement are necessary. USV architecture: The architecture including hardware and software onboard, and the shore-based system, is designed in a flexible and reliable way. The separation of the high-level and low-level autonomy layers enables the IIVC participants to design their autonomous control algorithms according to the requirements and quickly access the low-level control system to drive the USV. The stable low-level control mode could be switched into remote control mode at any time to ensure safe navigation in case of emergency. The wireless bridge is composed of several USVs and shore-based control stations under the same LAN. This existing communication architecture enables the development of multi-USV formation, cluster and cooperative operations.
Perception by vision and lidar: This competition puts forward the requirement for environmental perceptions. Traditional visual object detection methods are difficult to be used in the outdoor environment, especially in the color extraction step. The light or backlight conditions and the intensity transformation of natural light might affect the threshold settings in the color extraction step. Target identification based on deep learning like Yolo might have a better performance in the outdoor environment. Besides, the visually positioning method in this paper has many side effects which may result in low accuracy. Stereo vision is widely used in visually positioning, which could be used for this competition in the future. Lidar has good detection accuracy in long distances. Facing more accurate application scenarios, the influence of the USV angular motion should be considered.
In summary, this paper presents the development procedures of the USV autonomy architecture by the ARMs Team for the 2020 Zhuhai Wanshan International Intelligent Vessel Contest. The development mainly focuses on the high-level autonomy of the USV to satisfy the requirements of the competition. A number of sea trail results in sea environments show the feasibility and the performance of the autonomy architecture of the USV.