Model-Predictive Control for Omnidirectional Mobile Robots in Logistic Environments Based on Object Detection Using CNNs

Object detection is an essential component of autonomous mobile robotic systems, enabling robots to understand and interact with the environment. Object detection and recognition have made significant progress using convolutional neural networks (CNNs). Widely used in autonomous mobile robot applications, CNNs can quickly identify complicated image patterns, such as objects in a logistic environment. Integration of environment perception algorithms and motion control algorithms is a topic subjected to significant research. On the one hand, this paper presents an object detector to better understand the robot environment and the newly acquired dataset. The model was optimized to run on the mobile platform already on the robot. On the other hand, the paper introduces a model-based predictive controller to guide an omnidirectional robot to a particular position in a logistic environment based on an object map obtained from a custom-trained CNN detector and LIDAR data. Object detection contributes to a safe, optimal, and efficient path for the omnidirectional mobile robot. In a practical scenario, we deploy a custom-trained and optimized CNN model to detect specific objects in the warehouse environment. Then we evaluate, through simulation, a predictive control approach based on the detected objects using CNNs. Results are obtained in object detection using a custom-trained CNN with an in-house acquired data set on a mobile platform and in the optimal control for the omnidirectional mobile robot.


Introduction
The mobile robots sector has seen a global rise over the past decade. Industrial mobile robots are becoming more advanced to achieve higher levels of autonomy and efficiency in various industries [1]. These robots are equipped with sophisticated sensors, such as Light Detection and Ranging (LiDAR), stereo cameras, Inertial Measurement Unit (IMU), and a global positioning system or indoor positioning system, to gather information about the work environment and make well-informed decisions [2]. This is made possible by using complex algorithms for path planning, obstacle avoidance, and task execution. Furthermore, autonomous mobile robots, grouped in fleets, are often integrated with cloudbased technologies for remote monitoring and control, allowing for greater flexibility and scalability in their deployment.
Path planning is a crucial aspect of mobile robotics navigation because of the need to perform a task by moving from one point to another while avoiding obstacles and satisfying more constraints, among which are time, the level of autonomy given by the energy available, and significantly, maintaining safety margins regarding human operators and transported cargo. Mobile robot navigation is still one of the most researched topics of today, addressing two main categories: classical and heuristic navigation. In the variety of classical sub-optimal trajectory clusters based on unique topologies [27]. To overcome the mismatch problem between the optimization graph and grid-based map, the authors suggested an egocentric map representation for a timed elastic band in an unknown environment [28]. These path-planning methods are viable and pragmatic, and acquiring a desired path in various scenarios is generally possible. Yet, these approaches could have multiple drawbacks, such as a local minimum, a low convergence rate, a lack of robustness, substantial computation, and so on. Additionally, in logistic environments where OMR robots are equipped with conveyor belts to transport cargo, it is essential to guarantee low translational and rotational accelerations for the safety of the transported cargo. Therefore, we propose a nonlinear predictive control strategy on a reduced model where we can include maximum acceleration and velocities of the wheels within inequality constraints derived from the obstacle positions obtained from environment perception sensors (i.e., LiDAR and video camera). To tackle the problem of local minima, we propose a variable cost function based on the proximity of obstacles ahead to balance the global objectives.

Object Detection for Mobile Platforms
Deep neural networks specifically created to analyze organized arrays of data (i.e., images) are known as convolutional neural networks, often called CNNs or ConvNets. CNNs offered solutions to computer vision challenges that are difficult to handle using conventional methods. They quickly advance to the state-of-the-art in areas such as semantic segmentation, object detection, and image classification. They are widely used in computer vision because they can quickly identify image patterns (such as lines, gradients, or more complex objects such as eyes and faces). CNNs are convolutional-layered feed-forward neural networks. CNNs attempt to mimic the structure of the human visual cortex with these specific layers.
Localization of object instances in images is implied by object detection. Object recognition generally assigns a class to the identified objects from a previously learned class list. Although object detection operates at the bounding-box level, it has no notion of different classes. The phrase "object detection" now encompasses both activities, even though they were initially two distinct jobs. So, before continuing, let's be clear that object detection includes both object localization and object recognition.
Object detection and recognition is an essential field of study in the context of autonomous systems. The models can be broadly divided into one-stage and two-stage detectors. One-stage detectors are designed to detect objects in a single step, making them faster and more suitable for real-time applications, such as path planning based on object detection for a moving system. On the other hand, two-stage detectors use a two-step process, first proposing regions of interest and then looking for objects within those areas. This approach excludes irrelevant parts of the image, and the process is highly parallelizable. However, it comes at the cost of being slower than one-stage detectors.
To meet the constraints of the Nvidia Jetson mobile platforms considered for the OMR, lightweight neural networks were investigated for object detection and recognition. Among the models evaluated, YoloV5 [29], SSD-Mobilenet-v1 [30], SSD-Mobilenet-v2-lite [31] and SSD-VGG16 [32] were trained and tested. Earlier, the YoloV4 [33] model had already made significant improvements over the previous iteration by introducing a new backbone architecture and modifying the neck of the model, resulting in an improvement of mean average precision (mAP) by 10% and an increase in FPS by 12%. Additionally, the training process has been optimized for single GPU architectures, like the Nvidia Jetson family, commonly used in embedded and mobile systems.
A particular implementation is YoloV5 [29], which differs from other Yolo implementations using the PyTorch framework [34] rather than the original Yolo Darknet repository. This implementation offers a wide range of architectural complexity, with ten models available, starting from the YoloV5n (nano), which uses only 1.9M parameters, up to the YoloV5x6 (extra large), which uses 70 times as many parameters (140M). The lightest models are recommended for Nvidia Jetson platforms.
Recently, an increasing interest has been in developing mobile device object detection and recognition algorithms. A popular approach is using the Single Shot Detector (SSD) [35] neural network, a one-step algorithm. To further improve the efficiency of the SSD algorithm on mobile devices, researchers have proposed various modifications to the SSD architecture, such as combining it with other neural network architectures. One such modification is the use of the SSD-MobileNet and SSD-Inception architectures, which combine the SSD300 [35] neural network with various backbone architectures, such as Mo-bileNet [30] or Inception [36]. These architectures, such as the Nvidia Jetson development platforms, are recognized for their real-time object detection capabilities on mobile devices.
These methods for object detection perform very well in general detection tasks. Yet, there must be more datasets and pretrained models for objects specific to the OMR environment, such as fixed or mobile conveyors, charging stations, other OMRs, etc. We have acquired our dataset and deployed domain-specific models for object detection in the OMR environment. We summarize the main contributions of this paper to the field of object detection and OMR control in logistic environments below: • Acquisition of a data set for object detection in the OMR environment; • Investigation of domain-specific models for object detection and providing a model to be used in an OMR environment; • Deployed an image acquisition and object detection module fit for the real-time task of OMR control; • Proposed a joint perception&control strategy based on a non-linear model-predictive control; • Avoid local minima by using switched cost function weights to navigate around obstacles while still achieving the overall objective of decreasing travel distance; • Guarantee maximum wheel speed and acceleration through the constrained non-linear MPC in order to ensure safe transportation of cargo; The rest of the paper is organized as follows: Section 2 discusses object detection in the context of OMR's logistic environment. First, some equipment experiments were conducted on the image acquisition sensor and the processing unit. We also describe the object detection dataset creation and object mapping in 2D and 3D perspectives. Section 3 is dedicated to the modeling and control of the OMR. We introduce the mathematical model used for developing the control strategy, followed by formulating the optimization problem considering the environmental objects. In the last two sections, we discuss the object detection results and the simulation of the control algorithm, conclude, and emphasize future work goals.

Image Acquisition and Processing Unit
For the image acquisition unit, we analyzed four depth cameras. Depth information is needed to accurately place the detected objects on the 2D and 3D maps of the environment. The predictive control task relies on object maps. The most important features considered for the experiments were the correctness of the depth information and the integration of the camera with the Nvidia Jetson platforms, which are already in use on the Omnidirectional Robot.
All Zed cameras perform well in indoor environments, but, as can be seen in Figure 1, the far-depth information provided by Zed 2i is significantly better. Depth information is completely missing after 10 m for Intel RealSense. The best depth information is given by Zed 2i; it also has the largest FoV. Based on the image acquisition experiments performed in the OMR environment, Zed 2i was chosen to be integrated into the robot. Nvidia Jetson system-on-chip platforms are already used on the OMR. Some experiments evaluated the computational capabilities, detection precision, and the dependency between inference time and resolution. Localization is very important in our defined use cases for the OMR environment. MS COCO dataset [37] was used for object detection evaluation across different lightweight neural networks such as Mobilenet [30,31] and Yolo [29,33] which are suitable for mobile platforms.
The neural networks used for the first experiment are optimized using TensorRT to run on Jetson mobile platforms. In Table 1, we can see the run-time measurements for the selected models from the SSD family. The same solution takes considerably more time to run on the Jetson Nano. A second experiment aims to see how the processing time evolves depending on the image resolution. Table 2 presents the results in terms of FPS on a test subset from Cityscapes data set [38,39]. The results emphasize that the inference time depends on the size of the images provided at input. Thus, the higher the image resolution, the slower the model. Jetson Xavier AGX is 4 to 6 times faster than Jetson Nano, depending on the model and the input resolution. Following the analysis of hardware equipment and the experimental measurements, the Jetson architecture chosen to be integrated into the proposed solution for object detection in the OMR environment that meets the minimum requirements is the Nvidia Jetson Xavier AGX.

The Omnidirectional Robot Object Detection Dataset (OROD)
Enabling the efficient operation of autonomous robots is crucial for accurately detecting and recognizing objects specific to the OMR environment. Using the ZED 2i camera, we have acquired a new dataset for the object detection task that contains objects specific to the omnidirectional robot environment. The "Omnidirectional Robot Object Detection (OROD)" dataset includes charging stations, construction cones, mobile conveyors, and different types of fixed conveyors. The images in the dataset were captured using the camera mounted on an omnidirectional robot and were annotated with bounding boxes of objects. The dataset is intended to evaluate the performance of object detection algorithms in an omnidirectional robot environment.
The OROD dataset contains 1343 images, each labeled with the objects of interest in the scene. The images were collected in different environments, such as industrial warehouses and logistics centers, to reflect the various scenarios in which an omnidirectional robot operates. Additionally, the data set includes images with varying lighting conditions, occlusions, and different orientations of the objects to represent real-world challenges in object detection. The training subset was augmented for better results by applying flip, rotation, zoom, hue, saturation, blur, noise, etc. The original and augmented data sets were split according to the figures from Table 3. Examples of the augmented images can be visualized in Figure 2. The dataset augmentation process did not change the initial class distribution; it scaled by 3. The OROD dataset is the first to focus specifically on object detection in the context of an omnidirectional robot environment. It is intended to serve as a reference for evaluating the performance of object detection algorithms in this context and to promote research in this field.

Detected Objects in 3D and Mapping
All objects detected by the custom-trained model, along with the distances to them, are visible on the left side of Figure 3. Their 3D position is also exemplified on the right side. The distance between the scene object and the camera is measured from the back of the left eye of the camera and is given in meters.
In the context of an OMR moving through its environment, an important feature is to continuously be aware of its position and rotation relative to the starting point, the charging station, in our case. Examples of the OMR position and orientation are listed on the bottom of the frames in Figure 3. As a benefit of the IMU integration with Zed 2i, we can obtain the camera position, rotation, and quaternion orientation. In addition to the ZED 2i camera, the OMR is equipped with two LiDAR sensors for a 360-degree map. At this stage, the LiDAR data are empirically merged with the detected objects to obtain a bird's-eye view map of the entire environment. In Figure 4, we can see the obtained map of the environment with the detected objects shown in Figure 3.

Model-Predictive Motion Control of OMR
We derive the motion control strategy of the OMR based on a non-linear optimization algorithm as the core of the motion controller. We define in Section 3.1 the mathematical model used in the predictions step of the controller. The continuous time equations are discretized by the Euler method to realize the numerical implementation. Then, we define the physical constraints of the robot's actuators (i.e., omnidirectional wheel speed and acceleration) and the geometrical constraints of the objects (i.e., circumscribed circles of objects). We formulate the optimization problem considering the global objective of navigating on the shortest path, avoiding obstacles, and limiting the movement of the OMR within actuator limits.

Mathematical Model of 3D of Omnidirectional Robot
In this section, we define the discrete mathematical model used in the model-predictive controller to generate short-term paths and control the robot's movement along the predicted trajectory. Equation (1) depicts the inverse kinematics matrix representation: where v x and v y are the longitudinal and lateral velocities of the OMR, respectively. Ω defines the angular speed along the normal axis, ω j , j = 1..4 are the individual wheels' angular velocities, while J is the inverse kinematic Jacobian matrix of the OMR defined in (2) [1]: The forward kinematics of the 3DOF system are obtained from the lateral, longitudinal, and rotation velocities: where x, y, θ are plane coordinates and robot orientation, respectively. Moreover, R is the wheel radius, l x defines the distance from the GC to the front axle, while l y defines the half distance between the left and right wheels.
Pragmatically, it can be considered that deviations from the nominal kinematic model act on the system input. Therefore, we can design an input disturbance observer to compensate for unmodeled dynamics and disturbances. Let us define the disturbance acting on the system input as where the additive terms F act on the system inputs. The observer is designed considering the inverse kinematics of the process. An additional pole is added for the realizability of the observer. We define Q as a passive (i.e., unitary gain) first-order low-pass filter diagonal matrix. We define the estimated input disturbance as: The discretized-time model of (3) is obtained by backward rectangle area approximation (i.e., Euler method). Therefore, the system Equation (3) can be re-written in the state space frameworkẊ = AX + (J + )ω, where the state transition matrix is null, the state vector is X = [x y θ] t while the input matrix J + is defined as J + = (J T J) −1 J T . Thus, we obtain the discretized-time model of the OMR in global coordinates: where I 3 ∈ R 3×3 unity matrix, X k+1 = [x k y k θ k ] t is the state vector at iteration k + 1, T s is the sampling time and ω k = [ω 1 k ω 2 k ω 3 k ω 4 k ] t is the input vector.
To improve controller behavior w.r.t to deviations of the model and input perturbation, the extended discretized model can be used for states and output predictions within the MPC solver: Table 4 contains the parameters of the mobile robot and the sampling time considered for the time-discretization of the process.
where J k is the cost function defined in (9), x(·|k), y(·|k), θ(·|k) are the solutions of the optimization problem; ω UB and a UB are the upper bounds of the angular velocity and acceleration of the wheels, respectively. C o is the geometric constraints vector and is defined in (22). The cost function J k is defined by: where X r ∈ R H×3 is the reference trajectory matrix of the OMR over the prediction horizon H: d p (X 0 , X f ) is the length of the projection of the OMR geometric center over the ideal straight path connecting the starting (i.e., initial) node with the final node and is defined in (12): with where L 1 , L 2 , and L 3 define the L2-norms between the OMR position, initial, and resting are the initial and final resting positions. In the cost function, we aim to penalize by weights w x (α) and w y (α) the deviation from the reference trajectory x r , i = 0..H − 1 is such that the OMR remains with the frontal part facing the destination location. By w Tx , we penalize the terminal cost of x r and y r to reduce the steady-state error. Therefore, w Tx > w x and w Ty > w y ; w p (α) is a weight with two discrete states, and its value is a function of α which depends on the proximity (tolerance) of the closest object and is defined by (23).
The actuator constraints of the OMR are defined as: We define the physical-space constraints from the coordinates of the objects and their known sizes as: where φ is the angle from the global system's abscissa to the local system's abscissa. x l and y l are the local coordinates of the detected objects, and x and y are the global coordinates of the local system's origin. From inequality constraints (16) and (18), we obtain a concatenated vector of inequality constraints denoted by C k ∈ R (n a ·n w ·H+n o ·H)×1 ≤ 0: where C a ∈ R n a ·n w ·H×1 , C o ∈ R n o H×1 are defined below: In the previous equations, n a = 2 defines the number of constraints regarding actuators, and it is two because we included two types of actuator restrictions: angular speed and angular acceleration.
We approximate numericallyω j byω j ≈ (ω where T s is the sampling time. In the cost function (9), we propose that w p (α), w x (α) and w y (α) are switched between their two states based on the value of α = max 1≤j≤n o ·H C o j which, practically, determines the minimum proximity to an obstacle from the object list. Therefore, In the previous equation, tol defines the avoidance tolerance.
The set-point orientation over the control horizon H is defined as: and the reference trajectory is given by a first-order static function where the slope λ and the bias ρ are given by:

Control Algorithm-One Step Optimization
The control algorithm core is the sequential quadratic optimizer with a constraint tolerance of 1.0 × 10 −3 and an optimality tolerance of 1.0 × 10 −4 deduced heuristically through multiple experiments. Under this parametrization, the behavior is fairly robust and predictable with respect to the initial robot position, final resting position, varying size obstacles, wheel speeds, and acceleration. The object lists consist of a matrix of object positions obtained from the perception module. In order to determine the radius of the obstacles, we use the Moor-Neighbour tracing algorithm with Jacob's stopping criteria, which provides the contour of the objects from LIDAR data. Beyond providing LIDAR data, CNN can provide estimates of object radius with higher precision based on the object class. In order to reduce the computation time, the optimization problem is reformulated at each sampling time, and we consider in the optimization only those objects within a maximum radius (d max ) relative to the OMR's geometric center. The avoidance radius for each object is determined from the actual object radius with an additional tolerance according to OMR's dimensions. The reference orientation θ r , and reference trajectory (x r , y r ) are determined at each sample time since the OMR position evolves from one pose to another, constantly changing the heading to the final resting position. The first computed command over the predicted horizon is applied to the process inputs. We summarize the control algorithm steps for one sampling time T s in Algorithm 1.

Inputs:
Desired setpoint X f from mission planner, X f ← [x f y f θ f ] t ; Initial position X 0 from perception module, X 0 ← [x 0 y 0 θ 0 ] t ; Outputs: Actuator commands over horizon H, ω Acquire object list data: positions (x o , y o ), radius (r o ) from perception module; Detect object boundaries from LiDAR data using Moore-Neighbor tracing algorithm with Jacob's stopping criteria [40] [B,L] = bwboundaries(LiDAR data); (Matlab specific function) Object boundary ← B{k}; B is Matlab cell data-type, therefore brackets are '{}' for indexing Ignore objects composed of a very small or very high number of pixels (usually are artifacts or room boundaries) if Boundary Min ≤ numel(Object boundary)/2 ≤ Boundary Max then l ← l + 1; If the number of objects exceeds buffer size (MaxNoObjs), an error will be thrown, and optimization will not be started if l > MaxNoObjs then l ← −1; break; xy ← mean(Object boundary) ∈ R 2×1 Matlab specific function to determine mean value over each line of a matrix.
x o (l) ← xy [2]; y o (l) ← xy [1]; r o (l) ← max(|max(Objectboundary) − min(Objectboundary)|); Matlab specific functions to determine min, max values of matrix rows; or r o provided by CNN subsystem; noObjs ← l; No. of all objects detected in the map; Determine relevant objects (within specified proximity d max ); for k ∈ {1 . . . noObjs} No. of all objects do Calculate distance to each relevant object:     Figure 6 illustrates the main coordinates and notations used throughout the optimization problem. The projection d p from the robot CG to the imaginary straight path connecting the initial X 0 and final X f resting locations is noticeable. Moreover, the L2-norms used in calculating the cost function, L 1 , L 2 , and L 3 define the distances between the OMR, initial, and resting positions.  Table 5 contains the parameters of the model-predictive controller, including the penalizing factor of the cost function, the proximity threshold (tol) for switching cost function weights, the radius w.r.t to OMR's CG to and the prediction horizon.

Parameter Value
Cost weight 1 of reference trajectory (w xy1 ) 0.6 Cost weight 1 of projection length to ideal path (w p 1 ) 0.01 Cost weight 2 of reference trajectory (w xy2 ) 0.05 Cost weight 2 of projection length to ideal path (w p 2 ) 2.0 Cost weight of orientation angle (w θ ) 0

Object Detection Results
The performance of the selected object detection solutions (ssd-mobilenet-v1, ssdmobilenet-v2-lite, ssd-vgg16, and YoloV5) was evaluated on a testing subset with image resolutions varying between 720 × 404 and 2048 × 1024 pixels. The neural networks were tested on the Nvidia Jetson AGX mobile platform with the same input.
All models are optimized for Jetson Xavier AGX with the TensorRT framework from CUDA for Nvidia cards. The run time of the three selected architectures from the SSD family and the five main YoloV5 [29] is presented in Table 6. Architectures with fewer parameters performed better in terms of frames per second. Being the lightest model, the Nano YoloV5 is six times faster than the Extra Large model, the largest we considered for the Jetson platform. This highlights the importance of considering the specific hardware platform and the model's complexity for deploying object detection algorithms on mobile robots.
The two largest YoloV5 models did not bring any improvements for the overall precision and the precision per class compared to the Medium architecture; therefore, they were not considered for Table 6. A comparison between the precision of the models can be made based on the figures presented in Table 7. All architectures were trained for 150 epochs to evaluate the mean Average Precision. SSD Mobilenet v2 lite and SSD VGG16 reach a similar mAP@0.5 of 98-99%, while SSD Mobilenet v1 has a lower precision on the test subset, 86%.
Based on the results from Tables 6 and 7, we can draw the conclusion that the best model for our OMR object detection use cases is YoloV5 Medium which has a mAP comparable to SSD-VGG16, with the benefit of being twice as fast. Detection examples with the neural network models tested in the OMR environment are shown in Figure 7.

Simulation Results
To evaluate the control performances, we considered scenarios where the initial and final positions varied throughout the room so that obstacles blocked the OMR path. We perform numerical simulations on real data acquired from the perception module. We evaluate the steady-state error, the possible constraint violations, the cost function, and the optimization run-time.
In the first test case considered in Figure 8, the final resting position X f is reached after avoiding the two obstacles on the circumference of virtual circles centered around the objects. The inequality geometric inequality constrained C o < 0, and the actuator constraints are satisfied C a < 0 with an acceptable tolerance. Generally, the tolerance is within the expected margin of 1.0 × 10 −3 . The steady-state error of the controlled position (x, y) is less than 1% as measured around moment t = 10.2 s. The transient time is limited by the upper and lower bounds of the wheel speed, in this case, ±10 rad/s. The orientation θ changes at each sample time as the vehicle travels towards X f . Hence, the tracking is decent, with a peak error of 17 degrees noticeably on the roundabouts of the objects since the optimizer is more constrained. The cost function decreases as the vehicle evolves across the map. In the proximity of an object, the cost function is purposely increased to avoid local minima by amplifying the deviation from the reference trajectory and decreasing the penalizing weight for the projection to the ideal path to allow solutions on the circumference of the encircled object. The maximum number of iterations was 79 with a run-time of 0.8945 s, and the minimum number of iterations was 2 with a run-time of 0.0204 s (CPU Intel i7 7500u, dual-core, 7th generation). The mean number of iterations was 6.526, with an execution time of 0.0647 s. It must be mentioned that the run-time is less relevant since in MEX mode (Matlab executable), the run-time can be reduced considerably (in MEX mode, the average run-time was 0.0507 s, while in normal mode 0.0647 s). The execution time is platform dependent. [m]

Position x [m]
x SP  In the second scenario presented in Figure 9, the behavior is similar concerning the constraint tolerances. The violation of the object boundaries is within the expected limit, and the steady-state error of the controlled pose (x, y, θ) is less than 1%. In this case, the actuator constraints limit the transient time, ±10 rad/s. The maximum number of iterations was 66 with a run-time of 0.9923 s, and the minimum number of iterations was 2 with a run-time of 0.022 s (same CPU as mentioned in test case I). The mean number of iterations was 8.5658, with a mean execution time of 0.0814 s. In MEX mode, the maximum run-time was 0.5681 s, the minimum 0.0039 s, and the average 0.0356 s. Generally, the behavior is as expected, and the run-time proves the applicability of the control structure.
Similar behavior is obtained in the third test case presented in Figure 10, but the maximum run-time is slightly higher at 1.6 s, the maximum number of iterations is 210, and the minimum is 2. The minimum run-time was 0.0191 s. However, the average run-time in MEX mode is 0.0358 s with a maximum of 0.3176 s (instead of 1.6 s as in normal mode) and a minimum of 0.0043 s. [m]

Position x [m]
x SP

Conclusions and Future Work
The use of CNNs for object detection in mobile robot navigation provides benefits such as accuracy, robustness, and adaptability, which are desirable for the navigation of mobile robots in a logistic environment.
The paper proves the use of an object detector for a better understanding of the OMR working environment. To overcome this challenge, we also acquired a dataset for domainspecific object detection that was made public. It contains all objects of interest for the working environment, such as fixed or mobile conveyors, charging stations, other robots, and boundary cones. The results show a detection accuracy of 99% using the selected lightweight model, which was optimized to run on the available mobile platform already installed on the OMR at about 109 frames per second. The detection results offer a better understanding of the LiDAR map by assigning a name to obstacles and objects within the working environment, allowing the control model constraints to be adjusted on the fly. This paper also demonstrates the model-predictive control of the OMR in logistic environments with actuator and geometric constraints. We avoid local minima by using variable cost function weights to navigate around obstacles while still achieving the overall objective of reducing travel distance. The execution runtime of the optimizer allows for practical implementation while the control performance is within the expected margin.
Future work is also expected to involve the deployment of the OMR controller and testing in a controlled environment and then in an automated logistic warehouse. One of the short-term goals is to collect and annotate more instances of domain-specific objects so that the intraclass variety is better covered and the detector can extrapolate on new data.

Data Availability Statement:
The obtained data set for object detection is publicly available.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: