计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (10): 209-216.DOI: 10.3778/j.issn.1002-8331.2301-0163

• 图形图像处理 • 上一篇    下一篇

基于图像实例分割的机器人箱体拆垛方法

邹汶材,刘宝临   

  1. 1.西南交通大学 计算机与人工智能学院,成都 610000
    2.西南交通大学 信息科学与技术学院,成都 610000
  • 出版日期:2024-05-15 发布日期:2024-05-15

Robot Box Depalletizing Method Based on Image Instance Segmentation

ZOU Wencai, LIU Baolin   

  1. 1.School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 610000, China
    2.School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610000, China
  • Online:2024-05-15 Published:2024-05-15

摘要: 为了解决工业箱体包裹拆垛任务中传统特征提取方法依赖于箱体形状,难适用于规格多变、混箱垛型的问题,提出一种基于图像实例分割的机器人拆垛方法。为获得准确箱体拣选中心,首先利用Mask R-CNN进行实例分割获得箱体掩膜和分类信息,并在特征提取后添加空间变换网络(spatial transformation network,STN)模块,优化对旋转目标的识别;然后求解掩膜最小外接矩形并后处理获得待拣选箱体像素中心和水平旋转角度,并结合一种拆垛策略进行先行后列排序以确定拣选顺序;最后利用标定方法完成箱体定位,并对规格多变、混箱垛型进行分割和机器人拆垛实验,结果表明,待拆箱体像素中心平均像素距离约4个像素点,空间定位平均误差约1?cm,定位精度满足工业拆垛实际需求。

关键词: 实例分割, Mask R-CNN, 分类, 标定, 工业拆垛

Abstract: In order to solve the problem that the traditional feature extraction method in the unstacking task of industrial box parcels depends on the shape of the box and is difficult to apply to variable specifications and mixed box types, a robot unstacking method based on image instance segmentation is proposed. In order to obtain accurate box picking center, the Mask R-CNN is used to perform instance segmentation to obtain the box mask and classification information, and an STN (spatial transformation network) module is added after feature extraction to optimize the recognition of rotating objects. Then, the minimum circumscribed rectangle of the mask is solved and post-processed to obtain the pixel center and horizontal rotation angle of the box to be picked, and a unstacking strategy is combined to sort first row and then column to determine the picking order. Finally, the calibration method is used to complete the box positioning, and the segmentation and robot unstacking experiments are conducted for variable specifications and mixed box stacking types. The results show that the average pixel distance between the pixel centers of the unboxed body is about 4 pixels, and the average error of spatial location is about 1?cm. The position accuracy meets the actual needs of industrial depalletization.

Key words: instance segmentation, Mask R-CNN, classification, calibration, industrial depalletization