透過您的圖書館登入
IP:18.118.37.240
  • 學位論文

運用深度和色彩資訊之智慧物體辨識演算法設計

Algorithm Design on Intelligent Vision for Objects using RGB and Depth Data

指導教授 : 陳良基

摘要


近年來,新的應用如擴增實境、無人駕駛車、智慧環境監控、動作分析、智慧機器人等等都逐漸成為可能,而這都是因為電腦視覺的快速進展,其中物體辨識扮演了重要的角色。物體是環境中的基本物理單位,只要環境中的物體都能被成功地追蹤、辨識、偵測、定位、分析和切割,電腦視覺的相關應用就可能達成很好的表現 然而,傳統的二維電腦視覺方法有很多限制和問題,近年來色彩資訊加上深度資訊則逐漸成為主流,用來解決過去只用二維影像演算法無法處理的問題。深度感測器越來越便宜,也越來越實用,它們的出現對電腦視覺領域帶來的新的機會和挑戰。而這篇論文就是在探討如何有效率地結合色彩資訊和深度資訊,來解決物體辨識中的重要問題,這篇論文所提出的高度整合系統可以對未來的各種電腦視覺應用有根本上的幫助。 首先,環境中的三維結構分析將色彩深度資訊切割成平面和群集等基本物理單位。去噪和準確物體切割則接著進行,將深度資訊中的破碎物體邊緣補回來,以產生準確的影像切割。三維物體追蹤也同時在此論文中完成了開發,目標物體可以即時的在各種劇烈外觀變化下準確地被追蹤,色彩深度資訊提供了各種三維特徵來完成這樣工作。線上物體外觀學習則同時進行,物體偵測演算法可以被線上即時地訓練,來完成三維物體的偵測和定位。總而言之,這篇論文提出了一個高度整合的物體辨識系統,包含了物體線上學習、偵測、追蹤、切割、定位和三維結構分析,其中色彩深度資訊被有效地分析和使用來增進物體辨識的表現。

並列摘要


In recent years, new applications like augmented reality, driverless cars, intelligent environmental surveillance, human action analysis, and in-home robotics are all becoming possible due to the advancement of computer vision (CV) developments, and object recognition plays an essential role in all those tasks. Objects are basic meaningful units of surrounding environments, and good performance of those CV applications can be achieved if all related objects can be successfully tracked, detected, segmented, recognized and analyzed. However, there are still limitations in current 2D CV methods. Color data combined with depth data are then considered to improve performances and solve problems encountered by traditional algorithms. Depth sensors are now cheap and readily available, and they bring new opportunities and challenges to many CV areas. In this thesis, a thorough system is designed to solve essential problems of object recognition with RGB and depth data (RGB-D data). The system aims to assist those emerging CV applications to help people live more conveniently and have more fun. In this thesis, essential algorithms of object recognition are developed and integrated to provide a thorough solution for problems encountered by previous works. First, 3D structure analysis is developed to segment the input RGB-D data into basic 3D structure elements. The indoor scene is parsed into planes and physically meaning clusters with depth and surface normals. Further analysis and processing are the performed on those clusters as object candidates. De-noising and accurate object segmentation algorithm are then proposed. Depth data is useful in segmenting raw clusters, but it is often noisy and broken at edges of objects. Color images and depth data are then combined to accurately segment out objects due to the higher quality of passive color images. Detailed boundaries are preserved by color superpixels, and 3D structure is then used to build foreground and background color models. Using superpixels and color models, accurate object visual segmentation is achieved automatically without any user input. 3D object tracking is also developed. The targeted object can be tracked in real time with huge variations in size, translation, rotation, illumination, and appearance. Using RGB-D Data, the high performance can be achieved, and features like positions, global color and normal models, and local 3D descriptors are processed for tracking. Compared to previous 2D tracking, no sliding window or particle filtering is needed since 3D structure elements are available. Pixel-wise accurate video segmentation can also be achieved with the proposed segmentation method. Finally, a novel on-line learning method is proposed to train robust object detectors. By using the proposed object tracking, training data with labels can be automatically generated, and efficient on-line learning and detection methods are used for real-time performance. Combining object detectors with tracking, object recognition and tracking recovery can be achieved. In previous 2D CV learning tasks, training datasets often suffer from lack of variability, cluttered background, and inability in automatically locating and segmenting targets of interest. The proposed on-line learning is provided for those problems. Totally speaking, a highly integrated algorithm for object recognition is designed, and RGB-D data is used and studied for object segmentation, tracking, on-line learning, and detection.

參考文獻


groups with rgb-d data,” in Intelligent Robots and Systems (IROS),
2012 IEEE/RSJ International Conference on, pp. 2101–2107, 2012.
“Cloth grasp point detection based on multiple-view geometric cues
(ICRA), 2010 IEEE International Conference on, pp. 2308–2315, 2010.
Workshop on Semantic Perception Mapping and Exploration (SPME).,

延伸閱讀