Learning Novel Objects for Extended Mobile Manipulation

Nakamura, Tomoaki; Sugiura, Komei; Nagai, Takayuki; Iwahashi, Naoto; Toda, Tomoki; Okada, Hiroyuki; Omori, Takashi

doi:10.1007/s10846-011-9605-1

Learning Novel Objects for Extended Mobile Manipulation

Published: 23 July 2011

Volume 66, pages 187–204, (2012)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Tomoaki Nakamura¹,
Komei Sugiura²,
Takayuki Nagai¹,
Naoto Iwahashi²,
Tomoki Toda³,
Hiroyuki Okada⁴ &
…
Takashi Omori⁴

256 Accesses
16 Citations
Explore all metrics

Abstract

We propose a method for learning novel objects from audio visual input. The proposed method is based on two techniques: out-of-vocabulary (OOV) word segmentation and foreground object detection in complex environments. A voice conversion technique is also involved in the proposed method so that the robot can pronounce the acquired OOV word intelligibly. We also implemented a robotic system that carries out interactive mobile manipulation tasks, which we call “extended mobile manipulation”, using the proposed method. In order to evaluate the robot as a whole, we conducted a task “Supermarket” adopted from the RoboCup@Home league as a standard task for real-world applications. The results reveal that our integrated system works well in real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Inamura, T., Okada, K., Tokutsu, S., Hatao, N., Inaba, M., Inoue, H.: HRP-2W: a humanoid platform for research on support behavior in daily life environments. Robot. Auton. Syst. 57(2), 145–154 (2009)
Article Google Scholar
Wyrobek, K., Berger, E., Van der Loos, H., Salisbury, J.: Towards a personal robotics development platform: rationale and design of an intrinsically safe personal robot. IEEE Int. Conf. Robot. Autom. 2165–2170 (2008)
Weisshardt, F., Reiser, U., Parlitz, C., Verl, A.: Making high-tech service robot platforms available. In: Proceedings-ISR/ROBOTIK 2010 (2010)
Stückler, J., Behnke, S.: Integrating indoor mobility, object manipulation, intuitive interaction for domestic service tasks. In: IEEE-RAS International Conference on Humanoid Robots (2009)
Holz, D., Paulus, J., Breuer, T., Giorgana, G., Reckhaus, M., Hegger, F., Müller, C., Jin, Z., Hartanto, R., Ploeger, P., et al.: The b-it-bots RoboCup@ home 2009 team description paper. RoboCup 2009@ Home League Team Descriptions, Graz, Austria (2009)
RoboCup@Home: (2010)
2010 Mobile Manipulation Challenge: http://www.willowgarage.com/mmc10 (2010)
Semantic Robot Vision Challenge: http://www.semantic-robot-vision-challenge.org/ (2009)
Bazzi, I., Glass, J.: A multi-class approach for modelling out-of-vocabulary words. In: Seventh International Conference on Spoken Language Processing (2002)
Nakano, M., Iwahashi, N., Nagai, T., Sumii, T., Zuo, X., Taguchi, R., Nose, T., Mizutani, A., Nakamura, T., Attamim, M., et al.: Grounding new words on the physical world in multi-domain human-robot dialogues. In: 2010 AAAI Fall Symposium Series, pp. 74–79 (2010)
Holzapfel, H., Neubig, D., Waibel, A.: A dialogue approach to learning object descriptions and semantic categories. Robot. Auton. Syst. 56(11):1004–1013 (2008)
Article Google Scholar
Toda, T., Ohtani, Y., Shikano, K.: One-to-many and many-to-one voice conversion based on eigenvoices. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. 1249–1252 (2007)
Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. (TOG) 23(3), 309–314 (2004)
Article Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2002)
Google Scholar
Mishra, A.K., Aloimonos, Y.: Active segmentation. Int. J. Human. Rob. 6, 361–386 (2009)
Article Google Scholar
Hasler, S., Wersing, H., Kirstein, S., Körner, E.: Large-scale real-time object identification based on analytic features. In: Artificial Neural Networks–ICANN 2009, pp. 663–672 (2009)
Kim, H., Murphy-Chutorian, E., Triesch, J.: Semi-autonomous learning of objects. In: Computer Vision and Pattern Recognition Workshop, p. 145 (2006)
Wersing, H., Kirstein, S., Gotting, M., Brandl, H., Dunn, M., Mikhailova, I., Goerick, C., Steil, J., Ritter, H., Korner, E.: Online learning of objects in a biologically motivated visual architecture. Int. J. Neural Syst. 17(4), 219–230 (2007)
Article Google Scholar
Iwahashi, N.: Robots that learn language: developmental approach to human-machine conversations. In: Symbol Grounding and Beyond, pp. 143–167 (2006)
Roy, D.: Grounding words in perception and action: computational insights. Trends Cogn. Sci. 9(8), 389–396 (2005)
Article Google Scholar
Fujita, M., Hasegawa, R., Takagi, T., Yokono, J., Shimomura, H.: An autonomous robot that eats information via interaction with humans and environments. In: IEEE International Workshop on Robot and Human Interactive Communication, pp. 383–389 (2002)
Johnson-Roberson, M., Skantze, G., Bohg, J., Gustafson, J., Carlson, R., Kragic, D.: Enhanced visual scene understanding through human-robot dialog. In: 2010 AAAI Fall Symposium on Dialog with Robots (2010)
Mesa imaging: http://www.mesa-imaging.ch/index.php
Okada, K., Kagami, S., Inaba, M., Inoue, H.: Plane segment finder: algorithm, implementation and applications. IEEE Int. Conf. Robot. Autom. 2, 2120–2125 (2005)
Google Scholar
Nakamura, S., Markov, K., Nakaiwa, H., Kikui, G., Kawai, H., Jitsuhiro, T., Zhang, J., Yamamoto, H., Sumita, E., Yamamoto, S.: The ATR multilingual speech-to-speech translation system. IEEE Trans. Audio, Speech, Lang. Process. 14(2), 365–376 (2006)
Article Google Scholar
Fujimoto, M., Nakamura, S.: Sequential non-stationary noise tracking using particle filtering with switching dynamical system. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1 (2006)
Kawai, H., Toda, T., Ni, J., Tsuzaki, M., Tokuda, K.: XIMERA: a new TTS from ATR based on corpus-based technologies. In: Fifth ISCA Workshop on Speech Synthesis, pp. 179–184 (2004)
Okada, H., Omori, T., Iwahashi, N., Sugiura, K., Nagai, T., Watanabe, N., Mizutani, A., Nakamura, T., Attamimi, M.: Team eR@sers 2009 in the @home league team description paper (2009)
Nene, S.A., Nayar, S.K., Murase, H.: Columbia Object Image Library (COIL-100). Technical report (1996)
International Telecommunication Union: ITU-T P.800. http://www.itu.int/rec/T-REC-P.800/en
Attamimi, M., Mizutani, A., Nakamura, T., Nagai, T., Funakoshi, K., Nakano, M.: Real-time 3D visual sensor for robust object recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4560–4565 (2010)
RoboCup@Home league committee: RoboCup@ Home rules & regulations. http://www.ai.rug.nl/robocupathome/documents/rulebook2009_FINAL.pdf (2009)

Download references

Author information

Authors and Affiliations

The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu-shi, Tokyo, Japan
Tomoaki Nakamura & Takayuki Nagai
National Institute of Information and Communications Technology, 3-5 Hikaridai, Seika, Soraku, Kyoto, Japan
Komei Sugiura & Naoto Iwahashi
Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara, Japan
Tomoki Toda
Tamagawa University, 6-1-1 Tamagawagakuen, Machida, Tokyo, Japan
Hiroyuki Okada & Takashi Omori

Authors

Tomoaki Nakamura
View author publications
You can also search for this author in PubMed Google Scholar
Komei Sugiura
View author publications
You can also search for this author in PubMed Google Scholar
Takayuki Nagai
View author publications
You can also search for this author in PubMed Google Scholar
Naoto Iwahashi
View author publications
You can also search for this author in PubMed Google Scholar
Tomoki Toda
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Okada
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Omori
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomoaki Nakamura.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nakamura, T., Sugiura, K., Nagai, T. et al. Learning Novel Objects for Extended Mobile Manipulation. J Intell Robot Syst 66, 187–204 (2012). https://doi.org/10.1007/s10846-011-9605-1

Download citation

Received: 08 December 2010
Accepted: 12 May 2011
Published: 23 July 2011
Issue Date: April 2012
DOI: https://doi.org/10.1007/s10846-011-9605-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Novel Objects for Extended Mobile Manipulation

Abstract

Access this article

Similar content being viewed by others

NATLOC: Natural Language Object Localization

Vision-Based Reacquisition for Task-Level Control

TidyBot: personalized robot assistance with large language models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning Novel Objects for Extended Mobile Manipulation

Abstract

Access this article

Similar content being viewed by others

NATLOC: Natural Language Object Localization

Vision-Based Reacquisition for Task-Level Control

TidyBot: personalized robot assistance with large language models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation