Abstract
For human–robot interaction (HRI), perception is one of the most important capabilities. This paper reviews several widely used perception methods of HRI in social robots. Specifically, we investigate general perception tasks crucial for HRI, such as where the objects are located in the rooms, what objects are in the scene, and how they interact with humans. We first enumerate representative social robots and summarize the most three important perception methods from these robots: feature extraction, dimensionality reduction, and semantic understanding. For feature extraction, four widely used signals including visual-based, audio-based, tactile-based and rang sensors-based are reviewed, and they are compared based on their advantages and disadvantages. For dimensionality reduction, representative methods including principle component analysis (PCA), linear discriminant analysis (LDA), and locality preserving projections (LPP) are reviewed. For semantic understanding, conventional techniques for several typical applications such as object recognition, object tracking, object segmentation, and speaker localization are discussed, and their characteristics and limitations are also analyzed. Moreover, several popular data sets used in social robotics and published semantic understanding results are analyzed and compared in light of our analysis of HRI perception methods. Lastly, we suggest important future work to analyze fundamental questions on perception methods in HRI.
Notes
In this paper, we do not differentiate feature representation and feature extraction while there are some slightly differences. Generally speaking, feature representation methods focus on extracting low-level features and feature extraction methods aim at extracting middle- and high-level features.
In some social robots, semantic understanding is included in the intermediate mechanism. In this paper, we consider it as a part of the perception system since it is indirectly fulfilled on the acquired signals and the understanding can represent the outside environments that a robot really wants to know.
In some areas, emotion recognition is also called affective computing and emotion is referred to as affective state. In this paper, we still use emotion recognition because it is more commonly used in the literature.
In this survey, we mainly focus on four widely used signals used in existing social robots. Hence, other types of signals such as GPS, temperature and pain of humans will not be discussed.
Please note that the core algorithm refers to the approach used for some semantic understanding tasks such as detection/classification/prediction in HRI by using the extracted features.
Please note that semantic understanding presented in this paper refers to the interaction between humans and robots, and hence other tasks such as navigation in mobile robots and interaction between robots and the environments are not included in this paper.
Face, human and emotion can be unified as a different categories of objects.
Please note that some other results are not included here. In this table, only the representative social robots listed in Table 1 are selected.
References
Fong T, Nourbakhsh I, Dautenhahn K (2003) A survey of socially interactive robots. Robot Auton Syst 42(3–4):143–166
Breazeal C (2002) Designing sociable robots. MIT Press, Cambridge
Bartneck C, Forlizzi J (2004) A design-centred framework for social human–robot interaction. In: IEEE international workshop on robot and human interactive communication, pp 591–594
Hegel F, Muhl C, Wrede B, Martina H-F, Sagerer G (2009) Understanding social robots. In: International conference on advance in computer–human interactions, pp 169–174
Social robot, accessed 5 November, 2011 [Online]. Available from: http://en.wikipedia.org/wiki/Social_robot
Breazeal C (2003) Toward sociable robots. Robot Auton Syst 42(3–4):167–175
Hirose M, Ogawa K (2007) Honda humanoid robots development. Philos Trans R Soc, Math Phys Eng Sci 365:11–19
Fong T, Nourbakhsh I, Dautenhahn K (2003) A survey of socially interactive robots. Robot Auton Syst 42(3–4):143–166
Jensen B, Tomatis N, Mayor L, Drygajlo A, Siegwart R (2005) Robots meet humans-interaction in public spaces. IEEE Trans Ind Electron 52(6):1530–1546
Jones C, Deeming A (2008) Affective human–robotic interaction. In: Lecture Notes in Computer Science, vol 4868. Springer, Berlin, pp 175–185
The FG-NET Aging Database, accessed 25 February, 2008 [Online]. Available from: http://www.fgnet.rsunit.com/
Fitzpatrick PM, Metta G (2002) Towards manipulation-driven vision. In: IEEE international conference on intelligent robots and systems, vol 1, pp 43–48
Scassellati B (1998) Eye finding via face detection for a foveated, active vision system. In: National conference on artificial intelligence, pp 969–976
Tikhanoff V, Cangelosi A, Fitzpatrick P, Metta G, Natale L, Nori F (2008) An open-source simulator for cognitive robotics research: the prototype of the iCub humanoid robot simulator. In: Performance metrics for intelligent systems (PerMIS) workshop, pp 57–61
Sandini G, Metta G, Vernon D (2007) The iCub cognitive humanoid robot: an open-system research platform for enactive cognition. In: Lecture notes in computer science, vol 4850. Springer, Berlin, pp 358–369
Rolf M, Hanheide M, Rohlfing KJ (2009) Attention via synchrony: making use of multimodal cues in social learning. IEEE Trans Auton Mental Dev 1(1):55–67
Figueira D, Lopes M, Ventura R, Ruesch J (2009) Towards a spatial model for humanoid social robots. In: Lecture notes in computer science. Springer, Berlin, pp 287–298
Hornstein J, Lopes M, Santos-Victor J, Lacerda F (2006) Sound localization for humanoid robots—building audio-motor maps based on the HRTF. In: International conference on intelligent robots and systems, pp 1170–1176
Breazeal C (2003) Emotion and sociable humanoid robots. Int J Hum-Comput Stud 59(1–2):119–155
Breazeal C, Edsinger A, Fitzpatrick P, Scassellati B (2001) Active vision for sociable robots. IEEE Trans Syst Man Cybern, Part A, Syst Hum 31(5):443–453
Aryananda L (2002) Recognizing and remembering individuals: online and unsupervised face recognition for humanoid robot. In: IEEE international conference on intelligent robots and systems, vol 2, pp 1202–1207
Breazeal C, Aryananda L (2002) Recognition of affective communicative intent in robot-directed speech. Auton Robots 12(1):83–104
Ge S, Wang C, Hang C (2008) Facial expression imitation in human robot interaction. In: IEEE international symposium on robot and human interactive communication, pp 213–218
Barciela G, Paz E, López J, Sanz R, Perez D (2008) Building a robot head: design and control issues. In: IEEE international symposium on robot and human interactive communication, pp 213–218
Breazeal C, Kidd CD, Thomaz AL, Hoffman G, Berlin M (2005) Effects of nonverbal communication on efficiency and robustness in human–robot teamwork. In: International conference on intelligent robots and systems, pp 383–388
Feil-Seifer D, Matarić MJ (2005) Defining socially assistive robotics. In: International conference on rehabilitation robotics, pp 465–468
Simmons R, Goldberg D, Goode A, Montemerlo M, Roy N, Sellner B, Urmson C, Maxwell B (2003) GRACE: an autonomous robot for the AAAI robot challenge. AI Mag 24(2):51–72
Michalowski MP, Šabanović S, Disalvo C, Busquets D, Hiatt LM, Melchior NA, Simmons R (2007) Socially distributed perception: GRACE plays social tag at AAAI 2005. Auton Robots 22(4):385–397
Clodic A, Fleury S, Alami R, Herrb M, Chatila R (2005) Supervision and interaction. In: International conference on advanced robotics, pp 725–732
Jensen B, Philippsen R, Siegwart R (2003) Narrative situation assessment for human–robot interaction. In: IEEE international conference on robotics and automation, vol 1, pp 1503–1508
Jensen B, Froidevaux G, Greppin X, Lorotte A, Mayor L, Meisser M, Ramel G, Siegwart R (2003) Multi-robot human-interaction and visitor flow management. In: IEEE international conference on robotics and automation, pp 2388–2393
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: IEEE computer society conference on computer vision and pattern recognition, vol 1, pp I511–I518
Germa T, Lerasle F, Danès P, Brèthes L (2007) Human/robot visual interaction for a tour-guide robot. In: IEEE international conference on intelligent robots and systems, pp 3448–3453
Hasanuzzaman Md, Zhang T, Ampornaramveth V, Gotoda H, Shirai Y, Ueno H (2007) Adaptive visual gesture recognition for human–robot interaction using a knowledge-based software platform. Robot Auton Syst 55(8):643–657
Kanda T, Glas DF, Shiomi M (2009) Abstracting people’s trajectories for social robots to proactively approach customers. IEEE Trans Robot 25(6):1382–1396
Movellan JR, Tanaka F, Fasel IR, Taylor C, Ruvolo P, Eckhardt M (2007) The RUBI project: a progress report. In: ACM/IEEE conference on human–robot interaction—robot as team member, pp 333–339
Ruvolo P, Fasel I, Movellan J (2008) Auditory mood detection for social and educational robots. In: IEEE international conference on robotics and automation, pp 3551–3556
Bartlett MS, Littlewort G, Frank M, Lainscsek C, Fasel I, Movellan J (2006) Fully automatic facial action recognition in spontaneous behavior. In: International conference on automatic face and gesture recognition, pp 223–230
Christensen HI (2003) Intelligent home appliances. In: Springer tracts in advanced robotics. Springer, Berlin, pp 319–330
Lohse M, Hegel F, Wrede B (2008) Domestic applications for social robots-an online survey on the influence of appearance and capabilities. J Phys Agents 2(2):21–32
Asfour T, Regenstein K, Azad P, Schroder O, Bierbaum A, Vahrenkamp N, Dillmann R (2006) ARMAR-III: an integrated humanoid platform for sensory-motor control. In: International conference on humanoid robots, pp 169–175
Ekenel HK, Stiefelhagen R (2005) A generic face representation approach for local appearance based face verification. In: IEEE computer society conference on computer vision and pattern recognition workshops, vol 03, p 155
Nickel K, Gehrig T, Stiefelhagen R, McDonough J (2005) A joint particle filter for audio-visual speaker tracking. In: International conference on multimodal interfaces, pp 61–68
Kraft F, Malkin R, Schaaf T, Waibel A (2005) Temporal ICA for classification of acoustic events in a kitchen environment. In: European conference on speech communication and technology, pp 2689–2692
Voit M, Nickel K, Stiefelhagen R (2007) Neural network-based head pose estimation and multi-view fusion. In: Lecture notes in computer science, vol 4122. Springer, Berlin, pp 291–298
Nickel K, Stiefelhagen R (2007) Visual recognition of pointing gestures for human–robot interaction. Image Vis Comput 25(12):1875–1884
Osada J, Ohnaka S, Sato M (2006) The scenario and design process of childcare robot. In: PaPeRo, international conference on advances in computer entertainment technology. Springer, Berlin
Sato A, Imaoka H, Suzuki T, Hosoi T (2005) Advances in face detection and recognition technologies. NEC J Adv Technol 2(1):28–34
Betkowska A, Shinoda K, Furui S (2007) Robust speech recognition using factorial HMMs for home environments. Eurasip J Adv Signal Process. doi:10.1155/2007/20593
Stiehl W, Breazeal C (2005) Affective touch for robotic companions. In: Lecture notes in computer science, vol 3784. Springer, Berlin, pp 747–754
Esau N, Kleinjohann L, Kleinjohann B (2006) Emotional communication with the robot head MEXI. In: International conference on control, automation, robotics and vision, pp 1–7
Stichling D, Kleinjohann B (2002) Low latency color segmentation on embedded real-time systems. In: IFIP world computer congress—TC10 stream on distributed and parallel embedded systems, vol 219, pp 247–256
Austermann A, Esa N, Kleinjohann L, Kleinjohann B (2005) Prosody based emotion recognition for MEXI. In: International conference on intelligent robots and systems, vol 3, pp 1138–1144
Esau N, Kleinjohann L, Kleinjohann B (2005) An adaptable fuzzy affective states model for affective states recognition. In: EUSFLAT—LFA, pp 73–78
Hirth J, Schmitz N, Berns K (2007) Emotional architecture for the humanoid robot head ROMAN. In: IEEE international conference on robotics and automation, pp 2150–2155
Schmitz N, Spranger C, Berns K (2009) 3D audio perception system for humanoid robots. In: International conferences on advances in computer–human interactions, pp 181–186
Strupp S, Schmitz N, Berns K (2008) Visual-based emotion detection for natural man–machine interaction. In: Lecture notes in computer science, vol 5243. Springer, Berlin, pp 356–363
Hackel M, Schwope S, Fritsch J, Wrede B, Sagerer G (2006) Designing a sociable humanoid robot for interdisciplinary research. Adv Robot 20(11):1219–1235
Vogt T, Andreé E (2005) Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition. In: IEEE international conference on multimedia and expo, pp 474–477
Spexard T, Haasch A, Fritsch J, Sagerer G (2006) Human-like person tracking with an anthropomorphic robot. In: IEEE international conference on robotics and automation, pp 1286–1292
Haasch A, Hohenner S, Hüwel S, Kleinehagenbrock M, Lang S, Toptsis I, Fink G, Fritsch J, Wrede B, Sagerer G (2004) BIRON-the bielefeld robot companion. In: International workshop on advances in service robotics, pp 27–32
Fritsch J, Kleinehagenbrock M, Lang S, Plötz T, Fink GA, Sagerer G (2003) Multi-modal anchoring for human–robot interaction. Robot Auton Syst 43(2–3):133–147
Lang S, Kleinehagenbrock M, Hohenner S, Fritsch J, Fink GA, Sagerer G (2003) Providing the basis for human–robot-interaction: a multi-modal attention system for a mobile robot. In: International conference on multimodal interfaces, pp 28–35
Bennewitz M, Faber F, Joho D, Behnke S (2007) Fritz—a humanoid communication robot. In: IEEE international conference on robot & human interactive communication, pp 1072–1077
Lisetti CL, Brown SM, Alvarez K, Marpaung AH (2004) A social informatics approach to human–robot interaction with a service social robot. IEEE Trans Syst Man Cybern, Part C, Appl Rev 34(2):195–209
Brown SM, Lisetti CL, Marpaung AH (2002) Cherry, the little red robot…with a mission…and a personality. In: AAAI fall symposium
Marpaung AH, Lisetti CL (2002) Multilevel emotion modeling for autonomous agents. In: AAAI fall symposium—technical report FS-04-05, pp 39–46
Kerstin S, Anders G, Helge H (2003) Social and collaborative aspects of interaction with a service robot. Robot Auton Syst 42:223–234
Chopra A, Obsniuk M, Jenkin MR (2006) The nomad 200 and the nomad SuperScout: reverse engineered and resurrected. In: Canadian conference on computer and robot vision
Kozima H, Michalowski M, Nakagawa C (2009) A playful robot for research, therapy, and entertainment. Int J Soc Robot 1:3–18
Wada K, Shibata T (2007) Living with seal robots—its sociopsychological and physiological influences on the elderly at a care house. IEEE Trans Robot 23(5):972–980
Goris K, Saldien J, Lefeber D (2008) Probo, a testbed for human robot interaction. In: ACM/IEEE international conference on human–robot interaction, pp 253–254
Saldien J, Goris K, Vanderborght B, Lefeber D (2008) On the design of an emotional interface for the huggable robot probo. In: The reign of catz and dogz, AISB2008
Goris K, Saldien J, Vanderborght B, Lefeber D (2008) The huggable robot probo: design of a robotic head. In: The reign of catz and dogz, AISB2008
Poel M, Heylen D, Nijholt A, Meulemans M, Breemen A (2009) Gaze behaviour, believability, likability and the iCat. AI Soc 24:61–73
Van Breemen AJN (2004) Animation engine for believable interactive user-interface robots. In: IEEE/RSJ international conference on intelligent robots and systems, vol 3, pp 2873–2878
Ronald C, Fujita M, Tsuyoshi T, Rika H (2003) An ethological and emotional basis for human–robot interaction. Robot Auton Syst 42:191–201
Oh JH, Hanson D, Kim WS, Han IY, Kim JY, Park IW (2006) Design of android type humanoid robot Albert HUBO. In: IEEE international conference on intelligent robots and systems, pp 1428–1433
Miwa H, Itoh K, Matsumoto M, Zecca M, Takanobu H, Roccella S, Carrozza MC, Takanishi A (2004) Effective affective statesal expressions with affective states expression humanoid robot WE-4RII—integration of humanoid robot hand RCH-1. In: International conference on intelligent robots and systems, vol 3, pp 2203–2208
Ogura Y, Aikawa H, Shimomura K, Kondo H, Morishima A, Lim HO, Takanishi A (2006) Development of a new humanoid robot WABIAN-2. In: IEEE international conference on robotics and automation, pp 76–81
Zecca M, Mizoguch Y, Endo K, Iida F, Kawabata Y, Endo N, Itoh K, Takanishi A (2009) Whole body emotion expressions for expressions for KOBIAN humanoid robot-preliminary experiments with different affective statesal patterns. In: IEEE international workshop on robot and human interactive communication, pp 381–386
Salichs MA, Barber R, Khamis AM, Malfaz M, Gorostiza JF, Pacheco R, Rivas R, García D (2006) Maggie: a robotic platform for human–robot social interaction. In: IEEE conference on robotics, automation and mechatronics
Gorostiza J, Barber R, Khamis A, Pacheco M, Rivas R, Corrales A, Delgado E, Salichs M (2006) Multimodal human–robot interaction framework for a personal robot. In: International symposium on robot and human interactive communication, pp 39–44
Kormushev P, Nenchev DN, Calinon S, Caldwell DG (2011) Upper-body kinesthetic teaching of a free-standing humanoid robot. In: International conference on robotics and automation, pp 3970–3975
Ishida T, Kuroki Y, Yamaguchi J (2003) Development of mechanical system for a small biped entertainment robot. In: International workshop on robot and human interactive communication, pp 297–302
Park IW, Kim JY, Lee J, Oh JH (2005) Mechanical design of humanoid robot platform KHR-3 (KAIST humanoid robot—3: HUBO). In: International conference on humanoid robots, pp 321–326
Okada K, Ogura T, Haneda A, Kousaka D, Nakai H, Inaba M, Inoue H (2004) Integrated system software for HRP2 humanoid. In: International conference on robotics and automation, vol 4, pp 3207–3212
Cousins S (2010) ROS on the PR2. IEEE Robot Autom Mag 17(3):23–25
Bischoff R, Huggenberger U, Prassler E (2011) KUKA youBot-a mobile manipulator for research and education. In: International conference on robotics and automation, pp 1–4
Goodrich MA, Schultz AC (2007) Human–robot interaction: a survey. Found Trends Hum-Comput Interact 1(3):203–275
Castleman KR (1996) Digital image processing. Prentice Hall, New York
Kinect, Accessed 9 December, 2011 [Online] Available from: http://en.wikipedia.org/wiki/Kinect
Bumblebee2, Accessed 2010 [Online] Available from: http://www.ptgrey.com/products/stereo.asp
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Darrell T, Gordon GM, Harville M, Woodfill J (2000) Integrated person tracking using stereo, color, and pattern detection. Int J Comput Vis 37(2):175–185
Wang X, Xu H, Wang H, Li H (2008) Robust real-time face detection with skin color detection and the modified census transform. In: IEEE international conference on information and automation, pp 590–595
Kakumanu P, Makrogiannis S, Bourbakis N (2007) A survey of skin-color modeling and detection methods. Pattern Recognit 40:1106–1122
Ford A, Roberts A (1998) Colour space conversions
Ruesch J, Lopes M, Bernardino A, Hörnstein J, Santos-Victor J, Pfeifer R (2008) Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub. In: IEEE international conference on robotics and automation, pp 962–967
Chen J, Tiddeman B (2007) Facial feature detection under various illuminations. In: Lecture notes in computer science, vol 4841. Springer, Berlin, pp 498–508
Zabih R, Woodfill J (1994) Non-parametric local transforms for computing visual correspondence. In: European conference on computer vision, pp 151–158
Song M, Tao D, Liu Z, Li X, Zhou M (2009) Image ratio features for facial expression recognition application. IEEE Trans Syst Man Cyber Part B Cyber. doi:10.1109/TSMCB.2009.2029076
Wang L, He D-C (1990) Texture classification using texture spectrum. Pattern Recognit 23(8):905–910
Ojala T, Pietikäinen M, Harwood D (1996) Texture classification using texture spectrum. Pattern Recognit 29(1):51–59
Ojala T, Pietikäine M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Jin H, Liu Q, Lu H, Tong X (2004) Face detection using improved LBP under Bayesian framework. In: International conference on image and graphics, pp 306–309
Ahonen T, Hadid A, Matti P (2006) Face description with local binary patterns: application to face recognition. IEEE Trans Pattern Anal Mach Intell 28(12):2037–2041
Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(6):803–816
Solar J, Quinteros J (2008) Illumination compensation and normalization in eigenspace-based face recognition: a comparative study of different pre-processing approaches. Pattern Recognit Lett 29:1966–1979
Zhao G, Pietikainen M (2009) Boosted multi-resolution spatiotemporal descriptors for facial expression recognition. Pattern Recognit Lett 30(12):1117–1127
Zabih R, Woodfill J (1996) A non-parametric approach to visual correspondence. IEEE Trans Pattern Anal Mach Intell
Christian K, Ernst A (2006) Face detection and tracking in video sequences using the modified census transformation. Image Vis Comput 24:564–572
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Lowe DG (1999) Object recognition from local scale-invariant features. In: International conference on computer vision, pp 1150–1157
Tian YL, Kanade T, Conn JF (2001) Recognizing action units for facial expression analysis. IEEE Trans Pattern Anal Mach Intell 23(2):97–115
Kass M, Witkin A, Terzopoulos D (1988) Snakes: active contour models. Int J Comput Vis 1(4):321–331
Illingworth J, Kittler J (1987) The adaptive hough transform. IEEE Trans Pattern Anal Mach Intell 9(5):690–698
Levi K, Weiss Y (2004) Learning object detection from a small number of examples: the importance of good features. In: IEEE computer society conference on computer vision and pattern recognition, vol 2, pp II53–II60
Raman M, Himanshu A (2009) Study and comparison of various image edge detection techniques. Int J Image Process 3(1):1–12
Horn BKP, Schunck BG (1981) Determining optical flow. Artif Intell 17(1–3):185–203
Brox T, Bruhn A, Papenberg N, Weickert J (2004) High accuracy optical flow using a theory for warping. In: Lecture notes in computer science, vol 3024. Springer, Berlin, pp 25–36
Bab-Hadiashar A, Suter D (1998) Robust optic flow computation. Int J Comput Vis 29(1):59–77
Iida F (2003) Biologically inspired visual odometer for navigation of a flying robot. Robot Auton Syst 44:201–208
Cédras C, Shah M (1995) Motion-based recognition: a survey. Image Vis Comput 13(2):129–155
Moeslund T, Granum E (2001) A survey of computer vision-based human motion capture. Comput Vis Image Underst 81:231–268
Wang J, Singh S (2003) Video analysis of human dynamics: a survey. Real-Time Imaging 9:321–346
Lu J, Zhang E (2007) Gait recognition for human identification based on ICA and fuzzy SVM through multiple views fusion. Pattern Recognit Lett 28(16):2401–2411
Lu J, Tan Y-P (2010) Uncorrelated discriminant nearest feature line analysis for face recognition. IEEE Signal Process Lett 17(2):185–188
Lu J, Tan Y-P (2010) Uncorrelated discriminant simplex analysis for view-invariant gait signal computing. Pattern Recognit Lett 31(5):382–393
Lu J, Tan Y-P (2010) Gait-based human age estimation. IEEE Trans Inf Forensics Secur 5(4):761–770
Lu J (2010) Enhanced locality sensitive discriminant analysis for image recognition. Electron Lett 46(3):217–218
Lu J, Tan Y-P (2010) A doubly weighted approach for appearance-based subspace learning methods. IEEE Trans Inf Forensics Secur 5(1):71–81
Lu J, Tan Y-P (2010) Regularized locality preserving projections and its extensions for face recognition. IEEE Trans Syst Man Cybern, Part B, Cybern 40(2):958–963
Lu J, Tan Y-P (2010) Cost-sensitive subspace learning for face recognition. In: IEEE international conference on computer vision and pattern recognition, pp 2661–2666
Lu J, Tan Y-P (2011) Nearest feature space analysis for classification. IEEE Signal Process Lett 18(1):55–58
Liu N, Lu J, Tan Y-P (2011) Joint subspace learning for view-invariant gait recognition. IEEE Signal Process Lett 18(7):431–434
Lu J, Zhou X, Tan Y-P, Shang Y, Zhou J (2012) Cost-sensitive semi-supervised discriminant analysis for face recognition. IEEE Trans Inf Forensics Secur 7(3):944–953
Lu J, Tan Y-P (2013) Cost-sensitive subspace analysis and extensions for face recognition. IEEE Trans Inf Forensics Secur 7(3):510–519
Lu J, Tan Y-P, Wang G (2013) Discriminative multimanifold analysis for face recognition from a single training sample per person. IEEE Trans Pattern Anal Mach Intell 35(1):39–51
Lu J, Zhang E, Kang X, Xue Y, Chen Y (2006) Palmprint recognition using wavelet decomposition and 2D principal component analysis. In: International conference on communications, circuits and systems proceedings, pp 2133–2136
Lu J, Zhao Y, Xue Y, Hu J (2008) Palmprint recognition via locality preserving projections and extreme learning machine neural network. In: International conference on signal processing, pp 2096–2099
Zhang E, Lu J, Duan G (2005) Gait recognition via independent component analysis based on support vector machine and neural network. In: International conference on natural computation, pp 640–649
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Belhumenur PN, Hepanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherface: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
He X, Yan S, Hu Y, Niyogi P, Zhang HJ (2005) Face recognition using Laplacian faces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340
Dabbaghchian SP, Ghaemmaghami M, Aghagolzadeh A (2010) Feature extraction using discrete cosine transform and discrimination power analysis with a face recognition technology. Pattern Recognit 43:1431–1440
Stiefelhagen R, Ekenel HK, Fügen C, Gieselmann P, Holzapfel H, Kraft F, Nickel K, Waibel A (2007) Enabling multimodal human–robot interaction for the Karlsruhe humanoid robot. IEEE Trans Robot 23(5):840–851
Liu C, Wechsler H (2002) Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Trans Image Process 11(4):467–476
Lu J, Zhao Y, Hu J (2009) Enhanced Gabor-based region covariance matrices for palmprint recognition. Electron Lett 45(17):880–881
Tong Y, Liao W, Ji Q (2007) Facial action unit recognition by exploiting their dynamic and semantic relationships. IEEE Trans Pattern Anal Mach Intell 29(10):1683–1699
Susskind JM, Littlewort G, Bartlett MS (2007) Human and computer recognition of facial expressions of emotion. Neuropsychologia 45(1):152–162
Pavani SK, Delgado D, Frangi AF (2010) Haar-like features with optimally weighted rectangles for rapid object detection. Pattern Recognit 43(1):160–172
Papageorgiou CP, Oren M, Poggio T (1998) A general framework for object detection. In: IEEE international conference on computer vision, pp 555–562
Yang P, Li Q, Metaxas DN (2009) Boosting encoded dynamic features for facial expression recognition. Pattern Recognit Lett 30(2):132–139
Lai K, Bo L, Ren X, Fox D (2011) A large-scale hierarchical multi-view RGB-D object dataset. In: IEEE international conference on robotics and automation, pp 1817–1824
Martin AF, Robert, CB (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Benavidez P, Jamshidi M (2011) Mobile robot navigation and target tracking system. In: International conference on system of systems engineering, pp 299–304
Johnson A, Hebert M (1999) Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Trans Pattern Anal Mach Intell 21(5):433–449
Rusu RB, Blodow N, Beetz M (2009) Fast point feature histograms (FPFH) for 3D registration. In: IEEE international conference on robotics and automation, pp 3212–3217
Bo L, Ren X, Fox D (2011) Depth kernel descriptors for object recognition. In: International conference on intelligent robots and systems, pp 821–826
Hartley R, Zisserman A (2000) Multiple view geometry in computer vision. Cambridge University Press, Cambridge, pp 1–12
Kim I, Kim D, Cha Y, Lee K, Kuc T (2007) An embodiment of stereo vision system for mobile robot for real-time measuring distance and object tracking. In: International conference on control, automation and systems, pp 1029–1033
Li Z, Jarvis R (2009) A multi-modal gesture recognition system in a human–robot interaction scenario. In: International workshop on robotic and sensors environments, pp 41–46
Thompson S, Kagami S (2005) Humanoid robot localisation using stereo vision. In: International conference on humanoid robots, pp 19–25
Prasad R, Saruwatari H, Shikano K (2004) Robots that can hear, understand and talk. Adv Robot 18(5):533–564
Sweeney L, Thompson P (1997) Speech perception using real-time phoneme detection: the BeBe system
Jaisal PK, Mishra PK (2012) A review of speech pattern recognition: survey. Int J Comput Sci Technol 3(1):709–713
Clavel C, Vasilescu I, Devillers L, Richard G, Ehrette T (2008) Fear-type emotion recognition for future audio-based surveillance systems. Speech Commun 50:487–503
Vogt T, André E, Johannes W (2008) Automatic recognition of emotion from speech: a review of the literature and recommendations for practical realisation. In: Lecture note in computer science, vol 4868. Springer, Berlin, pp 75–91
Hyun K, Kim E, Kwak Y (2007) Emotional feature extraction based on phoneme information for speech emotion recognition. In: IEEE international conference on robot & human interactive communication, pp 802–806
Devillers L, Vidrascu L, Lamel L (2005) Challenges in real-life emotion annotation and machine learning based detection. Neural Netw 18:407–422
Rong J, Gang L, Chen Y (2008) Acoustic feature selection for automatic emotion recognition from speech. Inf Process Manag 45(3):315–328
Hegel F, Spexard T, Wrede B, Horstmann G, Vogt T (2006) Playing a different imitation game: interaction with an empathic android robot. In: IEEE-RAS international conference on humanoid robots, pp 56–61
Morrison D, Wang R, Silva L (2007) Ensemble methods for spoken emotion recognition in call-centres. Speech Commun 49:98–112
Markel JD (1972) The SIFT algorithm for fundamental frequency estimation. IEEE Trans Audio Electroacoust AU-20(5):367–377
Wang C, Seneff S (2000) Robust pitch tracking for prosodic modeling in telephone speech. In: IEEE international conference on acoustics, speech and signal processing, vol 3, pp 1343–1346
Ahmadi S, Spanias AS (1999) Cepstrum-based pitch detection using a new statistical V/UV classification algorithm. IEEE Trans Speech Audio Process 7(3):333–338
Xu M, Duan LY, Cai J, Chia, LT, Xu C, Tian Q (2004) CHMM-based audio keyword generation. In: Lecture notes in computer science, vol 3333. Springer, Berlin, pp 566–574
Kim E, Hyun K, Kim S, Kwak Y (2009) Improved emotion recognition with a novel speaker-independent feature, IEEE/ASME Trans Mechatron. doi:10.1109/TMECH.2008.2008644
Welch P (1967) The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Trans Audio Electroacoust AU-15:70–73
Shibata T, Inoue K, Irie R (1996) Affective statesal robot for intelligent system—artificial affective statesal creature project. In: IEEE international workshop on robot and human communication, pp 466–471
Dao D, Sugiyama S (2006) Fabrication and characterization of 4-DOF soft-contact tactile sensor and application to robot fingers. In: International symposium on micro-NanoMechatronics and human science, pp 1–6
Tsetserukou D, Kawakami N, Tachi S (2008) An approach to contact force vector determination and its implementation to provide intelligent tactile interaction with environment. In: Lecture notes in computer science, vol 5024. Springer, Berlin, pp 151–156
Iwata H, Hoshino H, Morita T, Sugano S (2001) Force detectable surface covers for humanoid robots. In: International conference on advanced intelligent mechatronics, pp 1205–1210
Stiehl W, Breazeal C (2006) A sensitive skin for robotic companions featuring temperature, force, and electric field sensors. In: IEEE/RSJ international conference on intelligent robots and systems, pp 1952–1959
Stiehl W, Lieberman J, Breazeal C, Basel L, Lalla L, Wolf M (2005) Design of a therapeutic robotic companion for relational, affective touch. In: IEEE international workshop on robots and human interactive communication, pp 408–415
Shibata T (2004) Ubiquitous surface tactile sensor. In: IEEE technical exhibition based conference on robotics and automation, pp 5–6
Berger DA (1988) On using a tactile sensor for real-time feature extraction. Master’s thesis, Carnegie-Mellon University
Iwata H, Sugano S (2005) Human–robot-contact-state identification based on tactile recognition. IEEE Trans Ind Electron 52(6):1468–1477
Göger D, Gorges N, Wörn H (2009) Tactile sensing for an anthropomorphic robotic hand: hardware and signal processing. In: IEEE international conference on robotics and automation, pp 895–901
Carotenuto L, Famularo D, Muraca P, Raiconi G (1997) A fuzzy classifier for tactile sensing. J Intell Robot Syst Theory Appl 20(1):71–86
Glas DF, Miyashit T, Ishiguro H, Hagita N (2007) Laser tracking of human body motion using adaptive shape modeling. In: IEEE international conference on intelligent robots and systems, pp 602–608
Gockley R, Forlizzi J, Simmons R (2007) Natural person-following behavior for social robots. In: ACM/IEEE international conference on human-robot interaction, pp 17–24
Jung B, Sukhatme GS (2009) Real-time motion tracking from a mobile robot. Int J Soc Robot. doi:10.1007/s12369-009-0038
Glas DF, Miyashit T, Ishiguro H, Hagita N (2009) Laser-based tracking of human position and orientation using parametric shape modeling. Adv Robot 23:405–428
Morales J, Martinez JL, Mandow A, Pequeno-Boter A, Garcia-Cerezo A (2011) Design and development of a fast and precise low-cost 3D laser rangefinder. In: International conference on mechatronics, pp 621–626
Scholer F, Behley J, Steinhage V, Schulz D, Cremers AB (2011) Person tracking in three-dimensional laser range data with explicit occlusion adaption. In: International conference on robotics and automation, pp 1297–1303
Spinello L, Arras KO, Triebel R, Siegwart R (2010) A layered approach to people detection in 3D range data. In: Proceedings of the national conference on artificial intelligence, vol 3, pp 1625–1630
Navarro-Serment LE, Mertz C, Hebert M (2010) Pedestrian detection and tracking using three-dimensional LADAR data. Int J Robot Res 29(12):1516–1528
Harrison A, Newman P (2008) High quality 3D laser ranging under general vehicle motion. In: International conference on robotics and automation, pp 7–12
Pantic M, Leon J (2003) Toward an affect-sensitive multimodal human–computer interaction. Proc IEEE 91(9):1370–1390
Fragopanagos N, Taylor J (2005) Emotion recognition in human–computer interaction. Neural Netw 18(4):389–405
Zeng Z, Tu J, Brian M, Huang T (2008) Audio-visual affective expression recognition through multistream fused HMM. IEEE Trans Multimed 10(4):570–577
Johnson DO, Agah A (2009) Human robot interaction through semantic integration of multiple modalities, dialog management, and contexts. Int J Soc Robot 1:283–305
Spexard T, Hanheide M (2007) Gerhard sagerer, human-oriented interaction with an anthropomorphic robot. IEEE Trans Robot 23(5):852–862
Turk M, Pentland A (1991) Eigedces for recognition. J Cogn Neurosci 3(1):71–86
Guillaume L, Miroslav R (2009). Directed reading: boosting algorithms
Inamura T, Toshima I, Nakamura Y (2003) Acquiring motion elements for bidirectional computation of motion recognition and generation. Exp Robot VIII, 5:372–381
Esau N, Kleinjohann L, Kleinjohann B (2005) An adaptable fuzzy affective states model for affective states recognition. In: European society for fuzzy logic and technology, pp 73–78
Altun H, Polat G (2009) Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection. Expert Syst Appl 36(4):8197–8203
Zhai Y, Yeary MB, Cheng S, Kehtarnavaz N (2009) An object-tracking algorithm based on multiple-model particle filtering with state partitioning. IEEE Trans Instrum Meas 58(5):1797–1809
Wu X, Gong H, Chen P, Zhong Z, Xu Y (2009) Surveillance robot utilizing video and audio information. J Intell Robot Syst 55(4–5):403–421
Nummiaro K, Koller-Meier, E, Van Gool, L (2003) An adaptive color-based particle filter. Image Vis Comput 21(1):99–110
Kwon HS, Kim, YJ, Lim, MT (2005) Person tracking with a mobile robot using particle filters in complex environment. In: International society for optical engineering, vol 6042. SPIE Press, Bellingham
Muñoz-Salinas R, García-Silvente M, Medina Carnicer R (2008) Adaptive multi-modal stereo people tracking without background modelling. J Vis Commun Image Represent 19(2):75–91
Tao Z, Biwen Z, Lee L, Kaber D (2008) Service robot anthropomorphism and interface design for emotion in human–robot interaction. In: IEEE conference on automation science and engineering, pp 674–679
Serrano A, de Diego IM, Conde C, Cabello E (2009) Recent advances in face biometrics with Gabor wavelets: a review. Pattern Recogn Lett. doi:10.1016/j.patrec.2009.11.002
Whitehill J, Littlewort G, Fasel I, Bartlett M, Movellan J (2009) Toward practical smile detection. IEEE Trans Pattern Anal Mach Intell 31(11):2106–2111
Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of German emotional speech. In: European conference on speech communication and technology, pp 1517–1520
Quast H (2001) Automatic recognition of nonverbal speech: an approach to model the perception of para- and extralinguistic vocal communication with neural networks. Master’s thesis, University of Göttingen
Kanade T, Cohn J, Tian YL (2000) Comprehensive database for facial expression analysis. In: IEEE international conference on face and gesture analysis, pp 46–53
Zhang W, Shan S, Gao W, Chen X, Zhang H (2005) Local Gabor binary pattern histogram sequence (LGBPHS): a novel non-statistical model for face representation and recognition. In: IEEE international conference on computer vision, vol 1, pp 786–791
Goodrich MA, Schultz AC (2007) Human–robot interaction: a survey. Found Trends Hum-Comput Interact 1(3):203–275
Drury JL, Scholtz J, Yanco HA (2004) Applying CSCW and HCI techniques to human–robot interaction. In: CHI 2004 workshop on shaping human–robot interaction
Moller A, Roalter L, Kranz M (2011) Cognitive objects for human–computer interaction and human–robot interaction. In: HRI2011, 6–9 March, Lausanne, Switzerland
Saldien J, Goris K, Vanderborght B, Vanderfaeillie J, Lefeber D (2010) Cognitive Objects Hum-Comput Interact Hum-Robot Interact 2(4):377–389
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yan, H., Ang, M.H. & Poo, A.N. A Survey on Perception Methods for Human–Robot Interaction in Social Robots. Int J of Soc Robotics 6, 85–119 (2014). https://doi.org/10.1007/s12369-013-0199-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12369-013-0199-6