Qualitative spatial logic descriptors from 3D indoor scenes to generate explanations in natural language

Falomir, Zoe; Kluth, Thomas

doi:10.1007/s10339-017-0824-7

Qualitative spatial logic descriptors from 3D indoor scenes to generate explanations in natural language

Research Report
Published: 24 June 2017

Volume 19, pages 265–284, (2018)
Cite this article

Cognitive Processing Aims and scope Submit manuscript

389 Accesses
5 Citations
4 Altmetric
Explore all metrics

Abstract

The challenge of describing 3D real scenes is tackled in this paper using qualitative spatial descriptors. A key point to study is which qualitative descriptors to use and how these qualitative descriptors must be organized to produce a suitable cognitive explanation. In order to find answers, a survey test was carried out with human participants which openly described a scene containing some pieces of furniture. The data obtained in this survey are analysed, and taking this into account, the QSn3D computational approach was developed which uses a XBox 360 Kinect to obtain 3D data from a real indoor scene. Object features are computed on these 3D data to identify objects in indoor scenes. The object orientation is computed, and qualitative spatial relations between the objects are extracted. These qualitative spatial relations are the input to a grammar which applies saliency rules obtained from the survey study and generates cognitive natural language descriptions of scenes. Moreover, these qualitative descriptors can be expressed as first-order logical facts in Prolog for further reasoning. Finally, a validation study is carried out to test whether the descriptions provided by QSn3D approach are human readable. The obtained results show that their acceptability is higher than 82%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Qualitative Spatial Reasoning for Orientation Relations in a 3-D Context

Construction of a Planar PLCA Expression: A Qualitative Treatment of Spatial Data

Collaborative Qualitative Environment Mapping

Notes

Trade and company names are included for benefit of the reader and imply no endorsement or preferential treatment of the product by the authors.
For a cross-disciplinary taxonomy of reference frames see the work by Pederson (2003).
JARCA workshop: http://madeirasic.us.es/jarca16/?lang=en.
In general, one could apply different classification algorithms as well. In particular, zero-shot learning (e.g. Ji et al. 2017; Socher et al. 2013) might prove as a useful improvement to the current implementation, as these methods do not require a training phase. This allows to more easily add new objects to the system.
http://www.ros.org.
http://www.openni.org.
http://www.pointclouds.org.
https://www.prolific.ac/.

References

Barclay M, Galton A (2013) Selection of reference objects for locative expressions: the importance of knowledge and perception. In: Tenbrink T, Wiener J, Claramunt C (eds) Representing space in cognition: interrelations of behavior, language, and formal models, explorations in language and space. Oxford University Press, Oxford, pp 57–169. doi:10.1093/acprof:oso/9780199679911.003.0005
Chapter Google Scholar
Bo L, Lai K, Ren X, Fox D (2011a) Object recognition with hierarchical kernel descriptors. In: Proceedings of computer vision and pattern recognition
Bo L, Ren X, Fox D (2011b) Depth kernel descriptors for object recognition. In: 2011 IEEE/RSJ international conference on intelligent robots and systems, IROS 2011, San Francisco, CA, September 25–30, IEEE, pp 821–826
Carlson LA, Regier T, Lopez W, Corrigan B (2006) Attention unites form and function in spatial language. Spat Cogn Comput 6(4):295–308
Article Google Scholar
Carlson LA, Skubic M, Miller J, Huo Z, Alexenko T (2014) Strategies for human-driven robot comprehension of spatial descriptions by older adults in a robot fetch task. Topics Cogn Sci 6(3):513–533. doi:10.1111/tops.12101
Article Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27, http://www.csie.ntu.edu.tw/~cjlin/libsvm
Clark HH (1996) Using language. Cambride University Press, Cambridge
Book Google Scholar
Du H, Henry P, Ren X, Cheng M, Goldman D, Seitz SM, Fox D (2011) Interactive 3D modeling of indoor environments with a consumer depth camera. In: Proceedings of 13th international conference on ubiquitous computing, ACM, New York, NY, UbiComp’11, pp 75–84
Falomir Z (2013) Towards cognitive image interpretation qualitative descriptors, domain knowledge and narrative generation. In: Gibert K, Reig-Balao VBR (eds) Artificial intelligence research and development, frontiers in artificial intelligence and applications. IOS Press, Amsterdam, pp 45–57
Google Scholar
Falomir Z (2015) A qualitative model for reasoning about 3D objects using depth and different perspectives. In: Lechowski T, Walega P, Zawidzki M (eds) LQMR 2015 workshop, PTI, annals of computer science and information systems, vol 7, pp 3–11, doi:10.15439/2015F370
Falomir Z, Rahman S (2015) From qualitative descriptors of movement towards spatial logics for videos. In: Proceedings of 3rd workshop on recognition and action for scene understanding (REACTS), co-located at 16th international conference of computer analysis of images and patterns (CAIP), Valleta, Malta, pp 119–128
Falomir Z, Castelló V, Escrig MT, Peris JC (2011a) Fuzzy distance sensor data integration and interpretation. Int J Uncertainty Fuzziness Knowl Based Syst IJUFKS 19(3):499–528. doi:10.1142/S0218488511007106
Article Google Scholar
Falomir Z, Jiménez-Ruiz E, Escrig MT, Museros L (2011b) Describing images using qualitative models and description logics. Spatial Cogn Comput 11(1):45–74. doi:10.1080/13875868.2010.545611
Article Google Scholar
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Article Google Scholar
Genesereth MR, Nislsson NJ (1987) Logical foundations of artificial intelligence. Morgan Kaufmann Publishers, Burlington
Google Scholar
Henry P, Krainin M, Herbst E, Ren X, Fox D (2010) RGBD mapping: using depth cameras for dense 3D modeling of indoor environments. In: RGB-D: advanced reasoning with depth cameras workshop in conjunction with RSS
Herbst E, Henry P, Ren X, Fox D (2011a) Toward object discovery and modeling via 3-D scene comparison. In: ICRA, IEEE, pp 2623–2629
Herbst E, Ren X, Fox D (2011b) RGB-D object discovery via multi-scene analysis. In: 2011 IEEE/RSJ international conference on intelligent robots and systems, IROS 2011, San Francisco, CA, September 25–30, IEEE, pp 4850–4856
Hernández D, Clementini E, Di Felice P (1995) Qualitative distances. In: Frank AU, Kuhn W (eds) Spatial information theory—a theoretical basis for GIS (COSIT’95). Springer, Berlin, pp 45–57
Chapter Google Scholar
Huo Z, Skubic M (2016) Natural spatial description generation for human–robot interaction in indoor environments. In: 2016 IEEE international conference on smart computing (SMARTCOMP), pp 1–3, doi:10.1109/SMARTCOMP.2016.7501708
Ji Z, Yu Y, Pang Y, Chen L, Zhang Z (2017) Zero-shot learning with multi-battery factor analysis. Signal Process 138:265–272. doi:10.1016/j.sigpro.2017.03.023
Article Google Scholar
Kluth T, Schultheis H (2014) Attentional distribution and spatial language. In: Freksa C, Nebel B, Hegarty M, Barkowsky T (eds) Spatial cognition IX, vol 8684. Lecture notes in computer science. Springer, Berlin, pp 76–91. doi:10.1007/978-3-319-11215-2_6
Chapter Google Scholar
Kluth T, Burigo M, Knoeferle P (2017a) Modeling the directionality of attention during spatial language comprehension. In: Herik JVD, Filipe J (eds) Agents and artificial intelligence, vol 10162. Lecture notes in computer science. Springer, Berlin, pp 283–301
Chapter Google Scholar
Kluth T, Burigo M, Schultheis H, Knoeferle P (2017b) Does direction matter? linguistic asymmetries reflected in visual attention. Cognition (to appear)
Krainin M, Henry P, Ren X, Fox D (2011) Manipulator and object tracking for in-hand 3D object modeling. Int J Robot Res 30(11):1311–1327. doi:10.1177/0278364911403178
Article Google Scholar
Lai K, Bo L, Ren X, DFox (2011a) Sparse distance learning for object recognition combining RGB and depth information. In: IEEE international conference on robotics and automation
Lai K, Bo L, Ren X, Fox D (2011b) A scalable tree-based approach for joint object and pose recognition. In: Twenty-fifth conference on artificial intelligence (AAAI)
Landau B (2016) Update on what and where in spatial language: a new division of labor for spatial terms. Cogn Sci. doi:10.1111/cogs.12410
Article PubMed Google Scholar
Levinson S (2003) Space in language and cognition: explorations in cognitive diversity. Cambridge University Press, Cambridge
Book Google Scholar
Lison P (2010) Robust processing of spoken situated dialogue. Diplomica Verlag, Hamburg
Google Scholar
Lloyd JW (1987) Foundations of logic programming. Symbolic computation: artificial intelligence, 2nd edn. Springer, Berlin
Book Google Scholar
Marton ZC, Pangercic D, Rusu RB, Holzbach A, Beetz M (2010) Hierarchical object geometric categorization and appearance classification for mobile manipulation. In: 2010 10th IEEE-RAS international conference on humanoid robots (humanoids), IEEE, pp 365–370
Mast V, Falomir Z, Wolter D (2016) Probabilistic reference and grounding with PRAGR for dialogues with robots. J Exper Theor Artif Intell 28(5):889–911. doi:10.1080/0952813X.2016.1154611
Article Google Scholar
Moratz R, Tenbrink T (2006) Spatial reference in linguistic human–robot interaction: iterative, empirically supported development of a model of projective relations. Spatial Cogn Comput 6(1):63–106
Article Google Scholar
Moratz R, Tenbrink T (2008) Affordance-based human–robot interaction. In: Proceedings of the 2006 international conference on towards affordance-based robot control. Springer, Berlin, pp 63–76
Museros L, Falomir Z, Sanz I, Gonzalez-Abril L (2014) Sketch retrieval based on qualitative shape similarity matching: towards a tool for teaching geometry to children. AI Commun 28(1):73–86. doi:10.3233/AIC-140614
Article Google Scholar
Olszewska JI (2015a) 3D spatial reasoning using the clock model. In: Bramer M, Petridis M (eds) Research and development in intelligent systems XXXII: incorporating applications and innovations in intelligent systems XXIII. Springer, Cham, pp 147–154. doi:10.1007/978-3-319-25032-8_10
Chapter Google Scholar
Olszewska JI (2015b) Where is my cup?—fully automatic detection and recognition of textureless objects in real-world images. In: Azzopardi G, Petkov N (eds) Computer analysis of images and patterns: 16th International Conference, CAIP 2015, Valletta, Malta, September 2–4, 2015 Proceedings, Part I, Springer, pp 501–512. doi:10.1007/978-3-319-23192-1_42
Olszewska JI (2016) Interest-point-based landmark computation for agents’ spatial description coordination. In: van den Herik HJ, Filipe J (eds) Proceedings of the 8th international conference on agents and artificial intelligence (ICAART 2016), Vol 2, Rome, February 24–26, SciTePress, pp 566–569, doi:10.5220/0005847705660569
Oppenheimer DM, Meyvis T, Davidenko N (2009) Instructional manipulation checks: detecting satisficing to increase statistical power. J Exper Soc Psychol 45(4):867–872. doi:10.1016/j.jesp.2009.03.009
Article Google Scholar
Pederson E (2003) How many reference frames? In: Freksa C, Brauer W, Habel C, Wender KF (eds) Spatial cognition III: routes and navigation, human memory and learning, spatial representation and spatial learning. Springer, Berlin, pp 287–304. doi:10.1007/3-540-45004-1_17
Chapter Google Scholar
Regier T, Carlson LA (2001) Grounding spatial language in perception: an empirical and computational investigation. J Exper Psychol Gen 130(2):273–298. doi:10.1037//0096-3445.130.2.273
Article CAS Google Scholar
Ruiz-Sarmiento JR (2016) Probabilistic techniques in semantic mapping for mobile robotics. Ph.D. thesis, Department of Systems Engineering and Automatics, University of Malaga, Malaga
Ruiz-Sarmiento JR, Galindo C, González-Jiménez J (2015) Olt: A toolkit for object labeling applied to robotic RGB-D datasets. In: European conference on mobile robots
Rusu RB, Bradski G, Thibaux R, Hsu J (2010) Fast 3D recognition and pose using the viewpoint feature histogram. In: Proceedings of the 23rd IEEE/RSJ international conference on intelligent robots and systems (IROS), Taipei, Taiwan
Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: Proceedings of 2011 IEEE conference on computer vision and pattern recognition. IEEE computer society, Washington, DC, CVPR ’11, pp 1297–1304
Skubic M, Blisard S, Bailey C, Adams J, Matsakis P (2004) Qualitative analysis of sketched route maps: translating a sketch into linguistic descriptions. IEEE Trans Syst Man Cyber B Cybern 34(2):1275–1282
Article Google Scholar
Socher R, Ganjoo M, Manning CD, Ng A (2013) Zero-shot learning through cross-modal transfer. In: Advances in neural information processing systems, pp 935–943
Steels L (2015) The talking heads experiment: origins of words and meanings. Computational models of language evolution. Language Science Press. doi:10.17169/langsci.b49.75 http://langsci-press.org/catalog/book/49
Steinhauer HJ (2005) A qualitative model for natural language communication about vehicle traffic. In: AAAI spring symposium: reasoning with mental and external diagrams: computational modeling and spatial assistance, AAAI, pp 52–57
Tenbrink T, Fischer K, Moratz R (2002) Spatial strategies in linguistic human–robot communication. In: Freksa C (ed) KI-Themenheft 4/02 spatial cognition. arenDTaP Verlag, Bremen, pp 19–23
Google Scholar
Tenbrink T, Maiseyenka V, Moratz R (2007) Spatial reference in simulated human–robot interaction involving intrinsically oriented objects. In: Symposium spatial reasoning and communication at AISB’07 artificial and ambient intelligence, vol 7
Tenbrink T, Coventry KR, Andonova E (2011) Spatial strategies in the description of complex configurations. Discourse Process 48(4):237–266
Article Google Scholar
Tenorth M, Beetz M (2013) Knowrob: a knowledge processing infrastructure for cognition-enabled robots. Int J Robot Res 32(5):566–590. doi:10.1177/0278364913481635
Article Google Scholar
Waibel M, Beetz M, Civera J, D’Andrea R, Elfring J, Galvez-Lopez D, Haussermann K, Janssen R, Montiel J, Perzylo A, Schiessle B, Tenorth M, Zweigle O, van de Molengraft R (2011) Roboearth. IEEE Robot Autom Mag 18(2):69–82. doi:10.1109/MRA.2011.941632
Article Google Scholar
Zhang X, quan Li Q, xiang Fang Z, wei Lu S, lung Shaw S (2014) An assessment method for landmark recognition time in real scenes. J Environ Psychol 40:206–217. doi:10.1016/j.jenvp.2014.06.008
Article Google Scholar

Download references

Acknowledgements

This work was conducted on the scope of the project Cognitive Qualitative Descriptions and Applications (CogQDA: https://sites.google.com/site/cogqda/) (CogQDA) funded by the Central Research Development Fund (CRDF) at Universität Bremen through the 04-Independent Projects for Postdocs action. The authors also thank Niels Eicke, Susanne Knoop, Bengt Kohrt, Nico Lehmann and Mareike Picklum for helping with the implementation.

Author information

Authors and Affiliations

Bremen Spatial Cognition Centre (BSCC), University of Bremen, Enrique-Schmidt-Str. 5, 28359, Bremen, Germany
Zoe Falomir
Language and Cognition Group, Cognitive Interaction Technology Excellence Cluster (CITEC), Bielefeld University, Inspiration 1, 33615, Bielefeld, Germany
Thomas Kluth

Authors

Zoe Falomir
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Kluth
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zoe Falomir.

Additional information

Handling editor: Antonio Bandera (University of Malaga); Reviewers: Andrea Torsello (Ca’ Foscari University Venice), Ricardo Vázquez Martín (University of Malaga), Rebeca Marfil (University of Malaga).

This article is part of the Special Section on ‘Cognitive Robotics’ guest-edited by Antonio Bandera, Jorge Dias, and Luis Manso.

Appendix

More results obtained by QSn3D (narratives and logics) are shown in Tables 6, 7, 8 and 9.

Table 6 QSn3D narratives and logics obtained in the home scenario using 2 pieces of furniture: 1 oriented and 1 non-oriented

Full size table

Table 7 QSn3D narratives and logics obtained in the home scenario using 3 pieces of furniture: 1 oriented and 2 non-oriented

Full size table

Table 8 QSn3D narratives and logics obtained in the office scenario using 2 pieces of furniture: 1 oriented and 1 non-oriented

Full size table

Table 9 QSn3D narratives and logics obtained in the office scenario using 3 pieces of furniture: 2 oriented and 1 non-oriented

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Falomir, Z., Kluth, T. Qualitative spatial logic descriptors from 3D indoor scenes to generate explanations in natural language. Cogn Process 19, 265–284 (2018). https://doi.org/10.1007/s10339-017-0824-7

Download citation

Received: 12 November 2016
Accepted: 14 June 2017
Published: 24 June 2017
Issue Date: May 2018
DOI: https://doi.org/10.1007/s10339-017-0824-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Qualitative spatial logic descriptors from 3D indoor scenes to generate explanations in natural language

Abstract

Access this article

Similar content being viewed by others

Qualitative Spatial Reasoning for Orientation Relations in a 3-D Context

Construction of a Planar PLCA Expression: A Qualitative Treatment of Spatial Data

Collaborative Qualitative Environment Mapping

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Qualitative spatial logic descriptors from 3D indoor scenes to generate explanations in natural language

Abstract

Access this article

Similar content being viewed by others

Qualitative Spatial Reasoning for Orientation Relations in a 3-D Context

Construction of a Planar PLCA Expression: A Qualitative Treatment of Spatial Data

Collaborative Qualitative Environment Mapping

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation