ABSTRACT
In this paper, we propose SocioPhone, a novel initiative to build a mobile platform for face-to-face interaction monitoring. Face-to-face interaction, especially conversation, is a fundamental part of everyday life. Interaction-aware applications aimed at facilitating group conversations have been proposed, but have not proliferated yet. Useful contexts to capture and support face-to-face interactions need to be explored more deeply. More important, recognizing delicate conversational contexts with commodity mobile devices requires solving a number of technical challenges. As a first step to address such challenges, we identify useful meta-linguistic contexts of conversation, such as turn-takings, prosodic features, a dominant participant, and pace. These serve as cornerstones for building a variety of interaction-aware applications. SocioPhone abstracts such useful meta-linguistic contexts as a set of intuitive APIs. Its runtime efficiently monitors registered contexts during in-progress conversations and notifies applications on-the-fly. Importantly, we have noticed that online turn monitoring is the basic building block for extracting diverse meta-linguistic contexts, and have devised a novel volume-topography-based method. We show the usefulness of SocioPhone with several interesting applications: SocioTherapist, SocioDigest, and Tug-of-War. Also, we show that our turn-monitoring technique is highly accurate and energy-efficient under diverse real-life situations.
- Alpaydin, E. Introduction to Machine Learning, 1st edition. The MIT Press, 2004. Google ScholarDigital Library
- Anguera, X., Bozonnet, S., Evans, N., Fredouille, C., Friedland, G., and Vinyals, O. Speaker Diarization: A Review of Recent Research. In IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, issue 2, pp. 356--370. 2012. Google ScholarDigital Library
- Aran, O., and Gatica-Perez, D. Analysis of Group Conversations: Modeling Social Verticality. Computer Analysis of Human Behavior, pp. 293--322. 2011. Springer London.Google ScholarCross Ref
- Barras, C., Zhu, X., Meihner, S., and Gauvain, J. Multistage Speaker Diarization of Broadcast News. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 14, Issue 5. 2006. Google ScholarDigital Library
- Boil, S., Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoustics, Speech, and Signal Processing. Vol 27, Issue 2, pp. 113--120. 1979.Google ScholarCross Ref
- Brdiczka, O., Maisonnasse, J., and Reignier, P. Automatic Detection of Interaction Groups, In ICMI, 2005. Google ScholarDigital Library
- Campbell, J.P., Jr. Speaker recognition: a tutorial. Proc. of the IEEE, Vol. 85, Issue 9, pp. 1437--1462. 1997.Google ScholarCross Ref
- Chen, J., Benesty, J., Huang, Y., and Doclo, S. New insights into the noise reduction Wiener filter. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 14, Issue 4. 2006 Google ScholarDigital Library
- Choudhury, T, and Pentland, A. Sensing and Modeling Human Networks using the Sociometer. In ISWC, 2003. Google ScholarDigital Library
- Cowley, S. J. Of timing, Turn-Taking and Conversations, Journal of Psycholinguistic Research, Vol. 27. Nov. 5. 1998.Google Scholar
- Efstratious, C., Leontiadis, I., Picone, M., Rachuri, K. K., Mascolo, C., and Crowcroft, J. Sense and Sensibility in a Pervasive World. In Pervasive, 2012. Google ScholarDigital Library
- Enck, W., Gilbert, P., Chun, B., Cox, L. P., Jung, J., McDaniel, P., and Sheth, A. N. TaintDroid: An Information-Flow Tracking System for Realtime Privacy, In OSDI, 2010. Google ScholarDigital Library
- Ford, B., Strauss, J., Lesniewski-Lass, C., Rhea, S., Kaashoek, F., and Morris, R. Persistent Personal Names for Globally Connected Mobile Devices. In OSDI 2006. Google ScholarDigital Library
- French, N. R. and Steinberg, J. C. Factors Governing the Intellligibility of Speech Sounds. Journal of the Acoustical Society of America, vol. 19, no. 1, pp.90--119. 1947.Google ScholarCross Ref
- Goffman, E. The Interaction Order. American Sociological Review, vol. 48, pp. 1--17. 1983.Google ScholarCross Ref
- Hawkins, K. Some Consequences of Deep Interruption in Task-oriented Communication. In Journal of Language and Social Psychology, vol. 10, no. 3, pp. 185--203. 1991.Google ScholarCross Ref
- Hung, H., Huang, Y., Friedland, G., Gatica-Perez, D. Estimating Dominance in Multi-Party Meetings Using Speaker Diarization. In IEEE Trans. Audio, Speech, and Language Processing, vol. 19, no. 4. 2011. Google ScholarDigital Library
- Hwang, I., Jang, H., Nachman, L., and Song, J. Exploring Inter-child Behavioral Relativity in a Shared Social Setting: a Field Study in a Kindergarten. In UbiComp 2010. Google ScholarDigital Library
- Ju, Y., Lee, Y., Yu, J., Min, C., Shin, I., and Song, J. SymPhoney: A Coordinated Sensing Flow Execution Engine for Concurrent Mobile Sensing Applications, in SenSys, 2012. Google ScholarDigital Library
- Kang, S., Lee, Y., Min, C., Ju, Y., Park, T., Lee, J., Rhee, Y., and Song, J. Orchestrator: An Active Resource Orchestration Framework for Mobile Context Monitoring in Sensor-rich Mobile Environments, in PerCom, 2010.Google ScholarCross Ref
- Kim, C., and Stern, R. M. Robust Signal-to-Noise Ratio Estimation Based on Waveform Amplitude Distribution Analysis. In InterSpeech, 2008.Google Scholar
- Kim, T., Chang, A., Holland, L., and Pentland, A. Meeting Mediator: Enhancing Group Collaboration using Sociometric Feedback. In CSCW, 2008 Google ScholarDigital Library
- Klasnja, P., Consolvo, S., Choudhury, T., Beckwith, R., and Hightower, J. Exploring Privacy Concerns about Personal Sensing. In Pervasive 2009. Google ScholarDigital Library
- Koegel, R. L., O'Dell, M. C., and Koegel, L. K. A Natural Language Teaching Paradigm for Nonverbal Autistic Children. Journal of Autism and Developmental Disorders, vol. 17, no. 2, pp. 187--200, 1987.Google ScholarCross Ref
- Lee, Y., Iyengar, S. S., Min, C., Ju, Y., Park, T., Lee, J., Rhee, Y., Song, J. MobiCon: Mobile Context Monitoring Platform, in Communications of ACM (CACM), 2012. Google ScholarDigital Library
- Lee, Y., Ju, Y., Min, C., Kang, S., Hwang, I., and Song, J. CoMon: Cooperative Ambience Monitoring Platform with continuity and benefit awareness. In Mobisys, 2012. Google ScholarDigital Library
- Lu, H., Brush, A. J. B., Priyantha, B., Karson, A. K., and Liu, J. SpeakerSense: Energy Efficient Unobtrusive Speaker Identification on Mobile Phones. In Pervasive, 2011. Google ScholarDigital Library
- Lu, H., Pan, W., Lane, N. D., Choundhury, T., and Campbell, A. T. SoundSense: Scalable Sound Sensing for People-Centric Application on Mobile Phones. In MobiSys, 2009. Google ScholarDigital Library
- Lu, H., Yang, J., Liu, Z., Lane, N. D. Choudhury, T., and Campbell, A. T. The Jigsaw continuous sensing engine for mobile phone applications. In SenSys, 2010. Google ScholarDigital Library
- Miluzzo, E., Papandrea, M., Lane, N. D., Lu, H., and Campbell, A. T. Pocket, Bag, Hand, etc. -- Automatically Detecting Phone Context through Discovery. In PhoneSense 2010.Google Scholar
- Miluzzo. E., Cornelius, C. T., Ramaswamy, A., Choudhury, T., Liu, Z., Campbell, A. T. Darwin Phones: the Evolution of Sensing and Inference on Mobile Phones. In MobiSys, 2011. Google ScholarDigital Library
- Mundy, P., Sigman, M., Ungerer, J. and Sherman, T. Defining the Social Deficits of Autism: The Contribution of Non-verbal Communication Measures. Journal of Child Psychology and Psychiatry, vol. 27, no. 5, 1986.Google ScholarCross Ref
- Olguin, D, O., Waber, B. N., Kim, T., Mohan, A., Ara, K., and Pentland, A. Sensible Organizations: Technology and Methodology for Automatically Measuring Organizational Behavior. In IEEE Transactions on Systems, Man, and Cybernetics, Vol. 39, Issue 1, pp. 43--55. 2009. Google ScholarDigital Library
- Park, T., Lee, J., Hwang, I., Yoo, C., Nachman, L., and Song, J. E-Gesture: A Collaborative Architecture for Energy-efficient Gesture Recognition with Hand-worn Sensor and Mobile Devices, In SenSys, 2011. Google ScholarDigital Library
- Sanchez-Cortes, D., Aran, O., Mast, M. S., and Gatica-Perez, D. Identifying emergent leadership in small groups using nonverbal communicative cues. In ICMI 2010. Google ScholarDigital Library
- Sellen, A., and Whittaker, S. Beyond Total Capture: A Constructive Critique of Lifelogging. Communications of the ACM, vol. 53, no. 5, pp. 70--77. May 2010. Google ScholarDigital Library
- Sohn, J., Kim, N. and Sung, W. Statistical model-based voice activity detection. IEEE Signal Processing Letters, Vol. 6, Issue 1, pp. 1--3. 1999.Google ScholarCross Ref
- Wang, D. and Narayanan, S. S. Robust Speech Rate Estimation for Spontaneous Speech. In IEEE Transactions on Audio, Speech, and Language Processing, Vol.15, Issue 8. Pp. 2190--2201. 2007. Google ScholarDigital Library
- Wrigley, S. N., Brown, G. J., Wan, V., and Renals, S. Speech and Crosstalk Detection in Multichannel Audio. IEEE Transactions on Speech and Audio Processing, Vol. 13, Issue 1, pp. 84--91. 2005.Google ScholarCross Ref
- Wyatt, D., Choudhury, T., Bilmes, J., and Kitts, J. A. Inferring Colocation and Conversation Networks from Privacy-sensitive Audio with Implications for Computational Social Science. ACM Trans. Intelligent Systems and Technology, vol. 2, 2011. Google ScholarDigital Library
Index Terms
- SocioPhone: everyday face-to-face interaction monitoring platform using multi-phone sensor fusion
Recommendations
SocioPhone: everyday face-to-face interaction monitoring platform using multi-phone sensor fusion
MobiSys '13: Proceeding of the 11th annual international conference on Mobile systems, applications, and servicesPhubbing behavior in conversations and its relation to perceived conversation intimacy and distraction: An exploratory observation study
AbstractThis study examines the occurrence, frequency and duration of co-present phone use, also known as ‘phubbing’ behavior, during a dyadic conversation and its association with perceived conversation intimacy and distraction. Phubbing was ...
Highlights- 100 student dyads were covertly observed in a student restaurant.
- In 62 dyads, ...
Conversational gaze mechanisms for humanlike robots
During conversations, speakers employ a number of verbal and nonverbal mechanisms to establish who participates in the conversation, when, and in what capacity. Gaze cues and mechanisms are particularly instrumental in establishing the participant roles ...
Comments