Abstract
We introduce the automatic analysis of conversational vlogs (VlogSense, for short) as a new research domain in social media. Conversational vlogs are inherently multimodal, depict natural behavior, and are suitable for large-scale analysis. Given their diversity in terms of content, VlogSense requires the integration of robust methods for multimodal analysis and for social media understanding. We present an original study on the automatic characterization of vloggers' audiovisual nonverbal behavior, grounded in work from social psychology and behavioral computing. Our study on 2,269 vlogs from YouTube shows that several nonverbal cues are significantly correlated with the social attention received by videos.
- Ambady, N. and Rosenthal, R. 1992. Thin slices of expressive behavior as predictors of interpersonal consequences: A metaanalysis. Psych. Bull. 111, 2, 256--274.Google ScholarCross Ref
- Biel, J.-I. and Gatica-Perez, D. 2009. Wearing a YouTube hat: Directors, comedians, gurus, and user aggregated behavior. In Proceedings of the 17th ACM International Conference on Multimedia. Google ScholarDigital Library
- Biel, J.-I. and Gatica-Perez, D. 2010. Voices of Vlogging. In Proceedings of the International AAAI Conference on Weblogs and Social Media.Google Scholar
- Bradski, G. and Kaehler, A. 2008. Learning OpenCV: Computer Vision with the OpenCV Library. O'Reilly.Google Scholar
- Burgess, J. and Green, J. 2009. YouTube: Online Video and Participatory Culture. Polity, Cambridge, UK.Google Scholar
- Cha, M., Kwak, R., Rodriguez, R., Ahn, Y.-Y., and Moon, S. 2007. I tube, you tube, everybody tubes: Analyzing the world's largest user generated content video system. In Proceedings of the ACM SIGCOMM Internet Measurement Conference. Google ScholarDigital Library
- Cheng, X., Dale, C., and Liu, J. 2008. Statistics and social network of youtube videos. In Proceedings of IEEE 16th International Workshop on Quality of Service.Google Scholar
- Curhan, J. R. and Pentland, A. 2007. Thin slices of negotiation: Predicting outcomes from conversational dynamics within the first 5 minutes. J. Appl. Psych. 92, 3.Google ScholarCross Ref
- de Vasconcelos Filho, J. E., Inkpen, K. M., and Czerwinski, M. 2009. Image, appearance and vanity in the use of media spaces and video conference systems. In Proceedings of the International ACM Conference on Supporting Group Work. Google ScholarDigital Library
- Dovidio, J. F. and Ellyson, S. L. 1982. Decoding visual dominance: Attributions of power based on relative percentages of looking while speaking and looking while listening. J. Soc. Person. Relation. 45, 2, 106--113.Google Scholar
- Evans, D. C., Gosling, S. D., and Carroll, A. 2008. What elements of an online social networking profile predict target-rater agreement in personality impressions. In Proceedings of the International AAAI Conference on Weblogs and Social Media.Google Scholar
- Gatica-Perez, D. 2009. Automatic nonverbal analysis of social interaction in small groups: A review. Image Vis. Comput. 27, 12, 1775--1787. Google ScholarDigital Library
- Gill, J. A., Nowson, S., and Oberlander, J. 2009. What are they blogging about? Personality, topic, and motivation in blogs. In Proceedings of the International AAAI Conference on Weblogs and Social Media.Google Scholar
- Goswami, S., Sarkar, S., and Rustagi, M. 2009. Stylometric analysis of bloggers' age and gender. In Proceedings of the International AAAI Conference on Weblogs and Social Media.Google Scholar
- Griffith, M. 2007. Looking for you: An analysis of video blogs. In Proceedings of the Annual Meeting of the Association for Education in Journalism and Mass Communication.Google Scholar
- Halvey, M. and Keane, M. 2007. Exploring social dynamics in online media sharing. In Proceedings of the 16th International Conference on World Wide Web. Google ScholarDigital Library
- Hanjalic, A. 2002. Shot-boundary detection: unraveled and resolved? IEEE Trans. Circ. Syst. Video Techn. 12, 2, 90--105. Google ScholarDigital Library
- Harley, D. and Fitzpatrick, G. 2009. YouTube and intergenerational communication: the case of geriatric1927. Univ. Access Inform. Soc. 8, 1, 5--20. Google ScholarDigital Library
- Huberman, B. A., Romero, D. M., and Wu, F. 2009. Crowdsourcing, attention and productivity. J. Inform. Sci. 35, 6. Google ScholarDigital Library
- Hung, H., Jayagopi, D. B., Ba, S., Odobez, J.-M., and Gatica-Perez, D. 2008. Investigating automatic dominance estimation in groups from visual attention and speaking activity. In Proceedings of the 10th International Conference on Multimodal Interfaces. Google ScholarDigital Library
- Jayagopi, D. B., Ba, S., Odobez, J.-M., and Gatica-Perez, D. 2008. Predicting two facets of social verticality in meetings from five-minute time slices and nonverbal cues. In Proceedings of the 10th International Conference on Multimodal Interfaces. Google ScholarDigital Library
- Jayagopi, D. B., Hung, H., Yeo, C., and Gatica-Perez, D. 2009. Modeling dominance in group conversations using nonverbal activity cues. IEEE Trans. Audio Speech Lang. Process. 17, 3, 501--513. Google ScholarDigital Library
- Knapp, M. L. and Hall, J. 2005. Nonverbal Communication in Human Interaction. Holt, Rinehart and Winston, New York.Google Scholar
- Kramer, A. D. I. and Rodden, K. 2008. Word usage and posting behaviors: modeling blogs with unobtrusive data collection methods. In Proceedings of the 26th annual SIGCHI Conference on Human Factors in Computing Systems. Google ScholarDigital Library
- Kruitbosch, G. and Nack, F. 2008. Broadcast yourself on YouTube—really? In Proceedings of the 3rd ACM International Workshop on Human-Centered Computing. Google ScholarDigital Library
- Landry, B. and Guzdial, M. 2008. Art or circus? Characterizing user-created video on YouTube. Tech. rep., Georgia Institute of Technology.Google Scholar
- Lange, P. 2007. Publicly private and privately public: social networking on YouTube.J. Comput.-Mediat. Comm. 1, 13.Google Scholar
- Levelt, W. J. M. 1989. Speaking: From Intention to Articulation. MIT Press, Cambridge, MA.Google Scholar
- Lin, W.-H. and Hauptmann, A. 2008. Identifying ideological perspectives of web videos using folksonomies. In Proceedings of the AAAI Fall Symposium on Multimedia Information Extraction.Google Scholar
- Mishne, G. 2005. Experiments with mood classification in blog posts. In Proceedings of the SIGIR Workshop on Stylistic Analysis Of Text For Information Access.Google Scholar
- Mislove, A., Marcon, M., Gummadi, K., Druschel, P., and Bhattacharjee, B. 2007. Measurement and analysis of online social networks. In Proceedings of the ACM SIGCOMM Internet Measurement Conference. Google ScholarDigital Library
- Molyneaux, H., O'Donnell, S., Gibson, K., and Singer, J. 2008. Exploring the gender divide on YouTube: An analysis of the creation and reception of vlogs. Amer. Comm. J. 10, 2.Google Scholar
- Nguyen, D. T. and Canny, J. 2009. More than face-to-face: Empathy effects of video framing. In Proceedings of the 27th Annual SIGCHI Conference on Human Factors in Computing Systems. Google ScholarDigital Library
- Pentland, A. S. 2008. Honest Signals: How They Shape Our World. MIT Press. Google ScholarCross Ref
- Scherer, K. R. 1979. Personality markers in speech. In Social Markers in Speech, K. R. Scherer and H. Giles, Eds., Cambridge University Press, 147--209.Google Scholar
- Sellen, A. J. 1995. Remote conversations: The effects of mediating talk with technology. Hum. Comput. Interact. 10, 401--444. Google ScholarDigital Library
- Strangelove, M. 2010. Watching YouTube: Extraordinary Videos by Ordinary People. University of Toronto Press.Google Scholar
- Viola, P. and Jones, M. 2002. Robust real-time object detection. Int. J. Comput. Vis. 57, 2. Google ScholarDigital Library
- Vonderau, P. 2010. The YouTube Reader. Wallflower Press.Google Scholar
- Zancanaro, M., Lepri, B., and Pianesi, F. 2006. Automatic detection of group functional roles in face to face interactions. In Proceedings of the 8th International Conference on Multimodal Interfaces. Google ScholarDigital Library
- Zhang, X., Xu, C., Cheng, J., Lu, H., and Ma, S. 2009. Effective annotation and search for video blogs with integration of context and content analysis. IEEE Trans. Multimedia 11, 2. Google ScholarDigital Library
Index Terms
- VlogSense: Conversational behavior and social attention in YouTube
Recommendations
Creating a conversational context through video blogging: A case study of Geriatric1927
Web-based communication technologies such as YouTube can provide opportunities for social contact, especially between older and younger people, and help address issues of social isolation. Currently our understanding of the dynamics of social ...
Living with HIV/AIDS: Exploring Vloggers' Narratives on YouTube
SMSociety '18: Proceedings of the 9th International Conference on Social Media and SocietyVideo blogs (vlogging) on YouTube have become a growing source of health information for individuals who have HIV/AIDS. Our study investigates the types of information video bloggers (vloggers) choose to share around their experiences managing HIV/AIDS. ...
Exploring Speech Cues in Web-mined COVID-19 Conversational Vlogs
ATQAM/MAST'20: Joint Workshop on Aesthetic and Technical Quality Assessment of Multimedia and Media Analytics for Societal TrendsThe COVID-19 pandemic caused by the novel SARS-Coronavirus-2 (n-SARS-CoV-2) has impacted people's lives in unprecedented ways. During the time of the pandemic, social vloggers have used social media to actively share their opinions or experiences in ...
Comments