Abstract
In social media users share a lot of content, such as comments, news, photos, videos, etc. This information can be used by automated systems to segment the users to provide them with specific recommendations or focused content. One of the most popular way to segment the users is by age and gender. Nevertheless, such demographic variables are frequently hidden, and thus becomes useful to indirectly infer them. Commonly, these variables are learned using the text comments the users publish, analyzing the style of writing or frequency of words. In this paper, we present a study of several machine learning models that employ user generated images and text trying to exploit both types of information to infer the age and gender for Pinterest users. We experiment with the models using a dataset composed of 548,761 pins, posted by 264 users. Each pin is a combination of an image and a short comment. We transformed the images to a deep visual representation using the pretrained convolutional neural network ResNet-50, and transformed the comments using the tf-idf method. We compare the models among them and between the types of information using different performance metrics. Our experiments show interesting results and the viability of employing the user generated image and text content to characterize users.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Orebaugh, A., Allnutt, J.: Classification of instant messaging communications for forensics analysis. Int. J. Forensic Comput. Sci. 1, 22–28 (2009)
Argamon, S., Koppel, M., Pennebaker, J.W., Schler, J.: Automatically profiling the author of an anonymous text. Commun. ACM 52(2), 119–123 (2009)
Liaudanskaitė, G., Saulytė, G., Jakutavičius, J., Vaičiukynaitė, E., Zailskaitė-Jakštė, L., Damaševičius, R.: Analysis of affective and gender factors in image comprehension of visual advertisement. In: Silhavy, R. (ed.) CSOC 2018. Advances in Intelligent Systems and Computing, vol. 764, pp. 1–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91189-2_1
Zailskaitė-Jakštė, L., Damaševičius, R.: Gender-related differences in brand-related social media content: an empirical investigation. In: 13th International Computer Engineering Conference (ICENCO), pp. 118–123. IEEE, December 2017
Hsieh, F., Dias, R., Paraboni, I.: Author profiling from Facebook corpora. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018) (2018)
Rangel Pardo, F.M., Celli, F., Rosso, P., Potthast, M., Stein, B., Daelemans, W.: Overview of the 3rd author profiling task at PAN 2015. In: CLEF 2015 Evaluation Labs and Workshop Working Notes Papers, pp. 1–8 (2015)
Rangel, F., Rosso, P., Verhoeven, B., Daelemans, W., Potthast, M., Stein, B.: Overview of the 4th author profiling task at PAN 2016: cross-genre evaluations. In: Balog, K., et al. (ed.) Working Notes Papers of the CLEF 2016 Evaluation Labs. CEUR Workshop Proceedings, pp. 750–784 (2016)
Rangel, F., Rosso, P., Potthast, M., Stein, B.: Overview of the 5th author profiling task at pan 2017: gender and language variety identification in Twitter. Working Notes Papers of the CLEF (2017)
Rangel, F., Rosso, P., Montes-y-Gómez, M., Potthast, M., Stein, B.: Overview of the 6th author profiling task at PAN 2018: multimodal gender identification in Twitter. Working Notes Papers of the CLEF (2018)
Peersman, C., Daelemans, W., Van Vaerenbergh, L.: Predicting age and gender in online social networks. In: Proceedings of the 3rd International Workshop on Search and Mining User-Generated Contents, pp. 37–44. ACM, October 2011
Lyons, M.J., Budynek, J., Akamatsu, S.: Automatic classification of single facial images. IEEE Trans. Pattern Anal. Mach. Intell. 21(12), 1357–1362 (1999)
McAuley, J., Leskovec, J.: Image labeling on a network: using social-network metadata for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7575, pp. 828–841. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33765-9_59
Fazl-Ersi, E., Mousa-Pasandi, M.E., Laganiere, R., Awad, M.: Age and gender recognition using informative features of various types. In: IEEE International Conference on Image Processing (ICIP), pp. 5891–5895. IEEE, October 2014
Wilhelm, T., Böhme, H.J., Gross, H.M.: Classification of face images for gender, age, facial expression, and identity. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3696, pp. 569–574. Springer, Heidelberg (2005). https://doi.org/10.1007/11550822_89
Kapočiūtė-Dzikicnė, J., Damaševičius, R.: Lithuanian author profiling with the deep learning. In: Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 169–172. IEEE, September 2018
You, Q., Bhatia, S., Sun, T., Luo, J.: The eyes of the beholder: gender prediction using images posted in online social networks. In: IEEE International Conference on Data Mining Workshop, pp. 1026–1030. IEEE, December 2014
Rosenthal, S., McKeown, K.: Age prediction in blogs: a study of style, content, and online behavior in pre-and post-social media generations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 763–772. Association for Computational Linguistics, June 2011
Marquardt, J., et al.: Age and gender identification in social media. In: CLEF (Working Notes), pp. 1129–1136, September 2014
López-Monroy, A.P., Montes-y-Gómez, M., Escalante, H.J., Pineda, L.V.: Using intra-profile information for author profiling. In: CLEF (Working Notes), pp. 1116–1120, September 2014
Schaetti, N.: UniNE at CLEF 2017: TF-IDF and deep-learning for author profiling. In: CLEF (Working Notes) (2017)
Rangel, F., Rosso, P.: On the impact of emotions on author profiling. Inf. Process. Manage. 52(1), 73–92 (2016)
Nieuwenhuis, M., Wilkens, J.: Twitter text and image gender classification with a logistic regression n-gram model. In: Proceedings of the Ninth International Conference of the CLEF Association (CLEF 2018), September 2018
Takahashi, T., Tahara, T., Nagatani, K., Miura, Y., Taniguchi, T., Ohkuma, T.: Text and image synergy with feature cross technique for gender identification. Working Notes Papers of the CLEF (2018)
Martinc, M., Skrjanec, I., Zupan, K., Pollak, S.: PAN 2017: author profiling-gender and language variety prediction. In: CLEF (Working Notes) (2017)
Lopez-Santamaria, L.-M., Gomez, J.C., Ibarra-Manzano, M.-A., Almanza-Ojeda, D.-L.: Age and gender identification in unbalanced social media. In: Proceedings of the 29th International Conference on Electronics, Communications and Computers (CONIELECOMP 2019), pp. 74–80 (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE, June 2009
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Bravo-Marmolejo, SP., Moreno, J., Gomez, J.C., Pérez-Martínez, C., Ibarra-Manzano, MA., Almanza-Ojeda, DL. (2019). Identification of Age and Gender in Pinterest by Combining Textual and Deep Visual Features. In: Damaševičius, R., Vasiljevienė, G. (eds) Information and Software Technologies. ICIST 2019. Communications in Computer and Information Science, vol 1078. Springer, Cham. https://doi.org/10.1007/978-3-030-30275-7_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-30275-7_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30274-0
Online ISBN: 978-3-030-30275-7
eBook Packages: Computer ScienceComputer Science (R0)