Skip to main content
Log in

Multi-modal microblog classification via multi-task learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Recent years have witnessed the flourishing of social media platforms (SMPs), such as Twitter, Facebook, and Sina Weibo. The rapid development of these SMPs has resulted in increasingly large scale multimedia data, which has been proved with remarkable marketing values. It is in an urgent need to classify these social media data into a specified list of concerned entities, such as brands, products, and events, to analyze their sales, popularity or influences. But this is a rather challenging task due to the shortness, conversationality, the incompatibility between images and text, and the data diversity of microblogs. In this paper, we present a multi-modal microblog classification method in a multi-task learning framework. Firstly features of different modalities are extracted for each microblog. Specifically, we extract TF-IDF features for each microblog text and low-level visual features and high-level semantic features for each microblog image. Then multiple related classification tasks are learned simultaneously for each feature to increase the sample size for each task and improve the prediction performance. Finally the outputs of each feature are integrated by a Support Vector Machine that learns how to optimally combine and weight each feature. We evaluate the proposed method on Brand-Social-Net to classify the contained 100 brands. Experimental results demonstrate the superiority of the proposed method, as compared to the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://twitter.com

  2. https://www.facebook.com

  3. http://weibo.com

  4. www.image-net.org

  5. wordnet.princeton.edu

  6. http://www.MALSAR.org

  7. www.nextcenter.org/Brand-Social-Net/

References

  1. Ando RK, Zhang T (2005) A framework for learning predictive structures from multiple tasks and unlabeled data. J Mach Learn Res 6:1817–1853

    MathSciNet  MATH  Google Scholar 

  2. Argyriou A, Evgeniou T, Pontil M (2007) Multi-task feature learning. Adv neural infor process syst 19:41

    Google Scholar 

  3. Asur S, Huberman BA (2010) Predicting the future with social media. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 1, pp 492–499

  4. Becker H, Naaman M, Gravano L (2010) Learning similarity metrics for event identification in social media. In: ACM international conference on Web Search and Data Mining, pp 291–300

  5. Ben-David S, Schuller R (2003) Exploiting task relatedness for multiple task learning. In: Learning Theory and Kernel Machines

  6. Bickel S, Bogojeska J, Lengauer T, Scheffer T (2008) Multi-task learning for hiv therapy screening. In: ACM International Conference on Machine Learning, pp 56–63

  7. Borth D, Ji R, Chen T, Breuel T, Chang SF (2013) Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: ACM International Conference on Multimedia, pp 223–232

  8. Chen C, Li F, Ooi BC, Wu S (2011) Ti: an efficient indexing mechanism for real-time search on tweets. In: ACM SIGMOD International Conference on Management of data, pp 649–660

  9. Chen MY, Hauptmann A (2004) Multi-modal classification in digital news libraries. In: Joint ACM/IEEE Conference on Digital Libraries, pp 212–213

  10. Chen Y, Li Z, Nie L, Hu X, Wang X, Chua TS, Zhang X (2012) A semi-supervised bayesian network model for microblog topic classification. In: International Conference on Computational Linguistics, pp 561–576

  11. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp 886–893

  12. Dunker P, Nowak S, Begau A, Lanz C (2008) Content-based mood classification for photos and music: a generic multi-modal classification framework and evaluation approach. In: ACM International Conference on Multimedia Information Retrieval, pp 97–104

  13. Gao Y, Wang F, Luan H, Chua TS (2014) Brand data gathering from live social media streams. In: ACM International Conference on Multimedia Retrieval

  14. Gao Y, Wang M, Tao D, Ji R, Dai Q. (2012) 3-d object retrieval and recognition with hypergraph analysis. IEEE Trans Image Process 21(9):4290–4303

    Article  MathSciNet  Google Scholar 

  15. Gao Y, Wang M, Zha ZJ, Shen J, Li X, Wu X (2013) Visual-textual joint relevance learning for tag-based social image search. IEEE Trans Image Process 22(1):363–376

    Article  MathSciNet  Google Scholar 

  16. Gao Y, Zhao S, Yang Y, Chua TS (2015) Multimedia social event detection in microblog. In: International Conference on Multimedia Modeling

  17. Gaonkar S, Li J, Choudhury RR, Cox L, Schmidt A (2008) Micro-blog: sharing and querying content through mobile phones and social participation. In: ACM International Conference on Mobile systems, applications, and services, pp 174–186

  18. Gong P, Ye J, Zhang C (2012) Robust multi-task feature learning. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp 895–903

  19. Gray KR, Aljabar P, Heckemann RA, Hammers A, Rueckert D (2013) Random forest-based similarity measures for multi-modal classification of alzheimer’s disease. NeuroImage 65:167–175

    Article  Google Scholar 

  20. Gu C, Wang S (2012) Empirical study on social media marketing based on sina microblog. In: IEEE International Conference on Business Computing and Global Informatization, pp 537–540

  21. Hanjalic A (2006) Extracting moods from pictures and sounds: Towards truly personalized tv. IEEE Signal Process Mag 23(2):90–100

    Article  Google Scholar 

  22. Jalali A, Ravikumar PD, Sanghavi S, Ruan C (2010) A dirty model for multi-task learning. In: Advances in Neural Information Processing Systems, vol. 3, p 7

  23. Ji R, Duan LY, Chen J, Yao H, Yuan J, Rui Y, Gao W (2012) Location discriminative vocabulary coding for mobile landmark search. Int J Comput Vis 96(3):290–314

    Article  MATH  Google Scholar 

  24. Ji R, Gao Y, Hong R, Liu Q, Tao D, Li X (2014) Spectral-spatial constraint hyperspectral image classification. IEEE Trans Geosci Rem Sens 52(3):1811–1824

    Article  Google Scholar 

  25. Ji R., Gao Y., Liu W., Tian Q., Li X. When location meets social multimedia: A comprehensive survey on location-aware social multimedia. ACM Transactions on Intelligent System and Technology (in press)

  26. Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, AUAI Press, pp 339–348

  27. Liu Q, Yang Y, Wang X, Cao L (2013) Quality assessment on user generated image for mobile search application. In: International Conference on Multimedia Modeling, pp 1–11

  28. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  29. Nagmoti R, Teredesai A, De Cock M (2010) Ranking approaches for microblog search. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol. 1, pp 153–157

  30. Naveed N, Gottron T, Kunegis J, Alhadi AC (2011) Searching microblogs: coping with sparsity and document quality. In: ACM International Conference on Information and knowledge management, pp 183–188

  31. Nie F, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint l2, 1-norms minimization. Adv in Neural Infor Process Syst 23:1813–1821

    Google Scholar 

  32. Ojala T, Pietikäinen M, Harwood D (1996) A comparative study of texture measures with classification based on featured distributions. Pattern Recognition 29(1):51–59

    Article  Google Scholar 

  33. Pronobis A, Caputo B (2007) Confidence-based cue integration for visual place recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 2394–2401

  34. Pronobis A, Mozos OM, Caputo B, Jensfelt P (2010) Multi-modal semantic place classification. Int J Robot Res 29(2-3):298–320

    Article  Google Scholar 

  35. Reuter T, Cimiano P (2012) Event-based classification of social media streams. In: ACM International Conference on Multimedia Retrieval, p 22

  36. Rowlands T, Hawking D, Sankaranarayana R (2010) New-web search with microblog annotations. In: ACM International Conference on World wide web, pp 1293–1296

  37. Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Infor process & management 24(5):513–523

    Article  Google Scholar 

  38. Sharifi BP (2010) Automatic microblog classification and summarization. University of Colorado, Ph.D. thesis

    Google Scholar 

  39. Sharma R, Walavalkar L (2002) Yeasin, M. Multi-modal gender classification using support vector machines (svms)

  40. Skowron A, Wang H, Wojna A, Bazan J (2006) Multimodal classification: case studies. In: Transactions on Rough Sets V. Springer, pp 224–239

  41. Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Machine Int 22(12):1349–1380

    Article  Google Scholar 

  42. Sui Y, Yang X (2010) The potential marketing power of microblog. In: IEEE International Conference on Communication Systems, Networks and Applications, vol. 1, pp 164–167

  43. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Royal Stat Society. Series B (Methodological):267–288

  44. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  45. Wang F, Qi S, Gao G, Zhao S, Wang X (2014) Logo information recognition in large-scale social media data. Multimedia Syst:1–11

  46. Wei Y, Zhang Z, Fei S, Du W (2014) A method of computing the hot topics popularity on the internet combined with the features of the microblogs. In: Frontier and Future Development of Information Technology in Medicine and Education, pp 2721–2728

  47. Weng J, Lee BS (2011) Event detection in twitter. In: International AAAI Conference on Weblogs and Social Media

  48. Yang J, Yu K, Gong Y, Huang T (2009) Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 1794–1801

  49. Yang Y, Wang X, Guan T, Shen J, Yu L (2014) A multi-dimensional image quality prediction model for user-generated images in social networks. Infor Scie 281:601–610

    Article  Google Scholar 

  50. Yang YH, Lin YC, Cheng HT, Liao IB, Ho YC, Chen HH (2008) Toward multi-modal music emotion classification. In: Advances in Multimedia Information Processing-PCM. Springer, pp 70–79

  51. Zhao S, Gao Y, Jiang X, Yao H, Chua TS, Sun X (2014) Exploring principles-of-art features for image emotion recognition. In: ACM International Conference on Multimedia

  52. Zhao S, Yao H, Sun X, Jiang X, Xu P (2013) Flexible presentation of videos based on affective content analysis. In: International Conference on Multimedia Modeling, pp 368–379

  53. Zhao S, Yao H, Yang Y, Zhang Y (2014) Affective image retrieval via multi-graph learning. In: ACM International Conference on Multimedia

  54. Zhao S, Yao H, Zhang Y, Wang Y, Liu S (2014) View-based 3d object retrieval via multi-modal graph learning. Signal Processing

  55. Zheng H, Yoshinaga N, Kaji N, Toyoda M (2012) A study on microblog classification based on information publicness. In: DEIM Forum

  56. Zhou J, Chen J, Ye J (2012) Malsar: Multi-task learning via structural regularization. Arizona State University

  57. Zhou J, Liu J, Narayan VA, Ye J (2012) Modeling disease progression via fused sparse group lasso. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp 1095–1103

  58. Zhou J, Yuan L, Liu J, Ye J (2011) A multi-task learning formulation for predicting disease progression. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp 814–822

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 61472103) and Key Program (No. 61133003). Sicheng Zhao was also supported by the Ph.D. Short-Term Overseas Visiting Scholar Program of Harbin Institute of Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongxun Yao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, S., Yao, H., Zhao, S. et al. Multi-modal microblog classification via multi-task learning. Multimed Tools Appl 75, 8921–8938 (2016). https://doi.org/10.1007/s11042-014-2342-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2342-2

Keywords

Navigation