Skip to main content
Log in

Community-based location inference in social media using supervised learning approach

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Social media embed rich but noisy signals of physical locations of their users. Accurately inferring a user’s location can significantly improve the user’s experience on the social media and enable the development of new location-based applications. This paper adopts a supervised learning model—generalized additive model (GAM) to find the best community in a user’s online neighborhood to predict the user’s physical location. It proposes to use geographical proximity, structural proximity, and generic attribute metrics to characterize the goodness of the communities in the ego-net of a user and apply variable selection techniques to identify important community metrics for user location inference. Evaluating the effectiveness of GAM model with real social media data, we discover that GAM can choose better communities for location prediction than using an individual metric and GAM identifies median haversine distance, triangle participation ratio, and internal density as the top three significant metrics for community selection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. In general, the GAM is allowed to be a generalized linear model that can accommodate categorical responses. Since the responses are continuous (distance) in our scenario, we use the linear form.

  2. We only look at community closeness at 50 miles, denoted by CC50, as it was chosen as the best choice of distance in Wagenseller et al. (2019).

  3. We did not include number of reciprocal contacts into the model because community size is linearly dependant upon the number of friends, followers, and reciprocal contacts. Specifically, the sum of friends, followers, and reciprocal contacts equals community size. So, we can pick any three from the four metrics. Here, we omit reciprocal contacts.

References

  • Abrol S, Khan L (2010) Tweethood: Agglomerative clustering on fuzzy k-closest friends with variable depth for location mining. In: 2010 IEEE second international conference on social computing. IEEE, pp 153–160

  • Abrol S, Khan L, Thuraisingham B (2012) Tweeque: Spatio-temporal analysis of social networks for location mining using graph partitioning. In: 2012 international conference on social informatics. IEEE, pp 145–148

  • Backstrom L, Sun E, Marlow C (2010) Find me if you can: improving geographical prediction with social and spatial proximity. In: Proceedings of the 19th international conference on world wide web. ACM, pp 61–70

  • Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, pp 759–768

  • Chon J, Raymond R, Wang H, Wang F (2015) Modeling flu trends with real-time geo-tagged twitter data streams. In: International conference on wireless algorithms, systems, and applications. Springer, pp 60–69

  • Cresci S, Cimino A, Dell’Orletta F, Tesconi M (2015) Crisis mapping during natural disasters via text analysis of social media messages. In: International conference on web information systems engineering. Springer, pp 250–258

  • Dredze M, Paul M, Bergsma S, Tran H (2013) Carmen: A twitter geolocation system with applications to public health. In: Workshops at the twenty-seventh AAAI conference on artificial intelligence

  • Dunbar RI (2016) Do online social media cut through the constraints that limit the size of offline social networks? R Soc Open Sci 3(1):150292

    Article  MathSciNet  Google Scholar 

  • Ghaffari M, Srinivasan A, Liu X (2019) High-resolution home location prediction from tweets using deep learning with dynamic structure. arXiv preprint arXiv:190203111

  • Hastie TJ (2017) Generalized additive models. In: Statistical models in S. Routledge, pp 249–307

  • Hecht B, Hong L, Suh B, Chi EH (2011) Tweets from Justin Bieber’s heart: the dynamics of the location field in user profiles. In: Proceedings of the ACM SIGCHI conference on human factors in computing systems, pp 237–246

  • Jurgens D (2013) That’s what friends are for: inferring location in online social media platforms based on social relationships. In: Seventh international AAAI conference on weblogs and social media

  • Jurgens D, Finethy T, McCorriston J, Xu YT, Ruths D (2015) Geolocation prediction in twitter using social networks: a critical analysis and review of current practice. In: Ninth international AAAI conference on web and social media

  • Kinsella S, Murdock V, O’Hare N (2011) I’m eating a sandwich in glasgow: modeling locations with tweets. In: Proceedings of the 3rd international workshop on Search and mining user-generated contents. ACM, pp 61–68

  • Kumar A, Singh JP (2019) Location reference identification from tweets during emergencies: a deep learning approach. Int J Disaster Risk Reduct 33:365–375

    Article  Google Scholar 

  • Leetaru K, Wang S, Cao G, Padmanabhan A, Shook E (2013) Mapping the global twitter heartbeat: the geography of twitter. First Monday. https://doi.org/10.5210/fm.v18i5.4366.

    Article  Google Scholar 

  • Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th international conference on World wide web. ACM, pp 631–640

  • Li R, Wang S, Deng H, Wang R, Chang KCC (2012) Towards social user profiling: unified and discriminative influence model for inferring home locations. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1023–1031

  • McGee J, Caverlee J, Cheng Z (2013) Location prediction in social media based on tie strength. In: Proceedings of the 22nd ACM international conference on information and knowledge management, pp 459–468

  • Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123

    Article  Google Scholar 

  • Wagenseller P, Wang F, Wu W (2018) Size matters: a comparative analysis of community detection algorithms. IEEE Trans Comput Soc Syst 5(4):951–960

    Article  Google Scholar 

  • Wagenseller P, Avram A, Jiang E, Wang F, Zhao Y (2019) Location prediction with communities in user ego-net in social media. In: IEEE international conference on communications (ICC), pp 1–67

  • Weiszfeld E, Plastria F (2009) On the point for which the sum of the distances to n given points is minimum. Ann Oper Res 167(1):7–41

    Article  MathSciNet  Google Scholar 

  • Xu C, Li J, Luo X, Pei J, Li C, Ji D (2019) Dlocrl: A deep learning pipeline for fine-grained location recognition and linking in tweets. In: The world wide web conference. ACM, pp 3391–3397

  • Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213

    Article  Google Scholar 

  • Young JG, Hébert-Dufresne L, Allard A, Dubé LJ (2016) Growing networks of overlapping communities with internal structure. Phys Rev E 94(2):022317

    Article  Google Scholar 

  • Zheng X, Han J, Sun A (2018) A survey of location prediction on twitter. IEEE Trans Knowl Data Eng 30(9):1652–1671

    Article  Google Scholar 

Download references

Acknowledgements

This project is supported by NSF Grant ATD No. 1737861.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wagenseller, P., Zhao, Y., Wang, F. et al. Community-based location inference in social media using supervised learning approach. Soc. Netw. Anal. Min. 11, 64 (2021). https://doi.org/10.1007/s13278-021-00769-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-021-00769-5

Keywords

Navigation