Abstract
The advent of geolocation technologies has generated unprecedented rich datasets of people’s location information at a very high fidelity. These location datasets can be used to study human behavior; for example, social studies have shown that people who are seen together frequently at the same place and same time are most probably socially related. In this article, we are interested in inferring these social connections by analyzing people’s location information; this is useful in a variety of application domains, from sales and marketing to intelligence analysis. In particular, we propose an entropy-based model (EBM) that not only infers social connections but also estimates the strength of social connections by analyzing people’s co-occurrences in space and time. We examine two independent methods: diversity and weighted frequency, through which co-occurrences contribute to the strength of a social connection. In addition, we take the characteristics of each location into consideration in order to compensate for cases where only limited location information is available. We also study the role of location semantics in improving our computation of social strength. We develop a parallel implementation of our algorithm using MapReduce to create a scalable and efficient solution for online applications. We conducted extensive sets of experiments with real-world datasets including both people’s location data and their social connections, where we used the latter as the ground truth to verify the results of applying our approach to the former. We show that our approach is valid across different networks and outperforms the competitors.
- Bhuvan Bamba, Ling Liu, Peter Pesti, and Ting Wang. 2008. Supporting anonymous location queries in mobile environments with privacygrid. In Proceedings of the 17th International Conference on World Wide Web. ACM, 237--246.Google ScholarDigital Library
- Michael Barbaro, Tom Zeller, and Saul Hansell. 2006. A face is exposed for AOL searcher no. 4417749. New York Times 9, 2008 (2006), 8For.Google Scholar
- J. L. Bentley. 1975. Multidimensional binary search trees used for associative searching. Commun. ACM 18, 9 (1975), 509--517.Google ScholarDigital Library
- Igor Bilogrevic, Kévin Huguenin, Murtuza Jadliwala, Florent Lopez, Jean-Pierre Hubaux, Philip Ginzboorg, and Valtteri Niemi. 2013. Inferring social ties in academic networks using short-range wireless communications. In Proceedings of the 12th ACM Workshop on Workshop on Privacy in the Electronic Society. ACM, 179--188.Google ScholarDigital Library
- Igor Bilogrevic, Murtuza Jadliwala, István Lám, Imad Aad, Philip Ginzboorg, Valtteri Niemi, Laurent Bindschaedler, and Jean-Pierre Hubaux. 2012. Big brother knows your friends: On privacy of social communities in pervasive networks. In Pervasive Computing. Springer, 370--387.Google Scholar
- C. M. Bishop. 2006. Pattern Recognition and Machine Learning. Vol. 4. Springer, New York.Google Scholar
- D. M. Blei, A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learning Res. 3 (2003), 993--1022.Google ScholarDigital Library
- Chloë Brown, Neal Lathia, Cecilia Mascolo, Anastasios Noulas, and Vincent Blondel. 2014. Group colocation behavior in technological social networks. PloS One 9, 8 (2014), e105816.Google ScholarCross Ref
- Chloë Brown, Vincenzo Nicosia, Salvatore Scellato, Anastasios Noulas, and Cecilia Mascolo. 2012. The importance of being placefriends: Discovering location-focused online communities. In Proceedings of the 2012 ACM Workshop on Workshop on Online Social Networks. ACM, 31--36.Google ScholarDigital Library
- Chloë Brown, Vincenzo Nicosia, Salvatore Scellato, Anastasios Noulas, and Cecilia Mascolo. 2013a. Social and place-focused communities in location-based online social networks. Eur. Phys. J. B 86, 6 (2013), 1--10.Google ScholarCross Ref
- Chloë Brown, Anastasios Noulas, Cecilia Mascolo, and Vincent Blondel. 2013b. A place-focused model for social networks in cities. In Proceedings of the 2013 International Conference on Social Computing (SocialCom). IEEE, 75--80.Google ScholarDigital Library
- W. M. Bukowski, A. F. Newcomb, and W. W. Hartup. 1998. The Company They Keep: Friendships in Childhood and Adolescence. Cambridge University Press.Google Scholar
- Xin Cao, Gao Cong, and Christian S. Jensen. 2010. Mining significant semantic locations from GPS data. Proc. VLDB Endowment 3, 1--2 (2010), 1009--1020.Google ScholarDigital Library
- Eunjoon Cho, Seth A. Myers, and Jure Leskovec. 2011. Friendship and mobility: User movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD (KDD’11). New York, NY, 1082--1090. DOI:http://dx.doi.org/10.1145/2020408.2020579Google ScholarDigital Library
- Chi-Yin Chow and Mohamed F. Mokbel. 2007. Enabling private continuous queries for revealed user locations. In Advances in Spatial and Temporal Databases. Springer, 258--275.Google Scholar
- David J. Crandall, Lars Backstrom, Dan Cosley, Siddharth Suri, Daniel Huttenlocher, and Jon Kleinberg. 2010. Inferring social ties from geographic coincidences. Proc. Natl. Acad. Sci. 107, 52 (2010), 22436--22441. DOI:http://dx.doi.org/10.1073/pnas.1006155107Google ScholarCross Ref
- Justin Cranshaw, Eran Toch, Jason Hong, Aniket Kittur, and Norman Sadeh. 2010. Bridging the gap between physical location and online social networks. In Proceedings of the 12th ACM International Conference on Ubiquitous Computing (Ubicomp’10). ACM, New York, NY, 119--128. DOI:http://dx.doi.org/10.1145/1864349.1864380Google ScholarDigital Library
- Miguel Rio de Sangre. 2013. The Geography of Tweets. Retrieved from https://blog.twitter.com/2013/the-geography-of-tweets.Google Scholar
- J. Dean and S. Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107--113.Google ScholarDigital Library
- Nathan Eagle, Alex (Sandy) Pentland, and David Lazer. 2009. Inferring friendship network structure by using mobile phone data. Proc. NAS 106, 36 (2009), 15274--15278. DOI:http://dx.doi.org/10.1073/pnas.0900282106Google ScholarCross Ref
- Eventbrite. 2015. Homepage. https://www.eventbrite.com/.Google Scholar
- Manuel Gomez Rodriguez, Jure Leskovec, and Andreas Krause. 2010. Inferring networks of diffusion and influence. In ACM SIGKDD. 1019--1028.Google Scholar
- Amit Goyal, Francesco Bonchi, and Laks V. S. Lakshmanan. 2010. Learning influence probabilities in social networks. In ACM WSDM. New York, NY, 241--250.Google Scholar
- Amit Goyal, Francesco Bonchi, and Laks V. S. Lakshmanan. 2011. A data-based approach to social influence maximization. VLDB 5, 1 (2011), 73--84.Google ScholarDigital Library
- Marco Gruteser and Dirk Grunwald. 2003. Anonymous usage of location-based services through spatial and temporal cloaking. In Proceedings of the 1st International Conference on Mobile Systems, Applications and Services. ACM, 31--42.Google ScholarDigital Library
- M. O. Hill. 1973. Diversity and evenness: A unifying notation and its consequences. Ecology 54 (1973), 427--432.Google ScholarCross Ref
- Cho-Jui Hsieh, Mitul Tiwari, Deepak Agarwal, Xinyi (Lisa) Huang, and Sam Shah. 2013. Organizational overlap on social networks and its applications. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 571--582. http://dl.acm.org/citation.cfm?id=2488388.2488439Google ScholarDigital Library
- Lou Jost. 2006. Entropy and diversity. Oikos 113, 2 (2006), 363--375. DOI:http://dx.doi.org/10.1111/j.2006.0030-1299.14714.xGoogle ScholarCross Ref
- Byoungyoung Lee, Jinoh Oh, Hwanjo Yu, and Jong Kim. 2011. Protecting location privacy using location semantics. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1289--1297.Google ScholarDigital Library
- Jure Leskovec. 2007-2012. Stanford Large Network Dataset Collection. (2007--2012). http://snap.stanford.edu/data/.Google Scholar
- Quannan Li, Yu Zheng, Xing Xie, Yukun Chen, Wenyu Liu, and Wei-Ying Ma. 2008. Mining user similarity based on location history. In Proceedings of the 16th ACM SIGSPATIAL (GIS’08). ACM, New York, NY, Article 34, 10 pages. DOI:http://dx.doi.org/10.1145/1463434.1463477Google ScholarDigital Library
- D. Liben-Nowell and J. Kleinberg. 2007. The link-prediction problem for social networks. J. Am. Soc. IST 58, 7 (2007), 1019--1031.Google ScholarDigital Library
- Juhong Liu, Ouri Wolfson, and Huabei Yin. 2006. Extracting semantic location from outdoor positioning systems. In MDM. Citeseer, 73.Google Scholar
- Hao Ma. 2013. An experimental study on implicit social recommendation. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’13). ACM, New York, NY, 73--82. DOI:http://dx.doi.org/10.1145/2484028.2484059Google ScholarDigital Library
- J. P. Mangalindan. 2013. Today in Tech. Retrieved from http://tech.fortune.cnn.com/2013/03/18/today-in-tech-hulu-tk/.Google Scholar
- Mohamed F. Mokbel, Chi-Yin Chow, and Walid G. Aref. 2006. The new casper: Query processing for location services without compromising privacy. In Proceedings of the 32nd International Conference on Very Large Data Bases. VLDB Endowment, 763--774.Google Scholar
- Arvind Narayanan and Vitaly Shmatikov. 2008. Robust de-anonymization of large sparse datasets. In IEEE Symposium on Security and Privacy, 2008 (SP’08). IEEE, 111--125.Google ScholarDigital Library
- Jasmine Novak, Prabhakar Raghavan, and Andrew Tomkins. 2004. Anti-aliasing on the web. In Proceedings of the 13th International Conference on World Wide Web. ACM, 30--39.Google ScholarDigital Library
- Debra L. Oswald and Eddie M. Clark. 2003. Best friends forever? High school best friendships and the transition to college. Personal Relationships 10, 2 (2003), 187--196.Google ScholarCross Ref
- Huy Pham, Ling Hu, and Cyrus Shahabi. 2011. Towards integrating real-world spatiotemporal data with social networks. In Proceedings of the 19th ACM SIGSPATIAL (GIS’11). ACM, New York, NY, 453--457. DOI:http://dx.doi.org/10.1145/2093973.2094046Google ScholarDigital Library
- Huy Pham, Cyrus Shahabi, and Yan Liu. 2013. EBM: An entropy-based model to infer social strength from spatiotemporal data. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, 265--276.Google ScholarDigital Library
- Guo-Jun Qi, Charu C. Aggarwal, and Thomas Huang. 2013. Link prediction across networks by biased cross-network sampling. In 2013 IEEE 29th International Conference on Data Engineering (ICDE’13), 793--804. DOI:http://dx.doi.org/10.1109/ICDE.2013.6544875Google Scholar
- A. Renyi. 1960. On measures of entropy and information. In Berkeley Symposium Mathematics, Statistics, and Probability. 547--561.Google Scholar
- Hanan Samet. 1984. The quadtree and related hierarchical data structures. ACM Comput. Surv. 16, 2 (June 1984), 187--260. DOI:http://dx.doi.org/10.1145/356924.356930Google ScholarDigital Library
- Daniel V. Schroeder and Harvey Gould. 2000. An introduction to thermal physics. Phys. Today 53, 8 (2000), 44--45. DOI:http://dx.doi.org/10.1063/1.2405696Google ScholarCross Ref
- Patricia M. Sias and Daniel J. Cahill. 1998. From coworkers to friends: The development of peer friendships in the workplace. West. J. Commun. (Includes Commun. Rep.) 62, 3 (1998), 273--299.Google ScholarCross Ref
- Socialbakers. 2011. Interesting Facebook Places numbers. Retrieved from http://www.socialbakers.com/blog/167-interesting-facebook-places-numbers.Google Scholar
- Liang Tang, Haiquan Chen, Haixun Wang, Min-Te Sun, and Wei-Shinn Ku. 2013. LinkProbe: Probabilistic inference on large-scale social networks. In Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE’13). IEEE Computer Society, Washington, DC, 290--301. DOI:http://dx.doi.org/10.1109/ICDE.2013.6544833Google Scholar
- Hanna Tuomisto. 2010a. A consistent terminology for quantifying species diversity? Yes, it does exist. Oecologia 164, 4 (2010), 853--860. http://dx.doi.org/10.1007/s00442-010-1812-0Google ScholarCross Ref
- Hanna Tuomisto. 2010b. A diversity of beta diversities: Straightening up a concept. Ecography 33, 1 (2010), 2--22. DOI:http://dx.doi.org/10.1111/j.1600-0587.2009.05880.xGoogle ScholarCross Ref
- Chris Weidemann. 2013. GeoSocial Footprint. Retrieved from http://geosocialfootprint.com/.Google Scholar
- Carol Werner and Pat Parmelee. 1979. Similarity of activity preferences among friends: Those who play together stay together. Social Psychol. Qtly. (1979), 62--66.Google Scholar
- Mao Ye, Dong Shou, Wang-Chien Lee, Peifeng Yin, and Krzysztof Janowicz. 2011. On the semantic annotation of places in location-based social networks. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 520--528.Google ScholarDigital Library
Index Terms
- Inferring Social Strength from Spatiotemporal Data
Recommendations
EBM: an entropy-based model to infer social strength from spatiotemporal data
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataThe ubiquity of mobile devices and the popularity of location-based-services have generated, for the first time, rich datasets of people's location information at a very high fidelity. These location datasets can be used to study people's behavior - for ...
GEOSO - a geo-social model: from real-world co-occurrences to social connections
DNIS'11: Proceedings of the 7th international conference on Databases in Networked Information SystemsAs the popularity of social networks is continuously growing, collected data about online social activities is becoming an important asset enabling many applications such as target advertising, sale promotions, and marketing campaigns. Although most ...
Finding Strong Groups of Friends among Friends in Social Networks
DASC '11: Proceedings of the 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure ComputingOver the past few years, the rapid growth and the exponential use of social digital media has led to an increase in popularity of social networks and the emergence of social computing. In general, social networks are structures made of social entities (...
Comments