Skip to main content
Log in

CrisMap: a Big Data Crisis Mapping System Based on Damage Detection and Geoparsing

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

Natural disasters, as well as human-made disasters, can have a deep impact on wide geographic areas, and emergency responders can benefit from the early estimation of emergency consequences. This work presents CrisMap, a Big Data crisis mapping system capable of quickly collecting and analyzing social media data. CrisMap extracts potential crisis-related actionable information from tweets by adopting a classification technique based on word embeddings and by exploiting a combination of readily-available semantic annotators to geoparse tweets. The enriched tweets are then visualized in customizable, Web-based dashboards, also leveraging ad-hoc quantitative visualizations like choropleth maps. The maps produced by our system help to estimate the impact of the emergency in its early phases, to identify areas that have been severely struck, and to acquire a greater situational awareness. We extensively benchmark the performance of our system on two Italian natural disasters by validating our maps against authoritative data. Finally, we perform a qualitative case-study on a recent devastating earthquake occurred in Central Italy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://blog.twitter.com/2014/using-twitter-to-measure-earthquake-impact-in-almost-real-time.

  2. https://www.ushahidi.com/.

  3. https://www.mapbox.com/.

  4. https://www.google.org/crisismap/.

  5. http://www.esri.com/arcgis/.

  6. https://crisiscommons.org/.

  7. http://earthquake.usgs.gov/research/dyfi/.

  8. https://developer.twitter.com/en/docs/tweets/filter-realtime/overview.

  9. https://kafka.apache.org.

  10. http://spark.apache.org.

  11. https://www.elastic.co/products/elasticsearch.

  12. https://lucene.apache.org/.

  13. https://www.elastic.co/products/kibana.

  14. https://www.elastic.co/products.

  15. The plugin is publicly available at https://github.com/marghe943/kibanaChoroplethMap.git.

  16. See Section 5 for more details about the proposed approach.

  17. https://github.com/dexter/dexter.

  18. https://github.com/dbpedia-spotlight/model-quickstarter.

  19. https://www.elastic.co/blog/elasticsearch-performance-indexing-2-0.

  20. https://www.elastic.co/guide/en/elasticsearch/reference/6.0/tune-for-indexing-speed.html.

  21. http://www.sobigdata.eu/.

  22. https://en.wikipedia.org/wiki/2009_L'Aquila_earthquake.

  23. https://en.wikipedia.org/wiki/2012_Northern_Italy_earthquakes.

  24. https://en.wikipedia.org/wiki/August_2016_Central_Italy_earthquake.

  25. https://en.wikipedia.org/wiki/2013_Sardinia_floods.

  26. https://dev.twitter.com/docs/api/streaming.

  27. http://gnip.com/sources/twitter/historical.

  28. As software implementation we used the SVC class available in the scikit-learn Python package.

  29. The meaning of this hypothesis is that words appearing in similar contexts often have a similar meaning.

  30. We did not use more sophisticated methods like “Paragraph Vector” (Le and Mikolov 2014) because these statistical methods do not work well for small texts like tweets.

  31. We used the ’balanced’ value for class weight, see scikit-learn documentation at http://bit.ly/2g5QSqk. In this way we indicate to SVM to treat the various labels in different ways during the training phase, giving more importance to class errors (measured with used loss function) made for skewed classes.

  32. In case of configurations with equal results in terms of F1 we prefer to choose those having more balanced values between precision and recall measures.

  33. http://en.wikipedia.org/wiki/Washington.

  34. https://tagme.d4science.org/tagme/.

  35. https://en.wikipedia.org/wiki/Choropleth_map.

  36. http://www.regione.sardegna.it/documenti/1_231_20140403083152.pdf - Italian Civil Protection report on damage to private properties, public infrastructures, and production facilities.

References

  • Avvenuti, M. et al. (2014a). EARS (Earthquake Alert and Report System): a real time decision support system for earthquake crisis management. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1749—1758). ACM.

  • Avvenuti, M. et al. (2014b). Earthquake emergency management by social sensing. In 2014 IEEE International conference on pervasive computing and communications workshops (PERCOM Workshops) (pp. 587–592). IEEE.

  • Avvenuti, M. et al. (2016a). A framework for detecting unfolding emergencies using humans as sensors. SpringerPlus, 5.1, 43.

    Article  Google Scholar 

  • Avvenuti, M. et al. (2016b). Impromptu crisis mapping to prioritize emergency response. Computer, 49.5, 28–37.

    Article  Google Scholar 

  • Avvenuti, M. et al. (2016c). Predictability or early warning: using social media in modern emergency response. IEEE Internet Computing, 20.6, 4–6.

    Article  Google Scholar 

  • Avvenuti, M. et al. (2017). Hybrid crowdsensing: a novel paradigm to combine the strengths of opportunistic and participatory crowdsensing. In Proceedings of the 26th international conference on World Wide Web companion (pp. 1413–1421). International World Wide Web Conferences Steering Committee.

  • Bauduy, J. (2010). Mapping a crisis, one text message at a time. Social Education, 74.3, 142–143.

    Google Scholar 

  • Bengio, Y., Courville, A., Vincent, P. (2013). Representation learning: a review and new perspectives. IEEE Transaction on Pattern Analysis and Machine Intelligence, 35.8, 1798–1828.

    Article  Google Scholar 

  • Bengio, Y. et al. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.

    Google Scholar 

  • Burks, L., Miller, M., Zadeh, R. (2014). Rapid estimate of ground shaking intensity by combining simple earthquake characteristics with tweets. In 10th US National conference on earthquake engineering.

  • Cheng, Z., Caverlee, J., Lee, K. (2010). You are where you tweet: a content-based approach to geo-locating twitter users. In Proceedings of the 19th ACM international conference on Information and knowledge management (pp. 759–768). ACM.

  • Cheong, F., & Cheong, C. (2011). Social media data mining: a social network analysis of tweets during the 2010-2011 australian floods. PACIS, 11, 46–46.

    Google Scholar 

  • Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20.3, 273–297.

    Google Scholar 

  • Cresci, S. et al. (2015a). Crisis mapping during natural disasters via text analysis of social media messages. In International conference on Web information systems engineering–WISE 2015 (pp. 250–258). Springer.

  • Cresci, S. et al. (2015b). A linguistically-driven approach to cross-event damage assessment of natural disasters from social media messages. In Proceedings of the 24th international conference on World Wide Web companion (pp. 1195–1200). International World Wide Web Conferences Steering Committee.

  • Cresci, S. et al. (2017). Nowcasting of earthquake consequences using big social data. IEEE Internet Computing, 21.6, 37–45.

    Google Scholar 

  • Dashti, S. et al. (2014). Supporting disaster reconnaissance with social media data: a design-oriented case study of the 2013 Colorado floods. In ISCRAM.

  • Dewan, P. et al. (2017). Towards understanding crisis events on online social networks through pictures. In Proc. of the IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). ACM.

  • de Oliveira, M.G. et al. (2015). Producing volunteered geographic information from social media for LBSN improvement. Journal of Information and Data Management, 6.1, 81.

    Google Scholar 

  • Earle, P.S., Bowden, D. C., Guy, M. (2012). Twitter earthquake detection: earthquake monitoring in a social world. Annals of Geophysics, 54, 6.

    Google Scholar 

  • Ferragina, P., & Scaiella, U. (2010). Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the 19th ACM international conference on Information and knowledge management (pp. 1625–1628). ACM.

  • Gao, H., Barbier, G., Goolsby, R. (2011). Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intelligent Systems, 26.3, 10–14.

    Article  Google Scholar 

  • Gelernter, J., & Balaji, S. (2013). An algorithm for local geoparsing of microtext. GeoInformatica, 17.4, 635–667.

    Article  Google Scholar 

  • Gelernter, J., & Mushegian, N. (2011). Geoparsing messages from microtext. Transactions in GIS, 15.6, 753–773.

    Article  Google Scholar 

  • Goolsby, R. (2010). Social media as crisis platform: the future of community maps/crisis maps. ACM Transactions on Intelligent Systems and Technology (TIST), 1.1, 7.

    Google Scholar 

  • Gupta, A et al. (2013a). Faking Sandy: characterizing and identifying fake images on twitter during hurricane Sandy. In Proceedings of the 22Nd international conference on World Wide Web. WWW ’13 Companion (pp. 729–736). ACM.

  • Gupta, A., Lamba, H., Kumaraguru, P. (2013b). $1.00 per RT #BostonMarathon #PrayForBoston: Analyzing fake content on Twitter. In 2013 APWG eCrime researchers summit (pp. 1–12).

  • Guy, M et al. (2014). Social media based earthquake detection and characterization. In KDD-LESI 2014: Proceedings of the 1st KDD workshop on learning about emergencies from social information at KDD14 (pp. 9–10).

  • Imran, M et al. (2013). Extracting information nuggets from disaster-related messages in social media. In Proceedings of the 10th international ISCRAM conference (pp. 791–801).

  • Imran, M et al. (2015). Processing social media messages in mass emergency: a survey. ACM Computing Surveys, 47.4, 67.

    Google Scholar 

  • Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20.4, 422–446.

    Article  Google Scholar 

  • Kropivnitskaya, Y. et al. (2017). The predictive relationship between earthquake intensity and tweets rate for real-time ground-motion estimation. In Seismological research letters.

  • Kryvasheyeu, Y. et al. (2016). Rapid assessment of disaster damage using social media activity. Science Advances, 2.3, e1500779.

    Article  Google Scholar 

  • Lagerstrom, R et al. (2016). Image classification to support emergency situation awareness. Frontiers in Robotics and AI, 3, 54.

    Article  Google Scholar 

  • Le, Q.V., & Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of the 31th international conference on machine learning, (ICML 2014) (pp. 1188–1196).

  • Lewis, G. (2007). Evaluating the use of a low-cost unmanned aerial vehicle platform in acquiring digital imagery for emergency response. In Geomatics solutions for disaster management (pp. 117–133). Springer.

  • Liang, Y., Caverlee, J., Mander, J. (2013). Text vs. images: on the viability of social media to assess earthquake damage. In Proceedings of the 22nd international conference on World Wide Web companion (pp. 1003–1006). International World Wide Web Conferences Steering Committee.

  • Meier, P. (2012). Crisis mapping in action: how open source software and global volunteer networks are changing the world, one map at a time. Journal of Map & Geography Libraries, 8.2, 89–100.

    Article  Google Scholar 

  • Middleton, S. E., Middleton, L., Modafferi, S. (2014). Real-time crisis mapping of natural disasters using social media. IEEE Intelligent Systems, 29.2, 9–17.

    Article  Google Scholar 

  • Mikolov, T et al. (2013). Distributed representations of words and phrases and their compositionality. In Burges, C. J. C. et al. (Eds.) Advances in neural information processing systems, (Vol. 26 pp. 3111–3119): Curran Associates, Inc.

  • Pablo, N et al. (2011). DBpedia spotlight: shedding light on the web of documents. In Proceedings of the 7th international conference on semantic systems (pp. 1–8). ACM.

  • Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22.10, 1345–1359.

    Article  Google Scholar 

  • Sakaki, T., Okazaki, M., Matsuo, Y. (2013). Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Transactions on Knowledge and Data Engineering, 25.4, 919–931.

    Article  Google Scholar 

  • Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34.1, 1–47.

    Article  Google Scholar 

  • Tassiulas, L., & Ephremides, A. (1992). Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks. IEEE Transactions on Automatic Control, 37.12, 1936–1948.

    Article  Google Scholar 

  • Trani, S. et al. (2014). Dexter 2.0: an open source tool for semantically enriching data. In Proceedings of the 2014 international conference on semantic web (Posters & Demos) (pp. 417–420). Springer.

  • Usbeck, R. et al. (2015). GERBIL: general entity annotator benchmarking framework. In Proceedings of the 24th international conference on World Wide Web (pp. 1133–1143). ACM.

  • Verma, S. et al. (2011). Natural language processing to the rescue? Extracting situational awareness tweets during mass emergency. In Proceedings of the 5th international AAAI conference on web and social media (ICWSM). AAAI.

  • Vieweg, S., & Hodges, A. (2014). Rethinking context: Leveraging human and machine computation in disaster response. Computer, 47.4, 22–27.

    Google Scholar 

  • Wang, L., & Kant, K. (2014). Special issue on computational sustainability. IEEE Transactions on Emerging Topics in Computing, 2.2, 119–121.

    Article  Google Scholar 

  • Weber, I., & Garimella, V. R. K. (2014). Visualizing user-defined, discriminative geo-temporal Twitter activity. In ICWSM.

Download references

Acknowledgments

This research is supported in part by the EU H2020 Program under the scheme INFRAIA-1-2014-2015: Research Infrastructures grant agreement #654024 SoBigData: Social Mining & Big Data Ecosystem, and by the MIUR (Ministero dell’Istruzione, dell’Universita‘ e della Ricerca) and Regione Toscana (Tuscany, Italy) funding the SmartNews: Social sensing for Breaking News project: PAR-FAS 2007-2013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefano Cresci.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Avvenuti, M., Cresci, S., Del Vigna, F. et al. CrisMap: a Big Data Crisis Mapping System Based on Damage Detection and Geoparsing. Inf Syst Front 20, 993–1011 (2018). https://doi.org/10.1007/s10796-018-9833-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-018-9833-z

Keywords

Navigation