Location reference identification from tweets during emergencies: A deep learning approach

https://doi.org/10.1016/j.ijdrr.2018.10.021Get rights and content

Abstract

Twitter is recently being used during crises to communicate with officials and provide rescue and relief operation in real time. The geographical location information of the event, as well as users, are vitally important in such scenarios. The identification of geographic location is one of the challenging tasks as the location information fields, such as user location and place name of tweets are not reliable. The extraction of location information from tweet text is difficult as it contains a lot of non-standard English, grammatical errors, spelling mistakes, non-standard abbreviations, and so on. This research aims to extract location words used in the tweet using a Convolutional Neural Network (CNN) based model. We achieved the exact matching score of 0.929, Hamming loss of 0.002, and F1-score of 0.96 for the tweets related to the earthquake. Our model was able to extract even three- to four-word long location references which is also evident from the exact matching score of over 92%. The findings of this paper can help in early event localization, emergency situations, real-time road traffic management, localized advertisement, and in various location-based services.

Introduction

Tweets are very responsive to real-world events, and are sometimes even more immediate than traditional news channels. Therefore, it is possible to keep track of the latest information by following tweets. Several examples were seen when the news was first reported on Twitter, such as an airplane crash over the Hudson River in New York in the year 2009 [70], the death of former British Prime Minister Margaret Thatcher in April 2013,1 and the explosions at the Boston Marathon 20131. In recent years, Twitter has been used extensively in the course of natural and human-made disasters such as earthquakes, floods, fire, terrorist attacks, civil unrest, and so on [3], [38], [40], [39], [48], [50], [51], [70], [74], [82]. The government and non-government agencies use Twitter in case of crisis so that different rescue operations can leap into action, disseminate information to the wider audience, and recognize floor reality [27], [26], [38], [40], [39], [69], [70], [86]. In an American Red Cross survey, a question was asked to individuals that “whom they contacted in an emergency?” Twenty-eight percent of Americans turned to Twitter for help if they were unable to reach the emergency contact number (911).2 Twitter is also used in real time road traffic monitoring [18], [21], event localization [16], [64], and in various location-based services [25], [80]. The estimation and detection of location information of events and users from tweets are a major concern in relation to the above-mentioned tasks.

Twitter provides three location information fields for sharing a user's location: (1) User location; (2) Place name; and (3) Geo-coordinate. The user location field has 140 character spaces (previously it was limited to 30 characters) in which the user can write his/her home location information while creating their profile. This field is optional to the user and the user can write any arbitrary words or leave it blank. In many instances, they write meaningless words that might not refer to any location name. Hecht et al. [19] analyzed that 34% of users do not reveal their “user location” information. Cheng et al. [7] found that only 26% of users use city level or below city level location names in their user location field. However, this field can not be treated as the current location of the user as it is entered at the time of creating their profile and most of the time not updated by the users regularly. The second field is for the “place name,” which can be attached to a tweet when it is posted. The place name is represented by a location name with an array of the latitude-longitude pair in the form of the location's boundary coordinates. These place names are predefined on the Twitter database, but it does not provide granular location information. Kumar et al. [36] found that only 47.33% of tweets contain place names. However, 12% of those place names are incorrect in terms of their spatiotemporal information. The third field provided by Twitter is for the “geo-coordinates” (geographical footprints of latitude and longitude) that can be attached at the time of posting a tweet using a GPS- (Global Positioning System) enabled device. Most of the researchers [23], [57], [83] have considered geo-coordinates as the most explicit and precise information, i.e., tweets associated with latitude-longitude information. However, tweets with geo-coordinate information are infrequent. Cheng et al. [7], Morstatter et al. [55], and Kumar et al. [36] determined that only 0.42%, 3.17%, and 7.90% of tweets respectively are geo-tagged. Kumar et al. [36] further reported that although geo-coordinates are the most precise location information, they are not always authentic in terms of their spatiotemporal information if the tweet is posted from third-party applications such as Instagram3 etc. Hence, all three location information fields, available in tweets and user accounts, have their own limitations and cannot be completely relied on.

Along with the location fields mentioned above, people also make location references in their tweet texts when asking for help or reporting the event of a disaster. It is found that people from a disaster-related area tend to use their location information in their tweet text [79]. The available location information in tweet texts is vitally important as it represents the location information of any event or user during emergencies. Hence, the location information mentioned in the tweet text may be considered as the most authentic source of geographic evidence in an emergency. The tweet text is a free-text field limited to 280 characters (previously it was 140 characters). Location information from these tweet texts can be extracted using either the gazetteer-based approach [30], [41], [49], [53], [71], [84] or the Named Entity Recognition (NER) based approach [15], [16], [78]. Gazetteer is a corpus of location names (e.g., GeoNames4). In the gazetteer-based approach, the words of tweets are looked up in the gazetteer to find the location names. However, there are some inherent problems with this approach: (i) the unavailability of gazetteers for all the regions; and (ii) a location name mentioned in the text may have some other non-geographic meaning in the context of a text e.g., the word “Reading” may refer to a location name in England or it may also be used in another context. The other problem with this approach is the geo-ambiguity (distinct locations have the same name, e.g., Paris has 140 possibilities). The second approach is Named Entity Recognition (NER). The NER technique generally tokenizes each word of the tweet using language-specific part-of-speech tagging, then it detects the group of words that probably refer to named entities. This approach works well for well-written English sentences, but it does not work well for tweet texts as they have several grammatical mistakes, nonstandard abbreviation, and spelling mistakes [1], [15], [63], [85]. Temnikova et al. [77] did an extensive analysis on the readability of tweets during the crisis and suggested several recommendations for writing understandable tweets. In many cases, a number of English language rules are violated e.g., the first letter of the proper nouns are not usually written in capital letters. Also, the grammar is not correct in many scenarios e.g., missing prepositions. Further, most users do not use the correct spelling in their tweets. They often write words in short by removing the vowels from words. To resolve the aforementioned problems and find the location references, several efforts have been made by researchers, such as Lingad et al. [44], who re-trained the Named Entity Recognition tool for the Twitter environment, [42], [46], [68], they re-built their own Named Entity Recognition framework. Some other works also combined the gazetteer and NER approaches to find named entities from tweets [14], [52].

In most of the earlier work, NER-based approaches used POS tagging and extracted all named entities such as person name, product, group, corporation, location, etc. In the current work, we are concentrating on the extraction of location words ignoring other named entities. For this, instead of using POS tagging, we train a Convolutional Neural Network- (CNN) based system to extract location names present in the tweet. We represent the tweet text as normal sentences and highlight the words containing location information. We assume that there is already a system that filters tweets based on their relatedness to a particular event. Several works have been reported regarding this [9], [27], [59], [61], [74]. Once the tweets are found to be related to the event, our model finds the location referring words in that tweet. We present this problem under the supervised learning paradigm. A dataset of tweets and their corresponding location words are made to train a system. Since the input is a sentence (tweet text), we had several options, such as LDA [5], PLSA [22], and word embedding [65] to represent the sentence. LDA and PLSA are generative statistical models that can represent a document as a mixture of a small number of topics. They are widely used for grouping tweets related to a specific event. Our target is to preserve the sentence structure so that the corresponding word number can be marked as a location word or not. This is why we prefer word embedding over other techniques, such as LDA or PLSA.

As a supervised learning model, there are several options starting from simple machine-learning models, such as SVM, Naive Bayes, Random Forest to deep-learning models, such as the Recurrent Neural Network (RNN), the Convolutional Neural Network (CNN). The machine-learning model requires some features to learn to associate input with output. Therefore, the performance of these systems heavily depends on the feature engineering. This is why we choose deep learning models. RNNs are good for sequential or long-text data. Tweets have short sentences, which favor the use of CNN over RNN. The intuition behind using CNN is that the convolutional layer can automatically learn the better representation of input data and then dense layers can utilize these input representations to identify location references. Our objective for the current work is formulated as: (i) Find whether a tweet contains a location name; and (ii) If there are location names present in a tweet, then highlight those words.

The remainder of the paper is organized as follows: in Section 2, we briefly present the related literature. Our proposed framework is presented in Section 3. The finding of the proposed system is presented in Section 4. Section 5 contains discussion about the results and implications of the current research. We conclude the paper in Section 6.

Section snippets

Related work

Recently, a number of works have been reported for better utilization of social media for emergency purposes [1], [3], [26], [63], [72], [85]. Olteanu et al. [62] investigated several natural hazards and human-induced disasters in a systematic way to better understand the effective use of social media for information gathering processes during emergencies. Most of the existing works focus on event detection and location estimation of events or users during emergencies. In event detection, some

Methodology

The proposed convolutional neural network-based model learns the continuous representation of tweets and then picks salient features from them to predict the location names present in the tweets. The proposed architecture has three parts: (i) word embedding that represent tweets in the vector form; (ii) convolutional model that learns the salient features from the tweets representation; and (iii) a fully connected layer that interprets the extracted features to predict the output. The detailed

Result

We performed several experiments to evaluate our proposed model and extract location words from the earthquake-related tweets. To minimize the bias, we used 10-fold cross validation [35]. It is a technique to randomly partition the data sample into ten equal subsamples in which one subsample is used to validate the system, whereas the remaining nine subsamples are used to train the model. This process is repeated ten times, with each of the ten subsamples used just once as the validation data.

Discussion and implications

In this work, we proposed a Convolutional Neural Network- (CNN) based model that learns the salient features from tweets to predict the location references mentioned in it. The proposed CNN-based models could extract the location references from the tweets with significant accuracy. In our case, the proposed CNN-based model can find the location information of almost every granularity, such as streets, buildings, the city, district, and country name with very significant accuracy. The use of

Conclusion

The extraction of location information from the tweets is a challenging task as tweets have various noise in terms of grammatical mistakes, spelling mistakes, and non-standard abbreviations. We have proposed a convolutional neural network-based model for finding location references present in the tweets. We used earthquake-related tweets and performed our implementation with several configurations of convolution layers with the dense layers. We achieved our best result with an F1-score of 0.96

References (86)

  • D.M. Blei et al.

    Latent dirichlet allocation

    J. Mach. Learn. Res.

    (2003)
  • F. Charte et al.

    Working with multilabel datasets in r: the mldr package

    R. J.

    (2015)
  • Z. Cheng, J. Caverlee, K. Lee, You are where you tweet: a content-based approach to geo-locating twitter users,...
  • J.P. Chiu, E. Nichols, Named entity recognition with bidirectional lstm-cnns, arXiv:1511.08308,...
  • S.R. Chowdhury et al.

    Tweet4act: using incident-specific profiles for classifying crisis-related messages

    ISCRAM Citeseer

    (2013)
  • R. Collobert et al.

    Natural language processing (almost) from scratch

    J. Mach. Learn. Res.

    (2011)
  • P. Däniken, M. Cieliebak, Transfer learning and sentence level features for named entity recognition on tweets,...
  • T.H. Do, D.M. Nguyen, E. Tsiligianni, B. Cornelis, N. Deligiannis, Multiview deep learning for predicting twitter users...
  • R. Dutt, K. Hiware, A. Ghosh, R. Bhaskaran, Savitr: a system for real-time location extraction from microblogs during...
  • J. Gelernter et al.

    An algorithm for local geoparsing of microtext

    GeoInformatica

    (2013)
  • J. Gelernter et al.

    Geo-parsing messages from microtext

    Trans. GIS

    (2011)
  • P. Giridhar, T. Abdelzaher, J. George, L. Kaplan, On quality of event localization from social network feeds, in: IEEE...
  • Y. Goldberg

    A primer on neural network models for natural language processing

    J. Artif. Intell. Res. (JAIR)

    (2016)
  • B. Hecht, L. Hong, B. Suh, E.H. Chi, Tweets from justin bieber’s heart: the dynamics of the location field in user...
  • T. Hoang, P.H. Cher, P.K. Prasetyo, E.-P. Lim, Crowdsensing and analyzing micro-event tweets for public transportation...
  • T. Hofmann, Probabilistic latent semantic analysis, in: Proceedings of the Fifteenth Conference on Uncertainty in...
  • Q. Huang, G. Cao, C. Wang, From where do tweets originate?: a GIS approach for user location inference, in: Proceedings...
  • Z. Huang, W. Xu, K. Yu, Bidirectional lstm-crf models for sequence tagging, arXiv:1508.01991,...
  • Y. Ikawa, M. Enoki, M. Tatsubori, Location inference using microblog messages, in: Proceedings of the 21st...
  • M. Imran et al.

    Processing social media messages in mass emergency: a survey

    ACM Comput. Surv. (CSUR)

    (2015)
  • M. Imran, C. Castillo, J. Lucas, P. Meier, J. Rogstadius, Coordinating human and machine intelligence to classify...
  • M. Imran, C. Castillo, J. Lucas, P. Meier, S. Vieweg, Aidr: artificial intelligence for disaster response,...
  • M. Imran, S. Elbassuoni, C. Castillo, F. Diaz, P. Meier, Extracting information nuggets from disaster-related messages...
  • M. Itoh, N. Yoshinaga, M. Toyoda, Spatio-temporal event visualization from a geo-parsed microblog stream, in: Companion...
  • D. Jurgens et al.

    Geolocation prediction in twitter using social networks: a critical analysis and review of current practice

    ICWSM

    (2015)
  • N. Kalchbrenner, E. Grefenstette, P. Blunsom, A convolutional neural network for modelling sentences, arXiv:1404.2188,...
  • S. Karimi, J. Yin, Microtext annotation. Technical Report Technical Report EP13703, CSIRO,...
  • D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, arXiv:1412.6980,...
  • R. Kohavi, et al. A study of cross-validation and bootstrap for accuracy estimation and model selection, Ijcai,...
  • A. Kumar, J.P. Singh, N.P. Rana, Authenticity of geo-location and place name in tweets, in: Proceedings of the 23rd...
  • G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, C. Dyer, Neural architectures for named entity recognition,...
  • F. Laylavi et al.

    A multi-element approach to location inference of twitter: a case for emergency response

    ISPRS Int. J. Geo-Inf.

    (2016)
  • C. Li, A. Sun, Fine-grained location extraction from tweets with temporal awareness, in: Proceedings of the 37th...
  • Cited by (0)

    View full text