Location reference identification from tweets during emergencies: A deep learning approach
Introduction
Tweets are very responsive to real-world events, and are sometimes even more immediate than traditional news channels. Therefore, it is possible to keep track of the latest information by following tweets. Several examples were seen when the news was first reported on Twitter, such as an airplane crash over the Hudson River in New York in the year 2009 [70], the death of former British Prime Minister Margaret Thatcher in April 2013,1 and the explosions at the Boston Marathon 20131. In recent years, Twitter has been used extensively in the course of natural and human-made disasters such as earthquakes, floods, fire, terrorist attacks, civil unrest, and so on [3], [38], [40], [39], [48], [50], [51], [70], [74], [82]. The government and non-government agencies use Twitter in case of crisis so that different rescue operations can leap into action, disseminate information to the wider audience, and recognize floor reality [27], [26], [38], [40], [39], [69], [70], [86]. In an American Red Cross survey, a question was asked to individuals that “whom they contacted in an emergency?” Twenty-eight percent of Americans turned to Twitter for help if they were unable to reach the emergency contact number (911).2 Twitter is also used in real time road traffic monitoring [18], [21], event localization [16], [64], and in various location-based services [25], [80]. The estimation and detection of location information of events and users from tweets are a major concern in relation to the above-mentioned tasks.
Twitter provides three location information fields for sharing a user's location: (1) User location; (2) Place name; and (3) Geo-coordinate. The user location field has 140 character spaces (previously it was limited to 30 characters) in which the user can write his/her home location information while creating their profile. This field is optional to the user and the user can write any arbitrary words or leave it blank. In many instances, they write meaningless words that might not refer to any location name. Hecht et al. [19] analyzed that 34% of users do not reveal their “user location” information. Cheng et al. [7] found that only 26% of users use city level or below city level location names in their user location field. However, this field can not be treated as the current location of the user as it is entered at the time of creating their profile and most of the time not updated by the users regularly. The second field is for the “place name,” which can be attached to a tweet when it is posted. The place name is represented by a location name with an array of the latitude-longitude pair in the form of the location's boundary coordinates. These place names are predefined on the Twitter database, but it does not provide granular location information. Kumar et al. [36] found that only 47.33% of tweets contain place names. However, 12% of those place names are incorrect in terms of their spatiotemporal information. The third field provided by Twitter is for the “geo-coordinates” (geographical footprints of latitude and longitude) that can be attached at the time of posting a tweet using a GPS- (Global Positioning System) enabled device. Most of the researchers [23], [57], [83] have considered geo-coordinates as the most explicit and precise information, i.e., tweets associated with latitude-longitude information. However, tweets with geo-coordinate information are infrequent. Cheng et al. [7], Morstatter et al. [55], and Kumar et al. [36] determined that only 0.42%, 3.17%, and 7.90% of tweets respectively are geo-tagged. Kumar et al. [36] further reported that although geo-coordinates are the most precise location information, they are not always authentic in terms of their spatiotemporal information if the tweet is posted from third-party applications such as Instagram3 etc. Hence, all three location information fields, available in tweets and user accounts, have their own limitations and cannot be completely relied on.
Along with the location fields mentioned above, people also make location references in their tweet texts when asking for help or reporting the event of a disaster. It is found that people from a disaster-related area tend to use their location information in their tweet text [79]. The available location information in tweet texts is vitally important as it represents the location information of any event or user during emergencies. Hence, the location information mentioned in the tweet text may be considered as the most authentic source of geographic evidence in an emergency. The tweet text is a free-text field limited to 280 characters (previously it was 140 characters). Location information from these tweet texts can be extracted using either the gazetteer-based approach [30], [41], [49], [53], [71], [84] or the Named Entity Recognition (NER) based approach [15], [16], [78]. Gazetteer is a corpus of location names (e.g., GeoNames4). In the gazetteer-based approach, the words of tweets are looked up in the gazetteer to find the location names. However, there are some inherent problems with this approach: (i) the unavailability of gazetteers for all the regions; and (ii) a location name mentioned in the text may have some other non-geographic meaning in the context of a text e.g., the word “Reading” may refer to a location name in England or it may also be used in another context. The other problem with this approach is the geo-ambiguity (distinct locations have the same name, e.g., Paris has 140 possibilities). The second approach is Named Entity Recognition (NER). The NER technique generally tokenizes each word of the tweet using language-specific part-of-speech tagging, then it detects the group of words that probably refer to named entities. This approach works well for well-written English sentences, but it does not work well for tweet texts as they have several grammatical mistakes, nonstandard abbreviation, and spelling mistakes [1], [15], [63], [85]. Temnikova et al. [77] did an extensive analysis on the readability of tweets during the crisis and suggested several recommendations for writing understandable tweets. In many cases, a number of English language rules are violated e.g., the first letter of the proper nouns are not usually written in capital letters. Also, the grammar is not correct in many scenarios e.g., missing prepositions. Further, most users do not use the correct spelling in their tweets. They often write words in short by removing the vowels from words. To resolve the aforementioned problems and find the location references, several efforts have been made by researchers, such as Lingad et al. [44], who re-trained the Named Entity Recognition tool for the Twitter environment, [42], [46], [68], they re-built their own Named Entity Recognition framework. Some other works also combined the gazetteer and NER approaches to find named entities from tweets [14], [52].
In most of the earlier work, NER-based approaches used POS tagging and extracted all named entities such as person name, product, group, corporation, location, etc. In the current work, we are concentrating on the extraction of location words ignoring other named entities. For this, instead of using POS tagging, we train a Convolutional Neural Network- (CNN) based system to extract location names present in the tweet. We represent the tweet text as normal sentences and highlight the words containing location information. We assume that there is already a system that filters tweets based on their relatedness to a particular event. Several works have been reported regarding this [9], [27], [59], [61], [74]. Once the tweets are found to be related to the event, our model finds the location referring words in that tweet. We present this problem under the supervised learning paradigm. A dataset of tweets and their corresponding location words are made to train a system. Since the input is a sentence (tweet text), we had several options, such as LDA [5], PLSA [22], and word embedding [65] to represent the sentence. LDA and PLSA are generative statistical models that can represent a document as a mixture of a small number of topics. They are widely used for grouping tweets related to a specific event. Our target is to preserve the sentence structure so that the corresponding word number can be marked as a location word or not. This is why we prefer word embedding over other techniques, such as LDA or PLSA.
As a supervised learning model, there are several options starting from simple machine-learning models, such as SVM, Naive Bayes, Random Forest to deep-learning models, such as the Recurrent Neural Network (RNN), the Convolutional Neural Network (CNN). The machine-learning model requires some features to learn to associate input with output. Therefore, the performance of these systems heavily depends on the feature engineering. This is why we choose deep learning models. RNNs are good for sequential or long-text data. Tweets have short sentences, which favor the use of CNN over RNN. The intuition behind using CNN is that the convolutional layer can automatically learn the better representation of input data and then dense layers can utilize these input representations to identify location references. Our objective for the current work is formulated as: (i) Find whether a tweet contains a location name; and (ii) If there are location names present in a tweet, then highlight those words.
The remainder of the paper is organized as follows: in Section 2, we briefly present the related literature. Our proposed framework is presented in Section 3. The finding of the proposed system is presented in Section 4. Section 5 contains discussion about the results and implications of the current research. We conclude the paper in Section 6.
Section snippets
Related work
Recently, a number of works have been reported for better utilization of social media for emergency purposes [1], [3], [26], [63], [72], [85]. Olteanu et al. [62] investigated several natural hazards and human-induced disasters in a systematic way to better understand the effective use of social media for information gathering processes during emergencies. Most of the existing works focus on event detection and location estimation of events or users during emergencies. In event detection, some
Methodology
The proposed convolutional neural network-based model learns the continuous representation of tweets and then picks salient features from them to predict the location names present in the tweets. The proposed architecture has three parts: (i) word embedding that represent tweets in the vector form; (ii) convolutional model that learns the salient features from the tweets representation; and (iii) a fully connected layer that interprets the extracted features to predict the output. The detailed
Result
We performed several experiments to evaluate our proposed model and extract location words from the earthquake-related tweets. To minimize the bias, we used 10-fold cross validation [35]. It is a technique to randomly partition the data sample into ten equal subsamples in which one subsample is used to validate the system, whereas the remaining nine subsamples are used to train the model. This process is repeated ten times, with each of the ten subsamples used just once as the validation data.
Discussion and implications
In this work, we proposed a Convolutional Neural Network- (CNN) based model that learns the salient features from tweets to predict the location references mentioned in it. The proposed CNN-based models could extract the location references from the tweets with significant accuracy. In our case, the proposed CNN-based model can find the location information of almost every granularity, such as streets, buildings, the city, district, and country name with very significant accuracy. The use of
Conclusion
The extraction of location information from the tweets is a challenging task as tweets have various noise in terms of grammatical mistakes, spelling mistakes, and non-standard abbreviations. We have proposed a convolutional neural network-based model for finding location references present in the tweets. We used earthquake-related tweets and performed our implementation with several configurations of convolution layers with the dense layers. We achieved our best result with an -score of 0.96
References (86)
- et al.
From twitter to detector: real-time traffic incident detection using social media data
Transp. Res. Part C: Emerg. Technol.
(2016) Theory of the backpropagation neural network
Neural Networks for Perception
(1992)- et al.
Using tweets to support disaster planning, warning and response
Saf. Sci.
(2016) - et al.
Event relatedness assessment of twitter messages for emergency response
Inf. Process. Manag.
(2017) - et al.
Crisis information to support spatial planning in post disaster recovery
Int. J. Disaster Risk Reduct.
(2017) - et al.
Feasibility study of using crowdsourcing to identify critical affected areas for rapid damage assessment: hurricane matthew case study
Int. J. Disaster risk Reduct.
(2018) - et al.
A survey of location inference techniques on twitter
J. Inf. Sci.
(2015) - H.S. Al-Olimat, K. Thirunarayan, V. Shalin, A. Sheth, Location name extraction from targeted text streams using...
Social media in disaster risk reduction and crisis management
Sci. Eng. Ethics
(2014)- et al.
A survey of techniques for event detection in twitter
Comput. Intell.
(2015)