Common statistical patterns in urban terrorism

The underlying reasons behind modern terrorism are seemingly complex and intangible. Despite diverse causal mechanisms, research has shown that there exists general statistical patterns at the global scale that can shed light on human confrontation behaviour. While many policing and counter-terrorism operations are conducted at a city level, there has been a lack of research in building city-level resolution prediction engines based on statistical patterns. For the first time, the paper shows that there exist general commonalities between global cities under frequent terrorist attacks. By examining over 30 000 geo-tagged terrorism acts over 7000 cities worldwide from 2002 to today, the results show the following. All cities experience attacks A that are uncorrelated to the population and separated by a time interval t that is negative exponentially distributed with a death-toll per attack that follows a power-law distribution. The prediction parameters yield a high confidence of explaining up to 87% of the variations in frequency and 89% in the death-toll data. These findings show that the aggregate statistical behaviour of terror attacks are seemingly random and memoryless for all global cities. They enabled the author to develop a data-driven city-specific prediction system, and we quantify its information-theoretic uncertainty and information loss. Further analysis shows that there appears to be an increase in the uncertainty over the predictability of attacks, challenging our ability to develop effective counter-terrorism capabilities.

3) GTD database. There probably needs to be a short discussion about how terrorism is defined (especially because the scientific and political use of the term can differ) in the context of the GTD database and the potential overlap with events that one could also consider asymmetric warfare (especially events in Iraq and Afghanistan). Also a short note on why the whole GTD is used and not a particular subset (e.g. target type or attack type) would be helpful. 4) Page 6 last paragraph. The manuscript claims that `the distribution given the attack intensity is robust across different urban scales and climates'. It is not clear how this robustness is established. 5) Page 7 first paragraph. The paragraph states that sequential attacks in each city are unrelated and that a `possible reason is that each terrorist attack depends on a large number of variables (i.e., organization, logistics, finance, personal, evading detection, and opportunity). This raises question about the general modeling approach. Couldn't the relationship between number of attacks and time interval be modeled controlling for a number of factors?
6) The motivation for using the Kullback-Leibler Divergence to assess model fit/performance is not clear. How is this better than just comparing expected time intervals to true intervals (e.g. RMSE). In addition, it was not clear whether the predictions are assessed within the training sample or whether they were assessed on a test sample. It would be recommendable to assess predictions on a test-sample.
7) The motivation to use the spectrogram analysis could be made clearer on page 3 sixth paragraph. The main finding is that the growth in death-tolls from terrorist attacks is `due to an increase in the number of slower and bigger casualty attacks'. It would be good to explore or at least discuss to which extent this might be due to coding bias in GTD (see Iraq/Afghanistan and temporal trends in the data.).

09-Jul-2019
Dear Dr Guo, The editors assigned to your paper ("Common Statistical Patterns in Urban Terrorism") have now received comments from reviewers. We would like you to revise your paper in accordance with the referee and Associate Editor suggestions which can be found below (not including confidential reports to the Editor). Please note this decision does not guarantee eventual acceptance.
Please submit a copy of your revised paper before 01-Aug-2019. Please note that the revision deadline will expire at 00.00am on this date. If we do not hear from you within this time then it will be assumed that the paper has been withdrawn. In exceptional circumstances, extensions may be possible if agreed with the Editorial Office in advance. We do not allow multiple rounds of revision so we urge you to make every effort to fully address all of the comments at this stage. If deemed necessary by the Editors, your manuscript will be sent back to one or more of the original reviewers for assessment. If the original reviewers are not available, we may invite new reviewers.
To revise your manuscript, log into http://mc.manuscriptcentral.com/rsos and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision. Revise your manuscript and upload a new version through your Author Centre.
When submitting your revised manuscript, you must respond to the comments made by the referees and upload a file "Response to Referees" in "Section 6 -File Upload". Please use this to document how you have responded to the comments, and the adjustments you have made. In order to expedite the processing of the revised manuscript, please be as specific as possible in your response.
In addition to addressing all of the reviewers' and editor's comments please also ensure that your revised manuscript contains the following sections as appropriate before the reference list: • Ethics statement (if applicable) If your study uses humans or animals please include details of the ethical approval received, including the name of the committee that granted approval. For human studies please also detail whether informed consent was obtained. For field studies on animals please include details of all permissions, licences and/or approvals granted to carry out the fieldwork.
• Data accessibility It is a condition of publication that all supporting data are made available either as supplementary information or preferably in a suitable permanent repository. The data accessibility section should state where the article's supporting data can be accessed. This section should also include details, where possible of where to access other relevant research materials such as statistical tools, protocols, software etc can be accessed. If the data have been deposited in an external repository this section should list the database, accession number and link to the DOI for all data from the article that have been made publicly available. Data sets that have been deposited in an external repository and have a DOI should also be appropriately cited in the manuscript and included in the reference list.
If you wish to submit your supporting data or code to Dryad (http://datadryad.org/), or modify your current submission to dryad, please use the following link: http://datadryad.org/submit?journalID=RSOS&manu=RSOS-190645 • Competing interests Please declare any financial or non-financial competing interests, or state that you have no competing interests.
• Authors' contributions All submissions, other than those with a single author, must include an Authors' Contributions section which individually lists the specific contribution of each author. The list of Authors should meet all of the following criteria; 1) substantial contributions to conception and design, or acquisition of data, or analysis and interpretation of data; 2) drafting the article or revising it critically for important intellectual content; and 3) final approval of the version to be published.
All contributors who do not meet all of these criteria should be included in the acknowledgements.
We suggest the following format: AB carried out the molecular lab work, participated in data analysis, carried out sequence alignments, participated in the design of the study and drafted the manuscript; CD carried out the statistical analyses; EF collected field data; GH conceived of the study, designed the study, coordinated the study and helped draft the manuscript. All authors gave final approval for publication.
• Acknowledgements Please acknowledge anyone who contributed to the study but did not meet the authorship criteria.
• Funding statement Please list the source of funding for each author.
Once again, thank you for submitting your manuscript to Royal Society Open Science and I look forward to receiving your revision. If you have any questions at all, please do not hesitate to get in touch. Comments to the Author(s) I have read the manuscript RSOS-190645 `Common Statistical Patterns in Urban Terrorism' with much interest and it presents important insights to the relationship between the interval and intensity of terrorist attacks. However, there are some points that, from my perspective, need to be addressed before publication. 1) Framing: The framing of the paper could highlight the main contribution more clearly. My reading of the paper is that the findings in regard to the interval between attacks are most central and that the intensity findings (Results a page 3) are more of a side note.
2) Literature: There is an early literature on the linkage between severity and duration of conflict that might be relevant for this this manuscript to consider. Weiss, 1993;Klingberg 1966;Voevodsky, 1969. 3) GTD database. There probably needs to be a short discussion about how terrorism is defined (especially because the scientific and political use of the term can differ) in the context of the GTD database and the potential overlap with events that one could also consider asymmetric warfare (especially events in Iraq and Afghanistan). Also a short note on why the whole GTD is used and not a particular subset (e.g. target type or attack type) would be helpful. 4) Page 6 last paragraph. The manuscript claims that `the distribution given the attack intensity is robust across different urban scales and climates'. It is not clear how this robustness is established. 5) Page 7 first paragraph. The paragraph states that sequential attacks in each city are unrelated and that a `possible reason is that each terrorist attack depends on a large number of variables (i.e., organization, logistics, finance, personal, evading detection, and opportunity). This raises question about the general modeling approach. Couldn't the relationship between number of attacks and time interval be modeled controlling for a number of factors?
6) The motivation for using the Kullback-Leibler Divergence to assess model fit/performance is not clear. How is this better than just comparing expected time intervals to true intervals (e.g. RMSE). In addition, it was not clear whether the predictions are assessed within the training sample or whether they were assessed on a test sample. It would be recommendable to assess predictions on a test-sample.
7) The motivation to use the spectrogram analysis could be made clearer on page 3 sixth paragraph. The main finding is that the growth in death-tolls from terrorist attacks is `due to an increase in the number of slower and bigger casualty attacks'. It would be good to explore or at least discuss to which extent this might be due to coding bias in GTD (see Iraq/Afghanistan and temporal trends in the data.).

Recommendation?
Accept as is

Comments to the Author(s)
The revisions sufficiently address my fundamental concerns about engaging with the existing scholarship on terrorism. The theoretical thrust of the paper is now much clearer and can offer citeable findings for the relevant sub-disciplines in terrorism research. Furthermore, the authors have clarified their findings on the memorlyess aspect of terrorism and report to what degree spatial and temporal dynamics can and cannot explain the occurrence of terrorist attacks. In addition, these authors are now well-defended against possible attacks from the social science terrorism research fields on 'reinventing the wheel', because they address the most core works in this field. I find this study novel in its methods and empirical testing of existing theories of terrorism and believe that the study, in its current form, is a rigorous example of how computer science can contribute to our understanding of violence and how social scientists can use computational tools to harness violence event data.
I recommend acceptance in its current form.

Comments to the Author(s)
Thank you for addressing my concerns effectively.

16-Aug-2019
Dear Dr Guo, I am pleased to inform you that your manuscript entitled "Common Statistical Patterns in Urban Terrorism" is now accepted for publication in Royal Society Open Science.
You can expect to receive a proof of your article in the near future. Please contact the editorial office (openscience_proofs@royalsociety.org and openscience@royalsociety.org) to let us know if you are likely to be away from e-mail contact --if you are going to be away, please nominate a coauthor (if available) to manage the proofing process, and ensure they are copied into your email to the journal.
Due to rapid publication and an extremely tight schedule, if comments are not received, your paper may experience a delay in publication.
Royal Society Open Science operates under a continuous publication model (http://bit.ly/cpFAQ). Your article will be published straight into the next open issue and this will be the final version of the paper. As such, it can be cited immediately by other researchers. As the issue version of your paper will be the only version to be published I would advise you to check your proofs thoroughly as changes cannot be made once the paper is published. Comments to the Author(s) The revisions sufficiently address my fundamental concerns about engaging with the existing scholarship on terrorism. The theoretical thrust of the paper is now much clearer and can offer citeable findings for the relevant sub-disciplines in terrorism research. Furthermore, the authors have clarified their findings on the memorlyess aspect of terrorism and report to what degree spatial and temporal dynamics can and cannot explain the occurrence of terrorist attacks. In addition, these authors are now well-defended against possible attacks from the social science terrorism research fields on 'reinventing the wheel', because they address the most core works in this field. I find this study novel in its methods and empirical testing of existing theories of terrorism and believe that the study, in its current form, is a rigorous example of how computer science can contribute to our understanding of violence and how social scientists can use computational tools to harness violence event data.
I recommend acceptance in its current form.
Follow Royal Society Publishing on Twitter: @RSocPublishing Follow Royal Society Publishing on Facebook: https://www.facebook.com/RoyalSocietyPublishing.FanPage/ Read Royal Society Publishing's blog: https://blogs.royalsociety.org/publishing/ This paper aims to tackle an important question: what drives spatial and temporal dynamics of terrorist attacks. In doing so, the study relies on Global Terrorism Database (GTD) through 2002-14, by focusing on 30,000 geo-tagged terrorism acts over 7000 cities worldwide. I'm happy to see that hard sciences scholars are developing interest in what is primarily and essentially a social sciences research question. The technical aspect of the paper is sound, but my comments are mostly directed towards how this paper can contribute to terrorism research by engaging more with the terrorism scholarship and by better contextualizing its results.
1. The paper will benefit a lot from engaging newer studies on the role of geography on terrorism to avoid re-inventing the wheel on how spatio-temporal dynamics affect conflict.
Some good examples that both offer significant results and also provide excellent literature reviews for the author's benefit are: "Early detection of terrorism outbreaks using prospective space-time scan statistics." The Professional Geographer 65, no. 4 (2013): 676-691. 2. The theoretical thrust of the paper is that there are tangible spatial determinants of terrorism at the city-level, and that 'terror attacks are seemingly random and memoryless for all global cities.' And that 'here appears to be an increase in the uncertainty over the predictability of attacks, challenging our ability to develop effective counter-terrorism capabilities.' Both of these findings are already well-evidenced in the existing terrorism literature. The paper has to make a more convincing case to argue that the spatial analysis conducted here goes beyond our existing limitations on predicting terrorist attacks in the cities. Mainly, by engaging with the scholarship above (and other relevant ones they cite), the study has to offer a truly new finding so that terrorism researchers can benefit and most importantly, cite.
3. The paper reports 'a growing uncertainty hidden in the complex process'. However, if the study will report the same uncertainty that other past studies did, it has to answer what  -9/11 (2002-14) terrorism patterns alone will provide us with overwhelming Islamist terrorism data. In contrast, prior to the Cold War, ideological drivers of terrorism used to be more nuanced. The paper must acknowledge that looking at post-9/11 will not provide us with a 'general theory of terrorism', but a temporally specific and contextual theory on Islamic terrorism alone. This is fine, because data became more available as US government agencies began funding dedicated research and data initiatives (like GTD) in the aftermath of 9-11, but of course, this temporal selection can't be generalized. The study has to clarify this critical point by perhaps changing the title and the abstract to let readers know that it is engaging overwhelmingly with Islamist terrorism. This is a very crucial distinction, because the study doesn't use the data on IRA, ETA and other European-American terrorist groups active through the Cold War.
6. 'The interesting observation is that most previous studies have considered low resolution conflicts (major wars) that span over 100 years, and it seems that the power law distribution remains valid even for high resolution terrorism and non-conventional conflict data in the modern era.' -This sentence is unclear. Does the author mean 'lowresolution conflict' to imply less granular data at the national-level? If so, sub-national data are in fact pretty much on the rise in terrorism studies; most newer studies combine national-level with sub-national (regional) data. This is not a valid criticism of the studies in the field in the last several years. behavior in Iraq diverges from those in Syria, versus Ukraine, and others do so because such behavior is non-random. These are driven by intra-organizational dynamics (leadership challenge, preventing splits and bolster group identity), external resilience (capacity of state security agencies, terrain, social cohesion and social support for terrorist groups) and ideology of the group (religious, ethno-nationalist etc.) Where and how frequent ISIS attacks in Iraq is different than its behavior in Syria. There are even further divergences between terrorism in urban areas of failed (failing) states, democracies and authoritarian countries. Trying to impose a general 'supra-level' explanation to these varieties will yield inconclusive results, and the study reports this inconclusivity itself. The study can escape this deadlock by cross-checking with the UCDP/PRIO data (or ACLED) and see whether non-terrorist sub-national violence has any direct correlation with terrorism. This will enable the author to distinguish between areas that are suffering BOTH from high intensity of conflict AND terrorism, and those where the two phenomena are not directly related (high violence, low terrorism or low violence, high terrorism). 8. I think the methodical sophistication of the paper should more directly and robustly interact with the terrorism studies literature, especially with those studies that explore how spatio-temporal dynamics interact with the more relevant and important HUMAN and SOCIAL drivers of terrorism.
While I understand the author's models, I'm a social scientist and the models themselves must be reviewed by a reviewer from mathematics, statistics or econometrics fields. But revising the paper in line of above comments will make this a highly citable and useful study for the terrorism literature.  critical point by perhaps changing the title and the abstract to let readers know that it is engaging overwhelmingly with Islamist terrorism. This is a very crucial distinction, because the study doesn't use the data on IRA, ETA and other European-American terrorist groups active through the Cold War.

Response to Reviewer Comments
Response 5. Thank you for this useful comment. As you pointed out, our main motivation for post 9/11 data analysis is the more consistent quality of geo-tagged event data across the whole world. As it is across the whole world, a significant portion of that data is not "War or Terror" related (e.g. Colombia, Narco-War Mexico, political violence in Thailand & India). We have clarified this point and hope the reviewer can agree (to an extent) that our analysis has some degree of generality to it, even if some of the top violence cities are in the war on terror area.
Comment 6. 'The interesting observation is that most previous studies have considered low resolution conflicts (major wars) that span over 100 years, and it seems that the power law distribution remains valid even for high resolution terrorism and non-conventional conflict data in the modern era.' -This sentence is unclear. Does the author mean 'lowresolution conflict' to imply less granular data at the national-level? If so, sub-national data are in fact pretty much on the rise in terrorism studies; most newer studies combine national-level with sub-national (regional) data. This is not a valid criticism of the studies. in the field in the last several years.
Please check: Urdal, Henrik. "Population, resources, and political violence: A subnational study of India, 1956-2002." Journal of Conflict Resolution 52, no. 4 (2008 19, no. 1 (1994): 5-40. Witmer, Frank DW, Andrew M. Linke, John O'Loughlin, Andrew Gettelman, and Arlene Laing. "Subnational violent conflict forecasts for sub-Saharan Africa, 2015-65, using climate-sensitive models." Journal of Peace Research 54, no. 2 (2017 Response 6. Dear reviewer, yes this is our claim. You are right that most analysis is now focused on sub-national level (economic regions, political zones), but very few studies are city/town specific. This is largely because the confounding causal mechanisms of interest are often not available at city or settlement level. However, we are still interested in (for this paper) on whether statistical laws hold and how a prediction algorithm can be developed for city governors. We have included your recommendations above and updated our literature review to reflect this.
Comment 7. When and where terrorists attack are very context-specific behavioral types. Terrorist behavior in Iraq diverges from those in Syria, versus Ukraine, and others do so because such behavior is non-random. These are driven by intraorganizational dynamics (leadership challenge, preventing splits and bolster group identity), external resilience (capacity of state security agencies, terrain, social cohesion and social support for terrorist groups) and ideology of the group (religious, ethno-nationalist etc.) Where and how frequent ISIS attacks in Iraq is different than its behavior in Syria. There are even further divergences between terrorism in urban areas of failed (failing) states, democracies and authoritarian countries. Trying to impose a general 'supra-level' explanation to these varieties will yield inconclusive results, and the study reports this inconclusivity itself. The study can escape this deadlock by cross-checking with the UCDP/PRIO data (or ACLED) and see whether non-terrorist sub-national violence has any direct correlation with terrorism. This will enable the author to distinguish between areas that are suffering BOTH from high intensity of conflict AND terrorism, and those where the two phenomena are not directly related (high violence, low terrorism or low violence, high terrorism).
Response 7. Dear reviewer, indeed I agree with your view point. I think there is value in getting a supra-level understanding. We have shown that the attacks, even across different urban locations and conflict genres, belong a common random pattern. This could make sense (as it has been shown for many other fields), if the multiple factors that contribute to it are numerous and independently distributed. Take buses arriving in busy cities (which are never on time). This has been shown to follow the same memoryless distribution as terrorism, because the multitude of factors that affect it (traffic, driver behaviour, route, passenger behaviour) are all broadly independent. As a result, the compound effect is a random distribution that is common across all bus arrivals. We have tried to capture the value of our statistical approach in this paper by saying the following in the Introduction: Statistical analysis of complex processes, even across diverse genres and mechanisms have value in data driven prediction. It has been shown that many complex processes with a multitude of different causal factors can exhibit common statistical patterns that aid prediction, e.g. bus arrival time in busy urban areas.
As such, whilst I totally agree with you that the detailed mechanisms are important and distinguishes violence across genres and mechanisms but having a statistical understanding of the overall pattern is also useful from a data-driven prediction framework perspective.
Comment 8. I think the methodical sophistication of the paper should more directly and robustly interact with the terrorism studies literature, especially with those studies that explore how spatio-temporal dynamics interact with the more relevant and important HUMAN and SOCIAL drivers of terrorism. While I understand the author's models, I'm a social scientist and the models themselves must be reviewed by a reviewer from mathematics, statistics or econometrics fields. But revising the paper in line of above comments will make this a highly citable and useful study for the terrorism literature.
Response 8. I believe Reviewer 2 is a physical science / computer science reviewer, and his/her comments have been addressed below. I have endeavored to ensure that the statistical models are sound, and that there is a relevance and contribution to the humanities and social science research, as you point out.

I have read the manuscript RSOS-190645 `Common Statistical Patterns in Urban
Terrorism' with much interest and it presents important insights to the relationship between the interval and intensity of terrorist attacks. However, there are some points that, from my perspective, need to be addressed before publication.
General Response. Dear reviewer, thank you for taking the interest and reviewing my manuscript. I have gone through your comments and tried to address the comments. I have highlighted revised new or significantly changed text in blue.
Comment 1) Framing: The framing of the paper could highlight the main contribution more clearly. My reading of the paper is that the findings in regard to the interval between attacks are most central and that the intensity findings (Results a page 3) are more of a side note.
Response 1) Indeed, and we have now sharped this to say in the Contributions section: Here, we show that despite diverse conflict genres and multiple confounding mechanisms in play, all global cities suffer attacks describable by a common statistical pattern. The memoryless nature of this pattern suggests that multiple causal mechanisms are independent to each other and that prediction is not helped by the knowledge of previous attacks.
Comment 2) Literature: There is an early literature on the linkage between severity and duration of conflict that might be relevant for this this manuscript to consider. Weiss, 1993;Klingberg 1966;Voevodsky, 1969.
Response 2) Thank you, we have added some of these relevant references to the Introduction along with other references recommended by Reviewer 1. The older studies tend to focus on significant wars across long time scales, as opposed to detailed geo-tagged event data of today.
Comment 3) GTD database. There probably needs to be a short discussion about how terrorism is defined (especially because the scientific and political use of the term can differ) in the context of the GTD database and the potential overlap with events that one could also consider asymmetric warfare (especially events in Iraq and Afghanistan). Also a short note on why the whole GTD is used and not a particular subset (e.g. target type or attack type) would be helpful.
Response 3) Thank you, this is very useful comment and also tangentially pointed out by Reviewer 1. Since post-Cold War, violence between terrorism, politics, criminal enterprise (e.g. narcotics) has become interleaved. Often, trans-national organisations like ISIS participate in all above aspects. As such, studies have shown that it has become difficult to separate the different genres of violence both statistically and contextually \cite{Findley12}. Therefore, it makes sense to consider GTD in its entirety, which is the violence between a non-state actor and other targets (state or non-state).
We have clarified this where introduced GTD data and also in discussions.

Reviewer: 2
Comment 4) Page 6 last paragraph. The manuscript claims that `the distribution given the attack intensity is robust across different urban scales and climates'. It is not clear how this robustness is established.

Response 4)
We have added a comment to see the robustness of K-L divergence in Figure 4, which shows information loss from top 50 cities across different geographies, conflict genres, and conflict sizes. Here, we replot Figure 3 for different longitudes, and we cans see that the information loss (-1.2 to -2.8 nats) is relatively small across diverse geographies. We have added this plot to the Methods section.
Comment 5) Page 7 first paragraph. The paragraph states that sequential attacks in each city are unrelated and that a `possible reason is that each terrorist attack depends on a large number of variables (i.e., organization, logistics, finance, personal, evading detection, and opportunity). This raises question about the general modeling approach. Couldn't the relationship between number of attacks and time interval be modelled controlling for a number of factors?
Response 5) If I understand you correctly, you are asking if we can check the contribution of each of the factors? This has indeed been done already extensively in literature. Some factors are latent/hidden, others are not available at a city level granularity -which is why most studies are at national or regional level. This paper tries to find an overall statistical pattern -which is still useful. As discussed with Reviewer 1, whilst I totally agree with you that the detailed mechanisms are important and distinguishes violence across genres and mechanisms but having a statistical understanding of the overall pattern is also useful from a data-driven prediction framework perspective. We discuss this in the paper: Statistical analysis of complex processes, even across diverse genres and mechanisms have value in data driven prediction. It has been shown that many complex processes with a multitude of different causal factors can exhibit common statistical patterns that aid prediction, e.g. bus arrival time in busy urban areas.