Mapping terrorist groups using network analysis: Egypt case study

Purpose – The purpose of this study is twofold; ﬁ rst, it aims to understand the underlying dynamics of the organizations behind the terrorist attacks, and second, to investigate the dynamics of terrorist organizations in relation to one another to detect whether there exist shared patterns of terror between different organizations. Design/methodology/approach – To achieve this purpose, the researcher proposes a computational algorithm that extracts data from global terrorism database (GTD); calculates similarity indices between different terrorist groups; generates a network data ﬁ le from the calculated indices; and apply network analysis techniques to the extracted data. The proposed algorithm includes applying SQL database codes for data extraction, building a tailored C# computer software to calculate similarity indices and generate similarity networks and using GEPHI software to visualize the generated network and calculate network metricsandmeasures. Findings – Applying the proposed algorithm to Egypt, the results reveal different shared patterns of terror among different terrorist groups. This helps us in creating a terror landscape for terrorist groups playing in Egypt. Originality/value – The importance of the study lies in that it proposes a new algorithm that combines network analysiswith other data-manipulation techniquesto generatea network ofsimilarterrorgroups.


Introduction
Terrorism is one of the most challenging issues facing the overall world. Due to the increasing advancement in information and communication technologies, international terrorism becomes more and more a complex phenomenon. To study that complex phenomenon such as international terrorism, we need new ways of analysis. Network analysis is one of the recent ways used to analyze terrorist organizations from within.
Network analysis enables the researcher to understand the internal structure of the terrorist organization, the pattern of communication, the flow and path of information sharing, the path of the command, the sub-groups or main components of this network, the most important actors in the network and the vulnerabilities of this network. Hence, network analysis has been used in counter-terrorism research as well.
However, a few numbers of studies used network analysis in mapping different terrorist groups together. This is because of the scarcity of data about relationships between different terrorist groups on one hand, and because there is also very few studies who tend to classify terrorist groups on the other hand.
Some studies tend to classify terrorist groups based on ideology, group size, location, etc. However, in this study, the researcher is interested to classify terrorist organizations or groups based on the way they perform the terrorist incident.
The main research question of this study is whether there exist patterns in terrorism incidents. In other words, can we classify terrorist groups based on the pattern they share in perpetuating their incidents?
To examine this question, the researcher used network analysis methodology in addition to database management using SQL language codes, and an algorithm to calculate similarity and generate network data from regular incidents' data. The researcher developed a C# computer software to extract data from SQL database, and apply the proposed algorithm.

Literature review 2.1 Terrorism
Terrorism is such a complex phenomenon that lacks a concise agreed-upon definition. More than three decades ago, Alex Schmid (1983) recognized more than 100 definitions of the term terrorism in both the official and the academic fields.
For example, In the USA, the annual country reports on terrorism in the US Code Title 22 Chapter 38 Section 2656f discussed the concepts of international terrorism, terrorism and the terrorist group as follows.
The term "international terrorism" means terrorism involving citizens or the territory of more than one country. The term "terrorism" means premeditated, politically motivated violence perpetrated against noncombatant targets by subnational groups or clandestine agents. The term "terrorist group" means any group practicing or which has significant subgroups, which practice, international terrorism" (22 U.S.C. 38 §2656f).
In Canada, terrorism is essentially defined as "acts of violence or threats of violence motivated by ideology and intended to intimidate the public or a segment of the public" (Criminal Code of Canada S83.01).
However, The Office of the United Nations High Commissioner for Human Rights in factsheet 32 states that terrorism refers to "acts of violence that target civilians in the pursuit of political or ideological aims" (UN Office of the High Commissioner for Human Rights (OHCHR), Fact Sheet No. 32, 2019).
Finally, The UN general assembly stated a consensus definition of terrorism, in which it describes terrorism as: [. . .] any action [. . .] that is intended to cause death or serious bodily harm to civilians or noncombatants, when the purpose of such an act, by its nature or context, is to intimidate a population, or to compel a Government or an international organization to do or to abstain from doing any act" (United Nations, 2019).
However, even this definition did not gain international consensus.
A huge sum of research connects terrorism to crime, especially organized crime. This is based on the fact that terror acts usually inflict damage of some sort on a given target. Therefore, the element of using violence in an illegal form in terrorism is the main cause of why many studies connect terrorism to criminal acts. However, the illegal use of violence is JHASS 2,2 only one dimension of terrorism, but not the only one. This clarifies why mafia crimes, for instance, are not considered as terrorism.
A second, and important, dimension of terrorism is politics: Differently from criminal activities, terrorism is defined by a clear political orientation and/or relevance. Such a political dimension reminds us that terrorists usually (although not always) have a political goal (Locatelli, 2014, p. 6).
"As terrorists generally challenge the monopoly of violence of the state and its ability to protect its citizens, terrorist acts obtain political significance even when the motivation for them is not primarily political" (Schmid, 2004, p. 200).
In addition, Locatelli (2014) highlighted five main elements of terrorism from literature; these elements are as follows: (1) The use of violence.
(5) The relative strength of offense over defense.
Summing up, he defined terrorism as "a peculiar form of political violence based on an indirect approach. It implies a patent breach of accepted rules and enjoys a tactical advantage over defense" (Locatelli, 2014, p. 10). Another branch of literature connects terrorism with psychology, specifically in studying the psychology of a terrorist. This is based on the idea of connecting terrorist behavior with some psychopathologies.
As Pearlstein (1991, p. 9) claims, "the individual who becomes and remains a political terrorist generally appears to be psychologically molded by certain narcissistic personality disturbances." In addition, many studies also claim the presence of the psychological mechanisms of externalizations and splitting, which are also found in individuals with borderline and narcissistic disturbances (Ferracuti, 1982;Ferracuti and Bruno, 1981;Laquer, 1987;Post, 1986aPost, , 1986b. However, other studies rejected this idea, claiming that terrorism is a group activity, and therefore, psychopathology or a single personality type cannot solely result in terrorism. The proponents of this idea claim that shared values and ideological commitment and group solidarity are much more important than psychological factors in understanding terrorism (Crenshaw, 2000).
Based on this collective view of terrorism, another line of research emerged adopting the idea of the strategic logic of terrorist organizations. In other words, terrorist organizations follow a strategic logic in their activities and their planning of the attacks with the aim to pursue a political goal (Caruso and Locatelli, 2008;Della Porta, 1995;Della Porta and Diani, 2006;Tarrow, 1994;Zald and McCarthy, 1980). "In contemporary studies on terrorism, three different analytical levels are usually considered: (1) the individual level; (2) the organizational level; and (3) the systemic level" (Palano, 2014, p. 142).
This study is more concerned with the organizational or group level.
Mapping terrorist groups 2.1.1 Classifying terrorist groups. Different terrorist groups differ in many aspects, including terms of membership, political goal and ideological foundations. However, few studies aimed at finding a metric upon which we can classify terrorist groups.
In this regard, Blomberg et al. (2011) in attempting to answer the question of what determines the survival or demise of a terrorist groupclassified terrorist groups based on several variables, which are groups' peak sizes, tactics, ideologies and base locations.
Later on, Locatelli (2014)  In this study, the researcher does not intend to classify terrorist groups based on a set of group properties or features, however, the main aim is to test whether there is a specific pattern that can be shared by different terrorist groups or not, and map the groups that share the same pattern together.

Network analysis to study terrorism
A network is a collection of actors (e.g. persons, groups, organizations)represented by nodesand relations between actors (connections, activities)represented by links (Wasserman and Faust, 1994). Network analysis has been a core technique of criminal intelligence analysis since the early 1970s, primarily in the form of link analysis, a specific adaptation of network analysis to criminal intelligence and investigation.
However, the tendency to regard terrorist organizations as networks (i.e. cellular structures rather than hierarchies) is relatively new. Terrorist groups or organizations can be viewed as "hub-and-spoke" networksi.e. efficiently organized structures of connected cells that are resilient to disruption (van der Hulst, 2014).
Each terrorist group is also a network inside a wider network of supporters, suppliers, audiences and opponents. Network analysis concepts of particular value to the analysis of terrorism include: (1) Centrality. Centrality describes the relative importance of an individual in a network. We can measure centrality in several ways, including:

JHASS 2,2
Degree Centrality is the number of other people adjacent to the individual. The higher this measure is, the more direct associates the individual has in the known network. The person may be a formal leader, a skilled networker or poor at keeping his connections secret (Strang, 2014). Betweenness Centrality is the number of geodesics the individual is on. The higher this measure is, the more indirect associates the individual has. He or she may be a central actor in the communications or exchange network and maybe a key individual in holding the network together (Strang, 2014). Eigenvector Centrality is the degree to which an actor is connected to highly connected peers, and it takes all direct and indirect network paths from the focal actor into account (Torfason and Kitts, 2011). (2) Components and cliques. Components (sub-graphs) are those that divide the network into separate parts with each having several actors. Cliques are the (maximal) sub-graphs of nodes that have all possible ties present among themselves. That is, a clique is the largest possible collection of nodes (more than two) in which all actors are directly connected to all others (Hanneman and Riddle, 2014). (3) Cutpoints and bridges. A Cutpoint is a single node connecting two or more components of a network. Removing that node should disconnect those components. A bridge is a link between two nodes in different networks or network components, so this relationship is also the connection between the two networks or sub-networks (Strang, 2014). (4) Clustering. One common way of measuring the extent to which a network displays clustering is to examine the local neighborhood of an actor (that is, all the actors who are directly connected to it), and to calculate the density in this neighborhood. After doing this for all actors in the whole network, we can characterize the degree of clustering as an average of all the neighborhoods in the whole network (Hanneman and Riddle, 2014).
Network analysis has been widely used in literature to study either the internal structure of a terrorist organization or the flow of information within it. In both cases, the main aim behind using network analysis is to find the most important elements in a network from the pattern of communication within this network, and therefore, detect the weak points in any terrorist network.
One of the first studies to apply network analysis on criminal organizations, in general, was that of Sutherland (1937), which highlighted the role played by the structure of the network, and the system of relations within it, in facilitating its criminal activities. Later on, Albini (1971) used network analysis to study the linkages between organized criminal Sicilian and Italian-American families. Arquilla and Ronfeldt (2001) highlighted the role of network structure in the performance and cohesion of terrorist organizations. They claimed that if we model and analyze the structure of terrorist networks, we could predict the future of transnational organized crime in general and terrorist organizations in particular.
Directly after 9/11, Krebs (2002) used publicly available information about the 19 hijackers to construct a network of weak and strong ties based on the nature of their relationships with one another. By computing several centrality measures, this paper depicts a sketch of the covert network behind the scenes, which allowed identifying the clear leader among the hijackers.

Mapping terrorist groups
Moreover, Yang et al. (2006) explained the importance of visualizing the structure of the social network. They argue that if the structure of relationships and the dynamics of information circulation within the network are accurately defined, we can detect: the most important actors in the network, information flow and how to track and disable them and know the weaknesses inside criminal networks.
Recently, network analysis is mainly used to serve counter-terrorism by identifying the strengths of the network to target them, and its vulnerabilities to penetrate them. Among these studies, we find Moon et al.

Methodology
To understand the international terrorist landscape that performs attacks in Egypt, we have to construct a network of terrorist groups. However, we need first to extract network data from the GTD data. To achieve this task, the researcher used a modified version of the algorithm proposed by Alison et al. (2017) to calculate a similarity matrix between different terrorist groups.
In addition, the researcher designed and implemented a computer program using C# programming language to perform the following tasks: Extract data from the database using SQL Database codes. Calculate the similarity matrix using the proposed algorithm (that will be discussed in detail in the next section). Generate a network data file using the calculated similarity matrix and export it to an Excel file.
Having groups similarity matrix calculated from the dataset, social network analysis is then used to understand the relationships between different terrorist groups and to test whether there are similar patterns among terrorist groups or no. In this regard, Gephi software (version 0.9.2) is used to visualize the network and to calculate some useful measures for the analysisthese measures will be discussed in the next section. The results of the network analysis are then compared to the results of a two-step cluster analysis, to validate the proposed methodology.

Data
The global terrorism database (GTD) is an open-source database including data on terrorist events around the world, from 1970 until 2017. The National Consortium for the Study of Terrorism and Responses to Terrorism (START) makes GTD available online, with systematic data on domestic, as well as transnational and international terrorist incidents. GTD is currently considered as the most comprehensive unclassified database on terrorist attacks in the world.
GTD now includes more than 190,000 cases; for each incident, it includes data on more than 12 variables. Examples include the date and location of the incident, the weapons used and nature of the target, the number of casualties, andwhen identifiablethe group or individual responsible. JHASS 2,2 GTD defines terrorism as "the threatened or actual use of illegal force and violence by a non-state actor to attain a political, economic, religious or social goal through fear, coercion or intimidation" (GTD Codebook, 2018). To consider an incident for inclusion in the GTD, all three of the following attributes must be present: (1) The incident must be intentionalthe result of a conscious calculation on the part of a perpetrator. (2) The incident must entail some level of violence or immediate threat of violence, including property violence, as well as violence against people. (3) The perpetrators of the incidents must be sub-national actors. The database does not include acts of state terrorism.

Data pre-processing
As previously mentioned, the study applies a modified version of Alison et al. (2017) algorithm, which is based on some features of the GTD data. However, in this study, we are interested in some more features that were not included in Alison et al. (2017) study. This study is interested in the following selected features for each terrorist incident in the GTD: (1) Year: The year in which the incident occurred.
(2) Latitude and longitude: The location of the terrorist incident.
(3) Group name: The name of the perpetrator group that carried out the attack. To ensure consistency in the usage of group names for the database, the GTD database uses a standardized list of group names that have been established by project staff to serve as a reference for all subsequent entries. If no information about the perpetrator group is available, this field is coded as "Unknown". (4) Inclusion criteria: These are three categorical variables, namely: Criterion 1, Criterion 2 and Criterion 3. For each variable, a case is coded as "1" if the criterion is met and "0" if the criterion is not met or that there is no indication that it is met. Criterion 1: The violent act aims at attaining a political, economic, religious or social goal. Criterion 2: There is evidence of an intention to coerce, intimidate or publicize some messages to a larger audience(s). In addition to the previously selected features, the researcher calculates the incident severity for each incident, using the following equation [Alison et al. (2017)]: where a is a parameter indicating how many wounds are equivalent to one death in terms of severity. Here, we use a = 3 as defined by Alison et al. (2017).

Algorithm
Step 1: Calculate Group Attributes: For each terrorist group, calculate the following attributes: Lethality Target type: The target type that the group attacked most. Attack type: The attack type that the group used most. Weapon type: The weapon type that the group most commonly used.
Step 2: Calculate Similarity Matrix: For each two groups included in the data, calculate their similarity value as follows: For each numeric feature (i): where v 1 is the value of feature (i) in group 1, and v 1 is the value of feature (i) in group 2. For categorical feature (i): For Longitude and Latitude: S i ¼ exp Àdistance , where distance is the geographical distance between two locations coordinates using the Haversine Formula: where: r = radius of earth (6371 km); d = distance between two points; U1, U2 = latitude of the two points; and l 1, l 2 = longitude of the two points, respectively. Now, the similarity value between two terrorist groups is the sum of similarities between these two groups overall selected features as in the following equation: Step 3: Generate a Network Data File: Create a weighted undirected network between terrorist groups as follows: The nodes of the network represent the terrorist groups included in the data. The edges (linking two nodes) of the network represent whether these two nodes or groups are similar or not. In this study, an edge is drawn between two nodes if the similarity value between these two groups exceeds a threshold value u . The weights of the edges represent the similarity value between the nodes linked by these edges.
Using the generated network data file, the researcher uses Gephi Software to visualize the network, and calculate the following measures: Number of Components (Communities) and Cliques in the network. Degree and Weighted Degree Centrality of each group. Betweenness Centrality of each group. Eigenvector Centrality of each group. Clustering Coefficient of each group.

Results
In this study, the researcher applies the algorithm proposed in the previous section on a sample of the GTD for Egypt only, and from the year 2011 until the year 2017. In the following paragraphs, the main results are highlighted.
Using the C# tailored application to extract data from a SQL Database including the original GTD data; the program extracted 22 groups performing terrorist incidents in Egypt from 2011 until 2017. Table I shows the names of these 22 terrorist groups sorted alphabetically. Applying Step 1 of our algorithm, the proposed computer program calculated the needed group attributes for each terrorist group. Table II shows the calculated categorical attributespeak year, attack type, weapon type and target typefor each group and Table III shows the calculated numerical attributes of these 22 groupsinclusion criteria, location, lethality and average lethality. Table II shows that there are differences between terrorist groups in terms of the peak year, the most common attack type, the most commonly used weapon type, and the most common target type. For instance, it is obvious here that: The  Applying Step 2 of our algorithm on the selected data using the proposed computer program, we get the similarity matrix between the 22 groups with (22 Â 22 = 484) cells. Figure (1) shows this similarity matrix with 484 cells, each cell shows the similarity value between the two groups shown in the row and column of this cell. Cells are highlighted with colors; the darker the color of the cell the greater the similarity between the two groups, and the lighter the color of the cell the less similar they are. Mapping terrorist groups Applying Step 3 of our algorithm, and setting a threshold value u = 0.6, i.e. to draw a link between two groups they must share a similarity value that exceeds 60 per cent, the proposed computer program generated a network data file, as shown in Table IV. It is obvious from Table IV that: The most powerful link in the network lies between "Ajnad Misr" and "Muslim Brotherhood" with weight 8.41, meaning that there is more than 84 per cent similarity between the violent acts performed by these two groups in terms of the features selected in this study. The second most powerful link lies between "Sinai Province of the Islamic State" and the "Unknown" group with weight 7.65, i.e. with similarity exceeding 76 per cent.
The third most powerful link lies between "Hasam Movement" and "Revolution's Brigade" with weight 7.53, i.e. with similarity exceeding 75 per cent. The fourth most powerful links lie between "ISIL" and "Revolution's Brigade" with weight 7.3, and "ISIL" and "Revolutionary Punishment Movement" with weight 7.1, i.e., with similarity exceeding 70 per cent in both links. The huge similarity between "Ajnad Misr" and "Muslim Brotherhood" can be justified by the fact that "Ajnad Misr" is an active Salafist Islamist militant terrorist group founded in 2013 directly after the removal of Mohamed Morsi from his office after 30 June revolution. In addition, it claims that its attacks are retribution for the break of sit-in of Muslim Brotherhood in Raba'a Square in August 2013, which leaded to an armed conflict between the Muslim Brotherhood and the military and police forces.
Corollary 1: From the huge similarity between "Ajnad Misr" terrorist group and "Mulsim Brotherhood" in terms of the way they perform their violent acts (incidents), we can, therefore, reach a conclusion that "Ajnad Misr" is just a militant group formed from members of the "Muslim Brotherhood" organization. In other words, "Ajnad Misr" can be viewed as a military wing of "Muslim Brotherhood". Our generated network is a weighted undirected network that contains 17 nodes and 23 edges. Visualizing the network using Gephi software, we find that there are three components or communities within this network. The average clustering coefficient of the network is 0.792, and the graph density is 0.169. Figure (2) shows the network with three components with three different colors with line thickness represent the weight of the link.  Mapping terrorist groups

153
The connection of Clique 2 and Cliques 3through the link between "Ajnad Misr" and "Ansar Bayt al-Maqdis (Ansar Jerusalem)"can be justified by the fact that the founder of "Ajnad Misr", Humam Muhammed, was a member of "Ansar Bayt al-Maqdis" militant group and then split away from it. Merging the cliques appearing in Figure (2) with the groups' attributes shown in Tables II and III, we can find that: Clique 1 {Groups 9, 11, 17, 18}: The most common attack type is "Armed Assault", the most common weapon type is "Firearms", the most common target type is "Police", the peak years are 2014, 2015 and 2016, and the main inclusion criteria are criteria 1 and 2 only. Clique 2 {Groups 4, 14, 19, 21}: The most common attack type is "Bombing/ Explosion", the most common weapon type is "Explosives", the most common target type is "Military", the peak years are 2013, 2014 and 2015, and the main inclusion criteria are criteria 1 and 2 only. Clique 3 {Groups 1, 10, 13, 20}: The most common attack type is "Bombing/ Explosion", the most common weapon type is "Explosives", the most common target type is "Police", the peak years are 2011, 2014 and 2017, and the main inclusion criteria are criteria 1, 2 and 3.
Corollary 2: There are two basic communities of terrorist groups in Egypt. Each group has its own pattern of attacks. The first community tends to make armed assault using firearms attacking mainly Police officers. However, the second community tends to make explosions using explosives attacking both Police and Military officers. Corollary 3: The Egyptian terrorism network reveals three cliques, one of them constitute a community in itself, which is the first community in Corollary 2. However, the second two cliques are different in the pattern of attacks, but in the same time, the two cliques are connected through a link between "Ajnad Misr" and "Ansar Bayt al-Maqdis (Ansar Jerusalem)." Cliques 2 and 3 share the same pattern of using explosives but differ in their targets, where Clique 2 mainly attacks Military forces, while Clique 3 mainly attacks Police forces. They also differ in the inclusion criteria and peak years.
In addition to the network visualization, Gephi calculates some measures of centrality for each node in the networkdegree, weighted degree, betweenness, clustering coefficient and eigenvector centrality. These centrality measures give us a notion about the importance of each node in the network. Table V shows the five centrality-measures for the 17 nodes of the network sorted in alphabetic order.  Finally, to validate the proposed algorithm, the researcher conducted a cluster analysis of the original GTD data without any processing. Using SPSS software, the cluster analysis with both Akaike's Information Criterion (AIC) and Shwarz's Bayesian Criterion (BIC) shows that the incidents can be clustered into two clusters as follows: (1) Removing the groups that are not included in the network analysis, we find a big resemblance between the clusters created by the cluster analysis and the communities created by the network analysis; however, network analysis gives the researcher far more detailed and accurate results including centrality measures and weights of linkages between groups.

Conclusion and future research
From the previous analysis and results, we can conclude that there are shared patterns of violence acts among different terrorist groups. Data about terrorism incidents reveal different levels of similarity between terrorist groups in terms of the ways they perform their acts. Using network analysis proved to be beneficial not only in studying the relationships between an individual actor inside a group and the rest of the group members but also in studying the mapping of terrorist groups and organizations themselves.
In addition, the results of mapping terrorist groups using network analysis were similar to the results of cluster analysis, which gives validation to the proposed algorithm. Moreover, network analysis outperforms cluster analysis because it promotes the analyst with much more details than just classification. For instance, it enables the analyst to visualize the linkages between different terrorist groups; it calculates several measures of centrality or prestige, or detect the most important groups in the network; it weights the linkages between different groups by the similarity measure, which enables more in-depth analysis.
Applying the proposed algorithm on a sample of terrorism incidents that happened in Egypt from 2011 till 2017, we found that the network of terrorist groups can be decomposed into two important components and three cliques. This means that in spite of the fact that there are 22 terrorist groups registered in the GTD data for this sample, there are only two to three main shared patterns of terrorism exist. In other words, we have many group names for the same pattern. This result needs to be studied in further detail in future research.
"Ajnad Misr" is the most central terrorist group in terms of the number of links it has, the betweenness of its location in the network, and its connectedness to other central and powerful groups. In addition, the linkage between "Ajnad Misr" and "Ansar Bayt al-Maqdis (Ansar Jerusalem)" is the most important link in the network, as it acts as a bridge between two cliques (cliques 2 and 3 in the results).
The highest similarity measure lies between "Ajnad Misr" and "Muslim Brotherhood," which gives an indicator that both groups are two sides of the same coin. This claim is validated by "Ajnad Misr" claim that their acts are considered as retribution for the isolation Mapping terrorist groups of Mohamed Morsi. However, "Muslim Brotherhood" has a 100 per cent clustering coefficient, meaning that it has the densest relationships with other groups in the network. The second highest similarity measure lies between the "Unknown" group and "Sinai Province of the Islamic State," which, in turn, reveals the high similarity with "Ansar Bayt al-Maqdis (Ansar Jerusalem)" creating a powerful triangle between these three terrorist groups. This means that the terrorism incidents that were perpetrated by unknown groups reveal high similarity with the terrorism incident that was perpetrated by "Sinai Province of the Islamic State" and "Ansar Bayt al-Maqdis (Ansar Jerusalem)".
Finally, this study proposes a new way of analysis of terrorism data that can be useful in understanding the landscape of terrorism.