Probabilistic seismic hazard analysis of the North-East India towards identification of contributing seismic sources

Abstract The North East (NE) India is regarded as one of the most seismically active regions of the globe. It is surrounded by two active plate boundaries (the Indian-Eurasian Plate boundary and the Indian-Burmese plate boundary), followed by rotation in the central part. As per the Seismic zonation map of India, most parts of NE India is classified under single seismic zone (zone V). However, numerous studies have shown significant variations in seismic activity as well as seismic hazard values across the NE. Present work attempts deaggregation of seismic hazard to identify seismic sources which control probabilistic seismic hazard values across the NE. With dominating fault mechanisms, seismic activity, and non-uniform past earthquake (EQ) records across NE, identifying controlling sources is crucial in order to understand the present and future seismic scenarios. Spatial distribution of seismic hazard levels and uncertainties are estimated using the Monte Carlo approach to sample a logic tree with alternate methods and models. As per the present findings, Manipur and Nagaland have highest seismic hazard in the NE, controlled by Indo-Burmese ranges. Further, seismic hazard of Guwahati, Shillong and Gangtok are significantly influenced by great EQs. In other parts, regional major to strong EQ sources control the seismic hazard values. Additionally, EQs with magnitude lesser than 4.5 have lower contributions to regional seismic hazards.


Introduction
NE India is one of the most EQ-prone regions across the globe.Notable EQs that caused induced effects in terms of ground shaking, landslides, liquefaction and infrastructural damage include 1897 Assam EQ (Mw-8.1,highest MSK intensity-IX, EMS intensity-X), 1869 Cachar EQ (Mw-7.6,Highest EMS intensity-IX), 1923 Meghalaya EQ (Ms-7.1), 1930 Dhubri EQ (Mw-7.1),1943 Assam EQ (Ms-7.2),1947 Arunachal Pradesh EQ (Ms-7.7),1950 Assam EQ (Mw-8.4), 1988Manipur EQ (Mw-7.2), 2009 Assam EQ (Mw-5.1),2011 Sikkim EQ (Mw-6.9)(Baro and Kumar 2015;Raghu Kanth et al. 2011;Kayal 2008;Oldham 1899;Fahmi et al. 1988; N. N. Ambraseys and Douglas 2004;England and Bilham 2015;Nicolas Ambraseys and Bilham 2003;Sabri 2002) etc.Recently, on April 28, 2021, an EQ of 6 Mw (as per USGS) occurred in Dhekiyajuli, Assam, triggering significant liquefaction.Before this EQ, another EQ (Mw-6.7)occurred in 2016 near Imphal, causing considerable damage (Gahalaut and Kundu 2016).High seismic activity of NE India is primarily due to the presence of two plate boundaries: the Indian Plate and the Eurasian plate subduction zone in the north, and the Indian Plate and the Burmese plate subduction zone in the east.In the light of significant EQs witnessed in the NE, IS-1893IS- (2016) ) part 1 classifies the entire NE India (excluding the state of Sikkim) as seismic zone V, the region of highest seismic hazard in the Indian subcontinent.However, numerous seismic hazard studies highlight that the seismic hazard across NE is not-uniform.The subsection below summarises various Seismic Hazard Analysis (SHA) studies for the NE and their associated limitations.

The existing SHA studies and their limitations
Besides IS-1893IS- (2016) ) part 1, numerous researchers attempted to estimate the level of seismic hazard in various parts of the NE India.Sharma and Malik (2006), NDMA (2010), Pallav et al. (2012), Sil et al. (2013), Das et al. (2016), Bahuguna and Sil (2020), Baro et al. (2018), Baro et al. (2020), Ghione et al. (2021) etc. utilised information about seismic source zones, past EQ data and GMPEs to determine the seismic hazard levels of the region.Most of the above studies utilised Probabilistic Seismic Hazard Analysis (PSHA) while estimating the seismic hazard level.PSHA method is primarily based on the work done by Cornell (1968) and Esteva (1969Esteva ( , 1970)).Accordingly, programs to perform PSHA were developed by McGuire (1976) and Bender and Perkins (1987).Earlier listed PSHA studies for the NE India were mainly based on the works by Cornell (1968), Esteva (1969Esteva ( , 1970)), McGuire (1976), and Bender and Perkins (1987).A careful observation on these studies for the NE India highlights that except Ghione et al. (2021), other SHA were based on one source zonation definition (refer to Table 1 for further detail).As per Roe et al. (2014), significant uncertainties exist regarding the position and geometry of seismic sources such as faults.Thus, considering multiple source models would be more accurate in accounting for these epistemic uncertainties (Gr€ unthal et al. 2018).
Seismic activity parameters provide information about the frequency of different magnitude EQs to occur on a seismic source.These can be estimated either by Least Square Method (LSM) or the Maximum Likelihood method (MLM).In LSM, linear regression is done to frequency-magnitude distribution (FMD) to obtain the seismicity parameters (Guttorp 1987).LSM is suitable for estimating the probabilities of the largest magnitude of EQs (Shi and Bolt 1982).MLM, on the other hand, takes into account EQ magnitudes greater than the observed maximum magnitude and the uncertainty in the magnitude of EQs (Kijko et al. 2016;Naylor et al. 2010;Weichart 1980).As per Shi and Bolt (1982), MLM, on the other hand, assumes that the data are exponentially distributed.Table 1 summarises the methods of seismicity parameter estimation used in earlier mentioned SHA studies for NE.According to Naylor et al. (2010), the standard assumption in MLM is that the data is not correlated.Thus, MLM is the best technique to fit data in a G-R recurrence relation and determine the seismicity parameters (Naylor et al. 2010;Weichart 1980).
GMPEs are empirical correlations between EQ characteristics and ground motion characteristics such as Peak Ground Acceleration (PGA) or Spectral Acceleration (Sa).For the NE India, few GMPEs were developed based on regional records or simulation-based SHA database (Anbazhagan et al. 2013;Gupta 2010;NDMA 2010).Some researchers (Sharma and Malik 2006;Anbazhagan et al. 2009;NDMA 2010;Pallav et al. 2012;Sil et al. 2013;Das et al. 2016;Bahuguna and Sil 2020;Baro et al. 2020; Baro et al. 2018) used the GMPEs (which were developed for other locations having similar tectonic settings as those of NE India region for regional SHA of the NE).Table 1 summarises GMPEs used by the previous seismic hazard studies for the NE region.It can be observed from Table 1 that many of the existing SHA works were based on a single GMPE.Further, other works though used more than one GMPE, the applicability of selected GMPEs for higher magnitude and larger distances was not checked.
Above discussion highlights the limitations of existing seismic hazard studies for the NE India.Overcoming the above limitations of existing studies for NE, the present study performs PSHA for NE India using two source models, MLM methods for seismicity parameter determination and multiple GMPEs using logic tree.The mean hazard levels and the uncertainties are estimated based on the Monte-Carlo sampling of the logic tree.While doing so contribution of different magnitude-distance ranges from different seismic sources, which control the regional seismic hazard of different parts of NE, are also studied.

Seismotectonics of the present study area
The NE India, surrounded by collision boundaries from two sides, has diverse geology and complex tectonic features.Hilly terrains such as Eastern Himalayan Region, Mikir Hill, Shillong Plateau, Naga Thrust, Mishmi hill and Indo Burmese ranges, and Plain lands such as Assam Valley and Bengal Basin are present in NE (see Figure 1).Past researchers had divided the NE region into various seismotectonic zones based on tectonic movements, direction, rate of movement and fault mechanisms (Kayal  Bahuguna and Sil (2020) Assam PSHA Linear Source LSM Baro et al. (2018) Shillong Plateau DSHA Linear Source not used Baro et al. (2020) Shillong Plateau PSHA Linear Source MLM Ghione et al. (2021) NE India and Bhutan
The Himalayan Region was formed due to the collision of the Indian Plate with the Eurasian Plate.This region shows diversity in seismic activity distribution and major structural units.It can further be divided into three subzones: the Bhutan Himalaya, the Arunachal Himalaya and the Mishmi Thrust (Angelier and Baruah 2009).This region has many thrust faults like Indus Suture Thrust (IST), Main Central Thrust (MCT), Main Boundary Thrust (MBT) and Himalayan Frontal Thrust (HFT) (Kayal 2008;Thingbaijam et al. 2008).1991 Uttarkashi EQ (Mw-6.8),1999 Chamoli EQ (Mw-6.6)and 1980 Gangtok EQ (Mw-6.3)are notable past EQs that had occurred on the MCT segment located west of the NE India (Mukhopadhyay 2011).Similarly, 1905 Kangra EQ (Ms-7.8),1934 Bihar-Nepal EQ (Mw-8.2) and 1950 Assam EQ (Mw-8.4)happened on the HFT (Nicholas Ambraseys and Bilham 2000; Pandey and Molnar 1988;Bilham 2001).As per Mittal et al. (2012), no major EQs have recently been detected on the NE side of the Himalayan region.
The Indo-Burma Ranges, located on the eastern side of NE India, came into existence due to the collision of the Indian and the Burmese plates.Kaladan faults, Naga thrust and Eastern boundary thrust are the main thrust faults in this region (Kayal 2008).According to Wang et al. (2014), the Eastern boundary thrust (also known as the Churachandpur-Mao Fault/CMF) exists in the region, located towards the east of the active Naga thrust.
To the north of the Bengal Basin region is the Shillong Plateau region.This region was uplifted during the Late Cenozoic (Rao and Kumar 1997).The northern edge of the Shillong Plateau was again uplifted during the 1897 Shillong EQ (Oldham 1899;Bilham and England 2001;Vernant et al. 2014).Based on GPS data, Vernant et al. (2014) found that the Shillong block is rotating clockwise at a rate of 1.15 /Myr between longitudes 89 E and 93 E. Numerous active faults such as Dauki fault, Brahmaputra fault, Dapsi Thrust, Barapani Shear Zone, and Oldham fault exist in the region.Dauki Fault, which is an active north dipping reverse fault (Morino et al. 2011) separates the Shillong Plateau and the Bengal basin region.Brahmaputra fault is situated to the north of the Shillong Plateau (Dasgupta and Nandy 1982).Oldham fault triggered the great 1897 Assam EQ (Mw-8.1)while Dhubri fault triggered 1930 Dhubri EQ (Ms-7.1)(Bilham and England 2001;Nandy 2001;Kayal 2008).
Similar to the Shillong block, the Assam block (from 93.5 E to 97 E) is also rotating clockwise at a rate of 1.13 /Myr (Vernant et al. 2014).Seismically, the Assam valley area, which includes Assam block, is less active than other regions.Kopili fault present in the region is responsible for the 1869 Cachar EQ (Mw-7.5)and the 1943 Assam EQ (Mw-7.2) (Kayal et al. 2010;Kayal 2008; N. N. Ambraseys and Douglas 2004).
In addition to regional seismicity in different parts of NE, there are seismic gaps existing in NE.As per Khattri (1987), Mittal et al. (2012), andSrivastava et al. (2015), the "Northeast Seismic Gap" is one such gap located between the epicentres of the 1934 Bihar-Nepal EQ (Mw-8.2) and the 1950 Assam EQ (Mw-8.4)(Mittal et al. 2012).Furthermore, as per Khattri et al. (1983), "the Assam Gap" exists in the upper Assam valley area.
Based on the discussion above, it can be seen that the governing tectonic features across NE are non-uniform.The presence of active faults and potential seismic gaps supports the chances of high to very high seismic hazards in the region.

EQ catalogue used for the analysis
For SHA, information about past EQs in terms of magnitude, epicentre coordinates, time of occurrence, focal depth etc. is crucial.For the present work, the declustered EQ catalogue developed by Borah et al. (2021) is considered.This declustered catalogue consists of a total of 5622 independent EQs.Out of these, numbers of EQ events having Mw < 4.0, 4.0 Mw < 5.0, 5.0 Mw < 6.0, 6.0 Mw < 7.0, 7.0 Mw < 8.0 and Mw !8.0 are 214, 4452, 754, 159, 36 and 7 respectively.(see Figure 2).The oldest EQs present in the catalogue had occurred in the years 825 AD and1100AD (NDMA 2010;Borah et al. 2021;Rajendran et al. 2004).

Source zone models
Section 2 earlier highlighted that different parts of the NE show variable tectonics as well as seismic activities.Hence, the region cannot be considered as a single seismic source zone.In this study, two models of seismic source zones are considered to account for epistemic uncertainties associated with the position and geometry of seismic sources (Gr€ unthal et al. 2018).Subsections below provide information on each source zone model considered.
4.1.Source model 1 (SM1) based on seismicity and tectonic features Gupta (2006) and, Kolathayar and Sitharam (2012) did seismic source delineation of NE based on visual observations of EQ events and tectonics.As a result, a total of 33 areal sources were identified by Gupta (2006) and Kolathayar and Sitharam (2012).Within these 33 areal sources, the Shillong Plateau, the Bengal Basin and a part of the Assam Valley are present in a single areal source.As per Angelier and Baruah (2009), however, all these three source zones (the Shillong Plateau, the Bengal Basin and one part of the Assam Valley) can be separated from each other on the basis of tectonics, regional geology, focal mechanism and spatial distribution of seismicity.Hence, in the present work, while the other 32 sources identified by Gupta (2006) and Kolathayar and Sitharam (2012) are as it is, the 33rd source zone (which contains the Shillong Plateau, the Bengal Basin and one part of Assam Valley) is further divided into three areal sources.This way, a total of 35 source zones (shown in Figure 3a) are considered in SM1 for the present analysis.Borah et al. (2021) highlighted that source zonations of NE done by prior studies were primarily based on the tectonic setting, geology etc.As a result, such zonations do not ensure uniform distribution of past EQs and seismicity parameters within each identified source zone.Highlighting these limitations of previous studies, Borah et al. (2021) used the hierarchical clustering method on past EQs for the NE region and identified 12 source zones, ensuring uniform distribution of past EQs in each zone.Further details on these identified source zones can be found in Borah et al. (2021).Figure 3b presents 12 source zones for NE India as per Borah et al. (2021), whch is used in the current work as SM2.

Determination of the seismic activity parameters for the NE India
The seismic activity parameters determine the frequency of various magnitude EQs to occur in the study region.In order to estimate these seismic activity parameters of the NE region, declustered EQ catalogue belonging to each source zone (in both source models) are to be separated first from the combined catalogue.While doing so, it is observed that some source zones have very less numbers for EQs than those required to estimate seismic activity parameters.For such source zones, EQs from adjoining source zones (which individually have less number of EQs) are combined to estimate seismic activity parameters for the combined source zone.For SM1 (Figure 3a), the source zones combinations considered are Similarly, for SM2 (Figure 3b), Zb5, Zb6, and Zb10 are source zones with very few number of EQ events.Thus, these source zones are combined (Zb5 þ Zb6 þ Zb10) to calculate the seismic activity parameters for SM2.Once zones with a sufficient number of EQs are ready, such catalogue is first to be checked for completeness with respect to magnitude and time as discussed in the next subsection, before estimating seismic activity parameters.

Completeness analyses
Before assessing seismic activity parameters from EQ catalogue, the completeness of catalogue with respect to magnitude and time must be checked.The magnitudes of completeness (Mc) for each of the earlier-mentioned zones (individual as well as combined) are determined using the maximum curvature (MAXC) method (Wiemer and Katsumata 1999;Wiemer and McNutt 1997;Wiemer and Wyss 2000).The plot of the total number of EQs greater than or equal to different magnitudes versus Once Mc values are known, completeness analysis with respect to time is performed using the CUVI method (Mulargia and Tinti 1985).For this, EQs in each source zone are divided into different magnitude classes (keeping DM ¼ 0.5 Mw) as: i)  5a, it can be seen observed that EQ occurrence is stable after the year 1963.Before 1963, the cumulative frequency of EQ occurrence does not follow any definite pattern.Thus, for magnitude class 4.3 Mw 4.7, the EQ catalogue for Zb1 is complete after 1963.Similar observations can be drawn from other plots of Figures 5a-f and 6a-f for Zb1 and Zb4, respectively.Summary of the duration of completeness for SM1 and SM2 are presented in Tables 2 and 3, respectively.
For the assessment of seismic activity parameters, only complete portion of EQ catalogues are used, as discussed in the next section.

Estimation of seismic activity parameters
The rate of exceedance (N) of EQ of a magnitude (M) in a region can be related to its M value as (Gutenberg and Richter 1956): Where "a" represents the logarithm of the total number of EQs with M !0 and "b" value is the relative chance of occurrence of smaller and greater magnitude EQs.Lower "b" values are generally associated with areas having high stress rates (Kalafat and G€ org€ un 2019).Thus, a temporal reduction in "b" value can be used to anticipate an impending main EQ event (G€ org€ un 2013).Further, low "b" values prior to a major seismic event are also found to be followed by high "b" values after the rupture (E.G€ org€ un et al. 2009).In the present work, seismic activity parameters are estimated by using MLM as LSM provides biased estimation of the "b" value.
In MLM, "b" (¼2.303b) is determined by solving the following maximum-likelihood condition (Weichart 1980); Where, m i ¼ value of the central magnitude of the i th interval, t i ¼ periods of completeness in number of years for the i th interval, m ¼ average value of magnitude in the catalogue.Once b (¼2.303b) is known, the total number of EQs per year greater than Mc (or N MC ) can be determined as (as per Weichart 1980); (3)  Table 2. Year of completeness for different ranges of Magnitude (Mw) for SM1.Where, N T stands for the total number of events.Once N MC is known, "a" parameter can be determined as; This way, the values of "a" and "b" can be obtained using MLM.Uncertainties in "a" and "b" values are also estimated by using Weichart (1980) method.Tables 4 and  5 summarise the values of "a" and "b" and the standard deviation in each parameter obtained from MLM for SM1 and SM2 respectively.

Source-wise distribution of seismic activity parameters
In the last section, values of seismic activity parameters are determined for the individual as well as combined source zones.In case a combined zone is considered for estimating seismic activity parameters, obtained seismic activity parameters must be redistributed to individual zones.Iyengar and Ghosh (2004) utilised the "Principle of Superposition" to redistribute "a" parameter from region to individual faults within that region on the basis of fault length and numbers of EQs.Iyengar and Ghosh (2004) considered the b value of all the faults within a region to be the same as b value of that region.
In the present study, referring to the work by Iyengar and Ghosh (2004), the "b" value of all individual source zones (which were combined to estimate seismicity parameters) is considered as same as the "b" value of the combined source zones.For the determination of a value, however, the number of EQs that had occurred on an individual source zone "k" is solely considered by defining the weight W k in redistribution as; Once W k is known, the activity "N k " for source zone, "k" can be obtained using Eq. ( 6) as;  Parameter N k in Eq. ( 6) is equal to aðm 0 ) (used in Eq. 9 later) which defines the frequency of m 0 (¼Mc) and above magnitude to occur on an individual source zone "k." Following the above procedure, aðm 0 ) and b values for all source zones of SM1 and SM2 are obtained.Tables 6 and 7 summarise aðm 0 ) and b values for each of the source zones of SM1 and SM2 respectively.
7. Maximum possible EQ magnitude (Mp) for each source zone EQ catalogue sometimes may not indicate the true potential of the seismic source which is very important for assessing the seismic hazard values.As a result, maximum possible magnitude (Mp) of a source can be larger than the largest observed magnitude (Mmax-obs) known on the source.Many statistical methods of Mp assessments (Kijko and Sellevoll 1989;Kijko and Graham 1998;Kijko 2004;Kijko and Singh 2011) are available which requires the information of the past EQs.Besides this, Mp also depends on other properties (length, rupture characteristics etc.) of the seismic source.Thus, in this work, two methods are used for calculating Mp.In the first method, based on fault length, Mmax (called as Mmax-fault) is calculated using the relation of Wells and Coppersmith (1994).For the calculation of Mmax-fault, the fault length is considered as the subsurface rupture length and the slip type is considered as all (Table 2A in Wells and Coppersmith 1994).
In the second method, Mmax is calculated using Robson-Whitlock-Cooke procedure (Kijko and Singh 2011).In this method, Mp is calculated by increasing the Mmax-obs by a suitable margin as suggested by Gupta (2002), which was widely used by various researchers (Bahuguna and Sil 2020;NDMA 2010) for the NE India.However, in Robson-Whitlock-Cooke procedure, the maximum possible magnitude (MmaxRWCÞ is estimated by using the following formula for truncated magnitude distribution (Kijko and Singh 2011).
Where; M obs max ¼maximum observed magnitude and M nÀ1 ¼ the second largest observed magnitude.
Once values of Mmax from both the above methods are known for each source zone, mentioned below four criteria are considered for choosing one Mp value for each source zone.Values of Mp for each source zone for SM1 and SM2, following the above-mentioned criteria are summarised in Tables 8 and 9, respectively.

GMPEs considered
In the declustered EQ catalogue developed earlier, M value varies from 4.3 to 8.8 (Mw).Further, based on the position of source zones, the epicentral distances can go as high as 400 km.Thus, referring to the works by Baro et al. (2018) and Baro et al. (2020), and taking the above range of magnitude and epicentral distance into account, three GMPEs, namely: Toro et al. (2002), NDMA (2010), and Anbazhagan et al. (2013) are selected for the current study.It should be noted here that GMPEs by NDMA (2010) and Anbazhagan et al. (2013) were developed for NE India and the Himalayan region, and GMPE by Toro et al. (2002) was developed for the subduction zone.All three GMPEs are valid for site class A condition.However to apply the GMPEs, the ranking (and subsequently the weight) of each GMPE is required to be found out.For the purpose, log-likelihood (LLH) ranking method is used in this work.In LLH ranking method, LLH factor is found out as (Scherbaum et al., 2009); Where, gðx i Þ ¼ likelihood of the model for observation x i and N ¼ number of observations.From LLH factors, the weights for each GMPEs can be obtained by the following formula; Where, LLH k ¼ LLH of the k-th GMPE and nG ¼ number of the GMPEs.
For the present work, a total 98 accelerograms from 17 EQs occurred in NE India, recorded at rock sites, are considered to estimate the LLH factor for all the GMPEs.The LLH factors and the weights obtained for each of the three GMPEs are shown in Table 10.

Uncertainties dealt in PSHA
PSHA methodology was developed primarily to account for uncertainty with respect to frequency of EQ magnitude, hypocentral distance and ground motion exceedance ( Reiter, 1990).Combining the above uncertainties, the frequency of ground motion exceedance (v z ) can be found as (EM-1110-2-6050, 1999); Where, v z ¼ annual frequency of exceedance of ground motion, P N ¼ summation of results from N numbers of seismic sources, k n ðm i Þ ¼ annual frequency of EQ occurrence of magnitude m i on n th seismic source, P n R ¼ r j m i j Þ À ¼ probability of occurrence of an EQ having magnitude m i on n th source at a distance r j away from the site of interest.P Z > zjm i r j À Á ¼ probability of exceedance of ground motion level z for an EQ of magnitude m i and located at a distance r j from the site of interest.It has to be mentioned here that source zones for SM1 and SM2 identified earlier are used as areal sources for PSHA in this work.Hence, in further discussion, the earlier mentioned source zones are called as areal sources.Further, only a brief discussion about the governing equations for quantifying previously mentioned uncertainties in PSHA is given.For detailed discussion, one can refer to EM-1110EM- -2-6050 (1999)).
The value of v z in Eq. ( 10) can be correlated with the probability of exceedance P E z ð Þ of ground motion z at a site, for a given exposure period (t) as; To solve Eq. ( 10), all the three uncertainties mentioned above are to be estimated individually first.In order to quantify the frequency of a particular magnitude to occur, firstly, the cumulative number of EQs of magnitude m [N M > m ð Þ] on a source can be quantified as; Where, b and aðm 0 Þ for all areal sources are taken from Tables 6 and 7. Further, Mc and Mp values estimated earlier for each areal source are m 0 and m u respectively for Eq. 10.Once the value of N M > m ð Þis known, the frequency of magnitude m i to occur on an areal source ½k n ðm i Þ can be calculated as (EM-1110-2-6050, 1999); Using Eqs. 12 and 13, the values of k n ðm i Þ for various values of m i are estimated for each areal source.
In the next step, probability of rupture with respect to location [P n R ¼ r j m i j Þ À ] is assessed.In doing so, it is presumed that the probability of EQ occurrence is same throughout the source.For an areal source, this means that within the areal source, the probability of rupture to occur is same throughout.Thus, the probability of occurrence of an EQ at a given epicentral distance "R" is calculated by taking the ratio of the area between R þ DR/2 and R-DR/2 (where DR is considered as 10 km in this work) to the total area of the areal source.This way, for each site of interest, the distance probability of rupture to take place is estimated.
Once the frequency of occurrence of various magnitude (m i ) of EQ in different areal sources and the probability of rupture to take place at a given distance (r j ) are estimated, uncertainty with respect to ground motion exceedance is to be dealt next.This uncertainty is accounted for by calculating the probability of exceedance of the ground motion parameter beyond a threshold value for a given m i and r j .The conditional probability distribution PðZ > zjm i r j Þ (as per Kramer 1996) can be estimated as; Where, F z ðZ Ã Þ is the value of the cumulative distribution function (CDF) of Z corresponding to m i and r j : In the case of GMPE, F z ðZ Ã Þ is primarily considered as lognormal distribution (EM-1110(EM- -2-6050, 1999) ) and can be expressed as follows; Where, ln z ð Þ ¼ logarithm of the ground motion level, ln z ð Þ ¼ mean natural logarithm of the ground motion level, S ln z ð Þ ½ ¼ standard deviation of the natural logarithm of the ground motion level, Ffg ¼ cumulative distribution function of a unit normal variable.Using Eqs. 14 and 15, the probabilities of ground motion exceedance using considered GMPEs are calculated based on earlier considered m i and r j values and considering z in increments of 0.05 g starting from a minimum value of 0.01 g.
Once all the three uncertainties mentioned above are known, these are combined using Eq. ( 10) while performing PSHA, as discussed in the next section.
10. Logic tree considered and overall uncertainty determination in PSHA Variations in seismotectonics, methods adopted and models used in SHA lead to variation in the estimation of the seismic hazard values at a particular location (Cramer 2001;Cramer et al. 2002).This variability is caused by both random natural processes (aleatory uncertainty) and uncertainties in knowledge and measurements (epistemic uncertainty).As discussed earlier, in order to incorporate uncertainty in predicted seismic hazard values, two source models (SM1 and SM2), and three GMPEs are considered in the present analysis using a logic tree (see Figure 7).Weights assigned to each component of the logic tree are shown in Figure 7.For the two source models, equal weights have been given for the analyses.For GMPEs, the weights obtained in section 8 are assigned, keeping the sum of the total weights as 1.0.
In the present study, by using the declustered EQ catalogue developed earlier, areal sources have been identified in section 4, values of parameters have been estimated in sections 5, 6 and 7, GMPEs selected is discussed in section 8, and the methodology adopted is described in section 9, PSHA is performed considering each logic tree branch (see Figure 7).There are two methods for estimating the final probable seismic hazard level from a logic tree.In the first method, the hazard level is first estimated for each alternative model, which is then weighted in accordance with the weights provided in the logic tree.In the second method, the logic tree is sampled by using a sampling technique to build a distribution of alternative models, and then hazard levels are estimated from the same (Cramer 2001), which in turn is used to estimate the uncertainty/variability of the final hazard levels.Generally, uncertainty/ variability is expressed in terms of either standard deviation or coefficient of variation (CoV) along with the mean hazard value.The CoV can be defined as the ratio of standard deviation to the mean of any distribution.Though the mean hazard level represents the best estimation of seismic hazard for the region (Cramer 2001), the CoV determination is also important as it describes the confidence in the mean hazard level estimation (Das et al. 2016).In other words, a CoV map gives a visual idea of the effect of the lack of knowledge in the estimated seismic hazard levels (Cramer et al. 2002;Cramer 2001).In the present work, the uncertainties associated with limited knowledge are addressed by using the Monte Carlo approach to sample the logic tree used for PSHA.A total 10000 Monte Carlo samples of the logic tree are generated to obtain the multiple values of possible seismic hazard levels.Based on these values, mean hazard level values and CoVs are calculated.Besides, hazard levels are also obtained for four percentiles, 5th, 16th, 95th and 84th.Hazard levels at different percentiles help in comparing the variability of the estimated hazard values.

Results and discussion
For the analysis, the entire NE is divided into grids of 0.5 Â 0.5 .Then for each grid mean hazard level and CoV of the hazard level are estimated for 10% and 2% probability of exceedance in 50 years by following the method described in section 10.
Further, considering the magnitude and distance bins listed in Table 11, the deaggregations of PSHA for mean PGA corresponding to 10% probabilities in 50 years and 2% probabilities in 50 years are carried out for eight cities of NE India.The selection of these magnitude and distance bins is inspired by the bins suggested by USNRC RG-1.2081RG-1. .208 (2007)).Exceedance approach is used for deaggregation of the seismic hazard.To do so, firstly, PGA values corresponding to 10% probabilities in 50 years and 2% probabilities in 50 years are estimated for the eight cities.Then the annual rate of exceedance of the estimated PGA values are obtained for the different combination of the selected magnitude and distance bins by using Eq.10.This way contribution of each combination of the magnitude and distance bins are obtained in the deaggregation of the seismic hazard.Besides this, the deaggregation of seismic hazard in latitude and longitude is also performed in the present study.The entire area is divided into 0.1 Â 0.1 grid and contribution toward PGA values from each Table 11.Various magnitude bins and distance bins considered for deaggregation of hazard.
Detailed discussions on the results obtained are given in different subsections below.

Spatial variations in spectral acceleration for NE
Figure 8a-d presents contours of mean Spectral Accelerations (Sa) at 0, 0.1, 0.2 and 1 s periods corresponding to 10% probability of exceedance in 50 years (also known as design basis earthquake or DBE) for NE.It can be observed from Figure 8a that the present study shows variation in PGA from 0.27 to 0.51 g within NE India.The highest seismic hazard level is observed in the eastern part of Manipur and in the southern part of Nagaland (Figures 8a).For Manipur, PGA varies from 0.35 to 0.51 g.Similarly, for Nagaland, PGA varies from 0.27 to 0.43 g.Considering seismic history (Figure 2), it can be observed that the density of past EQs and the magnitudes are very high near Manipur and Nagaland.Further, both these states lie in the Indo-Burma ranges, having high seismic activity.
The lowest seismic hazard value is observed in Sikkim and parts of Arunachal Pradesh and Assam.For Sikkim, the PGA varies from 0.27 to 0.31 g.Though some major and great EQs have occurred near Sikkim in the past, regional seismic activity is relatively low.This can be understood from the fact that the number of past EQs in the region is lower in comparison to adjoining regions.
A small segment in the eastern part of Assam and the south-central part of Arunachal Pradesh show PGA ranging from 0.23 to 0.27 g.Observing past EQs (Figure 2) indicate that this region has a significantly lesser number of past EQs along with the presence of major to great EQs.Further, a high PGA value (0.35-0.39 g) is observed in the middle parts of Assam.It is observed that this region falls close to the Bhutan Himalaya and the Shillong Plateau, where 1897 Assam EQ had occurred.Part of Assam (near the Indo-Burma ranges) shows higher PGA (>0.35 g).Similar observations can also be made from western and eastern parts of Arunachal Pradesh, where PGA varies from 0.35 to 0.39 g.These regions witnessed a high density of past EQs, and higher magnitude EQs (see Figure 2).Based on the above discussion, it can be concluded that both Assam and Arunachal Pradesh consists of regions which have experienced rare to very frequent EQs in the past (Figure 2).As a result, wide variation in PGA values can be witnessed in the above two states (from 0.23 to 0.39 g).
Other states of the NE do not show such wide variation in PGA values.Meghalaya has PGA varying from 0.27 to 0.39 g; Tripura has PGA variation from 0.27 to 0.35 g, and Mizoram has PGA variation from 0.31 to 0.39 g (see Figure 8a).It should be noted here that Meghalaya has a history of significant numbers of strong, major and great EQs.Similarly, many strong to major EQs had occurred near Tripura and Mizoram.However, based on the distribution of PGA for the NE, it can be stated that the highest PGA values are observed near the Indo-Burma ranges, followed by the Bhutan-Himalaya and then the Shillong Plateau.In other words, though the NE has experienced a significant number of EQs on regional faults, seismic sources governing the regional seismic hazard values are plate boundaries.Similar observations can be made from Figure 8b-d for 0.1, 0.2 and 1 s.
As mentioned earlier, to evaluate the variability/uncertainty of the estimated seismic hazard levels, CoVs are calculated.The CoV map of hazard level for 10% probability in 50 years is depicted in Figure 9a-d as contours corresponding to Sa at 0, 0.1, 0.2 and 1 s.The CoV of PGA within NE India varies from 0.1 to 0.4, which indicates that the variability in the estimated PGA value is low.However, for Sa at 1 s, location wise CoV values are higher as compared to PGA or other time periods.Figure 10a-d shows the spatial variation in PGA for 10% probability in 50 years for SC A at 5th, 16th, 84th, and 95th percentile.The mean PGA value is closer to the PGA value at the higher percentile.
Similarly, Figure 11a-d presents PSHA outcomes for 2% probability of exceedance in 50 years (also called Maximum Credible Earthquake or MCE) at 0, 0.1, 0.2 and 1 s, respectively.It is observed that for MCE, PGA values in the NE region vary from 0.40 to 0.85 g as per the present analysis.Further, similar to DBE, even MCE-based assessment shows that the dominating seismic sources include the Indo-Burmese range, followed by the Bhutan Himalayas and the Shillong Plateau.As mentioned earlier, the CoV values are calculated to find out the variability/uncertainty corresponding to MCE-based hazard level estimation.The CoV map of hazard levels for Sa at 0, 0.1, 0.2 and 1 s corresponding to 2% probability of exceedance in 50 years is depicted in Figure 12a-d.The CoV of PGA within NE India varies from 0.1 to 0.4.The location-wise CoV values corresponding to Sa at 0.1, 0.2 and 1 s are higher than the one estimated for PGA.It is notable that a similar phenomenon was observed for DBE as well.Figure 13a-d shows the spatial variation in PGA for 2% probability in 50 years for SC A. at 5th, 16th, 84th, and 95th percentile.Similar to the DBE, in case of MCE also the mean PGA value is closer to the PGA value at the higher percentile.
Table 13 compares DBE-based PGA values for eight cities (see Table 12 and Figure 14 for the location) of NE India obtained based on the present work and previous studies.The hazard curves for PGA values obtained in various MonteCarlo realisation for all the eight cities are also shown in Figure 15.It is observed that the present findings are slightly different from the findings by NDMA (2010) and Pallav et al. (2012).Further, present values are significantly higher than the works by Das et al. (2016), Baro et al. (2020), Sil et al. (2013) and lower than the studies by Sharma   and Malik (2006), Nath and Thingbaijam (2012), Bahuguna and Sil (2020) and Ghione et al. (2021).The spatial mean hazard level distribution of the present study shows comparable results with Sitharam and Kolathayar (2013), and Nath and Thingbaijam (2012).The primary reasons for variations in the seismic hazard values between any two studies are changes in EQ catalogues and the values of seismic activity parameters.The SM2 used in this work ensured uniform distribution of seismic activity in each source zone.This can be one significant attribute that makes the present findings more reliable than earlier studies.
In order to understand about contributing sources in a better way, deaggregations of EQ hazard in terms of magnitude and distance (M-R), and longitude and latitude are done, as discussed in the next section.

Deaggregation of EQ hazard for the important cities in NE India
This section explains the observations based on separate deaggregation plots  for 8 important cities of NE (see Figure 14).
Figures 16a and 17a show the M-R deaggregation plot (for all the seismic sources combined) for Guwahati city centre (91.74 E, 26.14 N) corresponding to DBE and    MCE, respectively.For DBE, EQs in magnitude bin of 6-6.5 Mw and distance bin of 25-50 km have the highest contribution towards PGA, as can be observed from Figures 16a.In addition, moderate EQs (Mw-5 to 5.9) with distance upto 100 km and major EQs (Mw-7 to 7.9) with distance from 50 km to 100 km are also contributing towards regional seismic hazard significantly.Based on the above distance range, the areal sources with higher contributions are Za7, Za8, Za11, Za12, and Zb7.Each of these seismic sources has experienced a large number of strong and major EQs that have occurred in the past.Further, from Figure 18a, it can be found that faults such as Atherkhet Fault, Kulsi Fault, Dudhonoi fault, Kapili fault, Dighalpani Kakijan fault, MCT, MBT, MFT, Dauki fault, Oldham Fault, and Barapani are contributing significantly toward hazard level.The epicentre of 1897 Shillong great EQ lies within 200 km of Guwahati city.While observing Figure 16a, it can be concluded that the contribution from great EQ towards the seismic hazard of Guwahati city centre though is significant, it is comparatively lesser than the contribution from strong to major EQs.In other words, EQs with magnitude in the range of moderate, strong and major EQs, happening within 100 km radial distance, are contributing more to the seismic hazard of Guwahati in comparison to great EQs located within 200 km distance.This is primarily due to the fact that the larger is the EQ magnitude, the lesser is the probability of its occurrence within a defined time period.Figure 17a presents deaggregation plots for MCE for Guwahati city centre.It can be observed from Figure 17a that EQs in the magnitude range of 7 to 7.5 shows the highest contribution for MCE.In other words, with an increase in the return period, there is an increase in percentage contribution from higher magnitude EQs.In similar manner, deaggregation for 8 cities of NE are presented in Figures 16a-h   17a-h corresponding to DBE and MCE, respectively.Further, Table 14 summarises the deaggregation results referring to Figures 16 and 17a-h.Collectively, it can be concluded that depending on the position of each city with respect to important sources, contributions from EQs in different magnitude and distance bins are different.These contributions mainly depend on the past seismic activity around the cities.Due to this reason, different cities in NE India show significantly varying seismic hazard values.In addition, current findings also show that, except for Itanagar, strong EQs play a dominant role in controlling PGA of DBE and MCE.For Itanagar city, however, the contribution from major EQs, which had happened at larger distances, is the most.Further, for all the cities, the contributions from larger magnitudes is more in the case of MCE than DBE.This is primarily due to the fact that MCE has a larger return period in comparison to DBE.
It is also observed from the present results that the contribution of the great EQs is significant for Guwahati, Gangtok and Shillong (Figures 16 and 17) primarily because these cities are close to past great EQs.Other cities, however, are either located far away from the locations of great EQs or have a larger number of strong to major EQs from nearby sources, which overshadow the contribution of the great EQs.However, such contribution from great EQs may increase with the increase in the return periods.
The present study also finds that irrespective of the magnitudes, EQs in distance bins >200 km have a negligible contribution to seismic hazard values (both in DBE and MCE).Further, the contribution from EQs with magnitudes less than 4.5 Mw towards regional seismic hazard is significantly lower.

Conclusions
The present work performs PSHA for NE India by considering a logic tree consisting of two source models, one method of determination of seismic activity parameters, and three GMPEs.In addition, a total of 10000 numbers of Monte Carlo sampling of the logic tree is carried out to obtain the mean hazard levels for DBE and MCE and the variability/uncertainty of the results throughout NE India.Based on the analysis, for DBE, mean PGA is found to vary from 0.27 to 0.51 g within NE India.CoV, which represents the uncertainty of the estimated PGA, is between 0.1 and 0.4 for DBE and MCE.Lower CoV values (<0.4) in the case of PGA determination represent a low variability in the estimated PGA values throughout NE India.Spatial distribution of PGA for 5th, 16th, 84th and 95th percentile are also found out for DBE and MCE in the present study.It is found that, the mean PGA values are more closer to higher percentile PGA values than the lower percentile PGA values.PGA values Based on the PSHA results, highest seismic hazard level is found in the eastern part of Manipur and the southern part of Nagaland.This is purely attributed to the very high density of past EQs and magnitude of above locations.Further, both these locations are close to Indo-Burma ranges.The lowest seismic hazard is found in Sikkim and parts of Arunachal Pradesh and Assam.It is observed that the density of past EQs is relatively low close to Sikkim and parts of Arunachal Pradesh and Assam.
The deaggregation plots obtained for eight important cities of NE based on SHA help in understanding the contribution of seismic sources around the cities.For Guwahati city centre, seismic sources such as the Atherkhet Fault, Kulsi Fault, Dudhonoi fault, Kapili fault, Dighalpani Kakijan fault, MCT, MBT, MFT, Dauki fault, Oldham Fault, and Barapani are controlling the seismic hazard value.Further, though the epicentre of great EQs are within 200 km radial distance from Guwahati, their contribution is relatively less.In fact, EQs in the magnitude range of 6 to 7.5 happening within 100 km radial distance have more contribution.For Agartala, EQs originating from the Dauki fault, and the Sylhet fault, having a magnitude range from 6 to 7.5, contribute most to the city's seismic hazard value.
Deaggregation plot for Imphal suggests that EQs from Indo-Burma Ranges, having magnitude in the range of 6 to 6.5, contribute most to seismic hazard values.For Shillong city, EQs originating from the Kulsi Fault, Dudhonoi fault, Kapili fault, Barapani shear, Dauki fault, Oldham Fault, Eocene Hinge Zone, and Sylhet fault, having magnitude range from 5.5 to 7.5 are found to be controlling the seismic hazard values.
For Aizawl, EQs originating from Indo-Burma Range, with a magnitude range from 6 to 7.5, contribute most to the city centre's seismic hazard value.Similarly, EQs originating from the CMF, Kapili faults, Disang Thrust, Dighalpani Kakijan fault and Naga Thrust having a magnitude range from 6 to 6.5, contribute most to Kohima's seismic hazard values.
For Gangtok city centre, EQs originating from MCT, MBT and MFT, with magnitudes ranging from 6.5 to 8, contribute most to the seismic hazard value.For Itanagar, on the other hand, EQs originating from Kopili fault, Dighalpani-Kakijan fault, Naga Thrust, MCT, MBT, and MFT, having magnitude range of 6.5 to 7.5 and epicentral distance of 25 to 100 km are contributing most to the seismic hazard value.
Overall, based on the present work, it has been observed that strong EQs mainly control the seismic hazard in NE India, both for MCE and DBE.While Itanagar receives most contributions from major EQs that had happened at a larger distance, other cities have significant contributions from larger EQs happening at close by distances.

Figure 2 .
Figure 2. Past EQs (main events only) in and around of NE India.

Figure 5 .
Figure 5. Plots of cumulative number of EQs versus year to obtain years of completeness for Zb1 of SM2.

Figure 6 .
Figure 6.Plots of cumulative number of EQs versus year to obtain years of completeness for Zb4 of SM2.
W k ¼ Number of EQs within individual source zone k Number of EQs in the combined source zone (5) 1.If Mmax-obs < Mmax-fault < MmaxRWC, then Mmax-fault is considered as Mp. 2. If Mmax-fault < Mmax-obs, Mmax-obs is considered as Mp. 3. If Mmax-fault > MmaxRWC by a small amount (up to 0.2 Mw), Mmax-fault is considered as Mp. 4. If Mmax-fault > MmaxRWC by larger amount (more than 0.2 Mw), MmaxRWC is considered as Mp.

Figure 7 .
Figure 7. Logic-tree considered for PSHA in this work.

Figure 8 .
Figure 8. Spatial variation in mean seismic hazard level for 10% probability in 50 years for SC A. (a) PGA (b) Sa at 0.1 s (c) Sa at 0.2 s (d) Sa at 1 s.

Figure 9 .
Figure 9. Spatial variation in CoV of seismic hazard level for 10% probability in 50 years for SC A. (a) PGA (b) Sa at 0.1 s (c) Sa at 0.2 s (d) Sa at 1 s.

Figure 11 .
Figure 11.Spatial variation in mean seismic hazard level for 2% probability in 50 years for SC A for (a) PGA (b) Sa at 0.1 s (c) Sa at 0.2 s (d) Sa at 1 s.

Figure 12 .
Figure 12.Spatial variation in CoV of seismic hazard level for 2% probability in 50 years for SC A. (a) PGA (b) Sa at 0.1 s (c) Sa at 0.2 s (d) Sa at 1 s.

Figure 14 .
Figure 14.Important cities in each of the eight states of NE India (Stars show the location of the cities) considered for deaggregation.

Figure 15 .
Figure 15.Hazard curves for PGA obtained in various MonteCarlo realisation for the eight cities of NE India.
Figure17apresents deaggregation plots for MCE for Guwahati city centre.It can be observed from Figure17athat EQs in the magnitude range of 7 to 7.5 shows the highest contribution for MCE.In other words, with an increase in the return period, there is an increase in percentage contribution from higher magnitude EQs.In similar manner, deaggregation for 8 cities of NE are presented in Figures16a-h and

Figure 16 .
Figure 16.Deaggregation plot for eight important cities considering seismic hazard level for 10% probability of exceedance in 50 years.

Figure 17 .
Figure 17.Deaggregation plot for eight important cities considering seismic hazard level for 2% probability of exceedance in 50 years.

Table 1 .
Summary of SHA for NE India done by previous researchers.
Mw !8.3.Figures 5a-f and 6a-f show typical plots of the cumulative number of EQs versus year for Zb1 and Zb4.From Figure

Table 3 .
Year of completeness for different ranges of Magnitude (Mw) for SM2.

Table 4 .
Seismicity parameters obtained for areal sources of SM1.

Table 5 .
Seismicity parameters obtained for areal sources of SM2.

Table 6 .
Seismicity parameters of areal sources of SM1 used for PSHA.

Table 7 .
Seismicity parameters of areal sources of SM2 used for PSHA.
Tables 8. Mp estimation for areal sources of SM1.

Table 10 .
Log Likelihood (LLH) values and weight obtained for the GMPEs.

Table 12 .
Most important cities in all the states of NE India.

Table 13 .
Comparison of DBE based PGA (in g) obtained in the present study with existing study.