Earthquake Prediction: Analogy with Forecasting Models for Cyber Attacks in Internet and Computer Systems

Currently, security of the cyber space (computer networks and the Internet) is mostly based on detection and/or blocking of attacks by the use of Intrusion Detection and Prevention System (IDPS), according to (National Institute of Standards and Technology [NIST SP80094], 2010). However IDPS lacks in security as it is based on postmortem approaches threats and attacks are identified and/or blocked only after they can inflict serious damage to the computer systems either while attacks are happening, or when attacks have already imposed losses to the systems (Haslum et al, 2008). On the subject of earthquakes, one can notice the same kind of limitation: once an earthquake has already begun, devices can provide warnings with just few seconds before major shaking arrives at a given location (Bleier & Freund, 2005), (Su & Zhu, 2009). In the cyber space context, intending to cover the deficiency of late warnings, predicting techniques have already been approached in a small number of studies for cyber attacks in the last few years (Pontes & Zucchi, 2010), (Haslum et al, 2008), (Lai-Chenq, 2007), (Yin et al 2004).


Introduction
Currently, security of the cyber space (computer networks and the Internet) is mostly based on detection and/or blocking of attacks by the use of Intrusion Detection and Prevention System (IDPS), according to (National Institute of Standards and Technology [NIST SP800-94], 2010).However IDPS lacks in security as it is based on postmortem approaches -threats and attacks are identified and/or blocked only after they can inflict serious damage to the computer systems either while attacks are happening, or when attacks have already imposed losses to the systems (Haslum et al, 2008).On the subject of earthquakes, one can notice the same kind of limitation: once an earthquake has already begun, devices can provide warnings with just few seconds before major shaking arrives at a given location (Bleier & Freund, 2005), (Su & Zhu, 2009).In the cyber space context, intending to cover the deficiency of late warnings, predicting techniques have already been approached in a small number of studies for cyber attacks in the last few years (Pontes & Zucchi, 2010), (Haslum et al, 2008), (Lai-Chenq, 2007), (Yin et al 2004).

Motivation
Although studies based on 1) historical earthquake records and 2) monitoring the earth's surface had contributed to map affected regions, short-term earthquake predictions are not efficient yet (Bleier & Freund, 2005).Some researchers are studying and correlating signals gathered in the ionosphere that can precede earthquakes, like odd radio noise and lights in the sky.According to (Bleier & Freund, 2005) "both the lights and the radio waves appear to be electromagnetic disturbances that happen when crystalline rocks are deformed--or even broken--by the slow grinding of the earth that occurs just before the dramatic slip that is an earthquake".Some occurrences of earthquakes show signals and disturbances like following reported ones: -Loma Prieta, San Francisco,1989: two weeks before a 7.1-magnitude earthquake, strong signals (20 times that of normal background noise at the 0.01 Hz frequency) of magnetic 102 disturbance were detected.Three times before the quake the signals jumped to 60 times normal size at the 0.01 Hz frequency; -Spitak, Armenia, 1988: signals occurred shortly before a 6.9-magnitude quake; -Guam, Pacific Ocean, 1993: signals were observed before a 8.0-magnitude quake; -Parkfield, California, 2003: nine hours before a 6.0-magnitude quake, spikes of activity, four to five times normal size (0.2 to 0.9 Hz frequency) were detected; -Taiwan, 1999: sensors registered unusually large disturbance in a normally quiet signal before the 7.7-magnitude earthquake.Researchers calculated the current required to generate those magnetic-field disturbance: between 1 million and 100 million amperes.Those examples show that the occurrence of electromagnetic signals does not justify a public warning, but it is an important source of data for forecasters and are also useful for directing the course of research on earthquake prediction such as changes in the conductivity of the air over the quake zone caused by current welling up from the ground, that contribute to the formation of the so-called earthquake lights in the Mojave Desert (Fig. 1).Fig. 1.Earthquake lights (Bleier & Freund, 2005) There are some theories about these signals generation, but details are not conclusive yet.Notwithstanding, electromagnetic effects of the signals can be detected in a number of ways (see Fig. 2 next page).Ground-based sensors, monitor changes in the low-frequency magnetic field and measure changes in the conductivity level of the air.Satellites monitor noise level at extremely low frequency and monitor the infrared light which is probably emitted when rocks are deformed or even broken.Some example: -after the 1989 Armenia earthquake, electromagnetic Extremely Lower Frequency (ELF) disturbances were observed by a Soviet Cosmos satellite by a month; -an U.S. satellite detected ELF bursts before and after a 6.5-magnitude earthquake in 2003 at California; - In 2004 France has launched a satellite for Detection of Electro-Magnetic Emissions Transmitted from Earthquake Regions (DEMETER) that unfortunately presented malfunctioning.
Earthquake Prediction: Analogy with Forecasting Models for Cyber Attacks in Internet and Computer Systems 103 Fig. 2. Electromagnetic signals detection (Bleier & Freund, 2005) According to (Bleier & Freund, 2005), "infrared radiation detected by satellites may also prove to be a warning sign of earthquakes to come".In China satellite-based instruments had registered the occurrence of several infrared signature instances with a jump of 4 to 5 oC before some earthquakes during the past two decades Sensors in NASA's Terra Earth Observing System satellite registered what NASA called a thermal anomaly on 21 January 2001 in Gujarat, India, just five days before a 7.7-magnitude quake there; the anomaly was gone a few days after the quake (Fig. 3).Accordingly with (Bleier & Freund, 2005), in both cases researches believe these sensosrs have detected an infrared luminescence generated by the recombination of electrons and holes, not a real temperature increase.
Fig. 3. Infrared radiation detected by satellites n (Bleier & Freund, 2005) The connection between large earthquakes and electromagnetic phenomena in the ground and in the ionosphere is becoming increasingly solid.Researchers in many countries, including China, France, Greece, Italy, Japan, Taiwan, and the United States, are now contributing to the data by monitoring known earthquake zones.Some correlations between historical data can be traced as well: monitoring 144 earthquakes (1997)(1998)(1999), Taiwanese researches noticed significant changes in the electron content of the ionosphere some days before the quakes higher than 6-magnitude.Therefore, the integration of: (1) several types of sensors (ground and space-based), (2) a network to bring together those signals, (3) a good distribution of the sensors (several sensors in a large area), (4) several types of detection (Ultral Low Frequency (ULF), ELF and magnetic-field changes, ionospheric changes, infrared luminescence, and air-conductivity changes--along with traditional mechanical and GPS monitoring of movements of the earth's crust and ( 5) the correlation of all data gathered, could make forecast more reliable.

Analogy with forecasting in cyber security
Cyber attacks can be classified as a set of actions with the purpose of compromising the integrity, confidentiality or availability of computer systems.Cyber attacks can be caused by users or malicious software, which try either to obtain access, to use systems in an unauthorized way, or to enumerate privileges (NIST SP800-94, 2010  (McPherson & Labovitz, 2010) as well.Regarding DDoS attacks, it is expected these attacks to become more common against independent media and human rights sites in 2011, as the recent highly publicized DDoS attacks on Wikileaks, and "Operation Payback" attacks by "Anonymous" on sites perceived to oppose Wikileaks (Zuckerman et al, 2010).
According to (Pontes et al, 2008), (Pontes & Guelfi, 2009a), (Pontes & Guelfi, 2009b), (Pontes & Zucchi, 2010), an early warning system showing a future trend outlook with an increasing number of cyber-attacks, exposed by forecasting analysis, may influence decisions on the security devices adoption (e.g.rules in IDPS combined with rules in firewalls) before incidents happen, according to the needs.Although, three major gaps lie in the studies about forecasting of cyber attacks: a) the use of few sensors and/or sensors employed locally; b) the use of just one forecasting technique; and c) lack of information sharing among sensors to be used for correlation (Pontes & Guelfi, 2009a).Correlation of information between IDPS and forecasters means looking for similar characteristics that may be related (Pontes & Guelfi, 2009a) (Abad et al, 2003).Throughout correlation it is possible to eliminate redundant and false data, to discover attack patterns and understand attack strategies (Zhay et al, 2006).Nevertheless, forecasts and alert correlation may be challenging as they depend on the reliability of the source of the security alerts (Silva & Guelfi, 2010).Therefore, the precision level of the detection tools is an important issue for validating correlations.Multi-correlation or integration of alerts with information from different sources, e.g.tools for monitoring or operating system logs, can allow a new classification for alerts, improving accuracy of the results (Abad et al, 2003), (Zhay et al, 2006).References (Abad et al, 2003), (Zhay et al, 2006), (Zhay et al, 2004) employed multi-correlation; however neither a detailed analysis concerning influence of isolated alerts in the FP rates, nor forecasting techniques were not applied for predicting future attacks (forecasting).Forecasting analysis in the information security area can be similar to forecasting methodologies used in any other fields: meteorology, for instance, use sensors to capture data about temperature, humidity, etc (Lajara et al, 2007), (Lorenz, 2005); seismology employs sensors to capture electromagnetic emissions from the rocks (Bleier & Freund, 2005); for economics, specifically stock market, data is collected from diverse companies (annual profit, potential customers, assets, etc) to draw trends about shares of companies (Prechter & Frost, 2002), (Mandelbrot & Hudson, 2006).For any field formal models can be applied to predict events over the collected data.But, before applying formal models, data regarding different kind of variables should be correlated (Armstrong, 2002).According to (Armstrong, 2002), to obtain a more accurate and realistic result about predictions it is suggested: (1) to use diverse forecasting techniques; (2) to analyze information regarding diverse variables and acquired data, from sensors for instance; (3) to employ diverse kind of employed forecasting models.Concerning cyber attacks, (Lai-Chenq, 2007), (Yin et al 2004) employed forecasting models, however they used just one formal method for predicting events and they did not make use of any kind of correlation process.In this chapter, security events for cyber security are actions, processes that have an effect on the system, disregarding the kind of the effect -in other words, actions that could result in positive or negative effects on the system.In other hand, security alerts are types of security events, indicating anomalous activities or cyber attacks (Silva & Guelfi, 2010).In our earlier works we proposed the Distributed Intrusion Forecasting System (DIFS) (Pontes & Guelfi, 2009), (Pontes & Zucchi, 2010), which covered the following gaps of today's forecasting techniques in IDPS: a) the use of few sensors and/or sensors employed locally for capturing data; b) the use of just one forecasting technique; and c) lack of information sharing among sensors to be used for correlation.Notwithstanding, we faced huge amount of alerts which could have negative influence over forecasting results.

Proposal
The goal of this chapter is to propose a Distributed Intrusion Forecasting System (DIFS) with a two stage system which allows: (1) in the first stage it is possible to make a correlation of security alerts using an Event Analysis System (EAS); and (2) to apply forecasting techniques on the data (historical series) generated by the previous stage (EAS).The DIFS works with prediction models and sensors acting in different network levels (host, border and backbone), which enables the use of different forecasting techniques (e.g.Fibonacci sequence and moving averages), the cooperation among points of analysis and the correlation of predictions.Additionally to the main goal, the aim of this chapter is proposing an analogous approach for earthquake prediction.As results it is intended to increase reliability of incidents predictions (e.g.earthquake incidents, cyber attacks), to prevent incidents in a proactive manner and to improve risk management employed for security of the homeland cyber space.A proof of concept of such architecture (DIFS) is presented, which allows concluding about the improvement of forecasts in the cyber space; furthermore, tests applied over two datasets -(Defense Advanced Research Projects Agency [DARPA], 1998) and (Knowledge Discovery and Data Mining Tools Competition [KDD], 1999) -with an IDPS have shown that the employed techniques define incidents trends.This chapter is organized as follows: state of art concerning forecasting and event correlation in IDPS are in section 2. Section 3 introduces the proposal of this chapter: the DIFS and the two stage system for correlation regarding cyber attacks.Section 4 presents details about the tests and environment to validate the proposal.Results are analyzed in section 5 and section 6 summarizes conclusions and suggestions for new studies.

State of art -Cyber attacks, event correlation and forecasting
In this section we approach event correlation for detecting cyber-attacks, the forecasting methods used to predict cyber-attacks and Distributed Architecture for Intrusion Forecasting System (DIFS (Pontes & Guelfi, 2009), (Pontes & Zucchi, 2010).

Unwanted internet traffic and cyber attacks
The expression "unwanted traffic" was first introduced in the eighties and it has always been related to malicious activity as worms, virus, intrusions etc (Feitosa et al, 2008).
Reference (Feitosa et al, 2008) defines unwanted Internet traffic (UIT) as unproductive and useless traffic, with malicious (worms, scans, spam) and benign (wrong setting in the routers) events.Reference (Soto, 2005) completes this definition: UIT may result from the noise in the telecommunication network.(Andersson et al, 2007) classified UIT as the malicious or useless one, with the objective to compromise vulnerable hosts, to spread malicious code, spam, DoS and DDoS.UIT may also be junk traffic, background traffic and anomalous traffic.Symposiums and workshops have been done about the issue of UIT, like the one promoted by Internet Architecture Board (IAB), on March 2006 (Andersson et al, 2007) and April 2008: the intention was to share information among people from different fields and organizations, fostering an interchange of experiences, views, and ideas between the various research communities.As a result, the Request for Comments (RFC) 4948 details the UIT types, the main causes, existent solutions and the actions to be taken in short and long term.It was decided, in this workshop, that some other research topics about UIT would be managed by the IAB, Internet Engineering Task Force (IETF) and Internet Research Task Force (IRTF).
According to (Feitosa et al, 2008), several of the losses caused by UIT are due to the inefficiency of today's techniques and security devices (anti-spam, antivirus, Intrusion Detection and Prevention Systems (IDPS) (NIST, 2010), firewalls), whether for detecting and preventing the intrusion, or to treat the UIT.Furthermore, the high rates of false positives, false negatives and the lack of a forecasting approach for the Internet traffic are some of the reasons of the UIT increasing.Internet attacks continue apace, with UIT, such as phishing, spam, and distributed denial of service attacks increasing steadily.However, it is important to classify whether it is unwanted or not: Voip (Skype), peer-to-peer (P2P), instant messengers (MSN, Google talk, ICQ), online social networks.Different classification may be employed from one company to another, from user to user, from country to country.China, for instance, does not allow calls from Skype to telephones.Another example: routers for backbone providers and for small companies -the UIT is differently classified in both cases (Feitosa et al, 2008).

Approaches for correlation of security events
Correlation techniques for security events can be classified into three categories: (1) rulebased, (2) based on anomaly and (3) based on causes and consequences (Prerequisites and Consequences (PC)) (Abad et al, 2003).The rule-based method requires some prior knowledge about the attack, so the target machine has to pass through a preparation phase called training.The goal of this phase is to make the target machine able to precisely detect the vulnerabilities in which the target machine was trained for (Abad et al, 2003), (Mizoguchi, 2000).Gaps of rule-based method are: (1) it is computer intensive; (2) it results in lots of data; (3) the method works only for known vulnerabilities.
The method based on anomaly analyzes network data flow, using correlation with statistical methods, using accumulation of gathered information and using observations of the occurred deviations throughout processes of network data flow; in a manner to allow detecting new attacks.For instance, (Manikopoulos & Papavassiliou, 2002) demonstrates a system for detecting anomalies which is characterized by monitoring several parameters simultaneously.Reference (Valdes & Skinner, 2001) presents a probabilistic correlation proposed for IDPS, based on data fusion and multi-sensors.However, the method which uses anomaly cannot detect anomalous activity hidden in a normal process, if it is performed at very low levels.Besides, as this method analyzes normal processes reporting only wrong deviations, hence the method is not suitable for finding causes of attacks (Ning et al, 2001).
The PC method lies on connections between causes (conditions for an attack to be true) and consequences (results of the exploitation of a cause), in order to correlate alerts based on the gathered information.This method is suitable for discovering strategies of attacks.Both causes and consequences are composed of information concerning attributes of alerts (specific features belonging to each alert) and are correlated.Arrangement of attributes is called tuple.According to Fig. 4, for the connections to be valid, a preparatory alert must have in its consequences at least one tuple, which repeats in the causes of the resulting alert.
In other words, the preparatory alert contributes to the construction of the resulting alert, and therefore it can be correlated.For this connection, illustrated by Fig. 4, the timestamp of the preparatory alert has to come before the resulting alert (Silva & Guelfi, 2010), (Pontes & Guelfi, 2010), (Ning et al, 2001).In order to reduce complexity, correlation can be shown in graphs where alerts are represented by nodes and connections are depicted by arrows (representing correlations between alerts).Yet, some gaps in the PC method may be mentioned, such as the difficulty in obtaining causes and consequences of alerts (Pietraszek & Tanner, 2005), the impossibility to analyze isolated alerts (alerts that are not correlated) and the fact that missed attacks are hard to correlate.An alternative to minimize the problem is to apply complementary correlation techniques (Morin & Debar, 2003), using sensors to work in cooperation, in order to supervise the environment for minimizing missed detections.There are two techniques to map IDPS' alerts and logs obtained from other sources: descending analysis and ascending analysis (Abad et al, 2003), (Silva, 2010).Descending analysis is based on the investigation of occurred attacks, verifying (correlating) whether other logs (e.g.logs from O.S.) have or do not have vestiges of the attacks' incident.
For occurred attack, other traced logs (e.g.Operational System's logs) can be analyzed based on timestamp.This type of analysis is useful to trace evidences about strategies of events, in order to map attacks to its source.
The ascending technique is used to discover attacks by the analysis of several logs.Once an anomaly is detected in one of these logs, other logs are checked based on timestamp.
Although ascending technique is computer intensive, this technique allows detecting new attacks.
In an earlier work we proposed the EAS (Silva & Guelfi, 2010), (Silva, 2010), intending to improve results of security events correlation and intrusion detection.EAS is able to make multi-correlation for events from Operational Systems (OSs) and from IDPS (log analysis), consequently, EAS is also capable for verifying the influence of isolated alerts in the cybersecurity context.
The EAS architecture has 4 modules, as shown by  According to (Silva, 2010), (Silva & Guelfi, 2010), with the employment of the EAS it was possible to improve the today's results of correlation regarding security events, considering the following issues: (1) traceability for causes and consequences within the PC-correlation method (with multi-correlation criteria, correlation analysis (ascending/descending) and identification of FP alerts through tables and graphs); and (2) the process of results validation regarding the correlation.In (Silva, 2010), (Silva & Guelfi, 2010), results of correlating phase were evaluated in three steps (FP1, FP2 and FP3) using tables and graphs.
The stepwise analysis allowed comparison of the results.EAS achieved an increase of 112.09% in the identification of FP alerts after the multi-correlation.Another important result of EAS was the evidence of preparatory connections between individual alerts that are in fact part of larger and more elaborated attacks.In other words, EAS can show that individual alerts can be grouped in a single attack, since they are part of the same attack strategy (Silva, 2010), (Silva & Guelfi, 2010).

Related forecasting methodologies for earthquakes
Statistical based forecast methodologies are used to understand and predict earthquake signals (Kagan, & Jackson, 2000).It is important to discuss these other researches to notice the variety of forecasting applications.Two forecast researches are summarized below.Holliday et. al (2005) has based their forecast research on the association of occurrence of small earthquakes with probably future large ones.In fact, the method does not predict earthquakes, but spots regions (Hotspots regions) where they are most likely to occur in the future (about ten years).

Earthquake forecasting and its verification
Basically the research objective is to reduce risk areas analyzing the historical seismicity for anomalous behaviour.
The approach is based on a pattern informatics (PI) method which quantifies temporal variations in seismicity and is as follows 4. The seismic intensity in box i, I i (t b , t), between two times t b < t, can then be defined as the average number of earthquakes with magnitudes greater than M c that occur in the box per unit time during the specified time interval t b to t.Therefore, using discrete notation, we can write: (1) Where the sum is performed over increments of the time series, say days.5.In order to compare the intensities from different time intervals, it is required that they have the same statistical properties.Therfore, the seismic intensities are normalized by subtracting the mean seismic activity of all boxes and dividing by the standard deviation of the seismic activity in all boxes.The statistically normalized seismic intensity of box i during the time interval t b to t is then defined by (2) Where < I i (t b , t) > is the mean intensity averaged over all the boxes and ϭ(t b , t) is the standard deviation of intensity over all the boxes.6.The measure of anomalous seismicity in box i is the difference between the two normalized seismic intensities: (3) 7. To reduce the relative importance of random fluctuations (noise) in seismic activity, the average change in intensity is computed, ∆I i (t 0 , t 1 , t 2 ) over all possible pairs of normalized intensity maps having the same change interval: (4) Where the sum is performed over increments of the time series, which here are days.8.The probability is defined as a future earthquake in box i, P i (t 0 , t 1 , t 2 , ), as the square of the average intensity change: (5) 9. To identify anomalous regions, it is desirable to compute the change in the probability P i (t 0 , t 1 , t 2 , ) relative to the background so that we subtract the mean probability over all boxes.This change in the probability is denoted by ( 6) Where < P (t , t , t ) > is the background probability hotspots are defined to be the regions where ∆P i (t 0 , t 1 , t 2 ) is positive.In these regions, P i (t 0 , t 1 , t 2 ) is larger than the average value for all boxes (the background level).Note that since the intensities are squared in defining probabilities the hotspots may be due to either increases of seismic activity during the  2005) could be used as an input in a larger forecast system like DIFSA which would provide the communication and correlation of data with others different models.(Kagan, & Jackson, 2000) has developed a research with both short and long-term forecast approach and testing both with a likelihood function to 5.8-magnitude (or larger) quakes.

Probabilistic forecasting of earthquakes
Although the long-term approach (see Table 1), is not completely developed and is suitable to estimation of occurrence of earthquakes, it is derived from statistical, physical and intuitive arguments while the short-term forecast seismicity model is based on a specific stochastic model and updated daily (see Table 1).
The research assumes that the rate density (probability per unit area and time) is proportional to a smoothed version of past seismicity and depends approximately on a negative power of the epicentral distance and linearly on magnitude of the past earthquakes.
The model (Kagan, & Jackson, 2000) does not use retrospective evaluation of seismic data.The parameters of long-term are evaluated on the basis of success in the forecasting of seismic activity also indicating possible earthquakes perturbations.A maximum likelihood procedure to infer optimal values are applied on short-term approach which can be incorporated into real-time seismic networks to provide seismic hazard estimate.About the scientific results (Kagan, & Jackson, 2000) concluded that the research depicted a statistical relationship between successive earthquakes in a quantitative way that facilitate hypothesis testing.About the practical results the quantitative predictive assessment can be adopted into mitigation strategies.The versatility of the methodology based on forecasts is evident in this work, presenting significant results.This scenario shows that quite different methods (e.g, that use and do not use historical data) can be used in conjunction with an approach that uses DIFSA.

Forecasting for cyber attacks
The forecasting approaches in IDPS lie mainly on stochastic methods (Ramasubramanian & Kannan, 2004), (Alampalayam & Kumar, 2004), (Chung et al, 2006).With no attention about predictions, references (Ye et al, 2001), (Ye et al, 2003), (Wong et al, 2006) applied diverse probabilistic techniques (decision tree, Hotelling's T² test, chi-square multivariate, Markov chain and Exponential Weighted Moving Average (EWMA)) on audit data as a way to analyze three properties of the UIT: frequency, duration, and ordering.Reference (Ye et al, 2001), (Ye et al, 2003) has come to the following findings: 1) The sequence of events is necessary for IDPS, as a single audit event at a given time is not sufficient; 2) Ordering (transaction (Wong et al, 2006)) provides additional advantage to the frequency property, but it is computationally intensive.According to (Ye et al, 2001), (Ye et al, 2003), (Wong et al, 2006), the frequency property by itself provides good intrusion detection.References (Ye et al, 2001), (Ye et al, 2003), (Wong et al, 2006) did not approach correlation for IDPS.Moving averages (simple, weighted, EWMA, or central) with time series data are regularly used to smooth out fluctuations and highlight trends (NIST, 2009).EWMA may be applied for auto correlated and uncorrelated data for detecting cyber-attacks which manifest themselves through significant changes in the intensity of events occurring (Ye et al, 2001).Both (EWMA for auto correlated and uncorrelated) has presented good efficiency for detecting attacks.EWMA applies weighting factors which decrease, giving much more importance to recent observations while still not discarding older observations entirely.The statistic that is calculated is (NIST, 2009): Where: EWMA is the mean of historical data; Yt is the observation at time t; n is the number of observations to be monitored including EWMA; 0 <α< 1 is a constant that determines the depth of memory of the EWMA.
The parameter α determines the rate of weight of older data into the calculation of the EWMA statistic.So, a large value of α gives more weight to recent data and less weight to older data; a small value of α gives more weight to older data.
Reference (Cisar and Cisar, 2007) gives an overview of adopting EWMA with adaptive thresholds, based on normal profile of network traffic.The analysis of thresholds with EWMA may summarize huge amount of data in network traffic (Zhay et al, 2006), (Pontes & Zucchi, 2010).Diverse moving averages, combined with Fibonacci sequence forecasting approach, were also used by (Zuckerman et al, 2010) to spot trends of cyber attacks in the (DARPA, 1998) datasets.A simple moving average (SMA) is the non weighted mean of the previous n data.For example, a 10-hours SMA of intrusive event X (DoS, e.g.) is the mean of the previous 10 hours' event X.If those events are: , ,…, .Then the formula is (NIST, 2009), (Roberts, 1959): Nevertheless, the forecasting approaches which use moving averages to cope with cyber attacks in IDPS are limited to analyze cyber attacks individually, e.g. in just one IDPS.Therefore, there is no collaboration among the forecasters.Besides: the concept of sensors is not adopted in (Pontes et al, 2008), (Pontes & Guelfi, 2009a), (Pontes & Guelfi, 2009b), (Pontes & Zucchi, 2010), (Ishida et al, 2005), (Viinikka et al, 2006), (Ye et al, 2003).

The distributed intrusion forecasting system with the two stage system (Pontes et al, 2011)
Intrusion Forecasting Systems (IFS) can work proactively in cyber security contexts, as early warning systems, in order to indicate or identify UIT (incidents, threats, attacks) in advance.IFS can also represent an improvement of IDPS, which is based on postmortem approaches (UIT is identified and/or blocked only after they can inflict serious damage to the computer systems).IFS predicts UIT by the use of different forecasting techniques (for instance, moving average, Fibonacci sequence etc) applied either for local or distributed environment.Additionally, for distributed environments, e.g.DIFS, the use of cooperative sensors can improve accuracy about predictions of incidents.Fig. 6 depicts the proposal of this chapter, i.e. the DIFS and the forecasting levels.
Similarly to forecasting methodologies used in other fields (e.g.Meteorology), DIFS also spreads agents and/or sensors widely to make predictions about the different kinds of UIT (spam, virus, intrusion, abnormal network traffic).There are four levels of the IFS: level 1 -independent security devices of hosts; level 2 -integrated security devices of hosts; level 3 -the network level; and level 4 -the backbone level.All levels have some communication degree among each other.In other words, the forecasts obtained from level 1 are shared and correlated to the forecasts of the other levels.Lower levels work as sensors to higher levels; consequently feedback about the UIT trends may be exchanged from one level to another.Level 1 concerns the trend analysis about incidents, alerts and diagnosis reported independently by the hosts' security devices (antivirus, antispyware, host-based IDPS and other anomaly detector systems).For each security device, individual forecasts may be provided, e.g. the trend about spam for next hour or the day of tomorrow, or the trend about virus infection etc.The next step of the IFS level 1 is to help the hosts' security devices to determine whether or not they should adopt countermeasures to stop UIT Level 2 involves correlation of forecasts about the hosts' security devices.At this level, the analysis lays on two databases: a) All the historical data generated from each one of the hosts' security devices are processed individually by the IFS first level, then stored in a database; b) The network flow may also be recorded for further forecasting analysis.The next step for the IFS level 2 is to query and to analyze the trends (forecasts) of such databases.After analyzing it, IFS level 2 returns a feedback to IFS level 1.It is important to notice that the databases of IFS level 1 work as sensors for IFS level 2.
Fig. 6.DIFS Architecture -adapted from (Pontes & Guelfi, 2009) The implementation of IFS level 3 happens at the gateway of the LAN.IFS level 3 is analogous to IFS level 2, as it queries databases generated by IFS levels 1 and 2. Likewise IFS level 1, some security devices may be installed at the gateway (as firewall, regular IDPS, etc) and they may also be analyzed.The steps for analysis at this level are: a) Network security devices record UIT in databases; IFS level 3 queries the databases provided by the lower levels and current level; b) IFS level 3 analyzes the provided databases to define trends; c) IFS level 3 provides feedback of the trend analysis to the security devices; d) IFS level 3 may also give feedback for the lower levels.It is important to notice that IFS level 1 and level 2 databases work as sensors for IFS level 3.The sensor elements may be more numerous at IFS level 3. IFS level 4 is the major level.It considers the structure of the backbone providers (an ISP, for instance).In the same way IFS level 3 and level 2, different security devices are linked to the backbone level.The steps for IFS level 4 to work are: a) Backbone security devices record UIT in database; b) IFS level 4 queries the databases provided by the lower and current level; c) IFS level 4 analyzes the provided databases to define the trends; d) IFS level 4 provides feedback of the trend analysis to the current level; e) IFS level 4 may also give feedback for the lower levels.Similarly to lower levels, IFS level 4 uses the same concept of sensors: lower databases and the entire lower IFS levels are sensors for IFS level 4.An important note is: the IFS level 4 may be shared and correlated among various backbone providers.To correlate forecasts of IFS level 4 means to provide the most realistic and integrated trend about UIT, as it may spread sensors along the network (Lajara et al, 2007).
It is important to notice that for the IFS we implemented a two stage system (Pontes et al, 2011), intending to improve the forecasting results by the use of correlation.Fig. 7 presents the sequence of activities done by the system: 1.The first task is the multi-correlation, running the EAS, to filter FP and tracing sophisticated.During this step, OS's logs, IDPS's logs, network traffic and other logs are analyzed by the EAS.According to Fig. 4, diverse logs and network traffic represent the Entry 1 for the two stage system.
2. The second task is done by the IFS, applying forecasting techniques over the EAS' generated data (historical series, without a considerable amount of FP).Several forecasting techniques may be adopted in this stage (e.g.EWMA, Fibonacci sequence, Markov chains).As illustrated by Fig. 7, EAS' generated data is the Entry 2 for the two stage system.Sep 2 of the two stage system considers just data from Entry 2.

Proof of concept
In this section we are going to describe two of the prototypes we have prepared and analysed.In the first one (Pontes et al, 2009), for the proof of concept, levels 1, 2 and 3 of the DIFS were implemented in three sites geographically divided (A, A' and A'').The following hardware and services were used: a) 1 Pentium core 2 quad 2.0 GHz, 8GB RAM; b) 2 Pentium core 2 duo 1.8 GHz, 4GB RAM; c) 10 virtual machines (Ubuntu 8.04) 512MB RAM; d) 4 virtual machines (WindowsXP) 512MB RAM; e) Windows Vista (host for the virtual machines); VMware Player 2.51; Snort; Netfilter/Iptables; MySQL; OpenVPN.
Likewise (Haslum et al, 2008), in this prototype the simulation of UIT was divided in just in  -94, 2007) was installed at the gateway.Fig. 8 illustrates the sites, hosts with normal activities and infected hosts.Infected hosts inflict UIT to the hosts of each site and to hosts from other sites, as pointed by arrows.In this prototype, the propagation of UIT was in the following sequence: from site A to site A', from site A and A' to site A", from site A, A' and A" to site A. For this prototype, IFS was developed in JAVA and it runs in the three levels of DIFS.The IDPS Snort was used to analyze the network traffic.All classified UIT is lately recorded in a MySQL database.IFS collects data from the database, analyzes them and next, when a particular threshold of UIT is exceeded, a warning is sent to the IFS collaborators.
For the second prototype (Pontes et al, 2009), the two stage system was implemented and employed in a wired LAN, specifically in a computer working as gateway for the Internet (level 3 of the DIFS).Elements of level 1 (logs from the OS) were used in the.Although level 3 of the DIFS was approached, level 1, 2 and 4 were disregarded in the second prototype.The reason for implementing only level 3 is the representativeness of the gateway level: (a) the simulated cyber-attacks and the real network traffic have just one path to reach the Internet: throughout the gateway; (b) at the gateway level it was possible to assure timestamp conditions for correlation processes, as the IDPS is set at the same machine, the EAS and the gateway.The computer working as gateway (DIFS level 3) was able to register all alerts of the Network IDPS and logs from its own OS.Table 2 details services used in the two stage system, as the source machines for each service and the reached destiny for each service.In the environment for the tests, multi-correlation was done between alerts from an IDPS with the OS' logs.(Pontes et Al, 2011) Table 3 presents applications which were used in the prototype.EAS was developed by the authors, in Visual FoxPro.Finally, Table 3 shows the elapsed time for the prototype.Both simulation of normal network traffic and simulation of cyber-attacks were referred in the prototype.Normal network traffic was brought up as well.Unlike (Pontes et al, 2008), (Pontes & Guelfi, 2009a), (Pontes & Guelfi, 2009b), (Pontes & Zucchi, 2010)  It is important to notice that the cyber-attacks considered in this prototype are, in matter of fact, a set of events (alerts and logs) classified as a single and more elaborated attack.In our earlier works (Pontes et al, 2008), (Pontes & Guelfi, 2009a), (Pontes & Guelfi, 2009b), (Pontes & Zucchi, 2010), forecasting techniques considered just individual events in the cybersecurity context.Consequently in this paper forecasting techniques are differently employed, considering the DIFS architecture, as the prototype deals with more refined sets of attacks.Details regarding the EAS and the IFS tasks are not reported in this chapter due space limitations, but the reader may consult (Silva & Guelfi, 2010), (Silva, 2010) and (Pontes et al, 2008), (Pontes & Guelfi, 2009a), (Pontes & Guelfi, 2009b), (Pontes & Zucchi, 2010) for more information relating to EAS and IFS, respectively.

Results
Table 4 depicts the results of forecasting UIT in the first prototype.The UIT hit 4.320 thresholds from site A to site A' and, gradually, it increased with propagation of the UIT among the three sites.The total amount of the UIT thresholds among the three sites was about 16.416.In Table 4, correct forecasts are the of times that it was possible to foresee the increasing and/or decreasing UIT's phases, without any delay.The correct predictions' rates were about 60,71%.Forecast with delay are the number of the times the increasing and decreasing thresholds were identified lately.In this prototype, forecasts' rates with delay were about 34,74%.During the prototype tests, sometimes it was not possible to identify thresholds for of UIT decreasing or increasing.The rate for the times we could not predict was about 4,95%.
A   et al, 2009) In the second prototype, for the first step (EAS), results are achieved by analyzing consecutive graphs and tables from each phase.Quantity of alerts and correlations are independently accounted, according to the registered route (source and destination).In case the alerts and correlation regards the gateway, whether for source or destination), they are registered as Gateway; the alerts and correlation which disregard the gateway are registered as Non-Gateway.Table 6 summarizes the prototype and some results.Correlation shows a range of attack strategies.In each strategy a number of different alerts are connected sequentially as they were a single attack.A peer-to-peer (P2P) attack performed on machine 23 was chosen for the analysis of forecasting (Fig. 10, Fig. 11, Fig. 12 and Fig. 13).i.e. the IFS, after the employment of EAS filtering.In Fig. 13 it is possible to verify two thresholds pointing out the increasing of events (as indicated by the red arrows, and one threshold point out the decreasing of events (as shown by black arrow).Notice there is no significant occurrence of alerts at the beginning of the experiment and two false thresholds regarding forecasts were eliminated.It is also important to observe that the second ellipse with the FP were eliminated after the EAS filtering, hence, another false threshold was wipe out as consequence.More details regarding results can be found in (Pontes et al 2011).

Conclusion
As a conclusion, this chapter has introduced the Distributed Intrusion Forecasting System (DIFS) (Pontes et al, 2009), approaching cyber attacks and UIT in the cyber space context.The DIFS also presented the two stage system with the EAS implemented for making the multi-correlation (step 1) (Pontes et al, 2009), afterwards the application of the forecasting techniques over the generated data by the EAS (step 2).The forecasting model presented in this chapter could be analogously employed for earthquake prediction, due the following aspects: a) DIFS, with the Two Stage System and the EAS, was able to track in advance the increasing and decreasing rates of cyber attacks and UIT; hence such methodology may be employed as an early warning system; b) DIFS considers just frequency and temporal characteristics (timestamp) of events (UIT and cyber attacks), thus this approach can be similarly used in other areas.
Even though only 4,95% of the thresholds for UIT's increasing and decreasing were not detectable, the value of the outcome is still questionable, as this early warning system still has 34,74% of warnings being lately reported.The use of two forecasting techniques represented better results if compared to the use of only one prediction technique.The reason for the accuracy using two forecasting techniques, according to (Pretcher and Frost, 2002), is due to the fact Fibonacci sequence depends on EWMA for marking the first wave.Thus, it was possible to observe just some of the trends drew by the Fibonacci sequence.
Another characteristic for predictions with Fibonacci sequence is forecasts in the long term (2, 3 days): EWMAs don't have this feature, so, predictions using only EWMA lack in long term predictions.Employing both of the techniques aggregates the positive of either techniques, making the forecast more accurate.
For the EAS, it was suggested a standard to define causes and consequences within the PCcorrelation method combined with multi-correlation criteria, correlation analysis (ascending/descending) and identification of FP alerts through tables and graphs.It was done an experiment with a prototype, in a LAN, with diverse machines and OS, which used a gateway to get access to the Internet.The obtained results from the tests in our prototype indicate that level 3 of DIFS was improved, as some FPs were treated and predictions concerning cyber-attacks were more accurate.It is possible to come to this conclusion by verifying that, despite high FP rates of FP1 (21.08%) and FP3 (44.72%) -see Table III -; during the whole experiment, no TP alert was correlated exclusively as result of an FP alert.As a suggestion for improving the work, it is suggested to automate analysis' processes that require user interpretation (table correlation and mapping) for using the EAS in real time.
The accuracy of the results can be improved whether the multi-correlation is extended to entire LAN.Regarding the forecast's result, among the suggestions for future works there are the aggregation of the fractal approaches (according to (Mandelbrot & Hudson, 2006)), and the use of other kinds of forecasting techniques (as Markov chains and neural networks) to follow (Armstrong, 2002)'s advices.It is also suggested to extend the employment of the EAS for the
Fig.5: (a) converter: the aim of this module is to handle the input data into the system (IDPS signatures, alerts and logs from the OS); (b) updating: it controls data which is going to be used by the system; (c) correlating: it does mappings for the correlation processes, FP identification, and the identification of isolated alerts; (d) calculator: it analyzes and compares FP, based on the results from the correlating module.

Fig. 7 .
Fig. 7. Sequence of Steps: (1) EAS Filtering -(2) IFS(Pontes et al, 2011) four types: 1) Denial of service (DoS): Ping of Death and SYN Flood are examples of this kind of UIT; 2) Remote to local (R2L): SQL injection is an example of this kind of UIT, where typical vulnerabilities that are exploited is buffer overflow and pure environment sanitation; 3) User to root (U2R): SQL injection is also example of this kind of UIT; 4) Probe (Scanning): Nmap, IPswep, Satan are examples of software for scanning.During eight weeks, we simulate usual network traffic and UIT among hosts in each site.Normal network traffic and UIT were also simulated among sites.H-IDPS (NIST SP800-94, 2007) was installed in each one of the hosts.N-IDPS (NIST SP800

Fig. 8 .
Fig. 8. DIFS Prototype -adapted from (Pontes & Guelfi, 2009) Fig 9 illustrates the LAN for the tests, which is based on the diversity: diverse machines, settings, protocols and services are executed; further more there are several OS and free access to the Internet.Virtualized OSs (Linux Fedora), using VMWare, the host operational systems with Windows 7 and Windows XP are used in the prototype.

Fig. 10
Fig. 10 depicts the amount of FP which was detected, considering a preliminary correlation without FP filters.Notice there are 17 alerts (nodes) with 69 correlations among them (connections between alerts represented by arrows).Fig. 10 denotes the first scenario for comparisons: the DIFS level 3 work\ing without EAS.

Fig. 12 .
Fig. 12. P2P Graph Attack (only TP alerts)(Pontes et al, 2009) The time series in box i is defined between a base time t b and the present time t. 2. All earthquakes in the region of interest with magnitudes greater than a lower cutoff magnitude M c are included.The lower cutoff magnitude M c is specified in order to ensure completeness of the data through time, from an initial time t 0 to a final time t 2 .3. Three time intervals are considered: a.A reference time interval from t b to t 1 .b.A second time interval from t b to t 2 , t 2 > t 1 .The change interval over which seismic activity changes are determined is then t 2 -t 1 .The time t b is chosen to lie between t0 and t1.Typically we take t 0 = 1932, t 1 = 1990, and t 2 = 2000.The objective is to quantify anomalous seismic activity in the change interval t 2 to t 1 relative to the reference interval t b to t 1 .c.The forecast time interval t 2 to t 3 , for which the forecast is valid.The change and forecast intervals are taken and forecast intervals to have the same length.For the above example, t 3 = 2010.
Holliday et al, (2005): 1.The region of interest is divided into N B square boxes with linear dimension ∆x.Boxes are identified by a subscript i and are centered at x i .For each box, there is a time series N i (t), which is the number of earthquakes per unit time at time t larger than the lower cut-off magnitude Mc.

Table 1 .
Example of long-and short-term forecast, 1999 February 11, north of Philippines.(Kagan,& Jackson, 2000) www.intechopen.comEarthquake Research and Analysis -Statistical Studies, Observations and Planning 112 When calculating successive values, a new value comes into the sum and an old value drops out, meaning a full summation each time is unnecessary,

Table 2 .
Services in the Prototype

Table 4 .
Results of Forecasting the UIT Propagation Using EWMA and Fibonacci Sequence(Pontes et al, 2009)

Table 5 .
Results of Forecasting the UIT Propagation Using Only Fibonacci Sequence (Pontes