0 The Automated Identification of Volcanic Earthquakes : Concepts , Applications and Challenges

Classifying seismic signals into their corresponding types of volcanic earthquakes is among the most important tasks for monitoring volcano activity. Such a duty must be routinely conducted —in a daily basis— and implies, therefore, a significant workload for the personnel. The discipline of pattern recognition (PR) provides volcanic seismology practitioners with theories and methods to design classification systems and, together with digital signal processing (DSP) techniques, has given rise to promising and challenging opportunities for the automated identification of volcanic earthquakes.


Introduction
Classifying seismic signals into their corresponding types of volcanic earthquakes is among the most important tasks for monitoring volcano activity.Such a duty must be routinely conducted -in a daily basis-and implies, therefore, a significant workload for the personnel.The discipline of pattern recognition (PR) provides volcanic seismology practitioners with theories and methods to design classification systems and, together with digital signal processing (DSP) techniques, has given rise to promising and challenging opportunities for the automated identification of volcanic earthquakes.
A wealth of recently published studies have demonstrated the applicability of PR tools to volcano-seismic monitoring; however, in spite of that, several cutting-edge approaches have not yet been applied to the problem; moreover, there is still a gap between research achievements reported in the literature and the deployment of custom solutions at the volcano observatories.This chapter introduces fundamental concepts regarding seismic volcanic signals and PR systems, reviews research contributions and case studies, and highlights open issues, future directions for research and challenges to bridge the gap in the transfer of prototype academic results into deployed technology.
In this preliminary section, important definitions and concepts from volcano seismology and PR are considered.First, fundamentals of measurement, data acquisition and telemetry are presented.This is followed by an overview of the different types of volcanic earthquakes, including concise explanations of their geophysical origin and importance for monitoring and forecasting volcanic activity.Advantages of using PR tools in the identification of seismic volcanic signals are discussed.Lastly, stages of a PR system -namely detection or segmentation, representation and generalization -are introduced.

Measurement, data acquisition and data transmission
The foundation of volcano monitoring is the collection of experimental physical data and their subsequent analysis and correlation with the associated underlying phenomena.Measuring volcanic earthquakes is particularly important, since seismic events are a first sign of renewed volcanic activity (Chouet, 1996) and reveal processes such as transport of magma and gases or fracture of solid rock.Nowadays, seismic data collection is typically automated and telemetered.Both properties are required in order to guarantee (1) continuous -24 hours a day-records, (2) real time surveillance, and (3) data acquisition in remote areas where frequent visits to collect data are not feasible.
The automated collection of seismic volcanic data can be divided into three stages: measurement, data acquisition and data transmission.Measurement is performed by using sensing devices that convert ground motion into measurable output signals: electrical energies as voltages; data acquisition is composed, in turn, by several substages including signal conditioning, analog to digital (A/D) conversion and further signal processing; data transmission is performed by radio link systems, either analog or digital whether the A/D conversion is carried out after or before transmission.A standard seismic monitoring station -loosely thought of as being composed by a buried sensor, an electronics box, a solar panel and a Yagi antenna-is shown in Fig. 1.Further descriptions regarding sensors and telemetry are given below.For a general introduction to data measurement and analysis, the reader is referred to (Brown & Musil, 2004) and the classic book by Bendat & Piersol (2010).

Seismic sensors
Comprehensive book chapters on seismic instruments have been written by Havskov & Alguacil (2004, Chap.2), Bormann (2009, Chap. 5) and Havskov & Ottemöller (2010, Chap. 3).In spite of that and for the sake of a self-contained presentation, brief discussions on physical Fig. 2. Examples of seismic sensors installed in the field.

Telemetry, A/D conversion and data storage
Seismic stations may be designed to be either portable or permanent.Portable ones are equipped with on-site data storage devices such as internal memories and external hard drives and are specially deployed for medium time periods.In order to avoid periodic visits to collect data in remote areas and ensure continuity in the historical records, permanent stations are installed by applying telemetry technologies, see Fig. 1.A typical analog radio telemetry system comprises -in the transmitting side-a sensor (see Sec. 1.1.1),a modulator, a radio and an antenna; similarly, in the receiving side, it is composed by an antenna, a radio, a demodulator or discriminator and an A/D system coupled to a storage device.The modulator usually corresponds to a voltage controlled oscillator with frequency modulation (Havskov & Alguacil, 2004, Chap. 8) followed by a second modulation introduced by the radio and aimed to transmit the signals in VHF or UHF bands 1 .When signals are digitized on-site, a digital telemetry system is used with a variety of modulation schemes (Bormann, 2009, Chap. 7).Moreover, recent deployments of seismic arrays have taken advantage of mobile telephone 1 VHF band: 30 to 300 MHz; UHF band: 300 MHz to 3 GHz.

379
The Automated Identification of Volcanic Earthquakes: Concepts, Applications and Challenges www.intechopen.comnetworks and internet technologies (Vargas-Jimenez & Rincón-Botero, 2003;Werner-Allen et al., 2006).Readers that require a thorough introduction to data transmission are referred to (Temes & Schultz, 1998) and (Eskelinen, 2004) for the analog case and to (Hsu, 2003) for both analog and digital cases.
The digital acquisition of seismic signals involves stages for signal conditioning and A/D conversion.The first one includes amplifiers and antialias filters, required to scale low-level outputs of passive sensors and fulfill the Nyquist criterion 2 , respectively.The A/D conversion is carried out by using analog-to-digital converters (ADCs), typically having sampling rates of 50, 100 or 200 Hz and resolutions between 12 and 24 bit.Individual events are extracted from the continuous records by applying segmentation methods, see Sec. 1.3.1.Further details about A/D conversion and filtering can be found in publications by Scherbaum (1994;2002;2007).
Segmented seismic events can be stored in a variety of file formats.The choice of a particular format depends on technical convenience for both space and compatibility.Plain text files are simple enough that most programs can read them because they use the ASCII standard to represent characters (Brown & Musil, 2004); however, text files are neither optimized in size according to the number of bits of the corresponding ADC nor suitable to embed codes indicating formatting and additional capabilities.These weaknesses are overcome by special binary formats such as the Seismic Unified Data System (SUDS), the Seismic Analysis Code (SAC), the SEISmic ANalysis system (SEISAN), the Guralp Compressed Format (GCF) and the Standard for the Exchange of Earthquake Data (SEED).

Seismic waveforms and classes of volcanic earthquakes
Seismic signals reveal the propagation of elastic waves through the ground.An earthquake generates two different types of such waves; namely body waves and surface waves (Kayal, 2008).The former propagate within a body of rock; the latter travel along the ground surface.A further distinction is made in body waves between the primary wave (P-wave) and the secondary or shear wave (S-wave).The P-wave is faster than the S-wave; therefore, it appears before the S-wave in the seismograph record as shown in Fig. 3.
The vibrations following the arrival of a wave are called coda.Since the coda of the P-wave is often hidden by the onset of the S-wave, the term coda usually refers to S-coda (i.e. the trailing part of the seismogram) unless indicated otherwise.Refer again to Fig. 3. 2 The sampling rate must be greater than twice the highest frequency component of the signal.

380
Earthquake Research and Analysis -Seismology, Seismotectonic and Earthquake Geology www.intechopen.comVolcanic earthquakes are typically categorized into four classes according to their mode of generation and the time-frequency behavior of their associated seismic signals.The first criterion -the mode of generation-corresponds to two distinct types of processes occurring either in the solid rock or in the magmatic and hydrothermal fluids within the volcanic edifice.A variety of names have been used to describe the four classes of volcanic earthquakes (McNutt, 2005;Zobin, 2003); however, nowadays, the following denominations are widely accepted: volcano tectonic (VT) events, long period (LP) events, tremors (TR), and hybrid (HB) events; see Fig. 4. Concise explanations including their geophysical origin, time-frequency characteristics and importance for monitoring and forecasting volcanic activity are given below.Some special events are observed in particular volcanoes, e.g.multiphase (MP) earthquakes at Mt. Merapi volcano (Hidayat et al., 2000); and flute tremors, spasmodic tremor (Gil-Cruz, 1999) and 'tornillo'-type signals at Galeras volcano (Narváez-M. et al., 1997).
Tectonic earthquakes such as teleseismic (TS), regional (RE) and local (TL) ones are also observed at the seismic volcanic stations.Furthermore, rock falls (RF), explosions (EX), landslides (LS), avalanches, icequakes (IC) and even lightnings are also recorded by the instruments.Descriptions for those non-volcanic events are not given here due to space constraints.Details of the TS, RE and TL classes are available in (Kayal, 2008).

381
The Automated Identification of Volcanic Earthquakes: Concepts, Applications and Challenges www.intechopen.com

Volcano Tectonic (VT) earthquakes
These earthquakes are indicative of fractures in the solid rock, which are caused by either pressure from magmatic intrusion into the volcano or stress relaxation due to a withdrawal of magma in the crust (Guillier & Chatelain, 2006).VT waveforms are characterized by clear and impulsive arrivals of P and S waves and a short coda typically lasting 7 to 15 s.In the spectral domain, VT events are characterized by a relatively high-frequency content with energy peaking in the band from 6 to 8 or 10 Hz (Chouet, 1996;Guillier & Chatelain, 2006), little energy in the frequencies below 3.5 Hz and significant components up to 15 or 20 Hz, see Fig. 4(a).It is important to monitor VT events because an increase in such seismic activity has been found to be often a first sign of volcanic unrest (Trombley, 2006); nonetheless, their consideration as eruption precursors may not be reliable since the activity may last from days to months or even years (Chouet, 1996).Therefore, VT events must be always correlated with the locations of occurrence and the other classes of volcanic earthquakes (Londoño-Bonilla, 2010).

Long Period (LP) earthquakes
These events are caused by pressure changes in channels filled with magmatic and hydrothermal fluids.Such changes, in turn, are produced by unsteady mass transport and/or thermodynamics of the fluid (Chouet, 1996).The interaction between the surrounding solid and the aforementioned pressure fluctuations constitutes a resonator system (Kumagai & Chouet, 1999) that exhibits decaying harmonic oscillations.LP waveforms are characterized by more or less emergent first arrivals, a lack of clear S waves (Lesage, 2009) and coda waves lasting up to 1.5 minutes (Gil-Cruz & Chouet, 1997).In the spectral domain, energies are concentrated in low frequencies ranging from 0.5 to 3 Hz according to Trombley (2006) or up to 5 Hz according to Chouet (1996).Weak energies at higher frequencies, up to 13 Hz, are only present at the onset.These time and frequency properties can be examined in the sample signal shown in Fig. 4(b).
The forecasting potential of LP events has been pointed out by several studies.They commonly precede and accompany volcanic eruptions (Chouet, 2003) and their analysis may provide an understanding of the dynamic state and mechanical properties of the fluids at their sources.

Tremors (TR)
Tremors are produced by the same phenomena that cause LP earthquakes but their oscillations may last from minutes to days, and sometimes for months or longer (Chouet, 1996).Such an extended manifestation reveals the presence of a sustained excitation.Trombley (2006) claims that such a sustained excitation is caused by extra pushes that the waves of pressure, traveling through the magma, get as a result of pressure changes coming from below.
There is no significant difference between the signal characteristics of LP and TR events, except for the longer duration of the latter.The study of TR earthquakes is considered crucial for the investigation of gas/liquid within a magma conduit (Martinelli, 1997) and also for improving eruption forecasting since, as LP earthquakes, TR events have been frequently observed prior to volcanic eruptions (Lesage et al., 2002).

Hybrid (HB) earthquakes
The occurrence of a VT earthquake may trigger a LP event or vice versa (Trombley, 2006).As a result, a combined event -so-called HB earthquake-appears, containing a mixture of the two former ones.HB earthquakes may be episodic or be related to a steady process as, for instance, the interaction between magmatic heat and underground water systems (Guillier & Chatelain, 2006).
The longest HB events last a few tens of seconds (Neuberg, 2000).Chouet (1996) highlights two particular properties of HB seismic signals: a high-frequency onset and a LP-like coda.
The first property is caused by a VT event preceding the LP event.The ambiguous physical origin of HB earthquakes limits their use for forecasting purposes (Harrington & Brodsky, 2007).

Pattern recognition systems
Duin et al. (2002) define PR as an engineering field that studies theories and methods for designing machines that are able to recognize patterns in noisy data.Many of the techniques and methods in the PR field are borrowed from other fundamental and applied disciplines such as DSP, statistics and machine learning.DSP techniques are mainly applied in the first two stages of the PR system pipeline, see Fig. 5. Statistical and machine learning methods are used in the classification task.The remaining stage -representation-is the focus of interest for PR practitioners and researchers working towards the solution of the following questions: (1) how to represent real-world objects or phenomena in such a way that measurements coming from the sensor stage can be appropriately arranged, e.g. in a vector space, to be provided to the classification methods?and ( 2) is the representation technically suitable in terms of discriminant power and computational complexity?In addition, the PR community is also devoted to modify classification methods in order to adapt them to the particular technical requirements of the application.

Sensor subsystem
Consider the particular case of the automated identification of volcanic earthquakes and refer again to Fig. 5. Sensors, as described in Sec.1.1.1,are seismometers.The subsequent stage -data processing-includes data storage and/or telemetered transmission, A/D conversion (Sec.1.1.2),and segmentation.This last task in the data processing stage is carried out with a two-fold purpose: (1) to detect the events of interest in the whole continuous raw data; (2) to save space for data storage.In real time implementations, the conventional method for segmenting seismic events is the so-called short-term average -long-term average (STA/LTA) trigger (Havskov & Ottemöller, 2010).Since a detailed discussion of the STA/LTA trigger method is out of the scope of this chapter, the reader is referred to (Havskov & Alguacil, 2004).

383
The Automated Identification of Volcanic Earthquakes: Concepts, Applications and Challenges www.intechopen.com

Representation approaches
The issue of representation has been traditionally addressed by extracting a set of discriminant features from the segmented sensor measurements.Those features span a vector space which is consequently known as the feature space.Good features should allow the building of accurate classifiers to partition the space into decision regions that are associated to the classes to be distinguished -types of volcanic earthquakes in this case.Let x(t) be a segment of the continuous record containing a seismic event and let x(n) be its associated discrete-time sequence.N features extracted from x(n) are arranged in a feature vector x ∈ R N .Typical features extracted from the morphology of a seismic signal in the time-domain are amplitudes and durations of the waves shown in Fig. 3.
The dissimilarity representation has been proposed as a feasible alternative to represent signals for PR (Pekalska & Duin, 2005).For a given signal x(n), this representation approach consists in computing a dissimilarity measure between either x(n) or some associated transform and a set of M reference signals belonging to a so-called representation set.The reference signals are called prototypes whenever the set is composed by archetypal examples of each class.Similarly to the feature-based approach, dissimilarities are arranged as a dissimilarity vector d ∈ R M in the so-called dissimilarity space.Dissimilarity measures typically correspond to metric distances; however, relaxed versions of the metrics are also common in practical applications, e.g. the weighted edit distance and the modified Hausdorff distance which are asymmetric.Pekalska & Duin (2005) advocate the use of dissimilarity representations instead of classical feature-based ones by presenting several conceptual and practical motivations.Here it is worthwhile to mention the following practical ones: dissimilarities can be derived from raw data such as images, spectra or time samples; dissimilarity-based classifiers outperform the nearest-neighbor rule.

Classification approaches
The last block in Fig. 5 consists in applying classification algorithms to infer a class label ω(x) ∈ Ω, where Ω = {ω 1 ,...,ω K } is the set of labels for the K different types of volcanic earthquakes to be identified.According to the nature of the classification algorithms, three different approaches can be distinguished (Jain et al., 2000): similarity-based classification, density-based classifiers and geometric classifiers.These approaches are succinctly described below, including the relatively recent strategy of combining multiple classifiers.A thorough presentation of the classification algorithms can be found in several good textbooks on the subject of PR, such as the ones by Duda et al. (2001), Webb (2002), van der Heijden et al. ( 2004), Theodoridis & Koutroumbas (2006) and Bishop (2006).

Similarity-based classifiers
This classification approach is based on the elementary rationale of resemblance, i.e. similar events -volcanic earthquakes in our problem-should be identified as belonging to the same class.Among the classifiers in this category, the following two are widely used: the nearest mean classifier (NMC), and the k-nearest neighbor (k-NN) rule.Decision in the first one is taken by examaning the class label of the closest vector among the mean vectors per class; in the second one, the closest event in the vector space defines the assigned class label ω(x) for a new incoming event to be identified.

Density-based classifiers
These classifiers are based on the well-known Bayesian decision theory, i.e. on the application of the Bayes decision rule, which consists in the maximization of the posterior probability P( ωk |x) across Ω. P( ωk |x) corresponds, in turn, to the conditional probability density p(x|ω k ) weighted by the prior probability P(ω k ).Costs of missclassifications are often included in the rule as an additional weighting parameter.
The key issue in this approach is the estimation of the conditional probability densities, i.e. p(x|ω k ).A distinction between parametric and nonparametric estimates can be made (Jain et al., 2000), where the parametric case corresponds to the assumption of a model for the probability density (e.g. a Gaussian distribution) and the nonparametric one consists in either estimating the probability densities by the standard histogramming technique or by defining window functions in the vector space.Such windows are used to define the contribution of the samples contained in them to the estimation of the probability density.A further division in the window-based nonparametric case is the one between the Parzen window approach and the k-nearest-neighbor method, whether the estimation process is space-invariant or not, respectively.
Consider again the parametric case and the assumption of Gaussian distributions.Parameters to be estimated are the mean vectors and the covariance matrices.
According to the assumptions made about the latter, two well-known decision rules result: (1) the Bayes-normal-linear classifier (LDC), when covariance matrices are assumed to be equal; (2) the Bayes-normal-quadratic classifier (QDC), when the covariance matrices are assumed to be different.
Seismic volcanic signals are composed by sequential data, analogously to the case of speech records and time series.A widely used tool for modeling and classifying such sequences is the hidden Markov model (HMM) method.A HMM is composed by a set of states, a matrix of probabilities of transitions between the states, a vector of initial probabilities and an emission model.The HMM-based classification typically consists in training one HMM for each class and, afterwards, using a density-based classifier.Additional details of this method are not given here but can be found in (Rabiner, 1989) as well as in the reviewed studies referenced in Sec.2.3.

Geometric classifiers
In these classifiers, decision boundaries are built by optimizing a performance criterion instead of considering proximities or densities as in the two previous approaches.Examples of geometric classifiers are the Fisher's linear discriminant, decision trees, single-and multi-layer perceptrons (and, in general, artificial neural networks) and the support vector classifier.Here we only describe the last two classifiers in more detail since they are the most used in volcano seismology applications, as it will be discussed in Sec. 2.
Artificial neural networks (ANNs) are able to implement linear as well as nonlinear classifiers, depending on their architecture (number of layers and number of neurons) and training method.In spite of their tricky tuning procedures, they are still extensively used due to their flexibility and potential good performance.Nonetheless, the emergence of the support vector method has progressively displaced ANNs from their consideration as general solutions for classification and regression; indeed, over the last 15 years, the support vector method have gained a solid theoretical development and an overwhelming number of applications.In

385
The Automated Identification of Volcanic Earthquakes: Concepts, Applications and Challenges www.intechopen.comfew words, the basic principle of the support vector classifier (SVM) is to maximize the margin between two classes, which is defined by the so-called support vectors: the closest training examples to the decision boundary.The SVM is extended to nonlinear and multiclass problems by using strategies called the kernel trick and the one-against-rest approach.Further details can be found in some of the PR textbooks cited above as well as in the original work by Vapnik (1998).

Combination of multiple classifiers
The strategy of combining multiple classifiers aims to exploit (1) the availability of multiple sources of data from different sensors or representations, and (2) the possibilities of training several classifiers for the same training set and performing different tuning sessions for the same classifier.Data mentioned in item 1 may belong to either the same events or to different ones.Most seismic volcanic data sets are multiple in nature since they are acquired at multiple recording stations and across several months or years; thereby, multiple sources -stationsfor the same events are often available and different sets of examples can be arranged by date of acquisition.
Several strategies for combining classifiers have been proposed.They are typically categorized according to their architecture into parallel, serial and hierarchical; or according to the combination rule into static and trainable (Kuncheva, 2004).PR systems that include these strategies are called multiple classifier systems.There has been a sustained interest in this field during the last decade as evidenced by the series of workshops started by Kittler & Roli (2000) and recently organized by Gayar et al. (2010).

A Review of research on automated identification of volcanic earthquakes
This section is meant to be a compact but comprehensive survey of research efforts, achievements and case studies on automatic classification of seismic volcanic signals.
Reviewed studies are grouped into categories according to the various approaches and methods discussed in Secs.1.3.2 and 1.3.3.

Research teams and study sites
A literature search was performed in the main technical databases.Most of the applications on the automated identification of volcanic earthquakes have been undertaken through the inter-institutional and international research collaboration of four teams composed by: (1) Departamento de Teoría de la Señal Telemática y Comunicaciones, Universidad de Granada, Spain; and Instituto Andaluz de Geofísica, Universidad de Granada, Spain; (2) Dipartimento di Fisica, Università di Salerno, Italy; and Osservatorio Vesuviano, Istituto Nazionale di Geofisica e Vulcanologia, Italy; (3) Departamentos de Ingeniería Eléctrica y Física, Universidad de La Frontera, Temuco, Chile; and Observatorio Volcanológico de los Andes del Sur, Servicio Nacional de Geología y Minería, Chile; and (4) Departamento de Informática y Computación, Universidad Nacional de Colombia Sede Manizales, Colombia; Observatorio Vulcanológico y Sismológico de Manizales, INGEOMINAS, Colombia; and Pattern Recognition Lab, Delft University of Technology, The Netherlands.In these collaborative studies, it seems that spatial proximity between volcano observatories and at least one expert in DSP and/or PR encourages collaboration, probably due to the possibility of establishing informal communication as pointed out by Katz & Martin (1997).Other active teams are composed by personnel from Istituto Nazionale di Geofisica e Vulcanologia, 386 Earthquake Research and Analysis -Seismology, Seismotectonic and Earthquake Geology www.intechopen.comCatania, Italy; and Institut für Erd-und Umweltwissenschaften, Universität Potsdam, Germany.
The found studies have been applied to data sets of the following volcanoes: Ambrym volcano, Vanuatu (AMV); Deception Island Volcano, Antarctica (DIV); Etna Volcano, Italy (ETV); Las Cañadas Volcano, Tenerife, Spain (LCV); Llaima Volcano, Chile (LLV); Mt.Merapi Volcano, Indonesia (MMV); Mt.Vesuvius Volcano, Italy (MVV); Nevado del Ruiz Volcano, Colombia (NRV); Phlegraean Fields, Italy (PFV); San Cristóbal Volcano, Nicaragua (SCV); Soufrière Hills Volcano, Montserrat (SHV); Stromboli Volcano, Italy (STV); and Villarica Volcano, Chile (VRV).Other studies are not applied to signals of volcanic origin but to tectonic seismic events.In spite of that and considering the affinity between these two problems, such studies have also been reviewed here.Data considered in those studies come from the European Broadband Network (EBN), the Mediterranean Seismic Network (MSN), the Hyblean Plateau network (HPN), the Marmara Region Network (MRN) and the Bavarian Earthquake Service Network (BEN).See Table 1 for associations between study sites and publications.

Applications and representation approaches
Raw seismic signals are the simplest and straightforward representation to be provided to a classifier.That option exempts designers from the need to find good features and may be convenient if sufficient training examples are available.However, building a vector space by using the original time samples yields to the following drawbacks: (1) it is mandatory to have equal-length and aligned signals, which is often not possible due to the intrinsic variable duration of seismic events; and (2) high dimensional vector spaces are spanned by the samples and, thereby, large training sets are required in order to avoid the "curse of dimensionality" phenomenon.The second drawback can be overcome by applying dimensionality reduction techniques such as principal component analysis (PCA) and feature selection methods.Avossa et al. (2003) adopted this approach, reducing the dimension from 240 to 15. Langer & Falsaperla (2003); Ursino et al. (2001);and Langer et al. (2006) used the autocorrelation function instead of the original waveforms in order to avoid the phase alignment problem.
Morphological features can be extracted directly from the examination of the waveforms.Curilem et al. (2009) measured the following values from the absolute value of the signals: standard deviation, mean, median and maximum value, as well as kurtosis and skewness from a histogram of the signal amplitudes.Scarpetta et al. (2005) and Esposito et al. (2006) extracted time-domain information by computing differences, properly normalized, between the maximum and minimum signal amplitudes.Similarly, Ezin et al. (2002) measured maximum and minimum signal amplitudes, Yıldırım et al. (2011) obtained peak S-to-P amplitude ratios and complexity values and Rouland et al. (2009) detected the presence or absence of S-waves.Signal envelopes, that are smoothed versions of the original waveforms, were also tested for data representation by Falsaperla et al. (1996), Langer & Falsaperla (2003) and Beyreuther et al. (2008).A collection of morphological and statistical attributes of the waveforms were considered in the study by Langer et al. (2006).The most specialized representation is that reported in (Ohrnberger, 2001, Chap. 7) and (Beyreuther & Wassermann, 2008), which includes several wavefield parameters.
An alternative consists in computing intermediate representations, usually spectra and spectrograms because differences in spectral content allow a visual discriminating of different

389
The Automated Identification of Volcanic Earthquakes: Concepts, Applications and Challenges www.intechopen.comtypes of volcanic earthquakes (Zobin, 2003, Chap. 9).This approach was followed by Orozco-Alzate et al. ( 2008 (2006);and Porro-Muñoz et al. (2010a;b;2011).In the first four studies, the computation of spectra was followed by dimensionality reduction techniques such as sequential feature selection, PCA and Fisher mapping.In the remaining ones, dissimilarity representations were computed after transforms to the frequency or the time-frequency domain.Porro-Muñoz et al. (2010a;b;2011) included multiway data analysis techniques, see Sec. 3.5.
Additional features can be extracted from spectral representations by measuring morphological attributes such as the mean frequencies of the five highest peaks, energies in given frequency bands (Curilem et al., 2009;Romeo, 1994;Romeo et al., 1995) and the instantaneous frequency (Beyreuther et al., 2008), or by computing variables such as the Mel-frequency cepstral coefficients (MFCCs), their associated log-energies and the so-called delta and delta-delta coefficients (Benítez et al., 2007;Gutiérrez et al., 2009;2006).Spectra and spectrograms are typically computed by using the Fourier or the cosine transforms.Other ones, such as the Hilbert and wavelet transforms have been applied for representation; e.g. by Riggelsen et al. (2007), San-Martín et al. (2010), and Porro-Muñoz et al. (2010b).
The linear predictive coding (LPC) coefficients have been widely used in speech recognition and, by extension, also chosen for representation in several projects of seismic signal classification (Del Pezzo et al., 2003;Esposito et al., 2007;2006;2005;Ezin et al., 2002;Scarpetta et al., 2005).They are aimed to predict samples as linear combinations of several previous ones, based on the correlation between successive samples in a seismic signal.

Applications and classification approaches
In the majority of the reviewed applications, ANNs have been used for classification; particularly multilayer perceptrons (MLPs).Summarized descriptions of publication references, input-hidden-output architecture (number of neurons per layer) and training method are shown in Table 2. Architecture and training method, in almost all the studies, were selected either by trial and error or by agreement with a previous publication.An exception is the study by Curilem et al. (2009), who optimized the size of the hidden layer and selected the training process by means of a genetic algorithm, finding that 14 hidden neurons and the Levenberg-Marquardt training algorithm were the optimal choice.
HMMs have been widely used in the speech recognition framework.Given the analogous nature of speech and seismic signals, authors have also successfully applied them to the automated classification of volcanic earthquakes.Similarly to the case of ANNs, the performance of HMMs is controlled by several free parameters, namely: the topology of the models, the number of states for the models, the number of multivariate Gaussian probability density functions and the number of iterations of the Baum-Welch algorithm for training.
Topology usually corresponds to a left-to-right configuration.Values used for the second parameter -the number of states-in the reviewed applications are listed in Table 3.
A conceptual discussion on the use of wavelet-based HMMs to the classification of seismic volcanic signals is presented in (Alasonati et al., 2006).Several reasons have motivated researchers to prefer a left-to-right HMM topology instead of an ergodic one; Ohrnberger (2001, Chap. 7) points out the following reasons: (1) seismic signals are causal in time; (2) seismic signals are analogous to speech signals, for which left-to-right models are widely used; and (3) the degree of freedom of a model -with equal number of states-is lower for a left-to-right topology than that for an ergodic one.Readers are referred again to (Rabiner, 1989) for details on the difference between these two topologies.A generalization of HMMs are the so-called dynamic Bayesian networks.Riggelsen et al. (2007) applied them to the real-time identification of seismic signals.(2010).Authors of the first study built classifiers on top of a classical feature representation while the others employed simple ones, either in the dissimilarity space or to be combined in a second step of the classification process as explained at the end of Sec.1.3.3.The reader is referred again to Table 1 to associate studies and classifiers.
Hoogenboezem (2010) presented a compendious survey of classifiers and representations applied to signals from NRV.However, more rigorous experimentations and statistical comparisons are a must when a comprehensive study is planned to be conducted.
Recommendations such as those made by Demšar (2006); Duin (1996); Salzberg (1997) and in Sec.2.4 should be taken into account.An additional concern is the methodological rigor in the evaluation of performances for multiclass problems; even though most of the studies

391
The Automated Identification of Volcanic Earthquakes: Concepts, Applications and Challenges www.intechopen.comreport confusion matrices, others draw conclusions from overall accuracies that are likely to be unreliable for multiclass and/or unbalanced data sets.
This subsection is concluded with a mention to the following studies dealing with the unsupervised classification problem: (Ansari et al., 2009;Esposito et al., 2007;2008;2005;Orozco-Alzate & Castellanos-Domínguez, 2007).They are aimed at finding clusters in seismic volcanic data and understanding their structure.A separate chapter would be required to properly discuss them.

The need of a benchmarking data set
Classification accuracies and other performance measures reported in the literature are not comparable across the reviewed studies because, unfortunately, there are no standard and publicly-available data sets of seismic volcanic signals.Furthermore, authors have used different sets even when they performed studies for the same volcano.Thus, the need for a benchmarking data set is evident.Researchers in this field are encouraged to define such a reference set to be made available for rigorous comparative studies.Ultimately, it is the only reliable way of measuring relative system performance.

Open issues and research opportunities
The area of PR has developed itself into a mature engineering field (Duin & Pekalska, 2005).As a result, in practical applications and particularly in volcano seismology, a number of recent approaches and techniques have not yet been explored.This section is concerned with future directions for research, considering not just the state-of-the-art in PR but also possibilities offered by the development of sensors and computer resources.Prospective projects are briefly outlined, considering novel approaches such as multiple instance learning, one-class classification, adaptive single and multiple classifiers, classifier optimization and multi-way representations.

Multiple instance learning
A multiple instance problem occurs when training objects are naturally organized into bags of feature vectors, also known as multisets, instead of being composed by individually labeled ones (Ray & Craven, 2005).It happens, for instance, when objects are too rich and contain too many details and information that can not be easily represented by a single feature vector (Tax & Duin, 2008), e.g.images that depict several objects -in addition to the one of interest, also known as concept-in the same picture.Feature vectors (called in this framework as instances) in the bag are assumed to be independent and are not individually labeled since the class labels are only assigned to the complete bags.In a two-class case, with a positive class and a negative class to be distinguished, a negative bag only contains vectors that are not members of the concept; whereas a positive bag contains at least one vector that is member of the concept and, consequently, may contain other vectors that are not.
A prospective application of multiple instance learning to the automated identification of volcanic earthquakes would consider waveforms and spectrograms as bags of feature vectors.
In such a way, labels might be more accurately assigned to those segments in the signals or patches in the spectrograms clearly belonging to the concept class.Moreover, ill-defined classes might be more properly treated, e.g. the HB events.

One-class classification
Seismic signal classification problems are unbalanced.Events of some classes are very common and, therefore, a lot of examples are available.In contrast, other classes are rare and just a few examples of them can be collected.Based on the given examples, only a boundary descriptor of the most frequent classes can be accurately built.Considering a rare type of seismic events as the outliers and the rest of the events as the target class clearly follows the definition of a one-class classification problem (Juszczak, 2006;Tax, 2001).
One-class classifiers are sound alternatives to multi-class ones for cases when rare or abnormal states are very infrequent, costly to be forced (e.g.faults in machinery) or impossible to obtain upon request: a person can not be asked to get sick with particular symptoms and a volcano can not be artificially induced to exhibit particular rare seismic events.This approach, to the best knowledge of the authors, has not yet been applied to the automated identification of earthquakes.

Adaptive single and multiple classifiers
Seismic signals of the same events may look completely different across seismic stations, waveforms of the same classes of events differ among volcanoes and; moreover, volcano geophysical conditions change over time.These dynamic nature motivates the application of classifier adaptation strategies, either for single or multiple classifiers (Aksela, 2007), that allow the possibility of learning from the test set to adapt or modify the decision regions.
Individually adaptive classifiers have been employed in optical character recognition (OCR) in order to prevent accuracy deterioration due to the statistical dissimilarity between the training and test data (Veeramachaneni & Nagy, 2003).Such a dissimilarity is introduced in OCR by the proliferation of fonts and typefaces.Similarly, in speech recognition, adaptation has been extensively applied to deal with unseen conditions or time-variant speakers (Herbig et al., 2011).In summary, undertaking an exploratory study on the application of adaptive single and multiple classifiers may provide a convenient solution for seismic signal classification under the varying conditions mentioned above.It might be indeed an alternative to re-training or entirely re-designing deployed PR systems.

Classifier optimization
The relative importance of different classification outcomes must be taken into account when optimizing and evaluating the design of a PR system.Such differences are reflected in a trade-off between the values of true positive rate and false positive rate and can be represented in receiver operator characteristic (ROC) curves, whose examination gives the designer insights to tune the classifiers.Classical ROC curves are restricted to two-class problems, in which one class is designated as positive (target) and the other one is assumed as negative.
The automated classification of seismic volcanic signals is a multiclass PR problem.Therefore, the application of classical ROC analysis is only possible under a one-against-rest approach.Nonetheless, recent research efforts have extended ROC analysis to multiclass cases while overcoming restrictive computational complexity issues that limit straightforward multiclass generalizations; see for instance (Landgrebe, 2007;Landgrebe & Paclík, 2010;Paclík et al., 393 The Automated Identification of Volcanic Earthquakes: Concepts, Applications and Challenges www.intechopen.com2010).Optimal classification systems for the automated identification of volcanic earthquakes might be designed by using those novel ROC approaches.

Multiway representations
Multiway data analysis has been extensively used in chemometrics and psychometrics.It extends classical multivariate statistical techniques such as component analysis, factor analysis, cluster analysis, correspondence analysis, and multidimensional scaling to multiway data (Kroonenberg, 2008).Multiway means that data are arranged in high-order arrays instead of the usual two-dimensional matrices, in which each row represents an object and each column is associated to a feature or measurement.

Challenges and constraints in deploying automated systems
This section is devoted to a discussion on the difficulties and challenges for the design and deployment of custom solutions at the volcano observatories.Technical challenges and non-technical constraints are summarized.Lastly, a few remarks concerning industrial and commercial implementation alternatives are made.

Technical challenges and non-technical constraints
Technical challenges in the deployment of PR systems for the automated recognition of seismic volcanic signals are mainly related to the following issues: (1) computational aspects and (2) local conditions.The first issue depends on the actual computational requirements of classification algorithms and their associated demands for data storage.The latter is becoming less relevant since disk storage capacity has grown exponentially and hardware prices have declined.In spite of that, processing the stored data may still be cumbersome, especially when dealing with continuous recording as commented by Langer & Falsaperla (2003).Classification speed is of crucial importance for real-time applications.Computational complexities of all stages in the PR pipeline (see Fig. 5) must be carefully estimated in terms of orders or FLOPS3 in order to guarantee fast execution.Such a condition implies a reasonable trade-off between complexity and classification performance.
The second issue -local conditions-includes the consideration of several volcano-specific factors as those mentioned at the beginning of Sec.3.3.In addition, the so-called source, path and local site effects require special attention.They cause that waveforms of the same seismic event but recorded at different stations exhibit distinct characteristics; for instance, time delays introduced by the physical distance between stations and amplifications or attenuations of signal components at certain frequencies due to geophysical properties that act as filters.See (Havskov &Alguacil, 2004, Chap. 9) and(McNutt, 2005) for further details about these effects, their characterizations and corrections.
Non-technical constraints are mainly related to budget limitations to undertake R&D projects at volcano observatories.Even though the research stage can be achieved in association with universities and institutes, as reflected in the discussion in Sec.2.1, the development and implementation of in-house solutions is subject to organizational practices and policies at the observatories.Therefore, formalizing high-level collaboration is needed, in such a way that isolated partnership between individuals become supported by inter-institutional cooperation agreements.

Industrial and commercial alternatives
Almost all the above-reported applications were developed in mathematical scripting languages, such as MATLAB and its free clones; see e.g.(Lesage, 2009).They certainly offer unparalleled advantages in the design of academic prototypes but are often not well-suited to deliver tools for real-world applications.Main constraints include the inherent slowness of interpreted languages, external dependencies with other third-party toolboxes and prohibitive licensing or pricing terms.
Two alternatives can be identified when economic constraints are critical: (1) developing the entire application from scratch in compiled languages such as Fortran and C, probably incorporating freely available numerical and graphical libraries, see for instance (Ottemöller et al., 2011); and (2) developing programs in high-level and free numerical languages such as OCTAVE and SCILAB, from where stand-alone compiled routines can be invoked, see (Laverde & Manzo, 2009) for an example.
A reasonable trade-off between affordability and performance is offered by a number of commercial software packages that also maintain the advantage of a faster development time in a scripting environment.PERCLASS -formerly PRSD STUDIO-allows the design and easy deployment of PR systems and has proved to be a successful solution in a variety of industrial applications4 .It is based on the MATLAB platform and follows the style of PRTOOLS (Duin et al., 2007), an academic toolbox for PR, but is not dependent on it.Another possibility is using the SIGNAL PROCESSING TOOLBOX together with the STATISTICS TOOLBOX, both by MATHWORKS, and translating the codes to platform-independent files such as DLLs.

Conclusion
Multiple research studies have shown that PR tools can be successfully used in the volcano-seismic monitoring task.Several data representations have been explored, including raw and processed signals in the time-and/or frequency-domain as well as other measurements related to geophysical wave properties.ANNs and HMMs have been preferred to be used in the classification stage, thanks to their flexibility and in spite of being heavily parameterized.Other classifiers, on the contrary, do not demand much parameter adjustments and have being used in combination with novel representations such as dissimilarities and multiway configurations.
The state-of-the-art in PR offers a number of new techniques and methods that might be suitably applied to the automated recognition of volcanic earthquakes.Such technological trends and research directions could effectively incorporate inherent properties of the problem, e.g.multiple channels (stations and components), variations over time and multiclass unbalanced nature.Results obtained by different research teams are unfortunately not comparable because different data sets were used across the studies.A rigorous and comprehensive comparison has not yet been made.If undertaken, defining a benchmark set of problems would be mandatory.
Transferring research achievements to the seismological practice demands careful feasibility evaluations of implementation alternatives and would greatly benefit from working cooperations agreements between volcano observatories and universities.One of the ways to achieve an effective technology transfer is the provision of grants and scholarships.

Fig. 4 .
Fig. 4. Examples of seismic volcanic signals observed at Nevado del Ruiz Volcano, together with their associated spectrograms.Events were recorded at Olleta station in 2006.Spectrograms were scaled to highlight the top 50 dB of the signals.

Table 1 .
Summary of reviewed studies and their associated experimental setups.
Porro-Muñoz et al. (2010a;t times, conditions or locations are suitable to be considered as multiway data sets(Porro-Muñoz et al., 2009).Porro-Muñoz et al. (2010a;b; 2011)derived intuitive multiway representations for classifying seismic volcanic signals.Spectrograms and scalograms are computed for each segmented seismic signal and, afterwards, the whole set is arranged by stacking those initial two-dimensional representations.As a result, a so-called profile-data configuration is obtained, where the three dimensions are associated to signals, time and frequency; respectively.Further studies on the design of custom classifiers for multiway data sets are needed.Moreover, other multiway arrangements might be created by considering, for instance, the recording stations or the sensor components (vertical, North-South, and East-West) as additional ways, i.e. dimensions.