Evolutionary Factor Analysis

Because of chemical interconversion, many chemical systems cannot be physically separated, making chemical identification and quantification difficult. The spectra (IR, UV, Visible, Raman, CD, etc.) of such systems exhibit overlapping contributions of uncataloged components, confounding the identification as well as the quantification. Strategies based on factor analysis [1], a chemometric technique for handling complex multi-dimensional problems, are ideally suited to such problems. Abstract factor analysis (AFA) reveals the number of spectroscopically visible components. Evolutionary factor analysis (EFA) [2-4] takes advantage of experimental variables that control the evolution of components, revealing not only the concentration profiles of the components but also their spectra even when there are no unique concentrations or spectral regions. Evolutionary factor analysis makes use of the fact that each species has a single, unique maximum in its evolutionary concentration distribution curve. We have recently applied this self-modeling method to the infrared spectra of stearyl alcohol in carbon tetrachloride solution. The evolutionary process of this system was achieved by increasing the concentration of stearyl alcohol from 0.0090 to 0.0800 g/L in 15 stages, each time recording the IR spectra from 3206 to 3826 cm '. The spectra were corrected for baseline shift, solvent absorption and reflectance losses. The 15 spectra were then digitized every 3 cm ' and assembled into a 35 x 15 absorbance matrix [A] appropriate for factor analysis. The factor indicator function [1], the reduced eigenvalue [5] and cross validation [6] indicated that three species contribute to the observed spectra. Thus AFA expresses the data matrix as a product of a 35 x 3 absorptivity matrix [E],,,, and 3 X 15 abstract concentration matrix [C],h,,.

[A] = [ElbS, [C]ah., Because the abstract matrices are mathematical solutions devoid of chemical meaning, they must be transformed into physically meaningful absorptivities and concentrations. This is accomplished by target transformation factor analysis (TFA) [1], a powerful technique which allows one to test factors individually without requiring any a priori information concerning the other factors. A test vector C,,,, emulating an evolutionary profile can be converted into a predicted vector Cp,,,, that lies completely inside the factor space by finding a transformation vector T that minimizes the sum of squares of the difference between C,,,, and Cnrcd. The three predicted vectors with maxima corresponding to the unique point of the respective test vector were retained as likely candidates. These crude profiles were refined, individually, by applying simplex optimization to the respective transformation vector, using a response function designed to minimize negative concentration points and double maxima in the profile.
Further refinement was achieved by the following iteration. Because negative regions are meaningless, all data beyond the boundaries marked by the first negative regions encountered on the left and on the right of the peak maximum were truncated. These profiles were normalized so the sum of squares equals unity and then assembled into a concentration matrix [C]. The pseudoinverse equation

Accuracy in Trace Analysis [C]={[E]' [El> [E] [A]
to recalculate the concentration profiles. This process (truncation, normalization and pseudoinverse followed by pseudoinverse) was repeated until no further refinement occurred.
The concentration profiles and spectra of the three unknown components of stearyl alcohol in carbon tetrachloride obtained in this manner were found to make chemical sense.
This EFA procedure, unlike others, was successful in extracting concentration profiles from situations where one component profile was completely encompassed underneath another component profile.

Institute for Analytical Chemistry Mikroand Radiochemistry Technical University Graz, A-8010 Graz, Austria
Chemometrics is a very international branch of science, perhaps more so than chemistry at large, and it is therefore appropriate to question the suitability of the topic to be presented. It is, however, the author's opinion that the profile of European chemometric research has a couple of distinct features that may originate more in the structure of the educational system than in the actual research topics. The profile as it will be presented is the one perceived by the author, and therefore comprises a very subjective selection of individual contribu-tions to the field. Obviously, this is not the place to offer a review on chemometrics, let alone one that is restricted to a continent.
The definition of chemometrics [I] comprises three distinct areas characterized by the key words "optimal measurements," "maximum chemical information" and, for analytical chemistry something that sounds like the synopsis of the other two: "optimal way [to obtain] relevant information."

Information Theory
Eckschlager and Stepanek [2][3][4][5] pioneered the adaption and application of information theory in analytical chemistry. One of their important results gives the information gain of a quantitative determination [5] I toII )= n(X 2 -xl) \/nA I (q 1p)=lIn SV2xR-en (I) where q and p are the prior and posterior distribution of the analyte concentration for the specific cases of a rectangular prior distribution in (x,,x 2 ) and a Gaussian posterior with a standard deviation s determined from /A independent results. The penalty for an inaccurate analysis is considerable and can be expressed as (2) I (r;q,p)= I (q 11P)_nA with d the difference between obtained value and the true value of x. The concept has also been extended to multicomponent analysis and multimethod characterization.
In the latter case, correlations between the information provided by the different methods need to be accounted for. Given the cost of and time needed for an analysis, information efficiency can be deduced in a straightforward manner [2]. Recently, work was published [5] suggesting the incorporation of various relevance coefficients; this, indeed, is a very important step since it provides a way to single out the information that is judged to be relevant for a given problem. It also opens up the possibility to draw on information theory for defining objective functions in computer-aided optimization of laboratory procedures and instruments.