2 Multivariate Modeling in Quality Control of Viscosity in Fuel : An Application in Oil Industry

Out of specification values can decrease the fuel volatilization, thus implying, in an incomplete combustion (Pontes et al., 2010). This physicochemical property can vary significantly with the modification of the cast during the processing of crude in a refinery (Figure 1), maintaining the same conditions of production control, which compromises the quality standards. This leads to the need to determine the viscosity or provide it as often as possible in lieu of performing the traditional point analysis in the laboratory that can take long time.


Introduction
This chapter aims to present an alternative to quality control of the viscosity of two important fuels in the international scenery -aviation kerosene and diesel oil -by statistical multivariate modeling (Pasadakis et al., 2006).
Viscosity is one of the most important properties of fuels; it influences the circulation and the fuel injection in the operation of injection engines.Engines efficiency in the combustion process depends on this property.
Out of specification values can decrease the fuel volatilization, thus implying, in an incomplete combustion (Pontes et al., 2010).This physicochemical property can vary significantly with the modification of the cast during the processing of crude in a refinery (Figure 1), maintaining the same conditions of production control, which compromises the quality standards.This leads to the need to determine the viscosity or provide it as often as possible in lieu of performing the traditional point analysis in the laboratory that can take long time.
According to Dave et al. (2003), the use of field instruments in conjunction with statistical multivariate techniques to determine, in real-time, properties of the products is one way to optimize the operations of oil refining.
Each refinery has at least one primary distillation tower, where the components of crude oil are separated into different sections using different boiling points, and different arrangements of unit conversion.In general, the refining margin increases with the complexity of the refinery.Decisions about how to operate and monitor a refinery and how to build the units, are factors that provide competitive advantages to oil companies.
The hydrotreatment is a catalytic process that removes large amounts of sulfur and nitrogen from the distillation fractions (Fernández et al., 1995).The fluidized catalytic cracking (FCC) is a process widely used in a petroleum refining industry.It consists in cracking large molecules into smaller ones by high temperatures.Thus, heavy oils are converted into products with higher added value (IEA, n.d.).In addition, some refineries have used the coking unit to maximize the refining margin in the conversion of waste from distillation towers.Important products of refineries are the fuels for several kinds of existing engines.This way, fuel for jets is derived from petroleum and it is suitable for power generation by combustion in gas turbine engines for aircraft (see Fig. 2).Jet fuel is produced by fractionation of petroleum by distillation at atmospheric pressure, with boiling range between 150 and 300 ° C, followed by finishes and treatments that aim primarily eliminate the undesirable effects of sulfur compounds, nitrogen and oxygen.
The viscosity of kerosene is limited to a maximum value to obtain a minimum loss of pressure in the flow at low temperature, as well as to allow using adequate spray nozzles for the fuel in order to improve the conditions of combustion.The viscosity property can significantly affect the lubricity of the fuel property, and, consequently, the life of the aircraft fuel pump.
The diesel fuel is a derivative of petroleum used in internal combustion engines compression to move motor vehicles (Fig. 3).It can also be used in marine engines and as a fuel for home heating.It is composed mainly of paraffinic hydrocarbons, and it is not desirable to the presence of olefins and aromatics.Its normal boiling range is 100 to 390 º C, while the number of carbon atoms should be located between six and eighteen atoms.
The chemical composition of diesel oil directly affects its performance and is related to the type of oil used and with the adopted processes for their production in refineries.
Overall this product is composed of one or more cuts from the distillation of petroleum, and it can be added to other current refining processes, for example, the product obtained from catalytic cracking called light oil recycled.Fig. 2. Combustor and engine -schema (Gomez et al., 2007).Kinematic viscosity of this product is an important property in terms of its effect on power systems and in fuel injection.Both high and low viscosities are undesirable since they can cause, among others, problems of fuel atomization.The formation of large and small droplets (low viscosity), can lead to a poor distribution of fuel and compromise the mixture air -fuel resulting in an incomplete combustion followed by loss of power and greater fuel consumption.
In Section 2, it will be presented a method for acquiring and construct the database.In Section 3, theoretical foundations of the statistical multivariate methods used.In Section 4, are presented the method application and the results for a real process of production.Finally, in Section 5, the conclusions reached are discussed.(Marshall, 2002)

Acquisition data base
The multivariate analysis, using infrared signals, allow manipulate data absorption, called spectra, associated with more than one frequency or wavelength at the same time.In the oil industry, their applications are associated with the prediction of the quality of distillates such as naphtha, gasoline, diesel and kerosene (Kim, Cho, Park, 2000).
The chemical bonds of the type carbon-hydrogen (CH), oxygen-hydrogen (OH) and nitrogen-hydrogen (NH) are responsible for the absorption of infrared radiation, but are not very intense and overlap, creating broad spectral bands, that are correlated and difficult to interpret.However, a multivariate approach has proven quite adequate for modeling physical and chemical properties from samples of the input variables, it is known as absorbed infrared radiation (Behzadian et al., 2010).
The polychromatic radiation emitted by the source has a wavelength selected by a Michelson interferometer.The beam splitter has a refractive index such that approximately half of the radiation is directed to the fixed mirror and the other half is reflected, reaching the movable mirror and is therefore reflected by them.The optical path differences occur due to movement of the movable mirror that promotes wave interference.Nowadays, the instrumentation has introduced an improved Michaelson interferometer, to develop the system "Wishbone", as illustrated in Figure 4.In this system, instead of having a movable mirror and a fixed mirror, both mirrors cubic move, tied to a single support.Fig. 4. "Wishbone" interferometric system employed in modern NIR spectrophotometers based on Fourier Transform.A, beam splitter; B, corner cubic mirrors; C, anchor, and D, "wishbone" (Pasquini, 2003) An interferogram is obtained as a result of a graph of the signal intensity received by the detector versus the difference in optical path traveled by the beams.Then, like the Fourier Transform translates the recurring phenomenon in a series of sines and cosines (see Fig. 5), it is possible to transform the interferogram in a spectrum transmittance.The amount of radiation absorbed is determined by using the co-logarithm of the transmittance spectrum.

Multivariate analysis
The characterization of the mathematical models more adequate was performed by using the multivariate technique with partial least squares regression (PLS).This is an analysis technique where the original matrix of data is represented by factors or latent variables.Only the portion of the spectral data that correlates with the property assessed is included in this representation.
The first factor, calculated by a statistical program The Unscrambler®, has the highest correlation of spectral data with respect the property of interest.The residual spectrum, which is the original spectrum minus the proportion represented by the first factor, then the same statistical program evaluates it.Thus, the second factor has the highest correlation with the residual spectrum property.This procedure is replicated until each one of the important information, which has a correlation with the property under study, was represented by the factors or latent variables.Observe that some caution is needed to determine the appropriate number of factors, because an insufficient number of them will not include all the necessary spectral information and too many of them will add noise (see Fig. 6).Fig. 6.Optimized number of components (Naes et al., 2002) This regression model has the advantage of using the entire spectrum, be quick and offer a stable result.In addition, the regression using partial least squares, that use a number of factors less than the principal component regression (PCR) method, is more resistant to noise and in presence of weaker correlations.
The decomposition of the original variables on principal components can be represented by Equation (1), where for the k principal components, t is the value called score, that indicates the differences or similarities between samples, γ is the parameter that relates the original variables with the latent variable, it is called loading and represents how much a variable contributes to a major component and it considers the variation of the data: So the principal components are related to the concentration, or property of interest, according to Equation ( 2), where h is the number of principal components used.The Fig. 7 represents the Equation (2) in matrix form: The Fig. 8 illustrates the absorbance of three wavelengths.Observe that the first principal component PC1 is a linear combination of the absorbance values that representing the maximum variance between samples.The projection of the sample point on the axis of PC1 is the score of PC1 (see Fig. 9).The PC2 must be orthogonal to PC1 and is positioned to capture the maximum residual variance.When all the data variability can not be explained by only one major component (red and green samples), a second PC is needed and so on.The score for the PC2 is obtained by projection, in a manner analogous to the previous situation (Fig. 10).

Fig. 10 . Second principal component in 3D
To obtain the remaining principal components, the procedure is the same.It be noted that for a set of 100 different wavelengths, for example, is not necessary 100 principal components to represent the data variability.
The variability of the spectrum can be compressed into a less number of principal components without significant loss of analytical information.After this compression, the scores are considered as independent variables in the regression to obtain the dependent variable y (concentration or physicochemical properties).
The PLS determines the principal components that are the best ones with respect to the variable y and that explain as best as possible the variable x (see Fig. 11).

Proposed method, results and analysis
The first step is collect samples of kerosene and diesel oil in a reasonable period of time, to obtain a data set that best reflect all possible operational variability, as changes in the cast of oil and operating conditions of the units.
In the second stage experiments are performed to characterize, on a laboratory scale, the product, aiming to determining, from the samples, the real kinematic viscosity to be modeled.The samples were also subjected to infrared radiation.
In the third step the mathematical models were developed using The Unscrambler® and Excel® softwares, associating the information to absorbed infrared radiation with the physicochemical property.In the end, the model is implemented on an industrial scale for forecasting the viscosity in real time, providing to the production area, high power decisionmaking, and enabling increase the profitability of the blending process.
For each oil studied was developed a mathematical model with 800 input variables.To help determine the number of latent variables and minimize the residual variance was used the full cross-validation method, which is a mathematical algorithm able to gradually reduce the number of samples.In the sequence, a model constructed from the remaining samples is tested by comparing it with the true values of the samples excluded.The models are developed using The Unscrambler® program.Several forms of preprocessing were evaluated to obtain the minimum value of RMSECV (Root Mean Square Error of Cross Validation).The preprocessing that provided the best result was the first derivative of the second-degree polynomial proposed by Savitzky-Golay (Galvão et al., 2007), that highlight the differences between samples, contributing to the model can be used to explain the variance between them.The Fig. 13 and Fig. 14 show the original spectra for jet fuel and diesel, and the Fig. 15 and Fig. 16 show the same data after preprocessing by the above derivative.The method of the derivative of Savitzky-Golay smooths the spectrum by polynomial mobile.The derivative of the values of the absorbance as a function of wavenumber is calculated with the polynomial equation cited.
After the derivative of the spectrum, was determined the optimal number of latent variables for each product.In this way, for kerosene, were adopted four latent variables.Above this number, the explained variance decreases due to the incorporation of noise in the model, as shown in

46
For the diesel, using the same procedure, were adopted eight latent variables, because above that number, there is no significant gain for explanation of the variance, as shown in     The results of the models are statistically equivalent to those ones from laboratory methods.The Table 3   reprocessing, reducing costs and energy costs.In addition, better quality fuel reduces the impact of burning on the environment.
This chapter opens up other possibilities for assistance, such as simultaneous determinations of several other parameters of oil products.

Fig. 12 .
Fig. 12. Representation of online determination of viscosity by mathematical modeling (adapted from Early Jr, 1990)

Fig. 13 .
Fig. 13.Set of spectra of kerosene used in modeling

Fig. 18 .
Fig. 18.Explained variance (%) versus latent variables (kerosene).The outliers were eliminated observing the graphs of scores of the first two principal components and their influence (residual variance in Y versus leverage).The first two principal components captured the largest variability between the data and both for the samples of diesel and kerosene they are statistically close, as represented in the Fig. 19 and Fig.20.

Fig. 19 .
Fig. 19.Scores of the first two latent variables for kerosene ).

Fig. 23 .
Fig. 23.Comparison of the results provided by PLS regression model and the results obtained by the reference laboratory (kerosene)

Table 3 .
summarizes the results of modeling.Summary of modeling results