Acessibilidade / Reportar erro

Detection of outliers in a gas centrifuge experimental data

Abstract

Isotope separation with a gas centrifuge is a very complex process. Development and optimization of a gas centrifuge requires experimentation. These data contain experimental errors, and like other experimental data, there may be some gross errors, also known as outliers. The detection of outliers in gas centrifuge experimental data is quite complicated because there is not enough repetition for precise statistical determination and the physical equations may be applied only to control of the mass flow. Moreover, the concentrations are poorly predicted by phenomenological models. This paper presents the application of a three-layer feed-forward neural network to the detection of outliers in analysis of performed on a very extensive experiment.

Isotope separation; Gas centrifugation; Uranium isotopes; Outlier detection; Neural network


PARTICULATED SYSTEMS

Detection of outliers in a gas centrifuge experimental data

M. C. V. AndradeI; C. A. O. NascimentoII; S. C. P. MigliavaccaIII, * * To whom correspondence should be addressed

ICTMSP, Centro Tecnológico da Marinha em São Paulo, Phone: +(55)(11) 3817-7599, Av. Prof. Lineu Prestes 2242, CEP 05508-900, São Paulo - SP, Brazil. E-mail: monicva@ig.com.br

IIDepartment of Chemical Engineering, Polytechnic School of the University of São Paulo, Phone: +(55)(11) 3091-2237CEP 05508-900, São Paulo - S.P., Brazil. E-mail: oller@usp.br

IIIIPEN, Instituto de Pesquisas Energéticas e Nucleares, Phone: +(55)(11) 3817-7599, Travessa R, 400, CEP: 05508-900, São Paulo, SP - Brazil. E-mail: scpresti@usp.br

ABSTRACT

Isotope separation with a gas centrifuge is a very complex process. Development and optimization of a gas centrifuge requires experimentation. These data contain experimental errors, and like other experimental data, there may be some gross errors, also known as outliers. The detection of outliers in gas centrifuge experimental data is quite complicated because there is not enough repetition for precise statistical determination and the physical equations may be applied only to control of the mass flow. Moreover, the concentrations are poorly predicted by phenomenological models. This paper presents the application of a three-layer feed-forward neural network to the detection of outliers in analysis of performed on a very extensive experiment.

Keywords: Isotope separation; Gas centrifugation; Uranium isotopes; Outlier detection; Neural network.

INTRODUCTION

Any experimental data set has an associated error, which may be systematic, random or, in most cases, both. Correct use of these experimental data requires a careful analysis of the associated errors, which indicates the reproducibility, the representativeness and the reliability of the data obtained. However, this analysis can be an arduous task in the case of a very complex process, where uncontrolled or unmonitored parameters influence the process or where large stochastic deviations are observed, resulting in a very dispersed data set (Alves and Nascimento, 2002)

Several procedures have been proposed for the treatment and analysis of experimental data sets, based on statistics or on the physics of the process (Himmelblau, 1970; Plácido and Loureiro, 1998). Nonetheless, those techniques can be difficult to apply. For instance, when the physics of the process is not very well known, i.e., it cannot be totally represented by means of equations, or when there is little data under any operational condition. Many processes, such as ultracentrifugation uranium enrichment, have these characteristics (Migliavacca, 1999).

The detection of gross errors will be analyzed by the application of a neural network to an experimental data set from isotope separation tests of ultracentrifuges. These data originated in the process of evaluation and optimization of ultracentrifuges developed at the Centro Tecnológico da Marinha em São Paulo (CTMSP) with the collaboration of the Instituto de Pesquisas Energéticas e Nucleares (IPEN). In an isotope separation test of a centrifuge, UF6 gas with a known isotope composition is sent through the centrifuge.. The operation is performed for several flow conditions as well as at several pressures in the feed, product and waste lines (enriched material in 235U and depleted material in 235U, respectively). Similar results on differences due to technological changes, were published by Zippe (1960) and Jordan (1966).

THEORETICAL BASIS

The numerical estimate of the separation performance of a centrifuge requires calculation of internal flow, whose solution is only feasible through mathematical simplifications (Kai, 1989; Olander, 1972 and 1981; Soubbaramayer, 1979). On the other hand, it is very difficult to relate the parameters used to describe numerically the countercurrent with the physical devices used in the centrifuge (Kai, 1989; Migliavacca et al., 1999; Migliavacca, 1999). Since it is a process where by the gas flows at supersonic speeds, the resulting flow is very sensitive to variations in the characteristics of centrifuge construction, and even to differences possible mechanical precision.

Isotope Separation

Separation of the uranium isotope like separation of other stable isotopes, is described by a general theory that is valid for any binary mixture. Due to the similar physical and chemical properties of the isotopes, separation is usually implemented in sequential stages, known as the isotope separation cascade. This theory of separation of the uranium isotope is described by Cohen (1951), Benedict et al. (1981) and Jordan (1966).

The smallest element that separates isotopes in an enrichment facility is called a separation unit. In the case of the gas ultracentrifuge, the separation unit is a single ultracentrifuge. Some ultracentrifuges connected in a parallel arrangement constitute a cascade stage. All the separation units in a stage have inputs with the same isotopic concentration and produce equal outputs of the enriched and of the depleted fractions. The connection of a series of stages costitute a cascade.

A simple separation element has one input feed flow F of the isotopic binary mixture with the composition xF of the light isotope (235U) and two output flows, the enriched product P with composition xP and the depleted waste W with composition xW. The abundance ratio R is defined by the relation between the composition of the light isotope and the composition of the heavy isotope [R=x/(1-x)]. The main parameters of the separation unit are defined as follows:

Separation factor:

Enrichment factor:

Tail factor:

A case in which the enrichment and tail factors are the same, b=g, thus a=b2, is called a symmetric separation process.

Another important parameter is the cut q, defined as the relation between the product and the feed flow:

which may be written as a function of the composition

The separation unit may be described by the separation parameters a, b and g and by the flow variables F and q (or P and W), but none of these variables alone represent the separation performance of the separation unit. Thus a function must be defined to indicate the separative capacity of the separation unit, which takes into account the separation and the flow parameters, referred to as separative power dU, defined by Dirac and described by Cohen (Cohen, 1951). This function uses the idea of the value of the flow (U) that represents the value acquired by a material that is been processed in a separation unit. The value of the flow U is given by multiplying the mass flow and the value function V(x). Therefore, a separation element has three flows with different values: UF = FV(xF), UP = PV(xP) and UW = WV(xW) in the feed, product and waste flows. The separative power dU is then obtained by the variation in material introduced by the separation unit, given by

The value function V(x) is a function of composition x, is dimensionless and is given by

This function is symmetric in x = 0.5, where V(0.5) = 0. For the other composition values, it is positive and increases, going to infinity as x goes to 0 or 1. When function V(x) is used the separative power dU in the separation unit is independent of the concentration of the material being processed. The separative power dU was internationally adopted as the parameter to quantify the capacity of separation units and facilities.

Uranium Enrichment with a Gas Ultracentrifuge

A gas ultracentrifuge, as schematized in Figure 1, is composed of a long, thin vertical cylinder (rotor), rotating around its axis at a high velocity inside a case under vacuum. The process gas, assumed to be a binary isotopic mixture with 235UF6 and 238UF6, inside the rotor is subjected to a ultracentrifuge force that is one thousand times stronger than gravity. A pressure gradient is established in the radial direction, increasing from the center to the rotor wall. That pressure distribution is slightly dissimilar for the different isotopes because it is proportional to mass. This results in a partial separation in the radial direction. A countercurrent axial flow increases the radial separation.


The separation performance of a gas ultracentrifuge depends on (Avery and Davis, 1973; Benedict et al., 1981) (a) the characteristics of the ultracentrifuge, given by the angular velocity, length and diameter of the rotor; (b) the operational conditions, given by the feed and extraction mass flows and by the mass hold-up (indirectly measured by the pressure at the product or at the waste extraction); (c) the axial feed position inside the rotor and (d) the strength and profile of the countercurrent axial flow. The countercurrent flow, shown in Figure 1 with vertical arrows, is induced by mechanical drives as the rotating gas hits stationary obstacles inside the rotor (scoop or baffles) and by thermal drives, e.g., different temperatures at the end caps of the rotor.

EXPERIMENTAL DATA

The bench plant is composed of an ultracentrifuge and a UF6 container, interconnected by pipes and valves, where instruments and control valves are properly located to control and to monitor the whole process of injection and extraction of the process gas UF6 in the ultracentrifuge. The operational condition is defined by the pressures in the feed pF, product pP and output of the waste pW lines; by the feed flow F and extractions of the product P and waste W; or by the cut relationship q. Under each condition, samples are collected for verification of the separation obtained by the measures of the abundance ratio of the enriched and depleted streams, RP and RW, respectively. An isotope separation test consists in the operation of an ultracentrifuge under different operational conditions, defined by three of the process variables (the feed flow F, the pressure of the extraction product header pP and the cut q), as shown in Figure 2. Thus, several groups of data are generated and each of them is denominated a separation experiment. They are composed of the control variables that define the operational condition; the other process variables, which are the observed answers response; and the respective separation parameters, calculated with Eqs. (1) to (6). These experiments can be divided into groups of experiments obtained under similar conditions.


Seven hundred and ninety-one experiments were performed; however some were eliminated due to conditions insufficient for sample analysis, resulting in a group of 764 experiments. The test was programmed to use the cascade design (this separation technology uses a cascade system of ultracentrifuges), covering the whole domain of interest for these variables, consisting of eight values of feed flow F, seven values of cut q and five values of pressure pP. These data were organized and analyzed according to conventional statistical procedures, taking into consideration the macroscopic mass balance, cluster analysis and the statistical analysis of the errors in the experimental results, as described below. Then, a neural network technique was employed.

Macroscopic Mass Balance

During the isotope separation test, the macroscopic mass balance was constantly verified by comparing the cut obtained through measurement of flow product waste q = P/(P+W) and the cut obtained from the abundance ratio resulting from Eq. (5). Experiments where the difference between these two cut values was greater than 0.03 were regarded and should be repeated. This procedure is necessary due to the possibility of an error in the sampling procedure, which uses an external element (the sample vessel) and bypasses the material flow coming from the ultracentrifuge to the sampling system, which may disturb the steady-state ultracentrifuge condition.

Application of this procedure, to the complete group of experiments, aimed to the experiments with gross errors. Those 176 experiments from the original data set were excluded, because they did not obey the mass balance. Thus, the experimental data set remained with 588 experiments.

Statistical Analysis of the Errors

• Statistical Analysis by Hypothesis Test

The 588 remaining experiments were clustered by process primary variables F and q; then RP and RW averages were obtained. These experiments were divided into 158 groups. Thus, the error of the experimental data was calculated in relation to these averages. For example, if a given group has three samples (three experiments), one has an average for RP of RPavgj and eRpi is the difference between RPi and RPavgj , such that:

RPi - RPavgj = eRpi , where i = 1,....n and n = data set with 588 experiments j = 1,...m and m = 158 groups

The same approach for RW was followed.

The analysis was then conducted for the errors in RP (eRp) and in RW (eRw).

The resulting errors were assumed to be a random sample of normal random variables eRp and eRw, where eRpavg and eRwavg are the averages:

The statistical test, referred to as the extreme deviate statistic, involves the difference between the extreme value and the sample mean value, where YRp and YRw are defined as the residual:

Thus, the statistical analysis was performed based on a hypothesis test for means, which involves a confidence interval estimate with a confidence level of 98%, and Anscombe suggests the following rules, giving c implicitly in terms of t in Himmelblau (1970):

and explicitly by the following approximate relation in terms of the F distribution:

Using this procedure, 15 experiments were suspected to contain gross errors in eRp, and nine in eRw.

Table 1 shows the coded number of the experiments detected by the statistical analysis.

• Cluster Analysis

The original data set was submitted to a cluster analysis. It was subdivided into groups where two of the control variables were constant, and each response was analyzed through graphical inspections as a function of the third control variable. The analysis evaluated of the response obtained in terms of the separation power dU for each experiment, since this is the most significant variable in the analysis of the separation performance of an ultracentrifuge.

The following variables are used: the feed flow F at eight levels, cut q at seven levels and the pressure in the product extraction pP at five levels. When a different result of the separation power dU is observed in a block of similar data, the corresponding experiment may be rejected. Points that are dispersed in relation to the tendency lines were detected in each cluster analysis performed. Figure 3 shows one of these analyses.


The general behavior of the separation power dU as a function of the feed flow F, cut q or pressure in the product line pP, keeping constant two of these variables, is known and has been presented by Migliavacca et. al. (1999 and 2002).

The five suspect experiments detected by cluster analysis are shown in Table 1.

NEURAL NETWORKS

Neural networks are one of the fastest growing areas of artificial intelligence in chemical and nuclear engineering. Their main applications are in fault diagnosis, dynamic modeling and control of chemical processes (Bhat and McAvoy, 1990; Hoskins and Himmelblau, 1988) and in solving nonlinear optimization problems (Nascimento & Giudici, 1998; Nascimento et al., 2000), among others.

Artificial neural networks are made up of highly interconnected layers of simple neuron like nodes. The neurons act as nonlinear processing elements within the network. Of the many existing artificial neural network paradigms, the three-layer feed-forward neural network consisting of an input layer, a hidden layer and an output layer, is the most widely used in process modeling. The input layer consists of ni+1 neurons, where ni is the number of input variables, and there is no processing in this layer. Besides the inputs, a bias is given to the network. The number of neurons in the hidden layer is defined by the user. According to Pollard et al. (1992), the precision is only slightly sensitive to the number of neurons in the hidden layer after a minimum value. The output layer consists of a number of neurons equivalent to the number of outputs of the process. The interconnections between a set of artificial neurons are called the weights of the neural model, which are calculated iteratively using a backpropagation algorithm, i.e., the steepest descent based optimization routine in order to minimize a given objective function (Rumelhart & McClelland, 1986). The computations are carried out over the entire network, except the input layer. The mapping of each unit is in terms of the combination of all its inputs, followed by the application of a nonlinear function, called the activation function.

The application of neural networks in the simulation of chemical and nuclear processes, specifically in isotope separation with the gas ultracentrifuge, is of great interest due to the nonlinearity of these processes (Migliavacca et. al., 1999 and 2002; Migliavacca, 1999). The success of this kind of modeling depends heavily on knowledge of the main variables affecting the process and the availability of a good data base with the necessary information on the desired domain. This work uses this technique of data treatment to identify the outliers based on the scheme of modeling by neural network defined by Migliavacca (1999) for data sets. The software for training the neural network was developed in FORTRAN by the Laboratory of Process Simulation and Control of the Departament of Chemical Engineering at the University of São Paulo (Nascimento, 1996). All data were included in the training data set and during the training process several thousand iteration were performed.

Conduction of the experiments took several months, and a worsening of the calibration of some instruments (the flow meters) occurred. Around the experiments coded 294 to 308, a severe error was detected through verification of the mass balance, and thus, the instruments were recalibrated. The same problem occurred after experiment number 604. These problems were found in using the neural network. Figures 4 and 5 show the error between the experimental and calculated values, in sequential order. These graphics show three different regions of experiments. Thus, the data set was divided into three groups of experiments, for the three different bias observed in the sequential experimental residues:

First group: from experiment number 1 to 294

Second group: from experiment number 308 to 604

Third group: from experiment number 605 to 788



The points where errors between the experimental and calculated data appear to be scattered far from the majority of the values are probably outliers (Bülau et al., 1999).

The residues of the first training run were analyzed by application of the statistical test described above, according to Himmelblau (1970).

The procedure was repeated several times until the scattered data showed no more abnormal points. Table 2 shows the results of this methodology and Table 3 shows the coded number of points eliminated from each run for the three groups in the data set.

Applying the neural network to the first group in the data set, corrected previously to include a bias, no suspect experiment (with gross error) was detected, as shown in Figures 6a and 6b. Using the same methodology for the second group, seven experiments with gross error in the first run were detected, as shown in Figures 7a and 7b. These seven points were excluded from the data set and then the neural network was applied again. In the second run, two additional experiments with gross error were identified and eliminated from the data set. Figures 8a and 8b show this. The same procedure was followed until no experiment with gross error was detected. A total of 13 experiments were eliminated. The third group of experiments was carried out under extreme experimental conditions. Under these conditions, the experimental process error was higher than it was for the two previous groups. Only one experiment was eliminated, as shown in Figures 9a and 9b.









Thus, after the neural network analysis for the three groups, a new, unique data set was formed with 574 experiments. This means that 14 suspect experiments were eliminated. Four of them are justified by cluster and statistical analysis. Four others experiments, that were identified in the first run, can be justified by small deviation in the steady state conditions defined. The neural network approach showed the capability to identify two different classes of errors: instrument bias and outliers.

CONCLUSIONS

The neural network model has been shown to be a very attractive tool for identifying systematic and gross error. The statistical methods employed identified some experiments suspected of being outliers. A direct comparison between the statistical analysis employed in this work and neural network analysis was not possible, since only the NN was able to identify groups with different bias. Statistical analysis of the points with bias is not reliable.

The process operation of an ultracentrifuge is very complex. Under many conditions high variability may occur. Some abnormal results may appear and can be misleading with gross error. In this case, all the results from any technique employed must be verified by of the research staff.

NOMENCLATURE

eRp error in abundance ratio of the product eRw error in abundance ratio of the waste F feed mass flow n size of the sample P product mass flow pF pressure in the feed line pP pressure in the product line pW pressure in the waste line RP abundance ratio of the product RW abundance ratio of the waste v degrees of freedom W waste mass flow xF light isotope concentration in the feed flow xP light isotope concentration in the product flow xP light isotope concentration in the waste flow YRp residual in abundance ratio of the product YRw residual in abundance ratio of the waste Subscripts avg pertain to average i pertain to the indices of the sample j pertain to the indices of the group of samples Greek Symbols q q cut (= P/F) a separation factor (= RP/RW) b head separation factor (= RP/RF) g tail separation factor (= RF/RW) dU separation power

ACKNOWLEDGMENTS

The authors would like to thank the Centro de Tecnologia da Marinha em São Paulo for allowing to be published this paper, and FAPESP for sponsoring one of the authors.

Received: February 27, 2004

Accepted: January 14, 2005

  • Alves, R.B.M. and Nascimento, C.A.O,. Gross Errors Detection of Industrial Data by Neural Network and Cluster Techniques, Brazilian J. of Chem. Eng., vol. 19, no.4, pp. 483-489 (2002).
  • Avery, D.G. and Davis, E., Uranium Enrichment by Gas Centrifuges. London: Mills and Boon (1973).
  • Benedict, M., Pigford, T.H. and Levi, H.W, Nuclear Chemical Engineering. 2d ed., New York, NY: McGraw-Hill (1981).
  • Bhat, N. and McAvoy, T., Use of Neural Nets for Dynamic Modeling and Control of Chemical Process Systems, Computers & Chemical Engineering, vol. 14(4/5), pp. 573-583 (1990).
  • Bülau, H.C., Ulrich, J., Guardani, R. and Nascimento, C.A.O, Application of Neural Networks to Data from a Melt Crystallization Process for the Detection of Outliers. In: AIDA International Seminar on Advances Data Analysis, 1999. Washington. Proceedings... Washington (1999).
  • Cohen, K., The Theory of Isotope Separation as Applied to the Large Scale Production of U235. New York, McGraw-Hill (1951).
  • Himmelblau, D.M., Process Analysis by Statistical Methods. New York, NY: Wiley (1970).
  • Hoskins, J.C. and Himmelblau, D.M., Artificial Neural Network Models of Knowledge Representation in Chemical Engineering, Computers & Chemical Engineering, vol. 12 (9/19), pp. 881-890 (1988).
  • Jordan, I., Separação dos Isótopos de Urânio pelo Processo da Centrifugação em Fase Gasosa. São Paulo, SP, These (Cátedra) Escola Politécnica da Universidade de São Paulo (1966).
  • Kai, T., Theoretical Research on Gas-centrifugal Separation for Uranium Enrichment, J. Nucl. Sci. and Technol., 26(1): 157-160 (1989).
  • Migliavacca, S.C.P. Modelagem do comportamento separativo de ultracentrífugas via rede neural. São Paulo, SP, Tese (Doutorado) - Instituto de Pesquisas Energéticas e Nucleares (1999).
  • Migliavacca, S.C.P; Nascimento, C.A.O. and Rodrigues, C., Use of Neural Network for the Simulation of a Gas Centrifuge. J. Nucl. Sci. Technol., vol. 36, no. 4, pp. 364-370 (1999).
  • Migliavacca, S.C.P., Nascimento, C.A.O. and Rodrigues, C., Analyses and Optimization of Gas-centrifugal Separation of Uranium Isotopes by Neural Networks., Brazilian J. Chem. Eng., vol. 19, no. 3, pp. 299-306 (2002).
  • Nascimento, C.A.O., NEURO14: Neural Networks Modeling Training Program for Multilayer Feedforward Networks Using: Backpropagation Algorithm; Momentum Algorithm or Backpropagation with Dynamic Dumping Factor Algorithm. Version 14 on 29.9.1996, (Roberto Guardani).
  • Nascimento, C.A.O. and Giudici, R., Neural Network Based Approach for Optimization Applied to Industrial Nylon-6,6 Polymerization Process, Comput. & Chem. Engng., no. 22, S595 (1998).
  • Nascimento, C.A.O., Giudici, R. and Guardani, R., Neural Network Based Approach for Optimization of Industrial Chemical Processes. Comput. & Chem. Engng., no. 24, pp. 2303-2314 (2000).
  • Olander, D.R., Technical Basis of the Gas Centrifuge. Adv. Nucl. Sci. Technol., no. 6, 105-174 (1972).
  • Olander, D.R., The Theory of Uranium Enrichment by the Gas Centrifuge. Prog. Nucl. Energy, vol. 8, pp. 1-33 (1981).
  • Plácido, J. and Loureiro, L.V., Industrial Application of Data Reconciliation. Comput. & Chem. Engng., no. 22, pp. 1035-1038 (1998).
  • Pollard, J.F., Broussard, M.R., Garrison, D.B. and San, K.Y., Process Identification Using Neural Networks, Computers & Chemical Engineering, vol. 16(4), pp. 253-270 (1992).
  • Rumelhart, D. and McClelland, J., Parallel Distributed Processing Explorations in the Microstructure of Cognition, vol. 1, chap. 8, MIT, Cambridge, Mass. (1986).
  • Soubbaramayer, Centrifugation. In: Villani, S., ed., Uranium Enrichment. Springer Verlag, Berlin, pp. 183- 243 (1979).
  • Zippe, G., The Development of Short Bowl Ultracentrifuges. Final report ORO-315 (1960).
  • *
    To whom correspondence should be addressed
  • Publication Dates

    • Publication in this collection
      28 Sept 2005
    • Date of issue
      Sept 2005

    History

    • Accepted
      14 Feb 2005
    • Received
      27 Feb 2004
    Brazilian Society of Chemical Engineering Rua Líbero Badaró, 152 , 11. and., 01008-903 São Paulo SP Brazil, Tel.: +55 11 3107-8747, Fax.: +55 11 3104-4649, Fax: +55 11 3104-4649 - São Paulo - SP - Brazil
    E-mail: rgiudici@usp.br