A case study to estimate costs using neural networks and regression based models

Article history: Received September 9, 2010 Received in revised format June 5, 2012 Accepted 11 July 2012 Available online July 11 2012 Bombardier Aerospace’s high performance aircrafts and services set the utmost standard for the Aerospace industry. A case study in collaboration with Bombardier Aerospace is conducted in order to estimate the target cost of a landing gear. More precisely, the study uses both parametric model and neural network models to estimate the cost of main landing gears, a major aircraft commodity. A comparative analysis between the parametric based model and those upon neural networks model will be considered in order to determine the most accurate method to predict the cost of a main landing gear. Several trials are presented for the design and use of the neural network model. The analysis for the case under study shows the flexibility in the design of the neural network model. Furthermore, the performance of the neural network model is deemed superior to the parametric models for this case study. © 2012 Growing Science Ltd. All rights reserved.


Introduction
Will this be a feasible project?Do we know how much it will cost?Are we able to afford this?These are a series of questions on the minds of most people and especially management when they are considering either introducing a new product to the market, or even if they are considering purchasing a product.To have the ability to accurately predict any cost in the aerospace industry is of upmost importance in order to gain an advantage over its competitors and suppliers.The importance of having this capability is especially vital in the aerospace industry, due to the fact that some of those products cost several million dollars per ship set.Several case studies have been conducted showing the costs of products and projects are not properly allocated and tend to go over budget (Assaf & Al-Hejji;2006;Frimpong, 2000;Bounds, 1998;Murmann, 1994;Norris, 1971).Furthermore, determining target costs will have other benefits with respect to forecasting, scheduling, bidding, fact based negotiation and in making strategic decisions.This paper presents a case study, in the form of an empirical study, carried out in collaboration with Bombardier Aerospace (BA).BA is well known as a global leader in the manufacturing and assembly of regional and business jets.
There are a various numbers of tools, models and methodologies available in the cost estimation industry.A bottom up approach entails estimating the cost of each and every activity according to their manufacturing sequence, summing these activities and finally allocating appropriate overheads.On the other hand, the estimation techniques that are under study, more particularly parametric and neural networks (NN) are considered a top down approach .
A top down approach takes into consideration the total effort of the project as a whole and assessing the most significant cost estimating relationship (CER), otherwise known as the target cost model.The two models previously mentioned will be developed and compared in order to estimate the target cost of a main landing gear (MLG) at BA. Authors such as Colmer et al. (1999), Bounds (1998) and Norris (1971) have emphasized on the importance of understanding the amount of resources, time, and cost required to complete a product.Even though this empirical study was a particular case at BA, the concepts and models used to estimate the cost of the MLG presented in this paper can be applied in any particular industry.The mathematical models presented in this paper can be used for a wide range of applications, such as cost estimations, make/buy decisions, project feasibility, and profitability for the company.
The rest of the paper is organized as follows; Section 2 presents the methodology utilized.It also gives an overview of the component under study, the MLG.Section 3 introduces both the parametric and neural network models developed for predicting the target costs.It also discusses the prominent factors selected as the cost drivers.The development of both types of models along with a detailed analysis is presented in Section 4. Finally, the conclusions and suggestions for future research highlighted in Section 5.

Methodology
BA is interested in estimating the target costs of various commodities and systems for their aircrafts.Even though several preliminary models have been created and some are in the process of being developed, this paper shows a regression based parametric model and a NN model developed to estimate the costs of the MLGs, which is a new approach at BA.

Landing Gear (LG)
The undercarriage of an aircraft, commonly referred to as the landing gear, is utilized as an interface between the aircraft and the ground.The complete landing gear assembly can be divided into subsystems.The main sub-systems are the MLG, nose landing gear, extension and retraction system, alternate release system, steering system and brake control system.The landing gear at BA is a major commodity in which absorbs roughly 5-10% of the overall cost of the aircraft.Therefore, to accurately predict its cost can serve as a great benefit.The focus of this paper is to develop target cost models for MLGs.The main functions of the landing gear are to absorb loads upon landing, taxiing and braking.The load during landing is absorbed by the gears and it is proportional to the maximum takeoff weight (MTOW).The MTOW, which is typically measured in the unit of pounds (lbs.), is the maximum allowable weight for an aircraft to take-off to meet the requirements of airworthiness authorities.

Target cost estimation
As previously mentioned, a comparative study was conducted for a regression based parametric model to a neural networks model.The basic theory of both methods is discussed in the section below.Both of these methods are used to predict the target costs of MLGs based upon certain input variables that were considered as cost drivers.

Cost drivers
Upon researching, analyzing the technical specifications and speaking to experts in the field of landing gears, the following three factors, which are provided to a potential supplier to make a bid, were selected as the CER to estimate the target cost: 1.
Weight of the MLG 2.

Height of the MLG
The weight of the MLG is measured in pounds.The MTOW has already been discussed in the previous section, and the height of the MLG, measured in inches, is the vertical height of the MLG when it is fully extended.

Regression Based Parametric Models
When a desired output, y depends on several input variables, as is the case in the problem under study, the multiple linear regression model (MLRM) can be utilized (Kutner et al., 2004).The basic equation of the MLRM is as follows: where, y, response dependent output on k predictor values β j, regression coefficient X j , j th independent variable ε, is the error The parametric model shown below has been used recently in another similar application, to estimate the required design effort to complete a project had promising results (Salam et al., 2009(Salam et al., , 2008;;Bashir & Thomson, 2004).Furthermore, the generic model shown below has been previously validated with empirical evidence of Boehm (1991), and Walston and Felix (1977). ( where is estimated target cost, X m , m th is cost driver and β m , constant (weight) estimated from historical data.It should be noted that Eq. ( 2) is not a linear equation and is not in the suitable form of a MLRM.In order to overcome this problem, if the natural log (ln) is taken of both sides the equation will have the form shown in Eq. (3).
Now in its current form it is clear that Eq. ( 3) is in the proper form in order to carry out the regression, provided the natural log of the input variables are also taken.However, as the data set is small, the jackknife technique for regression models is applied to ameliorate the results.

Jackknife technique
The jackknife technique is used to determine the regression coefficients of each of the model parameters.This technique was originally a computer-based method for estimating biases and the standard errors.According to Efron and Tibshirani (1993), this technique is commonly used not only to improve the problem of biased estimation due to small sample size, but also in situations where the distribution of the data is hard to analyze.In this technique, the data are divided into sub-samples, and the sub-samples are obtained by deleting one observation at a time.The calculations are carried out for each sub sample.Given a data set x = (x 1 , x 2 , x 3, x n ), the i th jackknife sample x i is defined to be x with the i th data point removed.The pseudo-values, Ps i , are determined using the following equation: The jackknife estimator is determined as follows:

Complex Parametric Models
The CNLM used in this research has the following notation. (6) As was the case in the regression models, the terms x i represent the cost drivers, and remaining terms are the constants.As this equation is not in the form of a regression model, the constants will have to be determined analytically.The manner is determining the constants will be using the gradient descent algorithm (GDA).The GDA is an optimization tool to find the local minima of a function (Snyman, 2005).In order to determine the constants, the function to be minimized is the square error of the predicted versus the actual costs (ie.∑ ).
The gradient, for each of the constants will calculate the amount the constant has to be changed (ie.delta) in order to minimize the function, the square error.Furthermore, the value of the constant will be adjusted by multiplying it by a step rate, η.Each of the constants will be adjusted each iteration until the specified stopping criterion (ie.acceptable change in error) is fulfilled.
Parametric CERs depend on a pre-determined cost function to establish the target cost.However, using non-parametric models such as the models based upon artificial intelligence, such as neural networks (NN) does not require pre-determined relationships.NN models have the ability of selfdetermining relationships of the variables to predict the cost.Therefore, the next model developed is based upon NN, and is discussed in the following section.

Neural Networks
Multilayer NNs which are trained by back propagation, explained below, are amongst the most popular forms of NNs.They are able to classify data and approximate functions based on a set of sample data, known as the training data.In literature, it is shown that an NN with a single hidden layer and a non-linear activation function can approximate the decision boundaries of a wide range of complexity (Kecman, 2001).This property is used to investigate the applicability of neural networks for cost estimation of the MLGs for BA.The NN used in this study is depicted in Fig. 1.
Input Layer Hidden Layer Output Layer Layer 0 Layer 1 Layer 2

Bias Bias
Fig. 1.The single hidden layer neural network used in this study An input vector representing the three cost drivers (see Section 3.1) is incident on the input layer.It is then distributed to the hidden layer and finally to the output layer via its weighted connections.As can be seen in the figure a bias neuron is also present.The bias neuron always outputs a unity without having an input.The bias neuron is very important, since the error-back propagation neural network without bias neuron for hidden layer will not learn (Kecman, 2001).Asides the bias neuron, each of the remaining neurons in the network operates by taking the sum of its weighted inputs and passing the result through a nonlinear activation function.This is shown mathematically in Eq. ( 6).
where out i,l output of i th neuron in layer l w j,i,l weight for the connection between j th neuron of layer l-1 and i th neuron of layer l f, nonlinear activation function.
There are several conventionally used choices for this activation function.One of the most frequently used functions (Pandya & Macy, 1996), also used in this study, is the sigmoid function given in Eq. (7).
The computational simplicity of the derivative of this function simplifies the formulation of the equations needed for the training process.Moreover, this function is bounded between ±1 ensuring that certain signals remain within a range and introduces non-linearity to the model.The term Q in this equation is referred to as the temperature of the neuron and it determines the shape of the sigmoid.It is used to tune the network in order to improve its convergence behavior.
For the NN to be able to predict a cost given the three cost drivers as inputs, it must first be trained using a training set.The training is the process of a successive and a systematic adjustment of the weights in order to minimize a defined measure of error, which is typically the square error.In this process, each time an input vector p(the three cost drivers) from the training set is presented to the network, the difference between the desired and the actual output is computed and the each weight is adjusted by an amount ∆w j,i,l , given in Eq. ( 8).
where η is learning rate, o p,j,l is output of the j th neuron of layer l and δ p,j,l error signal at the j th neuron of layer l.
The error signal is computed using Eq. ( 9) for the neuron in the output layer.
It should be noted that t p,1,2 is the desired value at the neuron in the output layer corresponding to the p th presentation of the training set vector.The training algorithm will try to obtain the output value as close as possible to t p,1,2 while minimizing the error.Eq. ( 10) is used to compute the error signal for the neurons in the hidden layer.
This training algorithm is known as training by back-propagation, and the complete derivation of the algorithm can be seen in Pandya and Macy (1996).

Data Analysis
In order to conduct this empirical study using both the parametric and the NN model, historical data is required.Table 1 below shows the costs of thirteen MLGs along with the corresponding technical data for the three cost drivers for thirteen types of aircraft programs at BA.It should be noted that the "actual" costs are not revealed and are masked to protect confidential information.
The manner the data is masked is important.The characteristics of the original data should be maintained, thus a masking technique developed in presented in Muralidhar et al. (1999) is utilized, and Bombardiers permission was obtained to use the masked data.For both of the types of models, sub-samples of the data were randomly generated, where ten programs were used to generate or train the model, and the remaining three were used for validation purposes.Three trials were conducted for this analysis.For the purpose of validation, the programs (6, 7, 13), (3, 7, 9), and (2, 8, 12)were removed from the data for Trials 1,2, and 3, respectively.

Analysis Based on Regression
For each trial a MLRM was carried out and it resulted in an equation to estimate the target cost of MLGs.The errors of the testing and validation data is obtained and will be further discussed in Section 4.3.Furthermore, with the MLRM, an analysis of variance (ANOVA) is conducted to see if the factors considered are significant.The confidence interval for significance is set at 90%.The procedure highlighted by Kutner et al. (2004) is used to determine the significant factors.If factors are considered insignificant the analysis is repeated by removing one factor at a time and the analysis is continuously repeated until only significant factors remain.The average generated equations to estimate the target costs are summarized in Table2 below.In all three of the trials there is only 1 factor, MTOW which is rendered as the significant factor.

Analysis Based on Complex Parametric Models
For modeling purposes, the values of input and cost data were normalized to facilitate in the convergence of the model.
1,000 ⁄ 100,000 ⁄ 1,000 ⁄ 1,000,000 ⁄ As mentioned in Section 3.2, the gradient descent algorithm (GDA) was used to determine the coefficients of the model.For all 3 Trials, using the GDA, the models converged and resulted in the following equations.(13)

Analysis Based on Neural Networks
The NN analysis is carried out for all the 3 trials as the linear regression.For each of the trials the number of neurons in the single hidden layer is set at 5. The temperatureQ is set at 0.9, and the learning rate η is set at 0.1.The stopping criterion, δ is the maximum value of acceptable error for the normalized training data.The value was determined by adjusting for each of the trials in order to minimize the maximum error of the testing and validation errors, the final selected value was 0.05.Furthermore, the maximum number of allowable iterations was set at 192,000,000 to prevent the model from "crashing" if no convergence is obtained.The value of delta was set at Fig. 2 depicts the corresponding plot of the data showing the actual (Data) versus the prediction (NN) for both the training and validation points for Trail 1.In the plot shown in Fig. 2, we can see that the NN is able to make a good prediction both on the training and the validation points.Such interesting results are also obtained in the others trials.This shows that the NN is able to generalize even after being trained with the limited data sets available in this problem.The comparison between the parametric model and the NN for all the trials is summarized in the following section.

Summary and Comparative Analysis
Tables 5-7 below summarize the results for the regression, the complex parametric and the NN models for Trials 1-3, respectively.It tabulates the maximum error for all the trials on the training, validation and all the combined data.As can be seen from Table 5-8, the regression model is the worst in terms of validation error for all 3 Trials.The NN based model outperforms both former models in all 3 Trials in terms of error on the training data, and is superior in the majority of the cases when the validation errors are compared.Furthermore, a t-test comparing the validation errors for the 3 models infers that the NN is superior, followed by the CPM, finally by the RM in terms of errors on the validation data.Therefore, it is reasonable to conclude that the NN model is the most suited model for estimating the target cost.Further analysis of pertaining to NN models is discussed below.

Further Analysis of NN Models
As mentioned in the previous section, the number of neurons set in the hidden layer was 5.It would be interesting to see the effects of varying the number of neurons in the hidden layer.Tables 8-10 show the effect for varying the neurons for Trials 1, 2, and 3 respectively.As can be seen from the sensitivity analysis, the number of neurons to select based upon the validation errors for Trials 1-3 would be 8, 10, and 5, respectively.In order to conduct the comparative analysis, one value is to be used for the number of neurons in the hidden layer.The value is set to 5 to minimize the number of weights created, to avoid over training.Moreover, as can be seen from Fig. 3, for different values for N, the error ranges between from 5.6% to 6.2%.Thus, based upon this analysis it can be concluded for this case that there is no great sensitivity in varying the number of neurons in the hidden layer, contrary the claims of Huang and Huang (1991).Therefore, any selected value of N would yield reasonable results for the NN.

Conclusions and Future Research
When purchasing a new product, understanding the target cost is essential, as it will help in planning, bidding and make or buy decisions.The study presented in this paper is a case study to estimate the target costs of the main landing gears at Bombardier Aerospace.Even though this empirical study is for a specific application, the methodology utilized can be applied to various industries interested in estimating costs, as the model has the flexibility of being generalized.
The methods utilized to make the cost estimations are regression based parametric models and neural networks.This study shows that the overall performances of the neural network models are superior to that of the parametric models.In fact, the overall average error of all the trials presented in this paper of the regressions based parametric model is more than double that of neural networks.
Furthermore, the number of neurons in the hidden layer does not have a great impact on the performance of the model.
Even though this study has promising results, the analysis is based upon scarce data and is not an exhaustive study of all the possible sub-samples that can be made.If all the sub-samples are analyzed and the neural networks are computed for all of them, the average output obtained from all of the neural networks can be utilized as estimation for the target cost.Furthermore, the weights generated in the neural networks are based on the gradient descent method which can be trapped in a local optimal solution of the error surface.Other meta-heuristics, such as the genetic algorithm or simulated annealing can be utilized to obtain the weights and compared to the method presented in this paper.Moreover, the research can be extended by comparing the estimation of NNs to other techniques; such as complex non-linear models and fuzzy clustering.
where, Ps i , Pseudo-value for the entire sample, omitting sub-sample i ns, Number of sub-samples β, Least-squares estimator of the whole sample β -1 , Least-squares estimator for the entire sample, omitting sub sample i

Fig. 2 .
Fig. 2. Plot of the masked cost versus the prediction using NN

Table 2
Regression based parametric models for Trial 1

Table 9
Maximum and average errors for Trial 2