An Iterative Method for Curve Adjustment Based on Optimization of a Variable and its Application

An iterative method for the adjustment of curves is obtained by applying the least squares method reiteratively in functional subclasses, each deﬁned by one parameter, after assigning values to the rest of the parameters which determine a previously determined general functional class. To ﬁnd the minimum of the sum of the squared deviations, in each subclass, only techniques of optimization are used for real functions of a real variable.The value of the parameter which gives the best approximation in an iteration is substituted in the general functional class, to retake the variable character of the following parameter and repeat the process, getting a succession of functions. In the case of simple linear regression, the convergence of that succession to the least squares line is demonstrated, because the values of the parameters that deﬁne each approximation coincide with the values of the parameters obtained when applying the method of Gauss - Seidel to the normal sys-tem of equations. This approach contributes to the teaching objective of improving the treatment of the essential ideas of curve adjustment, which is a very important topic in applications, what gives major importance to the optimization of variable functions.


Introduction
The method of regression is one of the most important statistical methods for higher education graduates.Its comprehension facilitates obtaining and correctly interpreting the results of different types of models to be applied in their professional careers.Bibliographic research in the scientific literature shows the wide interest and use of regression methods.From such bibliographic analysis, three approaches can be differentiated: The largest one, related to the application of regression methods to different fields and topics of science; see Braga, Silveira, Rodríguez, Henrique de Cerqueira, Aparecido & Barros (2009), Guzmán, Bolivar, Alepuz, González & Martin (2011), Ibarra & Arana (2011) and Santos da Silva, Estraviz, Caixeta & Carolina (2006) a second approach related to the theoretical aspects of the topic; see Núñez, Steyerberg & Núñez (2011), Vega-Vilca & Guzmán (2011), Donal (2001), Ranganatham (2004), Kelley (1999), Schmidt (2005).Lastly, a third group related to the teaching of the method, i.e., how to help students, and professionals in general, in correctly applying regression and interpreting its results see Batanero, Burrill & Reading (2011), Gutiérrez de Ravé, Jiménez-Hornero & Giráldez (2011) and Wei, De Quan & Jian (2001).
Applications of regression methods are found in scientific papers related to agriculture, medicine, environment, economics, sociology and different engineering areas.Using a random sample of one hundred papers published during 2012 and obtained by the authors from the Web of Knowledge, in 32% of them there was a direct application of these methods and in almost half of them (46%) there was a reference to regression.Such significant use of regression supports its inclusion in the largest part of university curricula.It is generally included in the statistics subject.
In terms of teaching, curve adjustment is generally explained once the methods of optimization of real functions of several real variables are known, along with the solution of linear equation systems.This makes possible the support of procedures that permit to determine the values of the parameters characterizing the functional class of the best adjustment curve sought.Usually, in practice, computer packages are used to determine these parameters.
Taking into consideration the importance of curve adjustment for applications, it is advisable to teach the students these ideas much more before it is usually done in university curricula.How can this purpose be achieved if curve adjustment is preceded by mathematical requirements which seem not to be possible to sever?This paper presents an approach that permits to teach in advance the consideration of these basic ideas of regression in at least one semester, what is justified assuming the following hypotheses: -The methods of optimization for the real functions of many variables are neglected, what has the immediate implication of not requiring the partial derivation.
-A system of linear equations is not stated, so the corresponding theory is not necessary.
-The reiterative application of the least squares method in functional classes determined by only one parameter, so that in each of them, the corresponding sum of the squared deviations is function of a unique variable.Consequently, optimization techniques for real functions of a real variable are only required.
Though sufficient and varied bibliography about the least squares method is available, it was considered necessary to make explicit some of its basic aspects initially, such as the expression that takes the sum of the squared deviations, as well as the normal system of equations that is formed at stating the necessary conditions for extremes, both in the case of the simple linear regression.
Curve adjustment is, possibly, the most frequently used mathematical resource for solving one of the fundamental problems related to numerous scientific areas: "reconstructing" a function starting from experimental data.Essentially, for the case of one variable functions, this problem may be formulated through the following statement: "Given the set of n points {(x 1 , y 1 ); (x 2 , y 2 ); . . .(x n , y n )}, where n is a natural number and every two x k abscissas are different, the goal is determining the y = f (x) function which, within a given prefixed class of functions, best adjusts them".

Curve Adjustment and the Least Squares Method
Generally, the prefixed functional class depends on various parameters, and the purpose of the method used for their estimation is to satisfy some criterion of optimization, which is characteristic of the method; particularly, the objective of the least squares method is to minimize the sum of the squared deviations.Two other alternatives, which are also frequently used are the Maximum Likelihood Method; see Yoshimori & Lahiri (2014), Seo & Lindsay (2013) and Han & Phillips (2013) and for the Bayesian regression method; see Zhao, Valle, Popescu, Zhang & Mallick (2013), Mudgal, Hallmark, Carriquiry & Gkritza (2014) and Choi & Hobert (2013).
In the probably most renowned and significant case of finding the best-adjusting function within the class of linear functions of one independent variable f (x) = a 1 x + a 2 this problem is solved through the least squares method by determining the values of the a 1 and a 2 parameters (the slope and the intercept with the yaxis, respectively), which provide the minimum value to the sum of the S(a 1 , a 2 ) squared deviations: Determining the minimum of S(a 1 , a 2 ) requires applying optimization techniques for real functions of two real variables, which initially require the use of the necessary condition on extreme points: Afterwards, it requires the resolution of the system of two linear equations resulting from it with a 1 and a 2 as unknowns.This system is called the normal equation system, which is expressed as follows: When applying any of the existing techniques for the resolution of system (2), the result is a single solution a 1 = a 2 , given by the expressions: The procedure herein presented is equivalent to applying the Gauss -Seidel Method (McCracken & Dorn 1974) to the normal system of equation ( 2).This is an iterative method for the resolution of linear equation systems, as it happens with the system in equation ( 2), or the one resulting from applying the necessary condition in equation ( 1) to the sum of the squared deviations, when the adjustment takes place in a functional class that is linear with respect to the parameters defining it.
In terms of teaching organization, this approach provides more significance to the optimization methods of real functions of one real variable.At the same time, it permits the introduction of an important application such as curve adjustment, advancing one semester, at least.

An Iterative Method for the Process of Curve Adjustment
Solving the normal equation system of equation ( 2) is not possible until a method that permits optimizing a derivable function of two real variables is available.Therefore, a sequence, that only requires applying different variable optimization techniques, one at a time, may be followed.Such method results from realizing the following steps: 1. Prefix the functional class in which the adjustment process will be carried out.
As it is known, the functional class is characterized by a functional expression involving the independent variable and the p parameters that define it, being p a positive integer.This class of functions is denoted by: where x is the independent variable, and the parameters have been denoted by a 1 , a 2 , . . ., a p , for which it is necessary to previously establish an order among them.
2. Keep the variable character of a 1 and assign values to the rest of the parameters.
The values assigned to the parameters are denoted by a 3 , . . ., a p , where the sub-index of each identifies the parameter, and the supra -index 0 indicates that it is the initial assignment.These values may be arbitrary or follow a certain criterion, but this is irrelevant to the method being described.Thus, the set of functions is defined as: which is formed by functions of the independent variable x depending on the parameter a 1 that obviously constitutes a subclass of the pre-fixed functional class.
3. Form the sum of the quadratic differences in p ).Given the set {(x 1 , y 1 ); (x 2 , y 2 ); . . .; (x n , y n )}, of n points, the corresponding sum of the quadratic differences to be minimized is formed, which is a function of the a 1 parameter, and is defined by the expression: Apply the necessary extreme condition to S(a 1 ).
As S(a 1 ) is a one variable function, it is enough to state S (a 1 ) = 0 to thus determine the solution of this equation.This gives the value of parameter a 1 , denoted by a (1) 1 so that S(a 1 ) is the lowest value of S(a 1 ).It is important to note that the supra-index 1 in a (1) 1 means that this is the first value calculated for a 1 .

Results and Discussion
The implementation of the previous process guarantees obtaining the function of better adjustment within the p ) subclass, in the initially pre-fixed functional class y = f (x, a 1 , a 2 , . . ., a p ).
In general, it is not expected that the y p ) by substituting a (1) 1 by a 1 , to be a good approximation to the better adjustment within the general prefixed class y = f (x, a 1 , a 2 , . . ., a p ).
The described process is repeated, leaving the next parameter as arbitrary (in this case a 2 ) and taking for a 1 the calculated value a p .As a result, the value of parameter a 2 , denoted by a (1) 2 , will be obtained, offering the best adjustment function within the subclass: Once the whole set of parameters has been recovered, by proceeding similarly, the following p functions would be obtained: 1 , a 2 , a 3 , . . ., a where each of them is the best adjustment function within the corresponding functional class.
It can be verified that each of these functions is not a worse, but a better approximation than the previous one.Indeed, y (1) for it.So with this value for that parameter, function y (1) 2 provides a value for the sum of squared differences that is similar to the one given by y (1) 1 .This proves that y (1) 2 is an approximation not worse than y (1) 1 .This also holds for the rest of the functions and this step completes the first iteration.
As the values of parameters a

One Example of the Application of the Iterative Method
A table with arbitrary or hypothetical data, which determine five points of integer coordinates: A(1, 1), B(2, 3), C(3, 3), D(4, 5) y E(5, 5), is taken.In Figure 1, a regression line y = x + 0.4 is represented.It was obtained by the the least squares method, in the general functional class y = a 1 x + a 2 , where the parameters are the slope a 1 and the intercept a 2 .The sum of the squared deviations is function of these two parameters, so to obtain y = x + 0.4 (it means, a 1 = 1 and a 2 = 0.4) techniques of optimization for the functions of some variables were required and the exact resolution of the normal system of equations.
In Figure 2, a segment of the first approximation is represented.It is the line that by the origin (of slope 61/55) better adjusts to the five points.It is optimized in the functional class y = a 1 x, where the parameter is the a 1 slope, what follows from assigning, in y = a 1 x + a 2 , the zero value to a 2 parameter.Geometrically, it means that it optimizes in the functional class of all no vertical lines that pass through the origin.The sum of the squared deviations is only function a 1 so that this optimum (minimum) determined techniques of optimization of functions of one variable (it not even requires the ordinary derivative, observing that the sum of the squared deviations is a quadratic function in a 1 variable, whose graphic is a parabola that opens upwards, so that the optimum (minimum) is reached in the abscissa of the vertex (value 61/55).It is maintained with purposes of comparison, the segment of the regression line.In Figure 3 a segment of the second approximation, which best adjusts to the five points among all the lines with slope 61/55, is represented.It is optimized in the functional class y = (61/55)x + a 2 , which is obtained from the functional class y = a 1 x+a 2 replacing a 1 = 61/55 and retaking the variable character of a 2 (notice that for a 2 the value 0 was initially assumed).Geometrically it means that the line of equation y = (61/55)x is paralleled displaced itself up to a position that betters the adjustment (provides the minimal for the sum of the squared deviations which now depends on a 2 ).The resulting value for the parameter is a 2 = 4/55.For a new approximation a 2 = 4/55 in the general class y = a 1 x+a 2 is replaced to optimize in the subclass y = a 1 x + 4/55, in which the slope a 1 is variable again, so that what is looked for is the line that better adjusts to the data (in the sense of minimizing the corresponding sum of squared deviations) among all those that pass through the axis point (0, 4/55).The result is the new slope value a 1 = 659/605, what permits the line with best equation adjustment y = (659/605)x + 4/55, a segment of which is represented in Figure 4 together with the one of the least regression line.
The process continues similarly, that new adjustment would take place in the functional subclass y = (659/605)x + 2 , where the variable character of second of the parameters is retaken.

The Gauss -Seidel Method and Convergence in the Case of Linear Adjustment
The issue related to the convergence of the described iterative method has an affirmative answer in the case of linear adjustment, if the set of points fulfills the initially described characteristics; i.e., if within the full set, every pair of points has different abscissas.
As the best adjusting straight line, with the equation y = a (0) 2 , does exist, and the parameters are analytically determined as the only solution by (3) in the standard equation system, an iterative method convergent to the solution of such system would obviously provide, after an adequate number of iterations, an α ≈ a One of the simplest iterative methods for the resolution of a linear equation system, easily programmed for its computerized application, is the Gauss -Seidel Method.
Revista Colombiana de Estadística 37 (2014) 111-125 For the case of a two -equation system with two unknowns: The method is described as follows: Supposing that in the coefficients matrix, those of the main diagonal are not null, it is possible to find the unknowns a 1 and a 2 : An arbitrary approximation is now defined for the a 1 = a solution, and it is used to find a new approximation for the a 1 unknown value stemming from the first of the expressions in (6): 2 ) The a (1) 1 calculated value is substituted in the second of the expressions (6) to determine an approximation to the a 2 unknown value: At this point the first iteration is fulfilled.
The second iteration is implemented by taking the calculated approximation a A sufficient condition for convergence to the solution of the iterations produced through the Gauss -Seidel Method lies in the matrix of the coefficients being diagonally dominant, which means in this case that the x k ; a = a 1 , and b = a 2 then the standard equation system (2) can be expressed as: y k so that the expressions in (7) allow determining an approximation to its solution.
It is not difficult to verify that the values given by the expressions (7) for the unique solution of the standard system (2) match in each m iteration, those provided for the a 1 slope and the a 2 intercept by each iteration of the iterative adjustment process here described.Neither is it difficult to prove that the system matrix ( 2) is diagonally dominant if in the {(x 1 , y 1 ); (x 2 , y 2 ); . . .; (x n , y n )} set of points, where n is a natural number, every two of the x k abscissas are different, which means that the n n k=1 x k 2 inequality is fulfilled.
Indeed, according to Bronshtein & Semendiaev (1971), in the inequality (which is strict if there are at least two different x k values): it would suffice to square both sides to obtain, first: Then, multiplying the two sides by n 2 and expressing the sums in a compact form, it results: As every two of the x k numbers are supposed to be different, the fulfillment of the n n k=1 x 2 k > n k=1 x k 2 inequality is finally guaranteed.This proves that the matrix of the standard equation system (2) is diagonally dominant, and in turn implies that the expressions (7), obtained by applying the Gauss -Seidel Method, converge to the unique solution of such system.
At the same time, each iteration of the method was observed to coincide with the parameter values that result, in each step, from the function of better adjustment within the corresponding subclass.Therefore, a conclusion can be advanced so that these functions converge to the least squares straight line of equation y = a

Conclusions
An iterative method has been proposed to obtain an approximation of the best adjustment function to a given set of points, consisting of determining the best adjustment function within a certain subclass of the pre-fixed functional class each time.Each subclass is dependent on a single parameter.
As optimization is used only on one variable, it is not required to explicitly write the standard equation system.

Figure 2 :
Figure 2: Segment of the first approximation.

Figure 3 :
Figure 3: Segment of the second approximation.

Figure 4 :
Figure 4: Segment of the third approximation.
offers the possibility of defining a y = αx + β approximation for the best adjustment equation y = a iteration.In this way it is possible to reach the order m iteration defined by the expressions: a |b 11 b 22 | > |b 21 b 12 | inequality has to be fulfilled (McCracken & Dorn 1974).If we now define b 11 =