Support Vector Machines Approach for Estimating Work Function of Semiconductors: Addressing the Limitation of Metallic Plasma Model

Experimental determination of work function of metals is not only a difficult task but also characterized with certain degree of inaccuracy. The success of metallic plasma model (MPM) remains incomplete due to its inability to correctly estimate work functions of elemental semiconductors (Silicon and Germanium). Support vector machine (SVM) with accuracy of over 98% in the work function estimation is hereby proposed to address the limitations of MPM. The estimated work functions obtained using SVM approach were compared with the universally accepted experimental work functions as well as work functions obtained from MPM and other theoretical model. Work functions obtained using SVM approach agree well with the experimental values. Keyword: work function, support vector machines, coefficient of correlation and metallic plasma model


Introduction
Significances of the work function in the field of surface science needs in-depth knowledge of how electrons are transported out of the Fermi surface to the vacuum as well as ways the energy needed can be accurately estimated.Full knowledge of work functions as well as surface energy of metallic surface enhances understanding of formation of grain boundaries ,surface segregation and growth rates to mention but few (J.Wang & S. Q. Wang, 2014).Inaccuracies attributed to experimental determination of the minimum energy require to move electron to the surface of the metal (work function) using scanning kelvin probe force microscope (SKPFM) (Rohwerder & Turcu, 2007) call for the need to invoke theoretical models.Theoretical calculation of work functions can be dated back to 1935 (Bardeen, 1935)when the estimation was made as the energy difference between two lattices of the same number of ions but differ with one electron.This qualitative estimation of work function was far from the experimental values; consequently the need for modification arose.The theory was later strengthened using pseudo -potential theory so that the effects of ions could be included (Lang &Kohn, 1971).Only simple metals (such as LI, Na, Al, Cs and others) could be estimated using this model (Lang &Kohn, 1971) while noble gases showed deviations from the experimental values.In the course of finding accurate method of estimating work function, stabilized jellium model was proposed using a constant potential in addition to the effective potential of the metal (Kiejna, 1999).Another set of work function for Body Centre Cubic (BCC) and Face Centre Cubic (FCC) metals were recently presented using density functional theory method (J.Wang & S. Q. Wang, 2014).The model proposed in this research work accurately estimates the work function of elemental semiconductors using an approach different from the aforementioned models.
The main goal of this research work is to address the limitation of the metallic plasma model (Olatunji, Selamat, & Raheem, 2011) recently adopted in estimating the work function of metals using fundamental relationship between the work function, parameter of electron density and Fermi energy.SV regression model adopted in this work estimates the work function of germanium and silicon accurately using electron density parameter, Fermi energy and the number of valence electron of the concerned metals as the predictors.
The inefficiency of MPM to accurately estimate the work functions of elemental semiconductors was attributed to strong localization of their valences electrons which consequently results into poor screening (Olatunji, Selamat, & Raheem, 2011).The ability of SV regression to learn the relationship between the predictors (Femi energy E F (ev), electron density parameter R (a.u), and the number of valence electron) and the target (work functions) and thereby accurately predict the target for any given unseen predictors leads to an edge over MPM.Support vector machine is an algorithm that works on the platform of artificial intelligence (AI).It has been widely deployed in several fields of study for classification and prediction.In oil and gas industries, it has excellent performance in the prediction of permeability of carbonate reservoirs (Olatunji, Selamat, & Abdulraheem, 2014) as well as other properties of crude oil (Olatunji, 2010).It is employed in medical field for identification of skin diseases (Olatunji & Arif, 2013) as well as predicting prostate cancers (Shini, Laufer, & Rubinsky, 2011).Applications of SVM and other tools of artificial intelligence are not left out in materials characterization (Swaddiwudhipong, Tho, Liu, Hua, & Ooi, 2005), predicting software maintainability of object oriented systems (Olatunji & Hossain, 2012), handwritten Arabic recognition (Mahmoud & Olatunji, 2009), forecasting stock prices(S.O. Olatunji, 2013), predicting correlation properties of crude oil ( Olatunji, Selamat, & Raheem, 2011), estimation of atomic radii of periodic elements (Owolabi, Akande, & Olatunji, 2014),prediction of compressive strength of concreteAkande, Owolabi, Twaha, & Olatunji, 2014) just to mention but few.The success of SVM in different fields of applications, coupled with the need to have an accurate means of estimating the work functions of semiconductors, prompted us to delve into this research work.
The uniqueness of this research work is that it adopts a simple model (support vector regression) through which the work functions of elemetal semiconductors are accurately determined.Empirical results from simulations carried out indicated that SVM is able to accurately predict the work functions of elemental semiconductors with a very high accuracy of up to 98% for the unseen dataset.This is by any standard a great achievement and a means to further explore AI techniques in predicting other needed materials properties.The obtained high accuracies on the performance of the model suggest that the developed model is an excellent method for predicting the work functions of elemental semiconductors using predictors which are available in the literature.

Proposed Method
Support vector machine is a kind of artificial intelligence system that learns from experiences.Statistical leaning theory (Cortes & Vapnik, 1995b) forms the basis of support vector machines (SVM).SVM is widely used for classification and regression.Classifications are made using SVM based on a principle that employs optimal separation of classes.If the classes under consideration are not separable, SVM adopts hyper-plane that maximizes the margin as well as minimizes a parameter which has direct influence on misclassification errors (Gupta, 2007).Support vector regression(SVR), as proposed by Vapnik (Cortes & Vapnik, 1995a), introduces ε -insensitive loss function which permits the application of the concept of margin for regression problems.The main goal of SV regression is to search for a flat function which at most, deviated from the actual target vector by ε for all training data under consideration.For simplicity, Equation ( 1 (1) Where w P ∈ , d R ∈ and , w x represents the dot product in space of the P input.Minimization of the Euclidean norm 2 w ensures the flatness of the Equation (1).Therefore, optimization attained in SV regression can be represented as Equation ( 2) holds on the basis of the existence of a function which gives rise to an error that is less than ε for all training pairs.For the purpose of creating a room for another kind of error that might arise in real life problems, slack variables ( ξ , ' ξ ) are introduced.As a result, Equation (2) can further be written as The regularization factor C is determined by the user and it measures the trade-off between the flatness of the generated function and the amount to which the deviation from the target that are larger than ε is tolerated.Equation (3) can be transferred to a dual space representation using Langrangian multiplier.To execute this, the constraint equation is multiplied by Langrangian multipliers ( ' ' , , , where 1,....... ) and then subtracted from the objective function.Therefore, the equation becomes, ( ) In order to obtain the solution to the above optimization problem, the saddle points are located by equating the derivatives of Lagrangian in Equation ( 4) (with respect to ' , , and (5) The optimization problem is maximized by putting Equation ( 5) in Equation ( 4).We have 1) can therefore be modified by utilizing the solutions ( and i λ λ ) obtained from Equation ( 6).The modified equation becomes Non-linear functions are treated in support vector regression by adopting the concept of kernel functions (Cortes & Vapnik, 1995a).This kernel function enables SV regression to perform the linear regression in feature space by mapping the data into high dimensional space.It is easy to write the regression problem in feature space by replacing nd j i x a x in Equation ( 6) by ( ) and ( ) The optimization problem becomes, Where ( . ) ( ). ( ) The regression function now becomes ( ) Kernel Function: The transformation of datasets into hyper-plane is carried out by the kernel function (Olatunji, 2010).The complexity of the final solution is governed by the structure of high-dimensional feature space which is determined by the variables of the kernel.Hence, the variables of the kernel need to be accurately computed.Polynomial, Linear ,Gaussian and Sigmoid are the most commonly used kernel functions in the literature (Mahmoud & Olatunji, 2009).

The Description of the Datasets
The actual data (obtained experimentally) employed in this research work was drawn from the literatures (J.Wang & S. Q. Wang, 2014;Michaelson, 1977).Total number of fifty-three data set was used for both training and testing as presented in Table 1 (Olatunji, Selamat, & Raheem, 2011).The adoption of this few data sets was due to the efficiency of SV regression model which is known to be highly stable and also able to perform accurately and excellently even in the case of few data sets (Shin, Lee, & Kim, 2005).The correlation coefficients between the predictors ( Fermi energy, electron density parameter and the number of valence electron) and the target (work functions) is not magnificent in such a way that one may think that the work function is not likely to be estimated from both Femi energy and the parameter of electron density.SV regression is able to leanrn both linear and non-linear relationship between predictors and the target so as to establish a pattern through which unknown targets can be accurately predicted.High accuracy of the trained system (on the basis of coefficient of correlation) in predicting work function gives assurance of the existence of pattern between the predictors and the target which may be difficult to learn using linear regression.

Experimental Description
This research work was carried out using MATLAB environment.For the purpose of enhancing efficient computation, the dataset were normalized and randomly separated into training and testing set in the ratio of 8 to 2(that is, 80%training and 20% testing) respectively.The training dataset was then applied to train the SVM model used in predicting the work functions of the testing datasets.SVM model was trained using the training datasets and its generalized accuracy was tested using testing datasets .The trained and well tested SVM model was adopted to estimate the work functions of elemental semiconductors.

Working Principle of the Adopted SV Regression
Support vector regression adopts kernel trick to generate pattern that relates the predictors to the target through which unknown targets are predicted.It works on the principle of artificial intelligence and its subfield called "machine learning" (that is, leaning from experiences).During the training period, SV regression takes some input parameters (such as hyper-parameter λ , regularization factor C, kernel -option and epsilon ε ) by which the chosen kernel function generates a relation between the predictor and the target.The hyper-parameter guides SVR to adopt hyper-plane that maximizes the margin as well as minimizes a parameter which has direct influence on likely errors between the actual and the predicted values (Gupta, 2007).The epsilon ε represents the maximum tolerable deviation of the predicted values from the actual values.The SV regression under the training is allowed to see the target for each data so as to adjust the generated function until the error between the predicted value and actual or desired value is acceptable.The user defines and adjusts these input parameters until maximum obtainable correlation is achieved between the actual and the predicted values.The regularization factor C (defined by the user) measures the trade-off cost incurred while minimizing the training error and the complexity of the model.The input parameters that give the optimum correlation are referred to as the optimum parameters and the SV regression system is said to be well trained and can be tested before being used.
The trained system needs to be tested in order to ascertain its efficiency, accuracy and fitness.In this case, the target values will not be shown to the trained system and high correlation between the actual and the predicted values measures the accuracy and efficiency of the system.
In the case of our trained system, accuracies of over 87% and 98% were obtained for both training and testing dataset respectively.The High performance obtained in testing our system confers confidence in the developed system and is therefore adopted to predict work functions of elemental semiconductor.

Optimum Parameter Search Strategy Using Polynomial Kernel
3.4.1 Regularization factor: Optimum parameters were searched for using all searching parameters.For the polynomial kernel, the effect of regularization factor is depicted in Figure 1.
The coefficient of correlation between the actual and the predicted work functions (while training the system) maintains constant value as the regularization factor increases.The optimum value with which our system acquires its maximum performance was obtained as 1.The figure shows no alteration in the performance of the system as the value of the hyper-parameter rises.The value of the hyper-parameter with which our developed system acquires its optimum performance was recorded as 1E-7.The kernel option shows no influence on the performance of the system as can be seen in Figure 4. We could not use any other kernels because of their poor performances as regard to the dataset at hand.

Performance Quality Measures
The accuracy of our developed model is characterized with low root mean square (rmse), low absolute error (Ea) as well as high value of the coefficient of correlation (cc).The estimation of these quantities was based on the following relations.
Where W act , W pre stand for the actual and predicted work functions respectively.The available data points is represented as n

Results and Discussion
The actual and predicted work function while training the system is depicted in Figure 5.The estimated work functions using the developed SVM are relatively in good agreement with the experimental values as it can be seen from the training graph.The system performs better during testing.The testing performance of the system is illustrated in Figure 6.Work functions of elemental semiconductors (silicon and germanium) were estimated using the developed SVM model.Accuracies of over 87% and 98% were obtained for the training and testing phase respectively.This high efficiency and accuracy in the course of testing the model is meritorious in the sense that it displays high level of confidence while applying the developed system to estimate the work function of semiconductors.
The work functions estimated using SVM were compared with the experimental results as well as the results from other theoretical models.Our results show excellent agreement with experimental work function in comparison with MPM (Olatunji, Selamat, & Raheem, 2011) as illustrated in Table 6.For instance, our developed model estimates 5.01eV and 5.05eV as the work function of germanium and silicon semiconductors with experimental values of 5.0eV and 4.85eV respectively.Stabilized jellium model gives no data ( Olatunji, Selamat, & Raheem, 2011;Kiejna, 1999) for the silicon and germanium semiconductor while metallic plasma model estimates their work functions as 4.34eV and 3.98eV respectively.

Figure 1 .
Figure 1.The trend of coefficient of correlation with respect to regularization factor for both training and testing dataset using Polynomial kernel

Figure 2 .
Figure 2. The trend of coefficient of correlation with respect to hyper-parameter for both training and testing dataset using Polynomial kernel

Figure 3 .
Figure 3.The trend of coefficient of correlation with respect to epsilon for both training and testing dataset using Polynomial kernel

Figure 4 .
Figure 4.The trend of coefficient of correlation with respect to kernel-option for both training and testing dataset using Polynomial kernel

Figure 5 .
Figure 5.The graph showing actual and the predicted work function whilw training our model.

Figure 6 .
Figure 6.The graph showing actual and the predicted work function while testing our model

Table 2 .
Statistical Analysis of the data set

Table 3 .
Correlation between each pair of the attributes of the data set r and W 0 E F and W 0 Valence electron and W 0

Table 6 .
Comparison of the work functions obtained from our model (SVM) with the experimental value and other theoretical models Work functions of silicon and germanium were estimated using trained and tested SV regression model.Comparison of our results with the experimental work function shows excellent agreement.SVM model displays outstanding results for elemental semiconductors since other previously known theoretical models obtained work function that are far from the experimental values.Hence, the extension of this model is recommended so as to solve the problem of inaccuracies and difficulties in experimental determination of work functions of other semiconductor compounds.