Development Study of Deep Learning Facial Age Estimation

Human age estimation is one of the most challenging problem because it can be used in many applications relating to age such as age-specific movies, age-specific computer applications or website, etc. This paper will contribute to give brief information about development of age estimation researches using deep learning. We explore three recent journal papers that give significant contribution in age estimation using deep learning. From these papers, they selected classification methods and there is gradual improvement in result and also in selected loss function. The best result gives MAE (mean average error) 2.8 years and VGG-16 is the most selected CNN architecture.


Introduction
Human age can be estimated by facial appearance.Our faces show a special pattern in every lifetime so that our faces will have a huge difference every lifetime such as in childhood and adulthood.For the same person, the photo taken at different years indicate the aging process on their faces.The longer the interval is, the more obvious A typical pipeline of the existing methods for age estimation usually consists of two modules: age image representation and age estimation techniques [1].Recently, deep learning schemes, especially Convolutional Neural Networks (CNNs), have been successfully employed for many tasks related to facial analysis.This paper aims to provide a brief description about some papers that have done age estimation research using CNN or deep learning.We will limit discussion to only a few paper published in journals or conferences in the last 5 years and became an important milestone of age estimating work.This paper is organized as follows: in section 2, age estimation algorithm will be explained and in section 3, we will explain about CNN architecture.

Age Estimation Algorithm
There has been a significant volume of research done for age estimates.This paper will focus on some papers that contributed significant development.We will explain these researches together with the estimation algorithm used.For age estimation, there are three methods that have been worked on, namely, classification, regression and ranking.In classification method, human age is assumed to be classified according to age-groups.The weakness of classification method is the sharing of important information between adjacent age groups.This is addressed by regression methods which appear to perform better.A different approach to deal with this challenge is to adopt ranking methods.We choose Rothe's work [2] as first paper examined and the winner of the LAP 2015 challenge [3] on apparent age estimation.Age estimation done by Rothe is a classification method.They use VGG-16 [4] as base CNN architecture called DEX (Deep Expectation).Fig. 1 shows pipeline of DEX method.System will get face image and then, it will be classified using CNN into 101 classes.These classes describe possible age groups from face image samples.They train CNN for classification and at test time, they compute expecting value over the softmax-normalized output probabilities of || neurons.

International Journal of Applied Sciences and Smart Technologies
where  = {1, 2, . . ., ||} is the ||-dimensional output layer and  ∈  is the softmaxnormalized output probability of neuron .Their research result a MAE (mean average error) 3.09 years with using IMDB-WIKI [2] as training dataset and FG-NET as testing dataset [5].
The same research was also conducted by Antipov [6].They also use VGG-16 as base CNN architecture.They did the research with 3 kind age encoding, Fig. 2,: (

Figure 2 .
Figure 2. Example of encoding [6]. denotes encoding result and  is a hyper parameter of LDAE.

Figure 3 .
Figure 3.An overview of proposed method by Hu[7] will be.Facial age estimation has potential application such as agespecific movies, age-specific products vending machine like tobacco, alcohol, and other age-specific computer applications or websites.
46changes there

Journal of Applied Sciences and Smart Technologies
denotes number of images in batch,  denotes number of age class,  denotes targets,  denotes to prediction.Loss function refer to Gaussian distribution.
[6]re 48 Lost function become differentiator between Rothe[2]and Antipov[6], but they are still in classification method.Antipov research result MAE 2.84 years using FG-NET as testing dataset.