Yamatani Activation: Edge Homogeneous Response Super Resolution Neural Network

In this research, I propose a two-variable activation function “Yamatani” that satisﬁes the ﬁrst-degree homogeneity, and realize a super-resolution convolutional neural network that is independent of the dynamic range and symmetrical about the luminance inversion


Introduction
The performance of single-image super-resolution, which enlarges an image while maintaining sharpness, has been significantly improved since the introduction of Convolutional Neural Networks(hereinafter called CNN).
On the other hand, CNN-based super-resolution does not produce the same result for images with different dynamic ranges due to bias parameters, and there is no guarantee that the same sharpened pattern will be output at edges with different luminance differences from the surroundings.Furthermore, when a non-linear activation function such as Relu or leakyRelu is used, there is a point that luminance inversion symmetry is not guaranteed.
This study proposes an activation function "Yamatani" that satisfies homogeneity with two variables as inputs.Yamatani activation is derived from the origami "Yamaori" (mountain fold) "Taniori" (valley fold).By using this activation function, it is possible to realize super-resolution that does not depend on a dynamic range in which a satisfies homogeneity residual responds to a luminance difference between an edge and surrounding pixels.

Homogeneity, Additivity and Linearity
The section supplements about homogeneity, additivity and linearity.

Definition
That a function f satisfies the n-degree homogeneity means that the following equation is satisfied for the input vector v.
Also, that a function f satisfies the additivity means that the following equation is satisfied for the input vector u, v.
Further, that the function f is linear means that it satisfies first-degree homogeneity and additivity.* github: https://github.com/tk-yoshimura/ The response of a CNN cannot be approximated to any function without using a nonlinear activation function.However, it is nonlinear if it satisfies homogeneity and does not satisfy additivity.In the univariate activation function, the first-degree homogeneous function is only the proportional function, which is equivalent to using a linear function as the activation function.That is, with an activation function of one variable, an CNN that does not depend on the dynamic range and satisfies the inversion symmetry cannot be constructed.Therefore, I introduce a two-variable activation function that satisfies first-degree homogeneity.

Yamatani activation
The section describes the definition and properties of yamatani activation.

Definition
Yamatani activation is defined by the following equation using two variables x, y and constant s ≥ 0 as inputs.This activation a curved surface as shown in Figure 1.
The convolutional layer with yamatani activation as an activation function is defined as follows.In the expression, v is input vector, W 1 , W 2 is weight parameters of the convolution function, and * is a convolution operator.Bias parameter is NOT used.
The convolutional layer satisfies the first-degree homogeneity for the input vector v.
Obviously, even when used as a residual connection, it satisfies homogeneity.

Propertie
When the convolutional layer using yamatani as the activation function is applied recursively, the following surface is obtained, which can approximate any first-degree homogeneous multivariable function.The symmetry that the super-resolution algorithm should satisfy includes spatial symmetry and luminance symmetry.
Spatial symmetry refers to the fact that when the image is rotated or inverted and then super-resolution is performed and then the image is restored, the same result is obtained as an image that has not been rotated or inverted(Figure 3).A conventional method for satisfying the spatial symmetry described in the previous section is self-ensemble.This is a method of super-resolution of 8 rotated and inverted images and taking the average [1].The following is proposed as a super-resolution model using yamatani activation.This model is referenced to DRRN [2].Yamatani-based Homogeneous Super Resolution Network(hereinafter called YHSRN) consists of three blocks, entry block, residual block, and terminal block.In the convolution layer of the entry block, the sum of the spatial parameter directions is limited to 0, and only the difference value of the image is output.Pixel Shuffler is referenced to [3].The following table shows the hyperparameters related to the structure of YHSRN.Generally, I is 1 of the Y components in the YUV color space, or 3 of all the components in the RGB or YUV color spaces.The number of output channels of the terminal block is IS Bilinear was interpolated by the following equation so that the interpolation coefficient did not depend on the pixel position.In the expression, x, y is pixel index, S x,y is source image values, and D x,y is destination image value.Reference to a position outside the image is reinterpreted as the nearest position in the image.Here, the case where the enlargement ratio S is 2 will be described.
To check the homogeneous response of YHSRN, training is performed with one image that has not been subjected to linear transformation, and then the image that has been subjected to linear transformation is inferred to verify whether super-resolution has been properly performed(Figure 6).Here, I = 3, K E = 7, C = 32, N = 3, K R = 3, K T = 3, S = 2 as hyperparameters in Table 1.Edge padding is used as a padding method.
Figure 6: Functional verification I use Lanna's halved image as the verification image.The image before reduction is learned as a target image(Figure 7).The results are shown below(Figure 8).Then multiply the input image by -1 and the output image by -1.The same image was obtained and the luminance inversion symmetry was confirmed(Figure 9).Then multiply the input image by 0.5 and the output image by 2. In this case, the same image was obtained, and it was confirmed that the image did not depend on the dynamic range(Figure 10).

Figure 3 :
Figure 3: Spatial SymmetryLuminance symmetry means that an image obtained by applying a linear transformation to each luminance value of an image and performing super-resolution and then inversely transforming the luminance value is equivalent to an image not subjected to a linear transformation(Figure4).