Fast Video Encoding Algorithm for the Internet of Things Environment Based on High Efficiency Video Coding

Video data for the Internet traffic is increasing, and video data transmission is important for consideration of real-time process in the Internet of Things (IoT). Thus, in the IoT environment, video applications will be valuable approach in networks of smart sensor devices. High Efficiency Video Coding (HEVC) has been developed by the Joint Collaborative Team on Video Coding (JCT-VC) as a new generation video coding standard. Recently, HEVC includes range extensions (RExt), scalable coding extensions, and multiview extensions. HEVC RExt provides high resolution video with a high bit-depth and an abundance of color formats. In this paper, a fast intraprediction unit decision method is proposed to reduce the computational complexity of the HEVC RExt encoder. To design intramode decision algorithm, Local Binary Pattern (LBP) of the current prediction unit is used as texture feature. Experimental results show that the encoding complexity can be reduced by up to 12.35% on average in the AI-Main profile configuration with only a small bit-rate increment and a PSNR decrement, compared with HEVC test model (HM) 12.0-RExt4.0 reference software.


Introduction
The Internet of Things (IoT) is a sensing network that connects any object with the Internet using many kinds of sensor equipment.Along with the rapid development of IoT applications, new generations of mobile broadband networks, cloud computing, and video coding technology for video streaming in real-time all represent an interactive and realistic development direction for next generation multimedia application networks.It will play a valuable role in industrial, medical, and television fields [1][2][3].
MPEG has already started to investigate standardization activities to define network protocols for the Internet of Things (e.g., how to connect things).The variety and heterogeneity of "Things" make it difficult to standardize descriptions, data formats, and APIs in a global manner; however, when the environment is well established, this can be done.Therefore, MPEG is exploring representations of multimedia things as part of complex distributed systems implying interaction between things and between humans and things.The multimedia data type elements are corresponding to descriptions of devices and messages for "talking to" and "adapting to" either devices or services in the Internet of Things.
Recently, there has been a change in the video content service in video communication technologies from lower resolution video to an ultrahigh definition (UHD) video format.Mobile device, storage, and network technologies are striving to keep pace with rapid changes in the market.Modern data compression techniques can store or transmit based on allocation of significant amounts of data while UHD video content has a large data transfer rate.Many applications of existing video compression technology are used to broadcast high definition (HD) TV signals over satellite, cable, and terrestrial transmission systems, video content acquisition and editing systems, camcorders, security applications, Internet and mobile network video, Blu-ray 2 International Journal of Distributed Sensor Networks discs, and real-time conversational applications, such as video chat, video conferencing, and telepresence systems for lower dimensional video sequences.
However, the growing popularity of HD video and an increasing diversity of services and emergence of beyond-HD formats (4k × 2k or 8k × 4k resolution, called UHD) requires stronger video coding with efficiency that is superior to previous video compression standards.Moreover, the traffic caused by video applications targeting mobile devices and tablet-PCs and the transmission requirement for video on demand services are imposing severe pressures on existing networks.An increased desire for higher quality and better resolution is also driving mobile applications.
The H.264/MPEG-4 AVC [4] is still widely used for most of many applications, both in real-time and non-real-time.However, this standardization suffers a bit-rate increment and significant computational complexity for beyond-HD resolution applications.
A next-generation video coding scheme, called High Efficiency Video Coding (HEVC), was developed by the Joint Collaborative Team on Video Coding (JCT-VC) group of ISO/IEC MPEG and ITU-T VCEG [5].The HEVC version 1 has the primary goal of achieving a 50% high compression rate than the H.264/MPEG-4 AVC, especially with a primary focus on 8-bit/10-bit YUV 4:2:0 video.Although this standardization supports high compression rate using improved and modified coding tools, the HEVC standard still requires a large amount of time for compression.
HEVC is developing extensions to support several additional application scenarios, including professional uses with enhanced precision and color format support, scalable video coding, and 3D/stereo/multiview coding.Among these extensions, the HEVC range extension (RExt) provides a high bit-depth (larger than 10 bits) and different color formats in high resolution sequences.
HEVC RExt has the same structure as HEVC, but additional coding tool options have been added to support 10 bits per sample and different color formats.The 4:2:2 and 4:4 :4 enhanced chroma sampling structures and sample bit-depths beyond 10 bits per sample are supported [6].
UHD resolution is expected to emerge in the near future and will be supported by next generation displays.This kind of data rate increase will put additional pressure on all types of networks, and data rates for video content are increasing faster than network infrastructure capacities for economical delivery.HEVC and its extensions provide good performance based on a large computational complexity because of heavy and complicated coding tools in order to improve the coding efficiency, support for in-deep color formats, and a high bitdepth.
To reduce the computational complexity in the HEVC RExt encoder, a fast intramode decision algorithm is proposed based on block texture information.This paper is organized as follows: in Section 2, the HEVC structure and related works are introduced.Local Binary Patterns (LBPs) and fast intramode decision method based on LBP are described for the proposed algorithm in Section 3. Section 4 presents the coding performance of the algorithm, and Section 5 presents concluding remarks.

HEVC Encoding Structure and Related Works
The HEVC standard has adopted a highly flexible and efficient block partitioning based on introduction of the coding tree unit (CTU).There are three structures of three block units as a coding unit (CU), prediction unit (PU), and transform unit (TU).The CU represents a basic block type like macroblock of the H.264/AVC.The PU is used for the coding mode decision, including motion estimation and rate-distortion (RD) optimization (RDO).Transform and entropy coding are performed based on the TU.Initially, a frame is divided into largest CU size which is called a coding tree unit (CTU).The CTU consist of a coding tree block (CTB), on luma block, and two chroma blocks.Each CTB is an assemblage of square shaped coding blocks (CB) that are divided based on a quad tree structure.The structure of each CB is square and the size can be 8, 16, 32, or 64.This kind of change is more effective and beneficial, unlike the conventional previous H.264 method that used a 16 × 16 macroblock (MB).A larger and more flexible block structure is effective for encoding high resolution video.
The CTU size is 2 × 2, where  is 32, 16, or 8.A CTU can contain a single CU with a 2 × 2 dimension, or it can be split into four smaller CUs of equal size ( × ).Each CTU is selected and recursively split into four CUs based on a quadtree structure.
Each CB is predicted based on an intra-or interprediction process that is performed in the PU.The intraprediction process uses two modes (PART 2 × 2 and PART  × ) based on the encoded size of the PU.Figures 1(a) and 1(b) show different intraprediction direction and mode between HEVC and H.264/AVC.In order to improve the coding efficiency of video coding, HEVC uses 35 intraprediction modes in the PU from 4 × 4 to 64 × 64.The H.264/AVC standard only used 4 and 9 intraprediction modes for block sizes of 4×4 and 16×16 macroblocks.The increased numbers of intraprediction modes increase the computational complexity of HEVC, compared with H.264.
In order to reduce the time required for the intraprediction coding, Yoo and Suh [8] proposed an early termination algorithm for inter-and intra-PUs that checked a coded block flag (CBF) value and the RD cost of the inter-PU.If conditions were satisfied based on these two values, each PU was skipped for an inter-PU and intra-PU.A two-stage prediction unit size decision method has also been presented [9] in which texture complexities are analyzed according to the video content using variance in order to filter out unnecessary PUs.Next, for intraprediction coding, skipped small PU sizes are selected based on PU sizes of encoded upper-left, upper, and left blocks.Some fast algorithms have been reported that use dominant edge information [10] with a subset of tree level PUs [11].
Cho and Kim [12] proposed fast CU splitting and pruning methods based on Bayes decision rules in order to reduce the computational complexity in the HEVC intraprediction process.Fast intraprediction approaches based on gradients have been used [7,13,14].Wang and Siu [15] reported an adaptive intramode skipping algorithm and signaling processes using statistical properties of reference samples.An intramode decision strategy arranging candidate modes into different groups has been presented using a notation of a circle [16].In HEVC RExt, an advanced color Table and Index Map (cTIM) [17], intrablock copy (IntraBC) [18], and angular prediction with a weight function and a modification filter based on a blending filter for DC mode [19] have been proposed.

Local Binary Patterns (LBPs)
. Intraprediction process has usually been analyzed based on use of image texture information.Local Binary Pattern (LBP) features were originally designed for texture description [20].This LBP operator transforms an image into an array or image of integer labels describing the small-scale appearance of the image.These labels or their statistics, most commonly in the form of a histogram, are then used for further image analysis.This approach has advantages, such as gray-scale invariance and normalization.The LBP represents texture information without any time consumption because the LBP operator is simply calculated.
The LBP operator is based on the assumption that texture has two locally complementary aspects for a pattern and a pattern strength.In H.264/AVC, the LBP is used to extract moving objects with motion vectors and to use edge information in motion estimation process [21][22][23].The pixels in a particular block area are thresholded based on a center pixel value, multiplied by powers of two and then summed to obtain a label for the center pixel.If the neighborhoods consist of 8 pixels, a total of 2 8 = 256 different labels can be obtained depending on the relative gray values of the center and the pixels in the neighborhood.
Circular symmetrical neighbor sets for different (, ) are illustrated in Figure 2.  is the node for the number of neighboring pixels and  is the radius of circle.By combining different values of  and , the LBP is composed of variety of sets.Let   denote the gray value of the sampled pixel in an evenly spaced circular neighborhood of  sampling points of radius  around point (, ).(, ) and   denote the image of a frame and the gray level of center position.  ,   is given by (− sin(2/),  cos(2/)): For analysis of local texture patterns, the joint distribution of differences with spatial characteristics can be modeled as Analyzed LBP is a discriminative pattern for different patterns between neighborhood pixels and center pixel.The LBP code can represent a bright/dark spot, flat areas, edges, edge ends, and curves if differences are zero in a constant region.
Equation (3) represents the binary bit value which is calculated at the th neighbor.Let () denote the binary bit value of the neighboring pixel intensity .() presents the pixel location of the center position at (0, 0).The  (,) coordinates of circular neighbor sets () are given by (− sin(2/),  cos(2/)).
The LBP operator when  is 8 and  is 1 is shown in Figure 2(b).Binary bits can be transformed into integer values as pattern number using (4) when the binary bit stream consists of a combination of each bit calculated using (3) as International Journal of Distributed Sensor Networks the thresholding function.For example, Figure 3 illustrates the LBP (8,1) operator: Many texture analysis applications are required for invariant or robust rotations of the input image.LBP , patterns are obtained by circularly sampling around the center pixel.Most of the Local Binary Patterns in natural images are uniform.Use of uniform patterns is the statistical robustness.Local primitives detected by the LBP include spots, flat areas, edges, edge ends, and curves.Figure 4 illustrates examples with the LBP 8, operator which are represented as gray circles and zeros are white.The LBP distribution can be successfully used in recognizing a wide variety of different textures, to which statistical and structural methods have normally been applied separately.
The texture of the current PU can be identified as discriminative textures using the LBP.In the HEVC encoder, interpolation is used for application of the LBP model to coordinate location of neighboring node sets appropriately using (5) Therefore, LBP (,) is calculated using () = Î, in (5) as neighborhood pixels for application and analysis of local texture information in the HEVC block structure.For analysis of binary textural information and different intraprediction modes, a probability distribution is first analyzed for binary patterns that occurred in text sequences of natural video content.Next, relationship with bit patterns that exhibit higher frequencies of occurrence than other patterns and encoded intramodes is analyzed.Probability distributions of patterns and modes in four sequences with 20 frames using texture information for LBP are shown in Figures 6 and 7.
Sequences most probable patterns with high distribution rates in the LBP are already prepared using a look-up table.
The mode distribution for different most probable patterns that are similar to other most probable patterns for Intra Planar, Intra DC, and vertical mode (0, 1, and 26) is shown in Figure 7. Therefore, in order to use only DC, Planar, and 26 modes, the most probable patterns are used to quickly make a mode decision using texture information.are fast encoding tools in each prediction, transform, and filtering processes.

The Overall
To support high speed encoding, the intramode prediction process in the original HM-12.0-RExt4.0 is performed using a rough mode decision (RMD) and probable mode (MPM).RMD and MPM are contributed to speedup intraprediction process.Intraprediction selects the  best candidate modes based on RMD where all modes are tested based on the minimum absolute sum of Hadamard transformed coefficients of the residual signal (HSAD) and the number of mode bits in the RMD.The number of N best RMD candidates is 8 for 4×4 and 8×8 and 3 for 16×16, 32×32, and 64 × 64.The RD optimization is only used for N + MPM candidates.However, the computation load on the encoder is still high.The overall procedure of the proposed fast mode decision scheme is illustrated in Figure 8 and follows the following process.
Step 1.Initially, the LBP is calculated for the current encoded PU.
Step 2. If the LBP is included in the most probable patterns, which are already defined in the look-up table, go to Step 3. Otherwise, go to Step 4.
Step 3. Prediction is only performed three times for a set of 0, 1, and 26 candidate modes.Next, go to Step 5.
Step 4. Prediction is performed for the number of modes based on the PU size.Go to Step 5.
Step 5.The best mode is selected with the minimum RD cost.
In the proposed scheme, the MPM condition is used based on the most probable LBP in the look-up table.If the local texture pattern of the LBP encoded block satisfies the condition, the intraprediction process is only performed three times for the three modes Intra Planar 0, Intra DC 1, and vertical mode 26.

Experimental Results
The proposed fast scheme was implemented on HM-12.0-RExt4.0(HEVC RExt reference software).Test environments were all intra using AI-Main.For wireless video communication, in the past, IPPP structure which one  frame followed by all  frames is usually employed.Recently, wireless video communication is required to support high resolution video service due to rapid advance in network technology.To provide better quality than IPPP structure, all intrastructure should be used necessarily.Standard sequences with 50 frames were used for three to four sequences with different quantization parameter (QP) range (12,17,22, and 27) defined by superhigh tier (SHT) [24].Test sequences were classified for color formats of RGB4:4:4, YCbCr4:4:4, and YCbCr4:2:2.Each class had a 1920 × 1080 resolution.Details of the encoding environment can be seen in JCTVC-N1006 [24].
To evaluate performance, measurements of ΔBit, ΔPSNR Y , and ΔTime were used as ΔTime is a complexity comparison factor used to indicate the amount of total encoding time saving (8).From (8), Time () indicates the total consumed time of the method  for encoding.
For sequences with YCbCr4:4:4 (Table 2), the proposed algorithm achieved a BDBR loss rate of 1.87% with a bit increment of 0.9% and a 0.08 (dB) PSNR decrement, on average.An 11.78% speed-up gain was achieved in sequences with YCbCr4:4:4.Performance result of [7] in YCbCr4:4:4 The proposed algorithm achieved a speed-up gain up to 16.14% with a smaller bit increment in the Seeking sequence with QP = 12 and YCbCr4:2:2.In DucksAndLegs sequence with QP = 22 and YCbCr4:2:2, Jiang's algorithm achieved a speed-up factor up to 12.18% with a smaller bit-rate loss.Bit-rate performance was increased in nonnatural sequences and videos with many moving objects in the EBUGraphics sequence.Using the proposed approach, an average speed-up gain of over 12.35% was obtained, compared to the original intraprediction process with a negligible bit-rate increment.Gradients based fast intramode decision algorithm of Jiang et al. [7] achieved encoding speed-up gain of 8.98% on average.Although Jiang's algorithm can give better performance of BDBR than the proposed method, the proposed one can reduce more encoding time without the performance degradation from gradients based fast intramode decision scheme [7].
RD performance for test sequences classified by color and line-style with a QP range of SHT in AI-Main is shown in A large loss of image quality was observed using the original HM encoder (up to 0.22 dB, on average, for the EBUGraphics sequence).However, the proposed method provided high quality maintenance of 40 (dB) when the QP value was less than or equal to 22, and the video quality was greater than or equal to 50 (dB) when the QP value was set to 12, except for the OldTownCross sequence.Furthermore, the proposed fast intramode decision scheme supported rapid compression, even with a small Y-PSNR loss.The proposed algorithm was efficient for use in a real-time encoding system without significant degradation of encoding performance for large video resolution, using a deep color format, and high bit-rate sequences.

Conclusions
A fast intramode decision scheme is proposed for high resolution video with a high bit-depth and a rich color format.The proposed algorithm achieved a 12.35% time saving and a BDBR value of 1.96%, on average, over the original HM-12.0-RExt4.0software with a Y-PSNR loss of 0.12 (dB) and a 0.81% bit increment.The proposed algorithm should be considered for the Internet of Things (IoT) environment in real-time and can be useful for real-time HEVC video encoding systems for maintenance of video quality.

4 Figure 2 :Figure 3 :
Figure 2: Different set for Local Binary Patterns. is the number of neighboring pixels and  is circle of radius.

Figure 6 :Figure 7 :
Figure 6: Probability distribution about pattern number according to various sequences.

Figure 8 :
Figure 8: The overall procedure of the proposed algorithm.

Figure 9 .
Figure 9.The original standard and proposed method performance values exhibited similar peaks in a graph (Figures 9(a) and 9(b)).A Seeking sequence with YCbCr4:2:2 (Figure 9(c)) showed a negligible loss of bit-rate and quality.A large loss of image quality was observed using the original HM encoder (up to 0.22 dB, on average, for the EBUGraphics sequence).However, the proposed method provided high quality maintenance of 40 (dB) when the QP value was less than or equal to 22, and the video quality was greater than or equal to 50 (dB) when the QP value was set to 12, except for the OldTownCross sequence.Furthermore, the proposed fast intramode decision scheme supported rapid compression, even with a small Y-PSNR loss.The proposed algorithm was efficient for use in a real-time encoding system without significant degradation of encoding performance for large video resolution, using a deep color format, and high bit-rate sequences.
Procedure of the Proposed Fast Scheme.The complexity of HEVC is significantly increased than H.264/AVC by improvement of encoding efficiency.Consequently, HEVC requires improved rapid coding process as well as guarantee of efficient compression.In HEVC, there International Journal of Distributed Sensor Networks