Editorial:

In recent years, the widespread deployment of wireless sensor networks, cloud computing, 5G, robotics, embedded computing and inexpensive sensors has facilitated multimedia technologies and fostered some emerging applications. Future multimedia is the direct motivation and drive for the industrial upgrading. Supported by cognitive computing, which is one of the most important fundamental researches and key techniques for implementing intelligent manufacturing, Internet of Multimedia Things (IoMT) is significantly becoming smarter that more intelligent services and applications are emerging. Therefore, the services of an intelligent IoMT integration with cognitive computing could be suggestive, prescriptive, or instructive in nature, and it could be more affective and influential by design choices to make a new class of problems computable.

Although IoMT has emerged with a great potential to change our life especially with ubiquitous sensing and sensory data, cognitive IoMT technologies will make it possible to understand what’s happening in the world more deeply. Therefore, it is necessary to address the technical challenges and problems related to IoMT on designing, building, and deploying novel cognitive computing, services and technologies, to enable intelligent Industrial IoMT services and applications.

This special issue features 7 selected papers with high quality. The first article proposes a CGAN (Conditional Generative Adversarial Nets)-based font repair method. This paper uses the content accuracy and style similarity of the repaired image as an evaluation index to evaluate the accuracy of the restored style font. The font content proposed by the paper based on CGAN network repair style is similar with the correct content.

The second article titled “ASDN: A Deep Convolutional Network for Arbitrary Scale Image Super-Resolution” employs a Laplacian pyramid method to reconstruct any-scale high-resolution (HR) images using the high-frequency image details in a Laplacian Frequency Representation. For SR of small-scales (between 1 and 2), images are constructed by interpolation from a sparse set of precalculated Laplacian pyramid levels. SR of larger scales is computed by recursion from small scales, which significantly reduces the computational cost. For a full comparison, fixed- and any-scale experiments are conducted using various benchmarks.

In the next article with the title “Intelligent Handover Triggering Mechanism in 5G Ultra-Dense Networks via Clustering-based Reinforcement Learning”, the authors propose an intelligent handover triggering mechanism for UE based on Q-learning frameworks and subtractive clustering techniques. The input metrics are first converted to state vectors by subtractive clustering, which can improve the efficiency and effectiveness of the training process. Afterward, the Q-learning framework learns the optimal handover triggering policy from the environment.

The fourth article titled “Region- and pixel-level multi-focus image fusion through convolutional neural networks” proposes a region- and pixel-based method that can recognize the focus and defocus regions or pixels by the neighborhood information in the source images. The proposed method can obtain satisfactory fusion results and achieve improved real-time performance.

Video derain is an important issue in the field of digital image processing and computer vision. The fifth article, “RoDeRain: Rotational Video Derain via Nonconvex and Nonsmooth Optimization” proposes a novel rotational video derain algorithm via nonconvex and nonsmooth algorithm (RoDerain). Not only can the rain streaks in natural scene be removed, but the rain streaks in stochastic scene can be also well removed.

The sixth article titled “Multi-scale vehicle logo detector”, proposes a new approach called multi-scale vehicle logo detector (SVLD), which is based on SSD. This method obtains better results than the current detection methods by setting the parameters of the preset boxes, changing the pre-training strategy, and adjusting the network structure.

The last article titled “RSANet: Towards Real-time Object Detection with Residual Semantic-guided Attention Feature Pyramid Network” introduces a lightweight convolutional neural network, called RSANet: Towards Real-time Object Detection with Residual Semantic-guided Attention Feature Pyramid Network. RSANet consists of two parts: (a) Lightweight Convolutional Network (LCNet) as backbone, and (b) Residual Semantic-guided Attention Feature Pyramid Network (RSAFPN) as detection head. In the LCNet, in contrast to recent advances of lightweight networks that prefer to utilize pointwise convolution for changing the number of feature maps, we design a Constant Channel Module (CCM) to save the Memory Access Cost (MAC) and design Down Sampling Module (DSM) to save the computational cost.