1 Introduction

Recent advances in wired and wireless networks, networking and computing technologies along with the emergence of multimedia applications and systems are changing our life experience. These technological changes are creating a multimedia era, where customers can access and produce audio and video contents in a ubiquitous way and cost-effectively, while content providers explore new ways to increase their revenues. The distribution of multimedia services such as High-Definition TV and public video surveillance content, over heterogeneous systems raises special network requirements in terms of delay, jitter, and loss tolerance, as well as meeting the needs of end-users in terms of visual perception and satisfaction.

Quality of Service (QoS) support is crucial for successful multimedia systems. Existing wired (Differentiated Services (DiffServ)), wireless (IEEE 802.11e and IEEE 802.16e) and cellular QoS models (Universal Mobile Telecommunication System (UMTS)) offer different forwarding behaviors, control operations, and measurement schemes for multimedia packets. QoS metrics, such as packet loss rate, packet delay rate and throughput, are typically used to indicate the impact on the audio-video quality level from the network’s point of view, but do not reflect the user’s experience. Consequently, pure network-based QoS approaches fail in capturing subjective aspects associated with human perceptions.

In order to support more accurate control and measurement of multimedia quality, novel user-aware and multimedia-aware approaches are required to increase the customer’s satisfaction while simultaneously improving the usage of network resources. Quality of Experience (QoE) [20, 33] schemes have been introduced to overcome the limitations of current QoS-aware solutions regarding multimedia coding unawareness, human perception, and subjective-related aspects. QoE assessment mechanisms and techniques show how a networking environment meets the viewer’s specific requirements [30], while QoE control schemes optimize network resources and improve the user perception [27] simultaneously.

The QoE applicability scenarios, requirements, evaluations, and assessment methodologies in multimedia systems have been investigated by several researchers and working groups, such as the International Telecommunication Union – Telecommunication Standardization Sector (ITU-T) [41], Video Quality Experts Group (VQEG) [16] and European Technical Committee for Speech, Transmission, Planning, and Quality of Service (ETSI STQ) [10]. QoE implementations can be found in Peer-to-Peer (P2P) networks [23], fixed wired and wireless systems [22] and are also attracting a lot of attention in mobility scenarios [3, 9].

The recent achievements made in multimedia and networking areas are key drivers enabling the deployment of new user-based QoS/QoE-sensitive services as well as providing new paradigms for the creation of new protocols, user’s perception metrics, routing approaches, mobility controllers and Content Delivery Networks (CDNs). This article presents some of the recent advances in multimedia networking with focus on multimedia QoS, QoE and related standardization issues/user perceptions, CDN, Fourth Generation (4G) systems, mobility, and standardization issues. Furthermore, this article also identifies some of the main challenges that still need to be addressed for future multimedia networking to become truly ubiquitous.

The rest of this paper is organised as follows. In Section 2, the Quality of Experience perception for multimedia users is introduced. Section 3 presents recent advances in content delivery networks. Section 4 discusses the distribution of multimedia content in fourth generation networks. In Section 5 we present recent results in mobile multimedia systems. Section 6 summarizes some important challenges for future multimedia networking. We make some concluding remarks in Section 7.

2 Quality of Experience (QoE) for multimedia users

Advances in multimedia user-centric systems are supporting new video quality level measurement schemes, as well as reduced and non-reference metrics in which the perception of QoE is deeply application-dependent. There are several studies (discussed in detail in [30]) in video assessment models for future multimedia systems that have focused on Standard-Definition (SD) and especially in High-Definition (HD) content. It is expected that a large amount of HDTV data will be distributed over heterogeneous environments in a near future.

QoE schemes developed for SD resolution will most likely fail in the case of HD quality broadcasting (unless the video assessment model is at least re-trained). Furthermore, there are current applications (such as 3DTV) in which several additional factors affect multimedia perception or applications (such as Closed-circuit TV (CCTV)) in which quality factors are not necessarily strictly related to the actual viewer experience. Considering different types of multimedia applications for future networks (including resolutions, acquisition conditions, and other factors), various video assessment models are commonly developed independently making them unsuitable for different usage scenarios.

Currently, the most influential working group in the video quality area is the Video Quality of Experts Group (VQEG) [41]. This group runs test plans that aim at selecting the best quality models that could then be turned into ITU Recommendations. Among the recent VQEG achievements it should be mentioned that an HDTV test plan [42] has just finished and the final report is about to be published at the end of 2010. The goal of the HDTV test plan was to analyse the performance of models suitable for use in digital video quality measurement for HDTV applications.

A secondary goal of the HDTV test plan was to develop HDTV subjective datasets that may be used to improve HDTV objective models. The performance of objective models with HD signals was determined from a comparison of viewer ratings of a range of video sample quality obtained in controlled subjective tests with the quality predictions from the submitted models [43].

Furthermore, VQEG has recently started to turn its attention to 3DTV video quality metrics and models. This activity is related to the ITU-R Question 128/6. First, the group will attempt to investigate ways of measuring 3D quality, which will be then followed by the development of standard metrics and models. These tasks differ significantly from previous VQEG efforts towards 2D models. It is not straightforward to transfer 2D expertise into the 3D area. For example, apart from classical image quality aspects, metrics for depth map quality, presentation room quality or viewing comfort quality (how long the user can watch 3D) have to be developed.

Additionally, some of the problems have already been solved for the 2D video technology including blurriness, strike back in the 3D technology (such as crosstalk leading to ghosting images). The currently available quality metrics for stereoscopic images are not enough as they have been based on 2D ground-truth.Footnote 1 Nevertheless, current creation of 3D video quality metrics is being hampered by the lack of high quality and realistic reference content. In order to gather a reference test-set of 3D videos, a The Consumer Digital Video Library (CDVL) [7] library is currently being extended to accept and provide 3D content as well.

As far as the CCTV application is concerned, the most active working group in the area of surveillance video quality is VQiPS (Video Quality in Public Safety) Workgroup [37], funded by the U.S. Department of Homeland Security, set up in 2008, but also related to VQEG. So far VQiPS’s achievements include coordination of the various organizations whose goal is to create standards for surveillance video. The current work focuses on: education of users and the development of specifications for surveillance video quality. It has been observed [12] that even different surveillance video applications can combine common elements that affect image quality specification. Consequently, VQiPS creates a set of use cases that are independent of the application, while at the same time enriching them with instructions for users to adapt the VQiPS specific standards to their own applications. VQiPS will also create a consistent terminology of concepts related to the quality of video utility and related equipment.

The results of video assessment schemes for multimedia applications are very important and will be used for different proposals, such as pricing, medium adaptation, and user-based optimizations.

3 Content Distribution Networks (CDNs)

CDNs are playing an important role in future multimedia networking and have evolved in recent years from infrastructures for Web documents to systems that support multimedia content and different forms of delivery such as streaming and Video on Demand (VoD) [28]. Traditionally, caching and replication techniques are used by CDNs to make content more widely accessible, distribute system load away from bottleneck servers and enable the overall system to react to changes in content popularity [21, 40].

Akamai was one of the first commercial CDNs using replication servers around the world to serve content. In Akamai, content is injected into the system by the content providers at so called entry points. From these points, reflectors are responsible for distributing the content optimally throughout the infrastructure, i.e. placing it onto relevant edge servers [14]. Consumers are then redirected towards the optimal edge server using transparent Domain Name System (DNS) redirections. Akamai is currently the most popular CDN with a market share of 64% and a presence in 70 countries [14].

An issue that has been addressed more recently is grounded in the fact that users are often interested in content, yet have very little (or no) interest in where it comes from. This has led to research in the area of content-centric networking. The content-centric paradigm puts content at the heart of a network’s operation and allows hosts to interact with it using a content request/reply model. Therefore, instead of having packets routed to specific hosts, requested are routed to optimal content sources using unique content identifiers. One of the first systems built around this paradigm was Data-Oriented Networking Architecture (DONA) [19]. The core principle behind its operation is to build a DNS-like infrastructure that allows hosts to resolve content identifiers to nearby sources. Subsequently, providers can register content with DONA whilst consumers can then query DONA to receive it from the optimal source. Importantly, this is done using solely the content’s unique identifier and not any location-based information. Note, DONA does not use content-centric routing at the network layer; instead, it deploys an indirection infrastructure (similar to i3 [32]).

Further research has also looked at introducing this paradigm at the lower layers of the network stack with the potential to replace traditional IP-based networking. One such example is PARC’s Assurable Global Networking (AGN) [1, 17]. This also uses an infrastructure based content-centric networking approach that routes requests using content identifier. In AGN routing is performed at the network-layer alongside IP. This has resulted in efforts to integrate the work alongside routing protocols such as BGP and IS-IS to create a truly content-centric network. Although, in practice, its real-world deployment is unlikely anytime soon, this is very much an active line of research.

Issues related to active and passive replication of content have also been recently investigated at the overlay level. For example, Corelli [36] is a system based on the P2P paradigm that supports decentralized CDN infrastructures. These kinds of distributed infrastructures are also receiving more attention from popular content providers. For example, the initial BBC iPlayer version was based on the P2P paradigm to better handle large-scale demand.

First and second generation CDNs have mainly concentrated on improving content availability through network caching and replication, but more recent content network research has looked into providing content support whilst still being able to integrate with existing systems. This has been necessary in order to accommodate the increasing number of networks, protocols (e.g., MESH, WiMAX, LTE, etc.) and devices (ranging from HDTV capable TVs to handheld mobile devices) used for content consumption. As such, it has become important to be able to flexibly manage different kinds of networks and delivery types. To address this, middleware based approaches have recently received more attention. An example of this is the Juno middleware, which uses local (re-)configuration to handle heterogeneity within the network and application [35]. Juno uses a range of different pluggable components for content discovery and content delivery. Different components can be used to interoperate with different systems and realize new concepts, allowing the system to adapt to network and end-system conditions, as well as flexibly supporting changing user requirements. This can be even done dynamically during deliveries For instance, if a client-server content source gets overloaded the system can dynamically switch to P2P based content distribution. Hence, such a middleware based content network provides optimal support for content-centric and delivery-centric content networking while still being able to interoperate with legacy CDNs.

4 Multimedia communications and delivery over heterogeneous wireless networks

Next generation wireless networks, such as the Fourth Generation (4G) wireless networks, have recently gained a lot of attention. 4G environments are a fully integrated all-IP packet-switched system that promises to support the following features: a highly efficient spectral system, support for all types of multimedia content, access speeds up to 1 Gigabit/second for low mobility, such as local wireless access, and up to 100 Mbits/s for high mobility, such as vehicular scenarios, seamless handoff and global roaming across multiple heterogeneous networks, better scheduling and call admission control schemes, terminal heterogeneity, and several other interesting features [34].

In addition, 4G networks also provide high usability for end users enabling them to customize their multimedia applications and receive their content with QoE assurances. 4G systems are expected to play a fundamental role in supporting truly ubiquitous multimedia communications in coming years. However, to satisfy the skyrocketing demand for multimedia service access over heterogeneous wireless network infrastructures at any-time, from anywhere and any device, several challenges need to be addressed at the network, device, and application levels.

It is well known that multimedia transmissions over wireless communication links need to also take into account the characteristics of the links in contrast to wired networks. Packet losses in wired networks are primarily due to congestions whereas in the case of wireless networks they are caused mainly by corruption of packets due to low Signal to Noise Ratio (SNR), interferences from nearby transmissions, or multi-path signal fading [6]. Past efforts to mitigate delays and packet losses caused by wireless links have primarily focused on cross-layer design optimization techniques, where dependencies between protocol layers are exploited to improve the end-to-end performance delivered to end-users. Some of the recent cross-layer design architectures that have been proposed specifically for multimedia transmissions include [11, 18, 26, 38]. Most of these cross-layer approaches provide additional support through the implementation of new interfaces used for sharing information between adjacent layers or by adjusting parameters that span across different transmission layers.

The layers exploited by several proposed cross-layer designs typically involve all or a subset of the application layer (even a user-based layer), the Media Access Control (MAC) layer, and the physical layer. For instance, Chilamkurti et al. [8] proposed a cross-layer design aimed at improving the quality of H.264 video over IEEE 802.11e-based networks. The design takes advantage of the characteristics of both application and MAC layer information to improve the video transmission quality of H.264 video. Schaar et al. [39] proposed an approach that classifies multimedia packets into different classes, and depending on the underlying network conditions, only specific packets are transmitted. Their cross-design approach also exploits information from the MAC layer and the transport layer to optimize MPEG-4 video delivery and quality. Several other multimedia cross-layer designs for multimedia transmissions using different types of adaptation strategies (such as the integrated, the MAC-centric, or the top-down approaches) have been proposed in the literature.

Advances in wireless QoS/QoE models and portable devices have been allowing the distribution of high quality multimedia content to fixed and mobile users. New strategies in routing, admission control, resource reservations, re-registration, and authorization, among others have been discussed in literature and implemented by service providers, and are creating wireless ubiquitous multimedia systems.

In addition to network-based efforts, mobile wireless devices are also attracting a lot attention. Nowadays, it is possible to see fairly complex multi-function terminals capable of handling different media types. Mobile users are using such devices for different purposes, including the storage of personal information (e.g., an address book), to conduct various forms of interactions (e.g., voice, email, video phone), entertainment (e.g., gaming, on-demand video streaming), and information access (through web browsers) [29].

Wireless multimedia devices are placing increasing demands on designers and manufacturers to provide higher processing capabilities, a larger number of functions and usage modes, displays with higher resolutions, user-friendly/multimodal interaction modes, and the support of multiple wireless interfaces that can connect to different types of wireless networks. In the context of future multimedia systems, fast detection of available access networks and selection of the “most appropriate” network interface based on factors such as user preference, costs, seamless and application requirements are becoming increasingly important for many portable devices.

Another important design consideration for future mobile multimedia devices is their conformance to well-known standards [29] (open application framework, standard interfaces, etc.) relieving end-users of the need to spend time learning proprietary technologies. By providing such design flexibility to end-users, software development on these platforms will be made much easier and more modular.

5 Multimedia mobile systems

Seamless multimedia mobility will be one of the dominating factors for the success of next generation systems. Besides the basic connectivity needed by any type of application, multimedia applications have stringent requirements from the network, which include bounds on the end-to-end delay, loss rate, and delay jitter. Considering voice services: they generally have generally low bandwidth requirements, which depend on the codec being used; but, on the other hand, voice is very sensitive to losses. Video services need a large amount of available bandwidth and compression is mandatory for such applications. While jitter and delay are still an issue, some losses are bearable due to the recovery capabilities of the compression mechanisms, such as MPEG-4. Overall, multimedia applications need that the quality of the communication has acceptable levels, which are naturally dependent on the nature of each application.

Due to the innate characteristics of wireless mobile technologies that have a limited capacity and are prone to interference, as well as to the challenges associated with mobility, the support of multimedia in such scenarios has raised work at several levels, from the technology dependent layers up to application adaptation, such as the compression mentioned before and base station selection.

The end-to-end support of multimedia applications requires the maintenance of the transmission quality level when multiple heterogeneous technologies are used and devices/users are mobile. One can imagine a Wireless LAN access and a WiMAX backhaul within a city scenario, a WiMAX access in rural areas with connection to some satellite or wired Internet access technology, or a train travelling across a country or between different countries. Besides the QoS capabilities of each technology, there is the need to guarantee an end-to-end quality level with QoE support. For such support, end-to-end QoS-aware mechanisms have been developed at technology independent layers. An example of such mechanisms is the Next Steps in Signalling Framework (NSIS) for IP signaling [13], including QoS signaling and resource reservation [24]. QoE requirements have also been integrated into signaling and reservation systems.

The use of NSIS to provide adequate quality levels for the transmission of multimedia applications has been explored within scenarios with and without mobility. When there is mobility, two broad approaches can be considered for QoS signaling for resource reservation, namely, “break-before-make” and “make-before-break”. In the first case resource reservation takes place after the handover procedure is completed while, in the second, resources are reserved prior to the handover decision. Despite the fact that both approaches have well known positive and negative aspects, NSIS has been used for the support of multimedia applications in mobility scenarios for the first approach [5] and also for the second [4, 25].

Nowadays, mobility usually occurs between different access points, but within the same technology, which is known as horizontal handover. However, as devices become empowered with different technologies, which naturally have diverse quality and pricing characteristics, there is a need to develop mechanisms for handovers between different technologies, known as vertical handover. This situation raises the issue of inter-technology mobility, which poses additional challenges when multimedia applications are at stake.

In this context, the Media Independent Handover (MIH), identified as IEEE 802.21 [31], plays an important role for the handover between Global System for Mobile communications (GSM), Bluetooth, Wi-Fi and WiMAX networks. MIH follows a cross-layer paradigm where the information from the lower/technology dependent layers is used to enhance the handover process and to provide the grounds for a handover that is as seamless as possible. Several research results have shown how MIH can improve multimedia communications [2, 15] by taking advantage of the interaction among different layers.

Other approaches have been proposed to support seamless mobility for multimedia applications, such as pre-fetch and cache-based multimedia schemes, self-adaptive handover management solutions for mobile streaming continuity [31] and content-oriented mobility scheme. For example, the last approach performs handovers, by exploring the characteristics of the available wireless resources, predicted user’s perception, and on-going multimedia content.

6 Challenges for future multimedia networks

In recent years, several solutions have been proposed in academic and industry environments regarding multimedia assessment, content distribution, and optimization over heterogeneous wireless and wired networks and seamless multimedia mobility. However, there are still many important challenges that need to be addressed in future multimedia networks in several areas. It is not the goal of this paper to propose an integrated solution for QoE multimedia networking, but rather to identify the main issues from application to network layers.

Regarding application multimedia measurements, new schemes must be developed to assess the quality level of on-going real-time 2D and 3D applications taking into account feasibility, performance, operational cost, and other issues. These mechanisms can be based on application-level measurements, where no-reference metrics are still needed. On the other hand, packet/network inspection-based (or even hybrid approaches) can be used to predict and assess video quality based on information gathered from packet and network conditions without accessing the decoded video. The results of assessment schemes are useful for pricing/billing, management and optimization operations in next generation multimedia systems.

While the focus of CDNs and content networks is still to improve content availability and content services for the users, the issues addressed by next generation multimedia networks are far more challenging. They have to provide optimal support of content delivery in heterogeneous environments, including a multitude of networks, devices and user’s requirements. Modular and component-based structures appear to offer the right kind of flexibility in this context, while still allowing the integration of legacy systems.

In future multimedia systems, new QoE-based application, transport and network levels optimization mechanisms (whether a cross-layer approach is used or not) are still needed, such as routing, inter/intra-session adaptation, resource reservation, traffic controller, seamless multimedia mobility and base station selection/user experience IEEE 802.11 k schemes. Additionally, the multi-homing capability of current devices can also provide an improved performance for multimedia applications by taking advantage of the multiple connectivity levels from each wireless device.

The ability of wireless devices to satisfy many of the emerging multimedia applications and user requirements remains a significant challenge for designers and manufacturers because these portable wireless devices have limited resources (CPU, memory, battery power). Low cost, small size, multi-homing and high performance will continue to be dominating factors that dictate the adoption, proliferation, and ultimately the success of future wireless multimedia devices on the market. With cheap, highly portable devices, it will be possible to access and produce high-quality multimedia content ubiquitously cost-effectively and adopt a user-centric approach.

7 Conclusions

Multimedia networking continues to be a strong area of research as it has been over more than a decade. This research trend is expected to continue with various challenges emerging as a result of new services, mobility, emerging portable devices, changing user and terminal requirements, and highly heterogeneous networking infrastructures and devices. This paper is intended to highlight some of the important multimedia networking areas that need attention to address some of the most pressing challenges associated with them. We focus on four key areas, where the first one was on the user’s experience schemes, in which the perception of QoE is deeply application-related. Next generation multimedia content systems were also discussed. New QoE-aware mechanisms are needed to improve the overall network performance, reduce operational cost, and increase the user’s satisfaction. Heterogeneity, ubiquitous and seamless mobility support is also crucial issues that need to be addressed in future multimedia systems.

We hope that this work will help improve our understanding of the issues and challenges that lie ahead in multimedia networks and will serve as a catalyst for designers, engineers, and researchers to seek innovative solutions to address and solve those challenges.