Cross-layer architecture for scalable video transmission in wireless network

https://doi.org/10.1016/j.image.2006.12.011Get rights and content

Abstract

Multimedia applications such as video conference, digital video broadcasting (DVB), and streaming video and audio have been gaining popularity during last years and the trend has been to allocate these services more and more also on mobile users. The demand of quality of service (QoS) for multimedia raises huge challenges on the network design, not only concerning the physical bandwidth but also the protocol design and services. One of the goals for system design is to provide efficient solutions for adaptive multimedia transmission over different access networks in all-IP environment. The joint source and channel coding (JSCC/D) approach has already given promising results in optimizing multimedia transmission. However, in practice, arranging the required control mechanism and delivering the required side information through network and protocol stack have caused problems and quite often the impact of network has been neglected in studies. In this paper we propose efficient cross-layer communication methods and protocol architecture in order to transmit the control information and to optimize the multimedia transmission over wireless and wired IP networks. We also apply this architecture to the more specific case of streaming of scalable video streams. Scalable video coding has been an active research topic recently and it offers simple and flexible solutions for video transmission over heterogeneous networks to heterogeneous terminals. In addition it provides easy adaptation to varying transmission conditions. In this paper we illustrate how scalable video transmission can be improved with efficient use of the proposed cross-layer design, adaptation mechanisms and control information.

Introduction

The evolution of wireless telecommunication systems can be divided into short term and long term evolution towards a global and integrated system, which will meet the requirements of both users and industrial world, and which could make efficiently use of emerging technologies. The expectations of evolution, whether called as short term i.e. increasing the bandwidth with new radio and access technologies or as longer term i.e. co-operative, converging networks, are in the end very similar in many ways. For end-user side the expectations of the future system include for example the good service quality and improved quality of experience (QoE), easy access to applications and services, improved usability of services, enhanced security and reasonable cost. Similarly on the service and network provider side minimizing the operational and capital expenditures by easy quality of service (QoS) provisioning and network/security management, flexibility of configuration and reconfigurability of the system and maximization of the network capacity are the expected values. Fulfilling these expectations will be a challenging task for system designers, who are aiming at producing flexible next generation wireless systems that interconnect, in a transparent way, a multitude of heterogeneous networks and systems. Optimal allocation of system and application resources can be achieved with the co-operative optimization of communication system components in different layers, and this is in particular the case for multimedia processing and transmission. The increased amount of the wireless network components in the whole transmission system and the demand of better QoS and QoE are guiding the work to better adapted co-operation of different elements in the whole multimedia system.

Traditionally, the two encoding operations of compression and error correction are separated from each other, following Shannon's well-known separation theorem [18], which states that source coding and channel coding can, asymptotically with the length of the source data, be designed separately without any loss of performance for the overall system. However, it has been shown that separation does not necessarily lead to the less complex solution [11] and separation is not either always applicable [28], especially for multimedia transmission. Thus joint source channel coding (JSCC/D) techniques that include a co-ordination between source and channel encoders have been recently investigated, and techniques have been developed [5], [17], [14] to improve both encoding and decoding processes while keeping the overall complexity at an acceptable level [15].

In order to benefit from JSCC in real systems, control information needs to be transferred through the network and system layers. Unfortunately, the impact of the network and networking protocols are quite often discarded while presenting the joint source and channel coding systems and only minimal effort is put into finding solutions for providing efficient inter-layer and network signaling mechanisms. Some work has, however, been carried out in order to provide cross-layer protection strategies for video streaming over wireless network, such as combining the adaptive selection of application forward error correction (FEC) and medium access control (MAC) layer automatic repeat request (ARQ) as presented in [20], [22].

There are already some mechanisms in use for generic information exchange between the different system layers, as the QoS features, namely differentiated services (DiffServ) [29] and integrated services (IntServ), which provide means for an application to reserve resources and specific service level from the interconnecting IP network by mapping the application requirement at network protocol level. Another example of the inter-layer signaling can be found from IEEE 802.11e standard where the QoS provisioning is performed between the application and the medium access layers.

The QoS information consisting of the IP packet priorities, to drop them selectively, is not alone sufficient as an optimization method for multimedia transmission. More detailed information needs to be delivered in order to fully optimize the end-to-end transmission in a cross-layer manner. Some of the possible methods to arrange the control information delivery between physical and application layer are discussed for example in [12], which describes the use of two additional adaptation layers in the protocol stack, in order to transmit cross-layer information within the protocol stack and through the network. Another possible solution for transferring the required controlling information is to extend the current protocols such as Internet Protocol version 6 (IPv6) or Internet Control Message Protocol version 6 (ICMPv6) through the definition of new options and message types.

The presented solutions are potential candidates for transferring the control information through both wired and wireless network but they do not solve fully the problem of transferring control information through protocol layers from application layer to physical layer and vice versa. Furthermore, they do not propose solutions to use this information for end-to-end optimization, which requires to take into account all protocol layers and particularly applications.

Within multimedia applications, scalable video coding (SVC) has gained a lot of attention recently from both researchers and standardization committees [13]. Scalability has been a part of several recent video coding standards (e.g. MPEG-2, MPEG-4, H.263+) but it has not reach a wide popularity in the industry mainly because of poor compression performance. Currently, a new SVC standard is under standardization process by joint video team (JVT) and it provides comparable compression efficiency and flexible 3-D scalability [26]. The layered structure of the scalable video stream and different priorities among the layers support the usage of unequal error protection (UEP) [6], [21] techniques and prioritization of layers on network level e.g. using DiffServ [19]. Combination of scalable video and different cross-layer solutions have been studied in several papers for example in [27], where end-to-end transmission control protocols and congestion control for scalable video has been studied. In [20], combining of the adaptive selection of application FEC and MAC layer ARQ together with scalable video has been studied.

In this paper we introduce the innovative IST-PHOENIX cross-layer architecture, which considers not only the joint optimization of source and channel coding, but also includes the interconnecting network in more detail than in previous works. One of the originality of this paper is the definition of the needed cross-layer information and the provision of solutions how this information can be delivered and utilized on different system layers within the system architecture defined in Section 2. In Section 3 we describe the methods to optimize multimedia transmission in network level, and the mechanisms for transferring transparently the control information through the network and within the protocol stack. In Section 4 we concentrate in more detail on optimization of scalable video transmission in PHOENIX architecture and we present some examples how the proposed cross-layer information can be utilized to adapt scalable video according the varying transmission conditions.

Section snippets

PHOENIX system architecture

The overall system architecture is represented in Fig. 1, including the informative signals and control signals for transmitting cross-layer control information through the system in order to implement efficient adaptation techniques such as UEP and soft input source decoding schemes. The system utilizes several new control lines for control signal delivery and application and physical layer controller units in a way that those are also exploitable in cross-layer communication.

The PHOENIX

PHOENIX network architecture

A central concept of the cross-layer system is the Network Transparency and cross-layer information exchange, but also modifications at protocol level are required in order to increase the overall performance of the multimedia transmission system. The Network Transparency expresses the abstract idea of making the underlying network infrastructure almost invisible to all the entities involved in the system. The primary goal of Network Transparency is to transfer cross-layer control information

Cross-layer mechanisms for scalable video transmission

In previous sections, we have defined a general architecture for cross-layer wireless video transmission and different system blocks of the architecture together with cross-layer information transmission techniques. While the PHOENIX architecture proposed in [10], [8] is general to be used together with several radios and video codecs, we would like to present some special aspects related to scalable video transmission utilizing the proposed architecture. The PHOENIX architecture can be also

Conclusions

In this paper we presented the PHOENIX architecture, which enhances the cross-layer approach for multimedia transmission in an all-IP environment. Our cross-layer architecture proposal includes the complete transmission chain from application layer source coding to wired and wireless channel models, required cross-layer signaling mechanisms and full network functionality. For cross-layer control information delivery, we have proposed several different protocol solutions which are optimal in

Acknowledgments

This work has been carried out in the PHOENIX project, which has been partially supported by the European Commission under Contract FP6-2002-IST-1-001812. The authors would like to thank especially Mikko Majanen and Konstantinos Pentikousis from VTT, Gábor Jeney and Gábor Fehér from Budapest University of Technology and Economics and Soon X. Ng from University of Southampton, and all the other colleagues who have participated in the PHOENIX project, given valuable contribution for the

References (28)

  • U. Horn et al.

    Robust internet video transmission based on scalable coding and unequal error protection

    Signal Process. Image Commun.

    (September 1999)
  • I. Amonou, N. Cammas, S. Kervadec, S. Pateux, Layered quality optimization for SVM, ISO/IEC JTC 1/SC 29/WG 11, M11704,...
  • C. Bergeron et al.

    Soft-input decoding of variable-length codes applied to the H.264 standard

  • C. Bormann (Ed.), Robust Header Compression (ROHC): Framework and Four Profiles: RTP, UDP, EPS, and Uncompressed, IETF...
  • J. Hagenauer et al.

    Channel coding and transmission aspects for wireless multimedia

    Proc. IEEE

    (October 1999)
  • E. Kohler, M. Handley, S. Floyd, Datagram Congestion Control Protocol, IETF RFC 4340, March...
  • C. Lamy-Bergot et al.

    Joint optimization of multimedia transmission over an IP wired/wireless link

  • L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson, G. Fairhust, The Lightweight User Datagram Protocol (UDP-Lite), IETF...
  • M.G. Martini et al.

    Content adaptive network aware joint optimization of wireless video transmission

    IEEE Commun. Mag.

    (January 2007)
  • J.L. Massey

    Joint source and channel coding

  • S. Mérigeault, C. Lamy, Concepts for exchanging extra information between protocol layers transparently for the...
  • J.-R. Ohm

    Advances in scalable video coding

    Proc. IEEE

    (January 2005)
  • M. Park et al.

    Joint source-channel decoding for variable-length encoded data by exact and approximate MAP sequence estimation

    IEEE Trans. Commun.

    (January 2000)
  • L. Perros-Meilhac, C. Lamy, Huffman tree based metric derivation for a low-complexity sequential soft VLC decoding, in:...
  • Cited by (39)

    • Content-aware downlink scheduling for LTE wireless systems: A survey and performance comparison of key approaches

      2018, Computer Communications
      Citation Excerpt :

      Furthermore, the cross-layer optimization strategy requires the video server to support the required cross-layer signaling protocols such as [77–79] to support the video rate adaptation thus adding compatibility issues. The European projects PHOENIX [80,81] and OPTIMIX [82] proposed a framework where the cross-layer adaptation task is split in two main control entities. The application layer (“Application controller”) performs rate-control and adaptation based on information from the lower layers, whereas the PHY/MAC layer (”Base station controller”) adapts PHY/MAC parameters based on the characteristics of the video flows.

    • Stochastic comparisons for rooted butterfly networks and tree networks, with random environments

      2011, Information Sciences
      Citation Excerpt :

      The design of communication systems is a complex interplay between software and hardware issues, which leads to some complex queueing models (see e.g., [15]). The tree networks have been widely used to model the traffic in internet, and the video transmission (see e.g., [12,34]). Such networks arise in many other applications, including modelling data structure, information theory, taxonomy, location, molecular biology, evolution, and ecology.

    • Remote Radio Head Scheduling in LTE-Advanced Networks

      2022, Wireless Personal Communications
    • Cross layer design for enhanced video transmission of H.264/SVC

      2015, International Journal of Applied Engineering Research
    • A tutorial and review on inter-layer fec coded layered video streaming

      2015, IEEE Communications Surveys and Tutorials
    View all citing articles on Scopus

    This work has been carried out in PHOENIX project, which was partially funded by the European Commission within the EU Sixth Framework Programme and Information Society Technologies.

    View full text