360-ADAPT: An Open-RAN-Based Adaptive Scheme for Quality Enhancement of Opera 360° Content Distribution

There is increasing viewer interest and technological support for streaming immersive clips over the Internet. There are, however, challenges in supporting high quality of viewer experience, mostly due to the large amounts of the data associated with immersive video and spatial audio (Ambisonics). In situations where there are limited network resources, the streamed 360° content needs to be adjusted dynamically to meet the network constraints. Dynamic Adaptive Streaming over HTTP (DASH) adaptation is a key technology for delivering high-quality video over open radio access networks (RANs). DASH allows for efficient adaptation of video streams to the available network conditions. This paper introduces 360-ADAPT, a DASH-based adaptation solution on an Open-RAN architecture for increased quality remote 360° opera experiences. Unlike existing schemes, 360-ADAPT gives precedence to audio over the video when selecting bitrates, increasing the overall quality of the artistic act and improving use of resources and energy. The proposed 360-ADAPT was tested with real opera viewers in the context of an artistic-oriented platform for opera delivery, part of the Horizon2020 TRACTION project. Results indicate that 360-ADAPT achieves higher perceived quality levels than alternative solutions both in QoS and QoE metrics.


360-ADAPT: An Open-RAN-Based Adaptive
Scheme for Quality Enhancement of Opera 360 • Content Distribution content is associated with large amounts of data which need to be exchanged over the existing networks.Additionally, the delivery latency must be kept to a minimum in order to maintain high service quality and prevent user dizziness when the content is consumed via virtual reality (VR) headsets [2].
While VR graphics can be rendered in Web browsers, it is a highly intensive computational processing job.Therefore, hardware with modest specifications in terms of graphics and processing may not be able to provide a smooth user experience [3], [4].Moreover, VR content is associated with increased amounts of data and transferring large data sizes with strict timing requirements across existing networks is not trivial.
In order to address the challenges associated with timely networked delivery and processing of video data, adaptation solutions have been designed for adjusting the video delivery process and playback according to the available resources.Their aim was to reduce the latency, stalls, buffering events and improve the overall user experience.Some relevant adaptation schemes are based on adjusting video content resolution, while also considering viewer interest in their effort to improve video quality [5], [6].Region of interest-based adaptive solutions compress in differentiated manner certain areas within the video frames, something achieved via eye tracking techniques for user interest detection.This results in higher user perceived quality in comparison with when quality adjustment is performed uniformly across the whole video frame area.
A viable approach to achieve adaptation at the network level involves dividing the radio access network (RAN) into distinct network slices based on specific functionalities needed.This concept is referred to as the Open-RAN architecture, promoting a software-centric framework that enables networks to behave differently according to the Quality of Service (QoS) requirements of various applications.
In open RANs, the RAN is composed of multiple, disaggregated components, which can be sourced from different vendors.This flexibility enables operators to tailor the RAN to their specific needs and preferences.However, it also introduces challenges in managing and orchestrating the RAN, as well as in ensuring compatibility and interoperability between different components [7].
DASH adaptation, one of the main adaptation techniques for streaming on-demand content over the HTTP protocol, can play a significant role in addressing these challenges in open RANs.By dynamically adjusting the video bitrate and other parameters based on real-time network conditions, DASH can help to maintain smooth and uninterrupted playback even when the network is congested or experiencing fluctuations, even for 360 • content.This can significantly improve the user experience and reduce the need for rebuffering, which can be disruptive and frustrating for viewers [8].
Based on their 3D geometry, the 360 • videos (as illustrated in Fig. 1) can be split into multiple tiles.More advanced adaptive solutions can adjust these tiles individually, including considering various priorities, based on user viewing patterns or interests [9], [10].
Consuming 360 • content in VR headsets requires high definition visual content, as the headset screens are placed very close to the viewers' eyes.Therefore, blurriness and imperfections become very evident.Apart from the works whose goal is to improve viewer quality [11], [12], [13], there are also works which suggest that improving audio quality creates a masking effect that minimizes user perceived video quality imperfections [14], [15].Audio is greatly enhanced in immersive experiences with the use of higher order ambisonics.Ambisonics enables the reproduction of high-resolution audio and, even though it presents an approximation of the sound sources in the 3D space, it can be achieved using standard headphones and stereo speakers.Additionally, multichannel setups are supported [16].
Addressing challenges when streaming 360 • content with high video and audio quality has been one of the research interests of the EU Horizon 2020 TRACTION project [17].The project presents a set of production tools for opera cocreation and co-design.These tools support user-generated rich media content creation, user interaction and communications, adaptive immersive content distribution, media editing and use of narrative engines [18].The TRACTION toolset aims to process and stream opera content remotely with low latency, delivering high quality experience to remote opera viewers that are close to that of real-life performances.In this project, audio prioritization in the context of immersive media adaptation is a core area of research focus.
This paper describes 360-ADAPT, a DASH-based solution that adjusts dynamically streamed 360  [19] for multimedia quality assessment, 24 individuals took part in the study.Participants used a TRACTION project-built immersive Web player, watched multiple opera clips using multiple delivery solutions, including 360AAA and evaluated their quality.The collected data was analyzed and showed that by using 360AAA surpassed alternative state-of-the-art solutions in terms of user perceived quality and QoS metrics such as video and audio bitrate, time to achieve highest bitrate, bitrate switches and average incurred throughput.The rest of this paper is structured as follows.Section II presents related works.Section III described the design of the Web player enhanced with the proposed adaptive scheme.Section IV details the 360-ADAPT mechanism and its algorithm.Section V details the performance, QoS analysis and impacts on energy consumption and Section VI presents the user study conducted to test the proposed solution.Section VII ends this paper by providing conclusions and future directions for further research.

II. RELATED WORK
The MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) standard is one of the main video adaptation techniques for streaming on-demand content over the Internet.A server delivers content to users via the HTTP protocol.Content is organized with the Media Presentation, which comprises segments that consist of periods, adaptation sets, and representations.These DASH media elements are described in the Media Presentation Description (MPD), a manifest file that also contains the segments bitrates and their locations [20], [21].While DASH defines the format of the MPD file and the media segments, it does not specify how the MPD and the media segments should be delivered, nor which solution should be used for determining what bitrate to be requested.Consequently, the DASH client needs to deploy adaptation algorithms that can adjust the segments' bitrate to match the network conditions to ensure viewers' good QoE.MPEG-DASH has been employed in the delivery of both two-dimensional and 360 • media, adapting the video quality according to the network's capabilities [14], [22], [23].
According to reports, only a small percentage of computers and smartphones in the market are VR-ready, with approximately 1% of computers and 6.8% of smartphones meeting the requirements.This translates to a total of 13 million PCs and 200 million smartphones [24], [25].These reports also mention that by 2020, it was estimated that only 100 million computers capable of supporting VR would be sold, which is a small fraction of the total 1.5 billion PCs currently in use worldwide.Based on these numbers, the dissemination of VR content needs to be integrated to existing regular hardware.This can be done, for instance, with the use of modern Web browsers, which are becoming capable of streaming immersive content.While MPEG-DASH is a major enabler in this context, more research focused on algorithms for 360 • content adaptation is required.
Video services have benefited from the employment of open-RAN architectures at a network level.Authors in [26] designed and integrated a QoE application within the Open-RAN architecture in order to improve users' perceived QoE of video services.The solution aims to solve user association-resource allocation-power allocation problems with an adaptive genetic algorithm.A novel prefetching mechanism was proposed in [27], performing a forecast of DASH media segments requested by media players.The solution was integrated to the Mobile Network Operator of a 5G RAN network, where a Multi-access Edge Computing (MEC) host is configured for forecasting and caching media segments based on media session information.RAN solutions running on MEC can also support DASH, leading to a more stable and overall higher video quality as the RAN can quickly identify the maximum sustainable bitrates nased on collected Key Performance Measurements [28], [29].
According to [30], [31] adaptive solutions, such as DASH, can be used to slice the RAN into multiple virtual RANs, each with its own dedicated resources.This can help to isolate different types of traffic and ensure that video streams receive priority over other types of data.Video streams can be transcoded in the cloud before they are transmitted over the RAN.This can help to reduce the latency of the video delivery and improve the overall user experience.DASH can also be used to monitor the network conditions in real time and adjust the video bitrate accordingly.This can help to ensure that video streams are delivered with the highest quality possible, even when the network is congested or experiencing fluctuations.DASH can be used with RAN to dynamically adjust the modulation and coding scheme (MCS) used for the video transmission.This can help to improve the spectral efficiency of the RAN and reduce the amount of interference between different users.By leveraging these techniques, DASH adaptation can enable high-quality video delivery over open RANs, improving the user experience and reducing the cost of network operation.
The delivery of immersive multimedia content via Web browsers is mostly achieved with the use of WebVR library and its newer version WebXR, specifications for VR applications for the Web.WebXR 3D assets are designed to be compatible with Web browsers on both desktop and mobile devices.The 3D environments can be viewed using screen monitors, smartphone-compatible VR headsets (e.g., Google Cardboard) or standard VR headsets (e.g., Oculus Rift) [32].
High quality and spatial audio is crucial for immersive opera experiences.Web-based VR applications can employ Omnitone [33], a JavaScript ambisonic decoder that provides spatial audio capabilities.Through the use of Ambisonics, users can perceive audio direction as they navigate in the virtual space.Omnitone employs a rotation matrix that tracks users' head position, relying on sensor data or user interactions.Omnitone is based on the Web Audio API, which provides head related transfer functions (HRTFs) and binaural rendering.
Popular video services such as Vimeo and YouTube are compatible with 360 • videos.Web developers often employ existing libraries, such as Video.jsand Three.js, to deploy, embed and customize immersive video players to websites.Video.js is a library to facilitate the development of video players for the Web by incorporating plugins with diverse functionalities.Three.js is a library based on JavaScript for designing and rendering 3D assets for Web browsers [34], [35].
The technologies described in the previous paragraphs provide the tools for crafting and delivering immersive and inventive Web-based multimedia content with open-RAN.Nevertheless, these experiences may encounter challenges caused by network instability and limitations.Streaming immersive video and audio seamlessly is challenging, therefore a number of research works have proposed solutions aiming to enhance the quality of immersive media delivery.
A study focused on comprehending the impact of spatial audio on visual attention within 360 • VR experiences is presented in [36].Authors' findings indicate that users experienced a more accurate understanding of the immersive space when high quality spatial audio was employed.
In [37], an adaptive streaming solution for 360 • videos is introduced.This approach employs multiple video quality levels that adapt to varying bandwidth conditions based on viewpoints.Authors' primary goal is to enhance QoE by offering higher video bitrates while minimizing interruptions or stall time during playback.
Numerous DASH-based adaptation algorithms [38], [39], [40] have been proposed in the literature.They can be sorted into 3 groups: buffer-based, bandwidth-based and hybrid.Buffer-based algorithms (e.g., BOLA [38]) only employs the playback buffer level to make decisions regarding the bitrate of segments.If the buffer level is high, a high bitrate is selected to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.avoid buffer overflow.In case of a low buffer level, low bitrates are favored to prevent buffer underflow, which can result in playback disruptions.Bandwidth-based algorithms (e.g., [39]) determine the appropriate bitrate by solely estimating the available network bandwidth.The higher the bandwidth is, the higher is the bitrate.Hybrid techniques (e.g., DYNAMIC [40]) select bitrates considering the estimated available bandwidth along with the buffer level, exploiting the advantages of both.DASH has also been adapted to employ energy saving features such as bitrate and video brightness adaptation solutions [41] and evaluating the streaming peak signal-to-noise ratio (PSNR) according to the capabilities of the target device screen [42].
These solutions, however, give precedence to video over audio when performing adaptation.This might influence the viewers' QoE as audio is of great importance to performing arts pieces.

III. TRACTION WEB-BASED 360
• PLAYER DESIGN Fig. 2 illustrates the interface of the player designed for the EU TRACTION project.The immersive 360 • player can play both regular 2D and 360 • video content is executed via Web browsers and was used in the tests described in this paper.Users can engage with immersive content by manipulating the point of view through actions like mouse dragging, touch screen interactions, or by moving their heads when using VR devices.The proposed player enhances a player created during the EU Horizon 2020 ImAc project [43], [44].
The player incorporates a range of features aimed at offering accessibility support.These features include subtitles, spatial audio descriptions, sign language interpretation, voice control, and enlarged menus specifically designed for individuals with visual impairments.Users can access these functions via the main player menu.The icons within the menu (i.e., [=], [>],  [••] and [o]) correspond to text subtitles, sign language, audio subtitles, and audio description, respectively.Furthermore, the player has been extended to facilitate access to questionnaires once the clips have concluded, providing an opportunity for feedback if desired.

IV. 360-ADAPT SOLUTION
This section describes the proposed 360-ADAPT solution, which selects dynamically the most appropriate bitrate for content delivery considering the available bandwidth, quality variation, and buffer level.The solution is integrated within the 360 Web-based player application.

A. Overall Solution Architecture
The TRACTION 360 player's architecture is composed of a Web server (e.g., Tomcat), responsible for hosting the Web application and the necessary libraries.Additionally, an HTTP server (e.g., Apache) is employed to host the content.Viewers access the player via a standard Web browser.The player utilizes several key libraries and technologies.Dash.js is utilized for video reproduction and adaptation, while Three.js is responsible for rendering 360 • content.Additionally, the player integrates the Omnitone library to handle ambisonics audio.The player is compatible with both Web browsers for mobile and desktop devices, and it supports immersive audio playback through either headphones or stereo speakers.The dash.js library of the player was enhanced to deploy the proposed 360-ADAPT adaptive solution introduced in this paper.
As shown in Fig. 3, the DASH server indicates the segments' locations on the open-RAN (i.e., via URLs on the MPD files).
The Bandwidth Estimator (BE) estimates network's available bandwidth using a smoothed moving average-based prediction approach.It considers the selected bitrate, channel quality (provided by Open-RAN) and actual download times of the previous segments.BE considers abrupt changes in bandwidth, which may influence bandwidth estimation, to avoid high bitrate variability among segments of a single stream.
The Playback Unit (PU) keeps track of the buffer occupancy in order to support continuous playout.PU considers the time taken to download new segments which is related to the bitrate selected by the adaptation algorithm.Higher bitrates are associated with longer download times, which are also recorded in case of highly loaded networks.These may cause buffer underflow, leading to playback interruptions.However, short download times may determine segments to be dropped due to buffer overflow.To address these issues, two thresholds B l and B h are used.When the buffer level is below B l , segments will be downloaded with low bitrates to avoid playback interruptions.In case the buffer level exceeds B h , a deferring period is first triggered, then high bitrates are selected to avert the buffer overflow problem.
The Quality Variation Monitor (QVM) keeps track of the difference in bitrates among the segments that have already been downloaded.It uses a moving average approach to capture short-term bitrate fluctuations, assigning greater importance to recent segment bitrate changes since they have a higher chance of impacting users' perceived QoE.
The Bitrate Adaptive Unit (BAU) deploys the 360-ADAPT algorithm which is responsible for choosing the suitable bitrate for the stream from the bitrate list b made available by the DASH MPD.Its decision making process uses the estimated bandwidth from BE, the buffer level from PU, and the last selected bitrate from QVM.Three bitrate groups are considered: high (H), medium (M), and low (L).BAU creates a table mapping each group to a set of bitrates, e.g., if the list of available video bitrates (Kbps) is b = {500, 1000, 1500, 2500, 4000}, LM uses b for group H, b − {4000} for group M, and b−{4000, 2500} for group L. Note that in case of high network bandwidth, BAU uses L for all priorities.The algorithm which decides the appropriate bitrates for the segments of the 360 • stream is presented later in this section.
Finally, the Scheduler sends HTTP GET requests to the DASH server to request the stream segments with the BAU selected bitrates.

B. Open-RAN Integration
Open-RAN enables hardware and software disaggregation, provides open interfaces and virtualisation, and offers easy upgrade possibility for the software located in the network.It involves the following components.
The RAN Intelligent Controller (RIC), defined by the O-RAN Alliance [45] as a logical function within the RAN responsible for controlling and providing intelligence to optimise radio resource allocation, execute handovers, manage interference, and distribute the load between cells.It comprises a real-time (RT) controller for tasks requiring latency of less than 1 second and a near-RT controller for tasks with a latency of 1 second or more.Mobile operators can utilize RIC to deploy and oversee their Open-RAN, ensuring interoperability, vendor diversity, predictive and intelligent resource management, and subscriber QoS.
The Non-Real Time RIC, a logical function which allows for latency exceeding 1 second.It is a micro-service-based software platform hosting remote applications (rApps).It is available in two variations: virtual native functions and cloud-native functions.Configuration management, device management, fault management, performance management, and lifecycle management for all network components are crucial aspects of Non-RT RIC.
The Near-Real-Time RIC, a logical function that guarantees a latency of one second.It is a micro-servicebased software platform hosting extensible applications (xApps), defined as micro-service-based applications that use standardized interfaces and service models to perform radio resource management.Near-RT RIC, usable as Virtual Native Functions or Cloud Native Functions, is responsible for handover management, real-time traffic and radio monitoring, QoS control, collection and maintenance of historical traffic data, and interaction with Non-RT RIC.
The Distributed Unit (DU), which was introduced by the Third Generation Partnership Project (3GPP) is part of the evolution towards dis-aggregated RAN.DU is software installed on a customized-off-the-shelf server on-site, responsible for running the Radio Link Control, Media Access Control, and sections of the Physical layer.Installed near the Radio Unit on-site, DU includes a subset of the eNB/gNB functions, depending on the chosen functional split.
The Centralized Unit (CU), which manages the Radio Resource Control and Packet Data Convergence Protocol levels and oversees DU operations.In the gNodeB (gNB), CU comprises a CU and a DU linked through interfaces for control and user planes.A CU can support multiple gNBs with multiple DUs.A 5G network can employ varying distributions of protocol stacks between CU and DUs based on mid-haul availability and network design.CU performs functions such as user data transfer, mobility control, RAN sharing, location, and session management.
The Radio Unit (RU), which is where the radio frequency signals are transmitted, received, amplified and digitized.
For the deployment of 360-ADAPT in an Open-RAN environment, the Channel Quality Indicator Monitor (CQI Monitor) component needs to be deployed in the Near-RT RIC as an xApp.Its services are accessible via an E2 interface.The E2 interface is an open link between the Near-RT RIC and other nodes which uses the E2 Application Protocol (E2AP) to manage communication between the endpoints.E2AP messages can incorporate functionalities through the E2 Service Model (E2SM), including the E2Report service, which collects RAN metrics using the RIC Indication message of type 'report'.One of the metrics gathered by this service is the Channel Quality Indicator (CQI).The CQI data is needed by the 360-ADAPT player to perform its adaptation.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
The channel quality, represented by the CQI value, is a value ranging from 0 to 15, and indicates the modulation and coding scheme that should be used for the client's data transmissions.This directly affects the highest achievable throughput.With this information, the streaming service can make more informed decisions about suitable bitrates compared to relying solely on transport layer information.
The client player, enhanced with the proposed 360-ADAPT, is aided by the Open-RAN CQI Monitor xApp.Following the player request, the xApp computes an exponential moving average of the latest client CQI values, which is then forwarded to the client via E2AP and a E2Report.The client uses the proposed 360-AAA to map the CQI average to a range of video bitrates, most appropriate for adaptation [28].This range is used for the 360-AAA-based dynamic adjustment of the video stream's quality in line with the current network conditions.By leveraging most-recent network information via the Open-RAN-supported mechanism, the player adapts more effectively to changing network conditions and utilizes resources more efficiently.In contrast, the default DASH player aggressively increases the bitrate when CQI rises, leading to congestion and forcing the player to eventually lower the bitrate significantly, causing frequent and sometimes prolonged buffer freezes.The proposed adaptive player, however, quickly identifies the maximum sustainable bitrates based on the CQI measurements and avoids any aggressive behavior, resulting in a more stable (i.e., reduced number of bitrate switches) and higher overall video quality.The joint use of Open-RAN and DASH mechanisms also contributes to savings in terms of energy consumption.These are described in more detail in Section V-B.

C. 360-ADAPT Adaptation Algorithms (360AAA)
360-ADAPT uses a new adaptive bitrate selection mechanism to identify and dynamically set bitrates for streamed 360 • content.The 360AAA algorithms are deployed at the level of BAU.
Videos are split into k segments, with a duration of T seconds with encoding in n varying bitrates (i.e., b = b 1 , b 2 , . . ., b n ).The DASH client must choose the bitrate b i of the i th segment of the video, based on the bandwidth available, the playback buffer level and the quality variation observed in the previous segments.The following segments (i.e., i + 1) are requested after segment i is completely downloaded.After that, segments are decoded and stored in the playback buffer to be played at the appropriate time.
360AAA selects the most appropriate bitrate for the stream as follows.Let b min and b max be the dynamic bounds of the bitrate list b provided by the DASH MPD.Initially, b min and b max are both set to the lowest bitrate in b 1 as no accurate bandwidth and channel quality estimation exist yet (see Algorithm 1).Following the received CQI from the Near-RT RIC and the downloaded segments, the bounds are adjusted to reflect the estimated bandwidth and channel quality.When there is a tendency for the bandwidth to increase, b max is configured to the highest bitrate that remains below the estimated bandwidth BW i−1 , whereas b min is set to the next wait for τ seconds end end end higher bitrate available in the list.If the estimated bandwidth decreases, b min is set to a bitrate two levels lower than b max in the list of available bitrates.This is in place to restrict the range of bitrates that the algorithm can select from, thereby preventing significant bitrate fluctuations in response to abrupt changes in bandwidth.
Algorithm 2 describes the bitrate selection process.360AAA takes a more lenient approach, beginning the download process with the first segment at the lowest supported bitrate (i.e., due to missing bandwidth estimates) while for subsequent segments, the bitrate selection relies on the estimated download time d i and the buffer level which should not drop to 0. The algorithm maintains a consistent bitrate for successive segments until the buffer level surpasses B l .When the buffer level falls between B l and B h , 360AAA chooses the bitrate for segment i based on the conditions outlined in Eq. (1).The first condition dictates that the bitrate of segment i for the component with the highest quality (H) group g must not exceed the estimated bandwidth.The second condition stipulates that the bitrate of segment i for components belonging to the groups M and L should not be higher than the bitrate of the component with the highest group quality (b i H ). The third condition specifies that the chosen bitrate must be the neighboring element in the bitrate list relative to b i−1 , either in descending or ascending order.The final condition indicates that the buffer level must remain above the threshold B l following the download of segment i.
Finally, if the buffer level surpasses B h , the algorithm introduces a waiting period before the next segment request.

A. Technical Performance and QoS Analysis
The performance of the proposed solution is compared against other state-of-the-art approaches in terms of metrics such as bitrate behaviour and achieved throughput.Table I depicts the average QoS metrics collected from the player across four different video clips and four algorithms (BOLA, DYNAMIC, THROUGHPUT, and the novel 360-ADAPT solution deploying the 360AAA algorithm proposed in this paper).We observe that: • 360-ADAPT provides the highest video average bitrate for all clips, reflecting the scores in question 3 of user tests (see Section VI).360-ADAPT's average video bitrates are higher than those of BOLA and THROUGHPUT in Clip 1 by 4.6% and 26%, respectively; higher than those of BOLA in Clip 2 by 1.4%; higher than those of BOLA, THROUGHPUT and DYNAMIC in Clip 3 by 0.7%, 0.3% and 0.1%, respectively; higher than those of THROUGHPUT and DYNAMIC in Clip 4 by 0.3%.• 360-ADAPT achieved the highest audio bitrate in Clip 1, and second highest in clips 3 and 4. • 360-ADAPT required the shortest time to reach the highest video bitrate in all clips.Additionally, 360-ADAPT has been compared to the other ABR algorithms using a network trace [46] consisting of periods with changing bandwidth and 100ms round-trip latency.The network trace also contains the segment length T, the encoded bitrates, and the segment size matrix C where C[i, j] represents the size of the i th segment of the video encoded at the j th bitrate.The video description file [47]    throughput (ATH), the total reaction time (TRT), i.e., defined as the time videos take to begin rendering at the highest sustainable bitrate as soon as the network bandwidth increases, and the accumulated played utility (AU), defined as AU = k i=1 log(b i ).Other parameter values include a segment length of 3s, and thresholds B max with 25s (8 segments), B l with 10s (3 segments) and B h with 22s (7 segments).
Fig. 4 presents the bitrate of the video playback as a function of the video play time using the network trace.It is possible to notice that 360-ADAPT and THROUGHPUT achieve a smoother video play quality with fewer bitrate switches compared to DYNAMIC and BOLA.THROUGHPUT provides the lowest number of bitrate switches with 360-ADAPT generating a number of bitrate switches that is 40% and 30% lower than BOLA and DYNAMIC, respectively (see BTS in Table II).Fig. 5 shows the accumulated played utility as a function of the video play time using the network trace.360-ADAPT achieves the highest AU compared to the other ABR algorithms, with an average AU which is 11%, 13%, and 22% higher than DYNAMIC, BOLA, and THROUGHPUT, respectively.
There are a few reasons why 360-ADAPT achieves a better performance than the other ABR approaches.First, 360-ADAPT is able to limit the number of available bitrates with a dynamic adjustment of the values of b min and b max while it maintains the selected bitrate for as long as the buffer level allows it in order to prevent rebuffering.Second, 360-ADAPT is slower to react to oscillations in bandwidth to avoid bitrate changes for every short spike or drop in bandwidth (as seen in TRT in Table II).With this, 360-ADAPT incurs the lowest number of bitrate switches versus BOLA and DYNAMIC and the highest average throughput (as seen in ATH in Table II), achieving higher AU (i.e., perceived QoE) than the other algorithms.The THROUGHPUT approach performs bandwidth estimations only when it selects the bitrate of segments, implementing a conservative solution that quickly reacts to bandwidth variations in order to prevent the buffer underflow problem.This, however, results in lower average throughput and high reaction time.In this test, none of the tested ABR algorithms created buffering events.

B. Impact on the Energy Consumption
By using real-time CQI values to list suitable video bitrates for 360-ADAPT, Open-RAN can adjust the modulation and coding schemes for data transmissions.This ensures that resources are used efficiently, reducing the need for excessive retransmissions and thereby saving energy.The player with 360-ADAPT uses CQI data to compute an exponential moving average and map it to suitable video bitrates.This adaptive streaming prevents the aggressive bitrate increases seen in default DASH players, which often lead to TCP congestion and inefficient use of network resources.By avoiding unnecessary bitrate fluctuations and congestion, the Open-RAN system reduces the processing power required for handling frequent retransmissions and buffer management.This contributes to overall energy savings in both the network infrastructure and client devices.The ability to quickly identify the maximum sustainable bitrate based on CQI measurements leads to more stable network conditions.A stable network requires less frequent adjustments and error corrections, further conserving energy.
Especially in constrained devices such as smartphones, Open-RAN enables 360-ADAPT to use a suitable range of bitrates (based on CQI data) resulting in a more appropriate PSNR.The default DASH behavior is increasing video quality whenever network bandwidth allows (e.g., buffering time is not perceived).For a mobile device, increasing video quality beyond a threshold would not necessarily improve perceived QoE due to screen constraints, but would consume more energy.360-ADAPT, with the help of the Open-RAN gathered CQI data, allows high QoE to be achieved with fewer bitrate switches, increasing smoothness of the immersive videos and maintaining appropriate bitrate levels according to the device needs, reducing the overall energy consumption.
Additionally, other energy consumption gains are inherent to the architecture of Open-RAN employed by 360-ADAPT.The RU can employ efficient power amplifiers, better cooling systems, and advanced materials can reduce the power consumption, as well as implement dynamic sleep modes that power down the RU during periods of low traffic can save energy.Beamforming and MIMO can also improve the efficiency of spectrum usage, leading to reduced energy per bit transmitted.These approaches make use of collected KPIs at RUs, such as power consumption per RU, energy per transmitted bit, active vs. idle Power consumption, sleep mode utilization, transceiver efficiency, and antenna utilization.
The DU can have its functions virtualised on generalpurpose hardware, leading to better resource utilization and energy savings through dynamic scaling, edge computing, and efficient signal processing algorithms to reduce the computational load and power usage.This can be achieved by monitoring and reacting to diverse KPIs from the DU, such as CPU utilization, baseband processing efficiency, virtual machine resource utilization, power usage effectiveness, load balancing efficiency and data transfer energy cost.The CU may employ cloud-native technologies for more efficient scaling and resource allocation, enhancing energy use, monitoring server utilization, energy per processed bit, and cooling Efficiency.Centralized processing also enables pooling of resources, which can be dynamically allocated to where they are needed most, reducing waste.Intelligent scheduling algorithms that consider energy consumption can improve the processing load and reduce overall power usage.

VI. USER STUDY
A subjective user study was conducted to evaluate the effectiveness of the 360AAA algorithm.This study was designed with reference to ITU-T Recommendation P.913 [19], considering aspects such as study duration and the number of participants.In this section, we provide information regarding the test participants, the assessment protocol, the questionnaires employed, as well as the QoE results and subsequent discussions.

A. Setup
The subjective testing setup was built within a room at the Performance Engineering Lab, Dublin City University (DCU), Ireland.The environment was arranged to minimize any possible disruptions during the testing sessions, adhering to the guidelines of ITU-T R.P.913.All participants were seated 58 cm away from the computer screen, positioned directly in front of it.The seat height was individually adjusted for each user to ensure comfortable screen visibility.Throughout the test sessions, all windows in the room were kept closed to prevent any external noise from interfering with the testing environment.
Each participant viewed four 360-degree videos in different orders according to Table III, with ambisonic audio (i.e., 4 channels, 48kHz sample rate, 256kbps bitrate).The videos were encoded in 3 different qualities: 7680×3840px, 3840×2160px and 1920×960px.The actual playout rate was dynamically selected via the adaptation algorithm employed during testing.Additionally, participants also evaluated the videos without adaptation, each video in a different fixed setting: high: 7680×3840px, medium: 3840×2160px and low: 1920×960px.All videos were tested across all qualities.
The participants viewed the content on a high spec PC with the specifications indicated in Table IV.The player running on this PC also collected metrics related to real-time bitrates, bitrate switches and throughput, writing these values into separate text files for each video watched.

B. Participants
The subjective testing included a total of 24 participants, comprising 19 males and 5 females with diverse nationalities, who were invited to DCU.Participant recruitment was conducted through Email, and individuals were required to complete a consent form before joining the study.Participants had the option to withdraw from the study at any point if they so desired.Additionally, participants were provided with a plain language statement and a data management plan.These documents offered a comprehensive explanation of the testing scenario, research objectives, data handling and analysis procedures and participant confidentiality safeguards.The study received ethical approval from the DCU Research Ethics Committee.

C. Assessment Protocol
The study focuses on testing the 360-ADAPT solution for adaptation of 360 • opera content, which emphasizes the audio component.The solution was deployed in the dash.jslibrary employed in the TRACTION 360 Web Player.This video player was used in this experiment.Metrics such as average bitrates and quality switches are monitored to assess whether there are performance improvements or not, while focusing on Ambisonic audio.
Each video presented to participants was played with a different DASH-based adaptive algorithm: the baseline algorithms BOLA [38], DYNAMIC [40] and THROUGHPUT [48], and the novel 360-ADAPT solution deploying the 360AAA algorithm, described in this paper.BOLA uses the playback buffer level to make bitrate decisions for video segments.When the buffer level is high, it selects a high bitrate to avoid buffer overflow.Conversely, when the buffer level is low, it opts for lower bitrates to prevent buffer underflow, which can disrupt playback.THROUGHPUT, on Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the other hand, determines the appropriate bitrate solely based on estimating the available network bandwidth; higher bandwidth leads to higher bitrate selection.DYNAMIC is a hybrid approach that chooses bitrates by considering both the estimated available bandwidth and the buffer level, leveraging the strengths of both methods.360-ADAPT, the solution proposed in this paper, maintains the selected bitrate as long as the buffer level allows, aiming to prevent rebuffering and reduce the frequency of bitrate changes due to short-term bandwidth fluctuations.
The participants were divided in four groups, so the four videos were seen in different order.Each video was also tested across all four algorithms.The four 360-degree videos viewed by participants were: a) "Che Faro Senza Euridice", recorded in DCU, b) "This Hostel Life part 1", from the Irish National Opera (INO) in Dublin, Ireland, c) "This Hostel Life part 2", also from INO, and finally, d) a snippet of "Romeo and Juliet", from the Gran Teatre del Liceu in Barcelona, Spain.The video durations are 2m10s, 1m40s, 2m55s and 4m10s, respectively.Fig. 6 illustrates relevant frames from the four videos.

D. Questionnaires
Before commencing the experiment, participants were given a demographics questionnaire that covered various aspects of their user profile and familiarity with the technology.
Following each video clip, participants were requested to complete a QoE questionnaire consisting of the 15 questions below.The responses to the questionnaires were collected using a five-point Likert scale, where the options ranged from "strongly disagree" to "strongly agree": 1) The audio improved the immersiveness of the experience.
2) The audio quality was good.
3) The video quality was good.4) I enjoyed the experience presented.
5) The immersive experience helped me to better assimilate the performance.
6) The immersive experience helped me to be more engaged in opera.
7) I enjoyed watching this opera piece as an immersive experience.
8) The 360 • /VR effects were disturbing for me during the video.
9) The colors of the footage are clear/vivid.10) I believe that the immersive experience is comparable to a live opera.
11) Please, rate the feature: audio quality 12) Please, rate the feature: video quality 13) Please, rate the feature: immersiveness 14) My enjoyment was negatively affected by the video quality.
15) My enjoyment was negatively affected by the audio quality.
At the conclusion of the study, a final usability questionnaire consisting of 9 questions was provided to participants.This questionnaire aimed to assess the viewers' experience with the TRACTION 360 Web player in terms of user experience and interface design.
All questionnaires employed in the experiment were created with Google Forms and were automatically presented to users.

E. User Study Results and Discussions
The participants were deliberately kept unaware of the content delivery algorithm in use, as well as the quality levels provided.The sequence of the video clips was also randomized Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.to eliminate bias.Consistently, the same set of questions was posed for all clips across all operas.Responses, recorded on a Likert scale ranging from 1 to 5 (i.e., (1) strongly disagree; (2) disagree; (3) neutral; (4) agree; (5) strongly agree), were aggregated and averaged for each question relating to the quality levels of the video or audio components.
1) Audio and Video: All four algorithms were applied to each video clip, and the average results for each algorithm are depicted from Fig. 7 to Fig. 10, with 95% confidence bars.
The average score associated with the perceived audio quality question (Q2) shows that the proposed 360-ADAPT solution provides the highest perceived audio quality across all videos and viewers.A comparison between the four algorithms indicates that 360-ADAPT provides an average audio quality across all videos that is 6.7%, 18.9%, and 6.7% higher than that supported by BOLA, THROUGHPUT, and DYNAMIC, respectively.
T-tests of two samples with unequal variances and α = 0.05 performed on the perceived audio quality results confirmed that: • there is no statistically significant difference between 360-ADAPT and BOLA (t(43) = 0.89, p = 0.38); • there is statistically significant difference between 360-ADAPT and THROUGHPUT (t(39) = 2.24, p = 0.03) and; Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Fig. 12.Average results per adaptation algorithm across all videos compared to fixed bitrate (i.e., no adaptation algorithms) in high, medium and low settings.
• there is no statistically significant difference between 360-ADAPT and DYNAMIC (t(43) = 0.92, p = 0.36).The average score associated with the perceived video quality question (Q3) shows that the proposed 360-ADAPT solution provides the highest average video quality across all videos.A comparison between the four algorithms tested indicates that the 360-ADAPT solution provides an average video quality across all videos that is 6.4%, 15.4%, and 1.3% higher than that of BOLA, THROUGHPUT, and DYNAMIC, respectively.This is shown in Fig. 11, along with the results of other questions.
T-tests of two samples with unequal variances and α = 0.05 performed on the perceived video quality results confirmed that: • there is no statistically significant difference between 360-ADAPT and BOLA (t(43) = 0.78, p = 0.44); • there is no statistically significant difference between 360-ADAPT and THROUGHPUT (t(43) = 1.81, p = 0.08) and; • there is no statistically significant difference between 360-ADAPT and DYNAMIC (t(44) = 0.16, p = 0.87).The proposed 360-ADAPT algorithm managed to outperform the baseline algorithms, due to its ability to control bitrates and deliver higher quality audio, while maintaining at least as good video quality levels as those supported by alternative solutions.This is achieved by minimizing the numbers of video stalls and quality switches.
A comparison between the ABR algorithms versus scenarios without adaptation (i.e., fixed bitrate quality in low, medium and high settings) is presented in Fig. 12. Questions 1 to 13 from Section VI-D were asked to participants.In the scenarios without adaptation, video quality impacted user perception, especially in the scenario with low resolution (i.e., 1920×960px), with participants disagreeing that the video quality was good (Q3).Interestingly, participants seemed to have a similar and at times better perception of the medium quality footage, indicating that videos with a resolution of 3840×2160px cause a similar impression to videos with high resolution (i.e., 7680×3840px).The perception of the colors of the footage being clear/vivid (Q9) was also more affected in videos with low resolution.In general, all ABR algorithms perform better than fixed bitrate approaches, according to users.Across all questions, 360-ADAPT performed on average 29% better than the fixed bitrate on high settings, according to users.
2) Usability: The TRACTION 360 Web Player was also evaluated in terms of usability.As already mentioned, the participants in the experiment were asked to answer a few questions about the player features and its supported immersiveness.Fig. 13 presents the results of a number of questions that participants answered after watching the videos using the TRACTION 360 player.
Most participants (88.9%) indicated that headphones help audio to be better enjoyed and immersive technology does enhance opera experiences (82.2%).Participants also left the experiment willing to consume more opera content with immersive media (82.2%).
The majority of the participants (88.9%) agreed or strongly agreed that immersive experiences can help users to be more engaged when consuming multimedia content.Even though most participants disagree that immersive media can be disturbing or distracting (57.8%), 26.7% were neutral and 20% agreed it can indeed be disturbing or distracting.Participants were more divided when asked if the immersive videos are comparable to live opera.Most participants agreed or strongly agreed that the player is easy to use, even without additional written instructions and they learned how to use it quickly (86.7%, 95.5% and 95.6%, respectively).

VII. CONCLUSION AND FUTURE RESEARCH DIRECTIONS
This paper introduced 360-ADAPT, an Open-RAN and DASH-based adaptive multimedia streaming solution designed to achieve increased quality 360 • opera experience.Unlike existing schemes, the 360-ADAPT solution gives precedence to the audio component over the video, increasing the overall quality of the artistic act.The paper presented a subjective study with real participants which assessed the impact of the delivery solution on both viewers' perceived audio and video quality.The experiment was conducted within the context of the TRACTION project, funded by the European Union's Horizon 2020 programme, which advocates for the collaborative creation and distribution of opera through innovative technologies.A total of 24 participants were involved in the study, where they viewed four opera clips featuring adaptive video and audio quality.The responses collected through the questionnaires and collected metrics indicated that the 360-ADAPT solution effectively facilitated the delivery of superior audio quality while sustaining video quality levels comparable to those supported by other state-of-the-art solutions, also bringing gains to energy consumption.For future work, we aim to support multi-stream bitrate adaption for immersive live content streaming.

Algorithm 1 :
Dynamic Adjustment of b min and b max Result: b min and b max while

Fig. 4 .
Fig. 4. Bitrate of the video playback as a function of the video play time for the various ABR algorithms.

Fig. 5 .
Fig. 5.Total played utility as a function of the video play time.

Fig. 6 .
Fig. 6.Videos employed in the experiment: a) "Che Faro Senza Euridice", recorded in DCU, Ireland b) "This Hostel Life part 1", from INO, Ireland c) "This Hostel Life part 2", also from INO, Ireland and d) "Romeo and Juliet", from the Gran Teatre del Liceu, Spain.

Fig. 11 .
Fig. 11.Average results per adaptation algorithm tested with the immersive adaptive player across all videos.
All ABR algorithms are evaluated in terms of the total number of bitrate switches (BTS), the average incurred Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.