eTOM-Conformant IMS Assurance Management

Generation of


Introduction
QoS management (Raouyane B. et al., 2009) mechanisms as defined by 3GPP can be viewed as a network-centric approach to QoS, providing a signalling chain able to automatically configure the network to provision determined QoS to services on demand and in real time, for instance on top of a DiffServ-enabled network. However, to envision a deployment of such technology in a carrier-grade context would mean significant further effort. In particular, premium paid-for services with SLA (Service Level Agreement) contracts such as targeted by IMS (Poikselka and Georg, 2009)networks would require additional mechanisms able to provide some degree of monitoring in order to asset the SLAs, while IMS by itself does not provide such mechanisms.
The eTOM (enhanced Telecom Operations Map) (Creaner and Reilly, 2005) functional framework is a widespread reference used to model and analyze networks and services activity. From an eTOM point of view, one could argue that IMS does indeed cover the Fulfilment part of service management, but lacks any means to carry out service Assurance. The eTOM framework proposes a complete set of hierarchically layered processes describing all operator activities in a standard way. It is furthermore sustained by a parallel specification of a standard information model, the SID (Shared Information Data) (TMF GB926 Release 4, 2004). It has to be noted however that both tools, the eTOM and the SID, are generic. Also, the eTOM has been designed at times when Services were viewed as centrally controlled and managed, whereas the IMS is really a distributed layer network.
The work presented in this contribution is an attempt to achieve Assurance functionality for QoS-enhanced IMS services following strictly the eTOM specification, thus filling the functional gap as analyzed earlier; furthermore, two architectures are proposed to be compared: a centralized one and a distributed one.

IMS and service provisioning
The composition of the supply chain in NGN network is classically described with three layers. The access layer provides IP (v4 or v6) connectivity regardless of the access technologies (Wireless or Wire-line). The service layer therefore supports technology-agnostic services that are developed independently. The core layer i.e. the control layer is the IMS system which provides the complex signalling responsible for routing sessions between users, invocating services and security-related tasks ( Figure  1). The information processing and management are carried out by nodes called CSCF (Call State Control Function) and HSS (Home Subscriber Server). The IMS system introduces a control environment similar to the CS session (Switched Commutation) but in CP (Packet Commutation). In addition to access unification and diversity of services, IMS introduced a flexible and capable QoS management architecture which organizes exchanges of QoS-related requirements between the control and access layers, allowing resource reservation mechanisms to offer best conditions of supply for e.g. multimedia services.
The service provisioning mechanism of IMS includes three consecutive steps impacting resources: Reservation, Activation and Release (Figure 2). When a user requests an IMS multimedia service by SIP signalling (Rosenberg et al., 2002) through its attached P-CSCF, the P-CSCF, before forwarding this request must ensure resources availability; this verification is performed through the exchange of Diameter (Korhonen et al., 2010) messages during all media negotiation stages between the two ends (User and AS). An agreement between the client and server can finally lead to change the resource status from reserved into activated. Naturally the PCEF (Policy and Charging Enforcement Function) (3GPP TS 29.210, 2006) applies the relevant QoS policy related to the types of access and transport layers; the most used models are DiffServ (Blake et al., 1998), RSVP (Wroclawski J., 1997) and MPLS (Le Faucheur et al). The resources release is carried out at each end of session; the P-CSCF must announce to the PCRF (Policy and Charging Rules Function) (3GPP TR 23.803, 2005) the end of the multimedia session, and the PCRF notifies the PCEF in order to release reserved resources for other applications. QoS management in IMS is a quite flexible on demand mechanism.

eTOM (enhanced Telecom Operations Map) architecture
The eTOM is a framework proposed by the TeleManagement Forum and provides a standardized telecom-oriented Business Process map covering all functions of an operator, including service integration and supply. The decomposition layers and functional areas (Customer Service, Resource, and Enterprise) allow detailed operation analysis and to develop solutions according to a well-defined environment. The eTOM has been standardized by the ITU-T (TeleManagement Forum GB921 D, 2010).
The eTOM in its operational part has three main areas: Fulfilment, Assurance and Billing. This section will present only processes related to Assurance, and insist on execution scenarios of SLA (Service Level Agreement)-enhanced services.

eTOM processes
The 'Operations' area is the traditional heart of the business or service provider (SP). It includes all processes that support client (and network) operations and management. It includes a combination of processes and actions of customer support, including management, provisioning and relationships with partners ( Figure 3). The horizontal and vertical processes groupings constitute a matrix formed by a crossing of several processes from level 2, many being derivatives of TOM, which are connected to customer and support operations (FAB). A more detailed view of the eTOM business process model (ITU-T Recommendation M.3050.3, 2004) shows a grouping of vertical processes called the FAB columns. These processes are necessary to support operations dedicated to customer satisfaction and operator management: -Fulfilment: Vertical grouping of E2E processes which provide requested services timely and accurately to customers. It reflects business activity. The processes inform customers of their order status, ensure completion on time and customer satisfaction. -Assurance: A group of vertical E2E processes is responsible for implementation of proactive and reactive activities of maintenance to ensure that services are always available and delivered correctly with respect to the SLA. The processes continuously monitor resources status and performance in a proactive way to detect possible defects. They collect performance data and analysis to identify potential problems. In case of trouble or SLA violation, relevant processes are activated to inform the client about service and trouble status, and to attempt restoration or repair. -Billing: This grouping of vertical E2E process is responsible for collection of appropriate user records, and production of accurate and timely bills, to provide information on resources and services used for payment processing of the customer. In addition, it handles requests from clients on billing, indicates billing status and investigation, and is also responsible for resolving billing issues with respect to customer satisfaction. These processes also support processing of services prepayment.
In addition to the FAB process columns, the Operation area proposes horizontal process groupings: -Customer Relationship Management (CRM): this group of processes supports knowledge of customer needs and includes all necessary features for acquisition, improvement and maintenance of a relationship with a client. It focuses on service and support, and also on retention management, cross-selling, up-selling and direct marketing. CRM also collect customer and applications information, and customization of service delivery to customers. The processes are responsible for identifying opportunities to increase customer value in company. CRM applies to traditional interactions between client and enterprise. -

Service Management & Operations (SM&O):
This group is focusing on services (access, connectivity, content, solution, composition, etc.). It includes all necessary features for management and operations of communications and information required by or proposed to customers. The focus is on service delivery and management of network and information technology. Some functions involve short-term capacity planning service for a service instance, applying a service design to specific customers or managing service improvement initiatives. These functions are closely related to actual experience of customer. The processes in this group are responsible to meet, at a minimum, QoS goals including performance processes and customer satisfaction with service levels and service costs. ). This group is responsible for managing all these resources (e.g. networks, computer systems, servers, routers, etc.). It is used to provide and support services required by or proposed to customers. The group also contains all features responsible for direct management of these resources (network elements, routers, servers, etc.) used in business process inside operator. These processes are responsible for ensuring that network infrastructure supports an E2E services provisioning. The processes ensure that infrastructure works perfectly, and is available on services and needs and managers. The R&O group also has a function that allows collection of information from various sources (e.g. network elements (NE) and/or management systems elements (EMS)), and integrates, correlates and in many cases, summarizes data to be transmitted as information relevant to the service management system. This group also includes processes involved in traditional management of network elements (NEM), because these processes are actually essential elements of any process of resource management. RM&O processes thus manage the network service provider and overall infrastructure to ensure reliable interaction with other service providers. -

Supplier/Partner Relationship Management (S/PRM):
This process group supports all FAB business processes: Fulfilment, Assurance and Billing. The processes include issuing requisitions and monitoring them until delivery, mediation of requests that must conform to external processes, validating billing and authorizing payment, as well as management quality of suppliers and partners. When an operator sells its products to a partner or supplier, this is done through the CRM business processes, acting on behalf of the supplier or enterprise in such cases.

System Information & Data (SID)
Naturally the exchange of information between processes is crucial in the eTOM. The detailed specification of the information supporting the eTOM is provided by the SID informational framework (Figure 4). The SID provides an information model capable of interpreting dynamic and static information of business processes and respects the decomposition of the eTOM. The SID specification uses extensively UML class diagrams.

Execution workflows in the eTOM
The eTOM flows during execution scenarios of SLA-monitored service deliveries describe interactions between business processes as well as the information messages that are exchanged in order to handle both cases: the normal execution and the SLA violation.

Normal execution
The normal execution is a normal state of service delivery without SLA violation and the customer will be billed according to services offered and resources reserved. The operation activates a set of processes and many messages are exchanged between them; the SLA verification requires a mapping between Key Performance Indicators (KPIs) and Key Quality Indicators (KQI) related to service and resource instances.
The SLA verification activates a number of separate processes ( Figure 5) which are able to assess QoS according to their positions in the different layers: Customer, Service and Resource. www.intechopen.com The SLA verification involves following processes: -Resource Data Collection & Distribution: this process is responsible for the collection of indicators and performance data by contacting all resource agents that provide monitoring, configuration and performance data. The process is also responsible for collecting performance indicators (KPIs) and metrics for all services running in the network. It allows furthermore redistribution of performance data to other processes after aggregation and structuring. -Resource Performance Management: this process reports collected KPIs after filtering and aggregation. The reports provide a structured view of KPIs and a preliminary detection of exceeded thresholds. -Service Quality Management: this process performs a mapping between KPIs and KQIs; it identifies for each service its quality indicators (KQIs) before determining appropriate actions to be performed to calculate them. KQIs values are used to identify failures causes of QoS degradation such a resources failure or lack of capacity in SLA violation. -Customer QoS/SLA Management: is responsible for checking SLA thresholds against measured QoS. After retrieving the KQIs from the Service Quality Management processes and receiving a preliminary report, the process imports the customer profile and SLA parameters to identify thresholds for comparison. It also manages reports of management systems and provides a comprehensive report on the service (metrics, KQIs, key performance indicators, resource use, etc. ...).
The workflow of the SLA verification consists of following steps: 1. When a client requests an IMS service (eg video streaming VoD), the provisioning or "ordering" operation activates all agents in the network to monitor performance indicators and retrieve their values in log files. 2. Resource Data Collection & Distribution retrieves KPIs and metrics collected from different entities in the network. Afterwards, it communicates with the RPM (Resource Performance Management) to identify the existence of critical values and generate performance reports. 3. The performance indicators KPIs collected are sent as XML to Service Quality Management, which identifies indicators KQIs and realize mapping function, and comparing with thresholds are specific to requested service. 4. The Customer QoS / SLA Management uses the loaded profile of customer to identify product thresholds to apply to data collected prior to drafting of audit report of SLA against QoS. 5. The process Billing & Collection Management performs charging functions and taxation with received information to make bills.

SLA violation
The SLA violation scenario begins with a simple verification as above, but in this case a threshold violation occurs. In this case the eTOM provides an escalation mechanism: first, the Resource Layer attempts to solve the problem locally, while warning the Service Layer in order to plan alternative solutions. If the trouble persists, Service processes must perform an alternative service configuration produced by an A real-time continuous monitoring of provided services allows early alerts concerning exceeded thresholds and resource failure alarms, which are main causes of violations and SLA unconformity. Most interactions occur within Assurance processes, but interactions are also concerning the Fulfilment processes, and violation is considered for reimbursement through the Billing processes.
Two specific processes handle the escalation mechanism depicted above: the Service problem Management and Resource Trouble Management processes ( Figure 6). The goal of these two processes is to perform a restoration of services and resources in short time, and to locate troubles before their expansion, with an optional notification to the user.
The operation is initiated by a usual collection of data by the RDC&P process when detecting and exceeded threshold. The process sends relevant information to RPM to alert the RTM process; in case of a component failure the communication is done directly between RDC&P and RPM.
The RPM process sends details to the Service Quality Management (SQM) and to the RTM process, depending on the type of trouble, trying to start procedures for resource restoration; for each attempt it notifies the Service Problem Management (SPM) process to synchronize their information about troubles (Figure 7). If the cooperation between the Service Problem Management and the Resource Management Trouble processes is unsuccessful, the SC&A process will be activated to perform its own corrective action, such as a new configuration. The new configuration will take into account all resource constraints and infrastructure development and service contract terms.
The reconfiguration proposed by SC&A follows exactly the steps of the Ordering operation, and is finalized by launching normal SLA verification, and tries to close all open troubles reports in SPM and RTM. The CQoS / SLAM process can inform the customer about service restoration and quality with the possibility of sending a QoS report.

Issues
3GPP specifications provide a basic QoS management architecture for the IMS network which ensures an adequate level of service compared to best effort service. However, the IMS services need to be monitored and managed by a set of mechanisms and methods taking into consideration constraints of the business enterprise. Such a set is explicitly proposed by the eTOM. The eTOM describes its operations and processes in ways that are generic and applicable to any transaction and promises to be fully applicable to the IMS architecture with no applicability constraints. The next step of the study is therefore to plan a mapping strategy in order to map eTOM processes to IMS functions.

Functional architecture
A first step in this undertaking is to match IMS functionality with eTOM processes. The resulting set has furthermore to be enriched by eTOM processes relevant to Assurance and Fulfilment. This broader set forms the basis to select different SID entities necessary to carry out these processes. The SLA execution procedure as defined in eTOM model requires the cooperation of several processes belonging to Assurance and Fulfilment of the 'FAB' area, and spanning the three business layers: Customer, Resource, and Service. These eTOM processes will be activated sequentially (Figure 8).

Fig. 8. eTOM and IMS interactions.
The processes belonging to the Assurance layer correspond to the monitoring aspect of this operation, related to Fulfilment for restoration and supply. In order to link eTOM processes to the IMS network, a new component entitled Monitoring, Configuration, Data Collection is required, which clusters the core modules to communicate with these entities.
In the IMS network, the diversity of entities and their various communication protocols require multi-protocol components which can implement all the necessary monitoring and correction operations. An additional constraint is that performance data collection and detection of services should be executed in real time or near real.

Design
The WSOA (Web Service Oriented Architecture) appears as a valid choice for such a distributed system. The SOA (Mark and Hansen, 2007) concepts will allow to implement EJB (Rima, Gerald, and Micah, 2006) based SOA modules supporting the processes of each component, exposing web services communications via XML/SOAP (Simple Object Access Protocol) /HTTP (Newcomer E., 2002). Three SOA modules have been designed, each of which supporting a part of the targeted eTOM business processes and their associated SID entities. In addition, a BPEL (Business Process Execution Language) (Poornachandra, Matjaz, and Benny, 2006) component has been designed to orchestrate the various processes and to organize the desired operations ( Figure 9).

Centralized architecture
The initial architecture is centralized and enables a selective monitoring of consecutive operations related to SLA and it verification. This system allowed demonstrating the steps of the verification operations, the different KPIs and KQIs of service, and some operational limitations (Raouyane B. et al., 2011).
In order to simplify SLM&M, the number of exposed web services has been limited to eTOM level 3 business processes. Naturally processes of level 4 are implemented via appropriate methods within web services. The Application Server agent scans Application Server activity in order to identify the customer parameters ; - The Router agents perform network analysis tasks in order to calculate KPIs that will be transmitted to SOA modules

Distributed architecture and continuous monitoring
To reduce SLM&M complexity, an enhanced architecture proposes to split the RDC&P into many smaller distributed and decentralized components. Additionally, continuous monitoring functionality has been added to the system ( Figure 11). However, the centralized web services that allow KPIs and KQIs monitoring are still relying on BPEL technology and still are very resource and time consuming. Indeed, the distribution of the Resource processes allows not only to share processes of KPIs but also to evaluate services locally. Thus, the distribution of EJB modules becomes necessary to incorporate mechanisms for monitoring locally but also to allow a local correction of QoS degradation and anomalies.
The new functional architecture of SLM&M ( Figure 12) consists therefore of two main modules: -Assurance Layer: represents the SLA verification process as defined in eTOM in both layer Customer and Service. Thus, processes that are related to operations Fulfilment and Assurance, and the information and data is stored in Customer Inventory and Service Inventory. -Monitoring Layer: • Is distributed, and contains a set of agents and probes that are able to recover all data in real time (signalization, logging, reservation, configuration, policy, routers status, etc. ..) and implements all Resource layer processes for SLA verification: Resource Data Collection & Processing (RDC&P) and Resource Performance Management (RPM), which are related to each IMS layer (Access, Control, Service).

•
Contains a set of processes that are functional in ordering and other SLA operations of WS-Resource. Also, a synchronization module that is necessary for detection and control of events in the network, such as planning activities and communications on one hand between the distributed modules (first part) and also between Web services exposed (Layer). www.intechopen.com The proposed functional architecture supports three communication channels between different modules: -TCP / IP between agents and the synchronization module, -SOAP / XML between layers of eTOM (Resource, Service, Customer) or WSs -With ability to use XMLCONF between the synchronization module and management in case of SLA violation.

Correction architecture
The architecture of SLM& M consists of two layers: Assurance and Monitoring, by analogy with the previous architecture (distributed). The Monitoring layer is distributed and contains only two main processes of collection and processing of information.
New processes must be integrated in a centralized way; a distributed integration can overload collection agents in routers and. For example in a router memory is crucial,; collection agents and processes DRC & P and RPM are reasonable for just performance collection and data local treatment. However, the addition of another process could overload the router that needs its capacities for traffic conditioning and processing.
The Resource Trouble Management process (RTM) catches alarms that reflect a degradation of service resulting from a physical or logical related to equipment; this process then tries to make a preliminary correction of the service and notifies WS-Service.
The WS-synchronization process is located in the same server as WS-Resource, so that this server can synchronize incoming events and data collection, and decide either to perform a normal SLA verification, or to report a violation. Also, the Resource Provisioning process is responsible for making resource reservations with respect to solution recommendations provided by WS-Service. WS-Service adds to its repertoire Service Problem Management (SPM) processes.
The interaction between WS components of SLM&M is through SOAP/HTTP, whereas the interaction between Monitoring layer of SLM &M and IMS layers uses Java-based client / server communication, with a spare possibility of using XML/RPC (Mi-Jung et al., 2004) between the (Resource, IMS ASs) and WS-Resource modules.

Implementation and results
The implementation encompasses three fundamental components: -The IMS network for service delivery: control entities (CSCF, HSS) and Application server for video streaming VoD. - The QoS management: PCEF and PCRF. - The Monitoring and managing System: SLM&M.

Trial infrastructure in SLA verification
The trial architecture exposes the function of each component.
The test bed is composed of ( Figure 13): www.intechopen.com -A core router and two edge routers (Linux boxes) defining a DiffServ-enabled network on which are connected an IMS terminal ad an Application Server; -This network is controlled by the OpenIMS (Open IMS Core) system which is deployed in the core router Linux box; -A management server supports the QoS monitoring/Assurance functionality. Fig. 13. Centralized trial infrastructure.

Trial infrastructure for SLA violation
The implementation architecture features two sub-architectures for the service provisioning and for service management and correction services.
The Supply Architecture which contains an IMS network that includes both the signalling and the media planes. The architecture includes three routers to transmit the media stream; a central router supports the IMS system. The PCRF (Policy and Charging Rule Function) is becoming an autonomous entity and includes other features such as policy management, and both edge routers include the PCEF (Policy and Charging Enforcement Function) functionality to receive and execute policies or PCC rules (Figure 14).
Monitoring and management architecture: SLM & M is divided into two layers -Monitoring Layer: contain the two WSs Resource and Synchronization, with the integration of RP and RTM processes and Resource Inventory, so the layer includes the functionality of PCRF for QoS management and control.
-Assurance Layer: contain both servers and WS-Customer and WS-Service, and integrate process and Fulfilment and Assurance, that will be activated in SLA correction and violation.
The supply architecture provides a set of IMS services, when a client requests a VoD streaming service, the provisioning chain stimulates IMS entities to provide a resource reservation and QoS management. The SLM&M in supply, after ordering operation, start collection of configuration data for a normal SLA execution. Anomaly detection or exceeding threshold causes an activation of SLA violation processes for restoring service into normal level.
The SLM&M must be reactive by rapid detection of QoS degradation or anomalies, followed by an attempt to resolve troubles, that activates the Assurance process and if necessary the Fulfilment process.

Scenarios
The test scenario includes three cases, a customer Alice with Platinum class that requires a VoD service: -The client receives service with perfect QoS; -An overloaded network with a slight QoS degradation; -Network Congestion and violation of SLA.

Results
The results expose several parameters relating to monitoring service and performance of SLM&M in centralized and distributed architecture, and response time in trouble detection and SLA violation.

SLA verification in centralized architecture:
Case 1: The QoS offered to Alice and Bob matches SLA contract, perceived video quality is satisfying (Figure 15). Case 2: the network conditions, hence video quality, deteriorate proportionally to mass of competing services for lost packets and reduced flow rate ( Figure 16). Case 3: competing services overload the routers: queues fill in gateways, impacting delay and jitter. Routers discard packets in excess, this causes static pixels in video ( Figure 17) and in some cases service cancellation.
www.intechopen.com The platform succeeds in identifying accurately deterioration of delivered services. The cost in terms of response time has been evaluated as well. It is observed that response time for Resource-WS is much longer than for other web services, due to complexity of its tasks (Figure9, 10). The number of web services and their internal functionality has a considerable impact on running time of SLA verification. This led to limit the exposed eTOM processes to level 3 and to implement sub processes via internal java methods.

Centralized vs distributed architecture
The execution time in SLA verification is composite, and is directly related to processing time in each WS. This time varies depending on the number of planned operations, WS state and SLM&M conception. Similarly, nature of communication technology between entities plays a vital role in reduction of complexity and processing, which highlights the advantage of using TCP /IP for exchange parameters of service and performance indicators and transmission at higher levels in order to achieve continuous monitoring.
The response time of WS-synchronization and other agents resource is short compared to WS-Resource in SLA verification, because of the processing performed locally in each device, that has a potential to reduce traffic between Monitoring and Assurance layers, as well as it reduces response time compared to centralized architecture in tree cases ( Figure 19). The processing of parameter flow in entity level allow a real-time control of multiple QoS and services, however a router must have enough memory for traffic conditioning, although using control function as treatment and comparison with thresholds can reduce its capacity in terms of CPU and memory.
The distributed architecture provides a Monitoring layer that alerts Assurance layer within a very short time or near real time, and allows rapid processing of SLA violations, compared to another architecture where several treatments must be executed to detect degradation QoS or failure of an entity.

SLA violation
Alice has registered in the IMS system with QoS classes Gold. The goal is to perform SLA Assurance tests in three representative cases and to compare results for SLA correction with Assurance and Fulfilment.  The MOS-AV indicator reflects customer satisfaction. When detection thresholds are exceeded or values of MOS-V become critical, SLA violation has launched the first attempt with confidence and was successful in restoring normal levels of service after 7 seconds. The second violation that requires intervention of Assurance & Fulfilment takes 17 seconds to restore the service. These results are justified by the architecture that used WSOA and interaction between different WSs and attempted solutions.

Conclusion
The proposed approach is based on the QoS provisioning architecture proposed by 3GPP with eTOM Assurance capabilities of QoS monitoring. SLM&M uses the new concepts of SOA and BPEL in managing and monitoring network and also must meet several constraints of instrumentation. The first version of the platform was centralized to address all performance data in a central node, this design offered SLA verification but still remained isolated from IMS network and real events. However the IMS network requires permanent and real time monitoring rather than just a sporadic SLA verification.
The solution to distribute Resource layer and processes of the eTOM and their adjustments to each layer of the IMS (Access, Control, and Service) is appropriate, as the creation of a WS-Synchronization which synchronizes operations process WS-Resource and their agents in network layer, in addition provides operations in the monitoring layer of SLM & M. The distributed architecture demonstrates its ability in terms of response time and is preferable to a centralized SLM&M.
The life cycle of SLM&M has three main stages: a real-time monitoring of services and resources to detect anomalies or degradation, followed by a stage of responsiveness to correct troubles, and the final step is to be proactive in order to estimate the behaviour of service and resource by correlation and root cause analysis of service impact. The proactive property will be integrated with QoS mechanisms that predicted from current data, a mathematical model or stochastic processes that come into the perspective. This book guides readers through the basics of rapidly emerging networks to more advanced concepts and future expectations of Telecommunications Networks. It identifies and examines the most pressing research issues in Telecommunications and it contains chapters written by leading researchers, academics and industry professionals. Telecommunications Networks -Current Status and Future Trends covers surveys of recent publications that investigate key areas of interest such as: IMS, eTOM, 3G/4G, optimization problems, modeling, simulation, quality of service, etc. This book, that is suitable for both PhD and master students, is organized into six sections: New Generation Networks, Quality of Services, Sensor Networks, Telecommunications, Traffic Engineering and Routing.

How to reference
In order to correctly reference this scholarly work, feel free to copy and paste the following: