Keywords

1 Introduction

Virtualization has been widely adopted in constructing the infrastructure of cloud computing platforms for its features such as flexibility, elasticity and ease of management. It provides the advantages of enabling multiple virtual machine (VM) to run on a single physical server to provide cost-effective consolidation, while preserving logical isolation of these co-located VMs. However, for network intensive applications, such as HPC MPI applications, Hadoop MapReduce jobs, and Memcached processing, high density network traffic usually generates heavy workload stress for I/O data moving and I/O event processing across the VM boundaries [1,2,3]. Inter-VM communication inefficiency is witnessed as a known problem for virtualized cloud environments, even the communication overhead between co-located VMs can be as high as that between VMs residing on different hosts [4,5,6,7].

Recent research advances in network I/O virtualization has focused on improving inter-VM network communication performance through SDN and NFV [8, 9]. Representative technologies include SR-IOV, Virtio and DPDK [10,11,12,13,14]. They improve the communication performance either by reducing overhead between the VMs and the physical device or by supporting direct access to the packets on NIC without going through OS kernel. They cannot bypass the network stack and are not designed to mitigate the overhead of inter-VM communication between co-located VMs.

With current trend towards more cores and increasing computing capabilities per physical server, the opportunity for more co-located VMs deployed on same physical host is growing. Researchers have proposed approaches to reduce the latency and enhance the bandwidth of communication between co-located VMs. Among these approaches, residency aware inter-VM communication optimizations have been made to enable co-located VMs to transfer data via pre-established fast shared memory channels, bypassing traditional long network I/O path for VMs residing on separate physical hosts [2, 14,15,16,17,18,19,20,21,22,23]. If the VMs are co-located, they will transfer data via shared memory based channel (in local mode), otherwise via traditional network path (in remote mode).

However, co-located VM membership are evolving due to dynamic addition of new VMs or the removal of existing VMs caused by events such as VM creation, VM migration in/out and VM shutdown on current host machine. Moreover, with the requirements of live migration enabled load balance, fault tolerance or higher resources utilization based VM replacement and deployment, the frequency of such events is even increasing. Furthermore, VMs are by design not aware of the existence of one another directly due to the abstraction and isolation support by virtualization, which makes it difficult to determine whether two VMs are co-located.

There are two categories of methods to maintain co-located VM membership in a deterministic way: static and dynamic. The static method collects or registers the membership of co-located VMs during system configuration time prior to runtime [18, 19]. In contrast to static method, dynamic method automatically detects the changes of co-located VM membership and updates the membership transparently and dynamically [2, 15,16,17]. For Xen, dynamic polling method emerged, which is asynchronous and needs centralized management by the privileged domain (Dom0), represented by XenLoop as a typical implementation [7]. XenLoop periodically gathers co-located VM membership from VMs residing on the same host. The co-location information is not be sent to VMs locating on the same host until next polling cycle after the membership changes, which introduces delayed updates and may lead to some level of inconsistency. Moreover, whether the membership changes or not, the overhead for the polling is inevitable. And the overhead for periodical status collection rises along with the increasing number of co-located VMs.

In this paper, we focus on Xen based approaches. We propose CoKeeper, a dynamic event driven method for co-located VM detection and membership update. We design CoKeeper with an attempt to meet following criteria that we believe are required for residency aware inter-VM communication optimizations:

  • Fast response: the detection and update of the co-location should be synchronous and the co-located VM membership refreshing is immediately visible upon the occurrence of VM dynamic addition or removal with acceptable response time, to avoid errors that may arise due to delayed update.

  • Low overhead: the detection of VM addition or removal is supposed to consume less computing and network resources than existing methods. Thus the protocol for co-located VM membership update should be designed and implemented to work in a fashion of low CPU and network overhead.

  • Transparency: neither intervention from the cloud provider nor application level programming efforts from the cloud tenant are required. Neither the applications nor OS kernel need to be modified to incorporate the proposed mechanism.

  • VM scalability: due to strengthened hardware and enhanced software stack, the number of co-located VMs deployed on the same physical server is increasing. The proposed method should keep stable in overhead while the number of VMs deployed on the same host rises.

To demonstrate the ability of CoKeeper, we implement a prototype system of it to satisfy above desirable criteria. For CoKeeper, no user level intervention is required. It is user level and OS kernel transparent. And it does not require centralized involvement of Dom0 and provides fresher and more consistent VM co-location information. In contrast to polling approach with fixed polling cycles, CoKeeper is less CPU consuming, and its communication overhead is low and stays stable while the number of co-located VMs increases. We implement CoKeeper on Xen as an integrated part of our XenVMC, a residency aware inter-VM communication accelerator for virtualized clouds.

The rest of the paper is organized as follows. Section 2 analyzes existing work. Section 3 discusses several key issues for designing and implementing a dynamic co-located VM detection and membership update mechanism. Section 4 presents our implementation. Section 5 demonstrates the experimental evaluations and the results. Section 6 concludes the paper.

2 Related Work

2.1 Detecting VM Co-residency with Heuristics for Cross-VM Side Channel Scenario

VMs in virtualized clouds offers large scale and flexible computing ability. However, it also introduces a range of new vulnerabilities. Malicious entities in an attack VM can extract sensitive information from other VMs via cross-VM side channel attacks, which breaks the isolation between VMs. The primary goal of security oriented heuristic methods is to ascertain if a VM is co-located via carefully designed measurements imposed on network traffic, CPU cache, etc., over monitored cross-VM side channel by extracting characteristics such as workload or time consumed [24,25,26,27].

HomeAlone is an initial representative work of using side channels to monitor for co-located attack VM [24]. It exploits side channels via the L2 memory cache to detect undesired co-residency as a defensive mechanism. The cloud tenant coordinates its collaborating VMs so that they keep selected portions of cache silent for a period of time and measure whether it has been accessed by other VMs during the resulting quiescent period, which suggests the presence of a potential attack VM which is sharing the same host. From the perspective of a malicious VM, achieving co-residency with a victim VM in the cloud allows it to launch various side-channel attacks that target information leakage. [27] undertakes extensive experiments on EC2 to explore the ways and the effectiveness with which an attacker can achieve co-residency with a victim VM. It discovers timing based side channels that can be used to ascertain co-residency. [25, 26] propose co-located watermarking, a traffic analysis attack that allows a malicious co-located VM to inject a watermark signature into the network flow of a target VM. It leverages active traffic analysis by observing data such as the throughput of traffic received to determine VM co-residency.

For heuristics methods, the intervention of cloud provider or application level programming efforts from cloud tenant is required. Instead of maintaining the membership of all the co-located VMs, the detection is conducted only between the attacker and the victim. And it usually takes more than a few seconds to get the analysis results. For high accuracy detection, even more time is required. High overhead and probabilistic inaccuracy prevent this category of approaches to be used into residency aware inter-VM communication optimizations.

2.2 Co-located VM Detection and Membership Update for Shared Memory Based Inter-VM Communication

For existing static co-located VM detection methods, co-location information is prefigured [17, 18]. And no membership update is allowed during runtime, unless the static file with membership information is modified and the co-located VM list is updated manually via extended API. XWAY refers to a static file that lists all co-located VMs. IVC initially registers the co-location information in the backend driver of Dom0 in a static manner. Since static method is not user level transparent and is not capable of automatically detecting runtime co-located VM membership update, dynamic method emerged [7]. For Xen, the update of co-located VM changes can be detected by examining isolated domain information provided by XenStore [28]. XenStore is a configruration and status information storage space hosted by Dom0 and shared by all do-mains. The information in XenStore is organized hierarchically as a directory in a file system. Each domain gets its own path in the store, while Dom0 can access the entire path. The domains can be notified of the status changes or updates in XenStore by setting watchers on items of interest in the hierarchy.

In previous work, only XenLoop and MMNet support dynamic method. XenLoop provides a polling approach by which a domain discovery module in Dom0 periodically scans all guest entries (items) in XenStore, where each entry represents its state. Then the module advertises the updated information to all the co-located VMs covered by the existed entries and the VMs update their local co-located VM lists. MMNet does not require a coordinator in Dom0. Each VM on the physical node writes to the XenStore directory to advertise its presence and watches for membership updates. When VM addition, removal or migration occur, the IP routing tables are updated accordingly. Technical details of MMNet are not available.

Different from heuristic methods for security scenario, existing work for shared memory based Inter-VM communication achieve lower overhead and can check co-location with deterministic results. And they usually provide the protocols for co-located VM detection and membership update for all the VMs on a physical node instead of only for two or a subset of the co-located VMs.

3 Design Issues

3.1 Design Goals

VM live migration support is an important capability provided by Xen platform. When VM migration occurs, if co-located VM membership is not updated accordingly with fast response, race conditions may occur. For instance, if VM1 on host A is migrated to host B and its communicating peer VM2 on node A is not notified about the change, VM1 will still try to communicates with VM2 on host A via the previously established shared memory channel, which will lead to connection failure and shared memory release error since VM1 is no longer present on host A. Similarly, no matter for guest VM addition or for running guest VM removal, it is also necessary to ensure that the co-location information is made visible to other VMs on the same physical node immediately after the event occurs and is kept up to date.

The primary goal of our design is to achieve fast response upon events such as VM addition, removal or migration, to avoid possible inconsistency caused by race conditions due to untimely perception and handling of co-located VM membership changes. All the operations for status collecting and update are supposed to finish in milliseconds. Low overhead is another goal. To reduce resource consumption and to avoid race conditions, we propose to employ the idea of actions on events in our design. Thus the co-located VM membership changes will be caught once they occur by the watchers set on XenStore items. Transparency is an important feature from the viewpoint of system administrator and end users. We believe that by encapsulating expected functionalities into loadable OS kernel modules, instead of putting them at user level, less overhead of data exchanging between user applications and OS kernel will be introduced. Another design goal is to ensure VM scalability. With polling method, it is Dom0 who gathers co-located VM membership from all VMs one by one, which introduces nearly linear overhead proportional to the number of co-located VMs. We need to keep the mechanism work stably even when the number of VMs deployed on the same physical node changes from small to large.

3.2 Analysis of VM Membership Modifications

xl is a representative tool provided by Xen for the administrator to manage VMs. It provides a set of console instructions with parameters that affect the existence status of operated VMs. We analyze xl management commands over the VMs and find that the commands which could change a guest VM’s existence status can be semantically converted into three basic operations or their combination: VM addition, VM removal and VM migration. We summarize the substitution in Table 1.

Table 1. xl commands on guest VMs and their semantic substitutes.

3.3 Event and Message Based Light Weight Protocol

Upon the occurrence of VM existence status changes, actions are supposed to be invoked directly or indirectly in Dom0 and DomU to react to and propagate such changes accordingly by exchanging information and triggering related handlers in a collaborative way. However, there is no available protocol in Xen for such synchronous processing. Therefore, we propose CoKeeper, an event and message based lightweight protocol, which serves as a fundamental facility to exchange residency status for dynamic co-located VM detection and the membership update.

CoKeeper enables fast reaction and information propagation both inside and across the boundaries of VMs. Events and messages are different in that events are only visible to the VM where it is created while messages are transmitted between VMs. When a VM existence status change is captured, an event will be created directly inside a guest domain or the privileged domain, which will be captured by the VM where it is locally created, leading to operations defined in the handlers and therefore invoking message exchange across the boundaries of VMs and accordingly generates corresponding events in related VMs indirectly. Different from Xen event channel, CoKeeper is light-weight and does not lead to context switches across VMs and virtual machine monitor (VMM).

3.4 Co-located VM Detection and Membership Update: Overview

As an indispensable part of XenVMC kernel modules, CoKeeper enables XenVMC to be aware of dynamic co-located VM existence changes and to update the membership adaptively to the changes. The overview of CoKeeper and its blue-colored components is shown in Fig. 1. CoKeeper’s interactive control flows spanning in the host domain and guest domains are also illustrated. The data delivery path and other control flow from a sender VM to a receiver VM by XenVMC are also given, with both its backend and frontend kernel modules concerned. With watchers registered on specific items in XenStore, CoKeeper can detect the membership changes. Each domain maintains a local co-located VM list of the membership. With the self-defined event and message protocol, CoKeeper refreshes the list for each VM dynamically.

Fig. 1.
figure 1

CoKeeper in XenVMC: overview.

Co-located VM Detection.

One of the key issues for co-located VM detection is to catch co-located VM existence status changes. Therefore, we register watchers for related items of XenStore in both Dom0 and DomU. As shown in Fig. 1, to discover VM addition and removal, in the co-located VM existence detector, one watcher is registered on item “/local/domain/<Dom-ID>” by Dom0. This item is only accessible by Dom0 and serves as a monitor of the guest VM’s existence. For VM live migration support, one of the key issues is to perceive the migration so that it is possible to prepare for the migration beforehand. Thus, in VM migration monitor of every DomU (shown in Fig. 1), we register a watcher on the item of “/control/shutdown” in XenStore since its value changes from null to “suspend” to indicate that the VM is going to migrate. To pause the original migration process for possible communication mode switches, pending data handling, etc., we unregister its original watcher and reregister it into the FIFO watcher list, so that it is invoked after the pre-handling.

Co-located VM Membership Update.

Since VMs are not directly aware of the existence of one another, so we use the proposed event and message protocol to update the co-located VM lists of VMs on the same physical node. The basic idea of the event driven approach is that: we define a series of events and messages, and implement their handlers; on the detection of VM co-location change events, the VM that discovers the change refreshes its local co-located VM list first, then it propagates invoked messages upon the events, the VMs that receive the messages update their local co-located VM list synchronously. If a VM is communicating with another guest VM, and it is going to migrate in or out of current node, then before the migration: i) pending data remained from former local/remote connections need to be handled properly to avoid data loss, ii) communication mode switches from local to remote, or vice versa, iii) release the shared memory if previous communication mode is local.

4 Implementation

To ensure the feature of high transparency, we encapsulate the functional components of CoKeeper into XenVMC’s frontend kernel modules in DomUs and its backend kernel module in Dom0. Since Linux kernel modules can be dynamically loaded into or unloaded from the kernel, CoKeeper can be seamlessly incorporated into the kernel on demand transparently as an integral part of XenVMC modules. We implement CoKeeper on Xen-4.6 with Linux kernel 3.16 LTS.

4.1 Events and Messages: Definition and Implementation

Dom0 and DomU play different roles in the processes of co-location VM status change detection and the membership update. We define the self-defined events and messages Dom0 and DomU separately as illustrated in Table 2 and Table 3.

Table 2. xl commands on guest VMs and the events created accordingly.
Table 3. Messages invoked and exchanged across VM boundaries.

Dom0 is responsible to handle three types of events and receives two types of messages as far as co-located VM membership is concerned. And DomU handles five types of events and receives four types of messages. Dom0 and DomU coordinate to propagate the status changes and update the co-located VM membership.

We implement event handlers and message processing mechanism in both Dom0 and DomU. For event handling, Dom0 and DomU exclusively maintain an event link list, which organized as a FIFO queue. The handlers are woken up by the actions of inserting new events into the event link. In addition, a kernel thread runs in either Dom0 or DomU when CoKeeper is enabled, aiming to handle events in the event link list. We implement the message mechanism in the XenVMC kernel modules to ensure user level transparency and fast message processing. We define ETH_P_VMC, a new Network Layer packet type in Linux kernel, through which messages are transmitted from Dom0 to DomU or conversely.

4.2 Event and Message Based Protocol: A Global Perspective

VM Addition.

For a guest VM addition, the initialization function of CoKeeper creates a VM Self-Joining event. Then the event handling thread in this DomU is woken up and generates a DomU registration message and sends the message to Dom0, which creates a DomU Registration event accordingly, which is dealed with by the handling thread in Dom0 by adding the new DomU into its local co-located VM list. Afterwards, Dom0 sends a VM Registration Response message to the new DomU, which contains the info of all the other co-located VMs and triggers the update of co-located VM list of the new DomU. Then Dom0 sends an Other VM addition message to all the other DomUs, with the information of the newly added VM. After receiving the message, these DomUs creates Other VM addition events and update their co-located VM list accordingly. The process is shown in Fig. 2.

Fig. 2.
figure 2

The process of co-located VM membership update: VM addition.

VM Removal.

Once a DomU is going to be removed, Dom0 will detect the status change of the item “/local/domain/<Dom-ID>” in XenStore by the watcher registered on it. Then Dom0 creates a DomU Deletion event. And the event handler removes the DomU from the co-located VM list and creates an Other VM Deletion message, which is sent to all the other co-located DomUs and the DomU to be removed is deleted from their co-located VM lists. Then all the other DomUs should deal with remained operations related to the DomU to be removed, such as transferring the remained data (if any) in shared memory channel and releasing shared memory buffer. The process is illustrated in Fig. 3.

Fig. 3.
figure 3

The process of co-located VM membership update: VM removal.

VM Migration.

The process for VM migration out is shown in Fig. 4. When a DomU is ready to migrate, the item “/control/shutdown” in XenStore for this DomU turns from null to “suspend”.

Fig. 4.
figure 4

The process of co-located VM membership update: VM migration.

The change will be caught by the watcher, which leads to the creation of a VM Self-Preparing to Migrate event in this DomU. Then the handler is woken up and creates a DomU Migrating message and send it to Dom0. Dom0 receives the message and creates a DomU Migrating event. Dom0 handles this event by broadcasting an Other VM Migrating message to all the other DomUs on this physical node and removing the DomU preparing to migrate from its co-located VM list. While other DomUs receive the message, they create an Other VM Migrating event, which invokes the event handling thread to deal with remained operations related to the DomU to migrate. And the VM prepared to migrate is removed from local co-located VM lists of all the other DomUs. The process of VM migration in is similar.

5 Evaluation

CoKeeper are evaluated in four aspects: 1) we evaluate the response time to verify the effectiveness of CoKeeper over polling based method, 2) we compare the CPU and network overhead of CoKeeper and polling based method under two circumstances, i.e., no co-located VM membership change, and membership changes due to VM addition, deletion, or migration, 3) we conduct experiments with real network workloads and confirm that CoKeeper is capable to handle co-located VM membership changes, with user level and Linux kernel transparency guaranteed, 4) we show that the overhead of co-located VM membership maintenance for CoKeeper is more stable when the number of co-located VMs increases than that of polling based method.

Among Xen based existing work, only XenLoop and MMNet support dynamic co-located VM membership maintenance. Since MMNet is not open source, XenLoop is used as a representative implementation of the polling based method for comparison. Note that we do not compare CoKeeper with heuristics based approaches for cross-VM side channel scenario. The reasons are that it often takes a few seconds to detect co-located VMs and the results come out in a probabilistic way, which make this kind of approaches not feasible for fast inter-VM communication.

The experiments are carried out in our two-server test bed. The larger server has 8 Xeon E5 2.4 GHz CPU and 64 GB main memory. The other one has an Intel i5-4460 3.2 GHz CPU and 4 GB main memory. All the data reported was run on Xen-4.6 and Linux kernel 3.16 LTS. To enable necessary comparison, we transplant XenLoop to the target Linux kernel and Xen versions without modifying their functionalities.

5.1 Evaluation of Response Time

We measure the response time between the event of co-located VM existence change happens and the point of time when the co-located VM lists of all the VMs on the same physical node are updated. For polling based method, the response time is a composition of three parts: i) the time from the event occurs to the beginning of next polling point, ii) the time for the detection of membership change, iii) the time to update the membership for all the co-located VMs. For CoKeeper, only the detection time and the update time are needed. The comparison is carried out mainly on the larger server, and the other server serves as a destination machine for VMs on the larger one to migrate out.

The first experiment is conducted under the scenarios that the number of co-located VMs varies from 1 to 10. Co-located VM membership change events are arranged in the sequence of VM1 addition,…, VM10 addition, VM10 removal,…, VM1 removal, VM1 addition,…, VM10 addition, VM10 migration out,…, and VM1 migration out. Since the process of VM migration in is similar to that of VM addition, we do not put VM migration in into the tests. We evaluate the response time by running the tests with above sequence for three times for both polling based method represented by XenLoop and for CoKeeper. During the tests, all the clock of VMs are synchronized with that of Dom0 and the synchronization accuracy is controlled below 100 ns. And the polling cycle of XenLoop is set to 1000 ms. Figure 5 shows the experimental results.

Fig. 5.
figure 5

Average response time of polling based method and event driven method.

Figure 5(a) shows that the response time of polling based method fluctuates from 100 ms to 900 ms when the number of co-located VMs ranges from 1 to 10. As presented in Fig. 5(b), CoKeeper’s response time is between 10 ms to 40 ms, which is more than 10 times lower than that of polling based method and more stable. The reason is that for CoKeeper, instead of collecting the membership changes periodically, the changes are immediately visible to all the other VMs with the synchronous notification approach.

We also evaluate the response time of polling based method with different polling cycle configurations. We set the cycle to 500 ms, 1000 ms, and 2000 ms respectively. Under each configuration, we randomly issue xl commands which lead to membership changes. Figure 6 shows the average response time of XenLoop and CoKeeper under the scenarios with 10 and 20 deployed co-located VMs. Each bar in Fig. 6 represents an average response time of 10 events under the same configuration.

Fig. 6.
figure 6

Average response time comparison.

From the experimental results, the average response time for polling based method is affected by the polling cycle since it must wait for next polling point to detect the events and thereafter to handle them. The average response time is about a half of the cycle, which can be improved if we set the polling cycle to a smaller value (it will bring other problems, such as higher overhead). The results show that the response time of CoKeeper is between 10–40 ms, which makes the update visible to its co-located VMs immediately after the event occurs. CoKeeper achieves much lower response time than polling base method. It is capable of avoid race condition as far as de facto frequency of co-located VM membership changes concerns. In addition, we find that event detection and message exchange contribute to most of the consumed time, while the time spent on event and message handling is only a few milliseconds.

5.2 Evaluation of the CPU and Network Overhead

We deploy 10 VMs on the larger server, each with 1 VCPU and 1 GB RAM. The we arrange two experiments, each lasts for 30 min. The first is for the scenario of no co-located VM membership changes, the results of which show that the CPU overhead of polling based method is higher than that of CoKeeper, and both are stable. And for network overhead, we count how many network packets have been sent during the tests. The results show that network overhead of polling based method is higher. The reason is that polling based method needs to check the status in the beginning of every cycle. However, event driven method only needs to initiate the co-located VM list when a VM is deployed and its shared memory based acceleration is enabled, no additional maintenance is needed when VM existence status does not change. The second experiment is for the scenario when co-located VM membership changes. For the 10 deployed VMs, the membership changes every 3 min. We randomly pick a group of changes in the sequence as illustrated in Table 4. We measure the CPU overhead and network packets for membership maintenance every 2 min.

Table 4. Co-located VM membership changes.

The results are shown in Fig. 7.

Fig. 7.
figure 7

CPU and network overhead: polling based method vs. CoKeeper.

The overhead of polling based method increases rapidly over time. From the 2nd min to the 30th min, the CPU overhead rises from 103 ms to 104 ms, and the network overhead goes up from 103 packets to over 104 packets. When co-located VM membership changes, the overhead of event driven method increases. The CPU and network overhead values are quite low, but not zero. Generally, for CoKeeper, the CPU overhead is a few milliseconds, and the network overhead is about tens of packets. This is because that the event driven method will not create events and messages until co-located VM existence status changes.

5.3 User Level and OS Kernel Transparency

We conduct a series of tests with VM addition, VM migration, and VM removal to validate if CoKeeper ensures user level and OS kernel transparency with four widely used real world standard network applications in Linux systems: ssh, scp, samba and httpd. For each group of tests, to enable the shared memory based inter-VM communication optimization, we load XenVMC’s frontend and backend modules into the kernels of guest or host domains of our two-server test bed.

First, VM1 and VM2 are deployed separately on the smaller server and larger server. The tests are as following: 1) login to VM2 from VM1 and operate with ssh, 2) use scp to copy files from VM2 to VM1, 3) startup samba server in VM2 and mount VM1’s shared directory in VM2, 4) httpd. Startup httpd server in VM2 and browse its html pages in VM1 with a Web browser. From the system logs, we see that the two VMs communicates in remote mode.

Then VM3, a newly added VM, are deployed on the larger server. And VM3 communicates with VM2 with the same sequence of operations, where ssh, scp, samba and httpd are used as application level workloads. And system logs indicate that they communicate automatically with local mode. During the communication, VM3 is live migrated from the larger server to the other server. The migration is detected and handled correctly without data loss. And the communication between VM3 and VM2 transparently switch from local mode to remote mode. After all the communication completes, we use scp to copy files from VM3 to VM1. The logs show that VM3 and VM1 communicate in local mode. Then VM1 is shutdown. From the logs, we find that the co-located VM lists of both the host domain and VM3 are updated accordingly.

In above tests, we use the original network applications and Linux kernels without code modifications. Experimental results show that CoKeeper is capable of maintain the co-located VM membership with user level and Linux kernel transparency.

5.4 Co-located VM Scalability

We measure the time spent for co-located VM membership maintenance under different numbers of VMs deployed on the same physical node to evaluate the correlation between the number of co-located VM and the overhead. The experiments are conducted on the larger server. The number of co-located VMs increases one by one in each experiment, with a random VM existence status change event for each experiment. If it is the event of VM removal or migration out that occurs, an additional VM will be deployed after this experiment to ensure the number of co-located VMs increases from 1 to 20 for the set of experiments as expected.

The results are illustrated in Fig. 8. For polling based method, from the detection of the event to the co-located VM lists of all the domains are updated, the time spent is almost linearly in proportion to the number of co-located VMs. The main reason is that for polling based method, the guest VMs cannot get the status of the host VM, and the messages generated upon each membership change must be sent sequentially. Thus, when the number of co-located VMs turns large, the guest VMs will spend a considerable percentage of time on collecting co-located VM membership information. The overhead can be mitigated by setting the polling cycle with a bigger value. However, this brings other problems such as larger response time.

Fig. 8.
figure 8

The comparison of co-located VM scalability.

Comparing with polling based method, the time for event driven co-located VM membership maintenance is much lower and more stable when the number of VMs deployed on the same physical node increases. The reason lies in that event driven method only reacts when co-located VM membership changes.

6 Conclusion and Future Work

We have present CoKeeper, a dynamic co-located VM detection and membership update approach for residency aware inter-VM communication in virtualized clouds. CoKeeper detects co-located VM membership changes synchronously and refreshes the membership immediately upon the occurrence of VM dynamic addition, removal, or migration with faster response time, compared with polling based dynamic method. Experimental results show that CoKeeper achieves much lower CPU and network overhead. Encapsulated as integral parts of XenVMC modules, CoKeeper can be seamlessly incorporated into the kernels transparently without modifying Linux kernel. And no extension and modification to current applications are needed. So legacy application can benefit from CoKeeper without code modification and recompilation. CoKeeper is more stable and introduces less overhead than polling based method when the number of co-located VMs scales. For the future work, we plan to apply our approach to a wider range of real-life software, and we are working on how to extend the approach to real situations such as environments with network failures.