Accounting Per-VM Resource Usage for I/O Activities in Virtualized Environment

Current cloud computing systems focus on processor utilization or the number of virtual machines allocated by each user for billing. However, the billing system based on processor utilization can be unfair when multiple virtual machines running on a single physical machine have different workloads. The Xen virtual machine monitor follows a shared device driver model to handle I/O requests from guest domains.

Virtualized system has a driver domain which performs I/O operations on behalf of guest domains and uses their native device to directly access the I/O device. For this reason, the driver domain executes delegated instructions to process I/O activities from guest domains.

Introduction

Likewise, virtualization technology intensifies server stability, resource allocation flexibility, and service delivery scalability (Armbrust et al., 2009). Amazon Elastic Compute Cloud (EC2) (Ostermann et al.) and Rackspace Hosting are the popular examples of IaaS, and those services provide VMs to customers. The cost of providing VM naturally increases with the amount of resource used by the VM, and even depends on which resource is used (Kim et al., 2011).

Xen (Barham et al., 2003), one of the most widely used hypervisors by cloud vendors, has adapted a secure hardware interface that allows unmodified device drivers to be shared between isolated operating system instances. And Credit Scheduler, by default for Xen, targets a fair share of the scheduler's CPU on the SMP host (Cherkasova et al., 2007).

Background and Motivation

Background

Xen device drivers typically consist of the real driver, the backend driver, the frontend driver, and the I/O ring. And the I/O ring is a kind of ring buffer that contains two types of data: requests and responses, updated by the back-end and front-end drivers. The packet normally travels through the TCP/IP stack and the bottom of the stack, the front-end driver, places it in shared memory.

The last driver, running in domain 0, reads the packet from the buffer and inserts it into the existing network pool of the operating system, which routes it as a packet coming from a real interface. Finally, it goes to the real device driver and the physical network device sends the packet.

Figure 2.2 shows the path of a I/O data sent from a domain U and the composition of a split device driver (Chisnall, 2008)

Motivation

The front-end driver, running in domain U, initializes the memory page with a ring data structure and exports it to domain 0. The back-end driver takes care of multiplexing to allow more than one VM to use the device and provides a generic interface to different operating systems. In fact, some Linux kernel functions that are typically used for domain 0 already provide a device multiplexing service.

For example, hard disk access is multiplexed using a file system abstraction, network devices using a socket abstraction, and so on. For this reason, Xen drivers use real drivers via these abstractions in existing operating system (Chisnall, 2008). Seeing the process of sending a packet from an application running on domain U to remote node is a good example of how a Xen driver works.

We performed a group of experiments on a Xen virtualized environment running two domain U guests with the exact same definition and using domain 0 to provide I/O functions. However, CPU usage of domain 0 was increasing depends on the number of VMs that have a network intensive process. In conclusion, VMs that used networking caused more CPU usage than those that did not.

However, existing hypervisors do not take into account the delegated CPU usage for domain U I/O device usage. The feature to measure the delegated CPU usage per VM for the I/O activity is necessary for the accurate resource -accounting.

Figure 2.3: Various Workload sets used in the experiment

Accounting per-VM processor usage

Accounting framework

The task engine in the Linux kernel provides many interesting features: a task can be registered as timers, can be scheduled to run with normal or high priority, can run immediately and never later than the next timer tick, and is strictly serialized with respect to itself . So we just inserted a timestamp into each step of handling network requests and got the remainder of the two timestamps. In addition, the measured CPU time must include time when the CPU was idle but occupied by a guest.

The backend driver wakes up with a timer and processes each request in the I/O ring at that time. Basically, the backend driver has a number of loops where you remove an item from the queue and do something to the item repeatedly until the queue is empty. Similarly, each domain's virtual network device has a local device number, and the device has a name consisting of the domain ID and the device number.

So, our implementation measures CPU time for each request and reads domain ID from processed data for categorization. The source code listed in Figure 3.2 is part of the function of net_rx_action. The structure has domain ID field, so it is possible to get domain information that sent the request by simply reading it.

When a packet is converted to I/O ring data, the back-end driver also uses device name to get the destination domain ID. In particular, the back-end driver allocates some shared pages, at which point they are no longer needed. And in addition to deallocation, there is common CPU usage for network activities from all domain U guests, and we measured them separately.

Figure 3.1: Network drivers run on domain 0

Interfaces and Monitor Application

The class has the rx_stat and tx_stat attributes to represent delegated CPU usage since a VM was launched. The show function for rx_stat and tx_stat returns a string generated with the total delegated CPU usage of all drivers and spaces as separators between the values of each domain. When the backend driver is initialized, it gets a page for each array to share using the __get_free_page function.

And using the virt_to_mfn function, convert the returned address to machine frame number and map to a pointer on the hypervisor using the map_domain_page function. By using the interface between user application and kernel memory space, a user application obtains profiled data by simple file read operation. And using the interface between the hypervisor and domain 0 guest, the hypervisor can read data from shared memory directly.

It can collect not only delegated CPU usage for network activities, but also network and CPU usage of unprivileged domains. It read the file whose path is /sys/class/netback_stat to collect delegated CPU usage. The returned string consists of profiled data and blank, so the monitor must parse it.

The VM scheduler profiles the CPU usage of the VMs, so data can be accessed with hyper calls. The network utilization of the VMs, such as the number of packets transferred and bytes, are taken from the files in procfs. Using the monitor, users can compare and analyze scheduled and delegated CPU usage for networking.

Figure 3.3: Interface to userspace on domain 0 and hypervisor

Evaluation

Experiments while VMs running various workload sets

On the first attempt, the CPU-to-CPU workload is set, all the delegated CPU time includes a total value close to 0. The reason the value is not exactly zero is because the operating system is sending or receiving, which keeps its network state connected. The results of the experiments on pairs of loads, including the network, were analyzed from two aspects.

The first aspect is the relationship between network usages as the number of transmitted packet and delegated CPU usage classified in domain that generated network requests. In the experiment of CPU-Network workload pair, a domain, DomU(2), networked alone, so it got much higher throughput. Due to the interference of other process of CPU intensive workload, the network process cannot make enough packets to fully use the network bandwidth.

But when two domains connect simultaneously, two guests produce requests and domain 0 consumes at once may be more efficient. DomU(2) delegated most of the CPU time in the second experiment, DomU(1) and DomU(2) in the third experiment delegated slightly more than half of it. Compared to both domains of the third experiment alone, although a domain that is more networked delegates less CPU usage, but the values are somewhat similar.

Considering the measurement error, we can say that a domain delegates more CPU usage when they are networked more. Form this set of experiments; it is difficult to ascertain the reason for the increase in shared use. To ascertain the reason for the increase, we conducted another experiment with different number of domains which is networking.

Table 4.1: Resource usage while VMs run various workload pairs

Experiments on various number of networking domains

The total sum of delegated CPU usage by each domain, excluding general usage, always has a similar value regardless of how many VMs use the network. When VMs share network hardware, network domains delegate a portion of the total usage at a rate of occupied network bandwidth. The overall CPU usage for network operations of domain U spends a large portion of the entire CPU usage of domain 0.

And it made a difference in the total delegated CPU usage while different number of VMs are using the network. This graph shows that there is no relationship between the number of packets and the total CPU time. However, it is clear that an increase in the number of network domains leads to an increase in the total CPU usage for network activities.

To see the relationship between delegation and network utilization, Figure 4.4 shows the categorized utilization of the CPU delegating from domain U by the number of packets sent or received.

Table 4.2: Average of transferred packets while multiple domains network

Related work

It is not a system embedded profiler and users must instruct the start and end point. Joule meter (Kansal et al.) is a software approach to measure the energy consumption of virtual machines in a consolidated server environment. The Joule meter estimates the amount of energy each virtual machine uses by dynamically monitoring resource usage.

Conclusion

And on a cloud service, it is possible to build a fair billing system based on real resource consumption of VMs. Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems. Cloud Computing and New IT Platforms: Vision, Hype and Reality for Delivering Computing as the 5th Utility.

Proceedings of the first workshop on Operating Systems and Architectural Support for the On-Demand IT Infrastructure (OASIS).