PDF E cient Task Scheduling in Cloud Environment - ERNET

In the third contribution of the thesis, we discuss the reliability-aware scheduling of tasks with hard deadlines in the cloud environment. Cjvm Maximum number of VMs that a PM can host fj Probability of machine failure Mj.

Figure 1.1: Cloud computing overview Table 1.1: Pricing of Google app engine [7]

Overview of Cloud Computing

Characteristics
Service Models
Cloud System Architecture
Examples of Commercial Cloud Platforms
Scheduling in Cloud Environment

SaaS: This is the highest layer in the cloud service group and is responsible for providing services to consumers. Cloud computing has several features that provide benefits to the user as well as the cloud service provider.

Figure 1.2: Overall cloud computing model illustrating the relationship and overlap between different cloud models and architectural components

Deployment Models of Cloud

Virtualization Technology

Resource Allocation in Cloud Environment

Using VM migration in an efficient way maximizes resource utilization, improves performance, and reduces the power consumption of the cloud system [32]. The power consumption of the system can be reduced by reducing the number of idle systems and increasing resource utilization.

Brief Literature Review

Related Research on Task Scheduling in Cloud Computing
Related Research on Virtual Machine Allocation
Related Research on Interference Aware Scheduling
Related Research on Constraint Aware Scheduling
Related Research on Profit Maximization Based Scheduling
Related Research on Reliability Aware Scheduling in Cloud Sys-

In the same way, resource utilization is considered as the total utilized capacity of cloud resources. Profit is one of the most important factors in the cloud economy and all cloud service providers want to maximize their profit.

Motivation

93] proposed a resource allocation algorithm that introduces an analytical model to analyze system reliability. However, the replication-based approach is widely used to ensure system reliability.

Objectives

We want to incorporate a resource prediction model into the scheduler design to predict the number of PMs required for a job package based on the resource usage pattern of the previous job package. The goal of the scheduler is to optimize resource utilization by deploying fewer host servers.

Contributions

Interference Aware Scheduling
Constraint Aware Scheduling
Reliability Aware Scheduling
Reliability Ensured Scheduling

In this contribution, we consider a set of real-time tasks to be scheduled on a virtualized cloud environment, where all the VMs of the cloud are homogeneous. Here we schedule the real-time tasks with priority or weight (high priority safety critical and low priority mission critical tasks) considering the failure of the machines.

Thesis organization

Static Allocation

Static allocation strategies spend more time determining a mapping since it is performed in an offline mode, using estimated values of the application parameters. Many researchers worked on static allocation strategies in the cloud environment, and some of the works are discussed here.

Dynamic Allocation

Adjusting these constraints results in the effectiveness of their proposed scheduling algorithm in the cloud environment. Their proposed approach takes into account the network traffic load over the network connections to minimize the problems associated with overprovisioning.

Optimization Objectives of Resource Allocation

Resource Consolidation

The problem was transformed into a stochastic garbage packing (SBP) problem and an online packing algorithm was proposed whereby the number of servers required is within the optimum for. Their model is used to improve the consolidation of the VMs to maximize resource utilization while meeting the application performance requirements.

Table 2.1: Representative works based on resource utilization objective Ref. Main objective Technique used Evaluation [90] Efficient utilization

Energy Consumption

142] proposed makespan-conservative energy reduction along with simple energy-aware scheduling to find a trade-off between the makespan time and energy consumption, where they reduce both makespan time and energy consumption of precedence constraint graph on heterogeneous multiprocessor systems supporting DVFS. 196] To optimize lifetime reliability of application and energy consumption with guarantees of QoS constraint.

User Cost

The goal in this case is to minimize the operating costs and that would maximize the overall profit of the system with better utility profit from the service provider's perspective. From the user's perspective, minimizing implementation costs and meeting budget constraints are required to increase service demand.

Evolutionary Approaches for Resource Allocation

On-line Resource Allocation

The resource allocation problem considered here can be formulated as a vector bin packing problem (VBPP), and some algorithms are available to solve the VBPP. The vector binning problem (VBPP) is one variant of the binning problem proposed by Garey [105].

Summary

The most promising solution to address many user requirements with a cloud system is virtualization technology [45]. To improve the utilization of cloud resources, the service provider allows multiple VMs to be hosted on a physical computer, resulting in performance degradation due to interference between concurrently running applications.

Related Research Work

Problem Formulation

Task Environment

Machine Environment

For example, if the value λ = 0.9, at that point the total resource used by different VMs running on a physical machine should not exceed 90% of the entire accessible resource of the physical machine. In this work we had taken the same value of λ for all types of sources to simplify the model.

Optimization Goal

The resource request of VMs deployed to run concurrently on a host computer must not exceed the upper threshold of available resources with that host. Here, λ represents an upper threshold, which means that the total resource consumption cannot exceed this limit and has the range 0 < λ 6 1.

Methodology

System Architecture

Interference Prediction in Virtualized Cloud Environment

We choose to use the linear regression model to predict the application runtime because the time to build the linear regression model at runtime is less compared to the nonlinear model. The prediction model coefficient also changes as more tasks are considered for the model building process at runtime.

Figure 3.2: Consolidation profiling result (JS - Java Server, FS - File Server, DS - -Database Server and WS - Web Server) as reported in [244].

Prediction of Resource Usage of Online Tasks

As in all cases, the predicted value of resource consumption is greater than the observed value; so this helps our disruption-aware scheduling approach to deploy slightly higher than the required number of physical machines to reduce the number of deadlocks. Since a VM locks the entire core once it is initialized until it is destroyed, the number of active machines required for deployment can be calculated using Eq.

Figure 3.4: Resource utilization prediction using double exponential smoothing (DES) with α = 0.4 and β = 0.3

Interference Aware Scheduling of Tasks with Admission Control 62

For other cases, we had taken the overlapping time of different tasks to consider the effect of interference. For the entire duration of the expected execution time of the task (from start time s to end time f of the task), the expected resource utilization should be below the threshold utilization of each resource on the machine.

Figure 3.6: The different cases of background (B) or i th task and foreground (F) or j th task process execution

Simulation Setup

Using DES and FUSD, we predict the number of physical machines that will be active in a given time window. Based on the forecast, we activate the same number of physical machines that will be deployed in that specific time window (5 min.).

Figure 3.7: Number of active machines required and predicted in a time interval of 5 min.

Performance of the Scheduler for Google Cluster Data

The number of lost tasks for IARPS is less compared to other approaches because it schedules the task considering the effect of interference of the co-allocated tasks, deadline and access control mechanism. The dynamic nature of cloud environment and infinite resource availability would not help us to get 100% TGR and PGR.

Performance of the Scheduler with Different Co-allocated VMs 70

But the overhead associated with virtualization is about 15%, which reduces the profit for the cloud service provider. Missing the deadline for the task increases the penalty, thus reducing the profit for the service provider.

Figure 3.11: Scheduler performance in terms of missed tasks and active machines required with co-allocated VMs for Google traces

Related Research Work

These schedulers ignore hard and soft constraints for task placement in heterogeneous data centers. All the well-known schedulers discussed above do not consider both the constraint and the task deadline.

Problem Formulation

Machine Environment
Task Environment
Revenue Model
Cost Model
Penalty Model
Problem Statement

The income (revenuei) collected for the performance of task Ti can be defined as;. Penalty cap refers to being the maximum value of the penalty that the service provider repays for violating the SLA.

Figure 4.1: Square of utilization (u 2 ) vs execution time and deadline.

System Architecture

Solution for CAPM

Ordering of Tasks
Task Mapping without Allowing Deadline Miss
Task Mapping with Allowing Deadline Miss
Simulated Annealing for CAPM

So here our objective is to map all the tasks of the task set (T G) to the machine set (M G) so that the total profit is maximized. Here, we rank the tasks based on the descending order of their expected payoff (Eq. 4.7 without considering the penalty), which is the difference between the task's revenue and the task's execution cost.

Experimental Setup and Results

Simulation Setup
Performance of Different Task Ordering
Performance of HOM-CAPM with or without Allowing to be
Performance of Different Scheduling Approaches
Impact of Threshold
Performance of SA-CAPM and HOM-CAPM

Number of tasks HOM-CAPM with allowing tasks to be missed HOM-CAPM without allowing tasks to be missed. The profit obtained for the case of HOM-CAPM by missable tasks varies 3%.

Figure 4.6: Arrival of different types of tasks in Google traces.

Summary

There are many causes of failure in a cloud environment, which affect the reliability of the cloud service [80]. Here we want to schedule real-time tasks with priority or weight (high-priority safety-critical tasks and low-priority critical tasks) considering machine failure.

Related Research Work

In general, the reliability of a parallel application is represented as the product of the reliability values of all tasks [262]. 35] developed an approach to reliable resource allocation that tries to increase reliability while minimizing costs.

Problem Statement

Task Environment

Machine Environment

Reliability Model

Optimization Goal

Study of Task Scheduling Approaches on Reliable Machines

Scheduling of Tasks with Common Deadline on Machines with

Machines with the same failure rate can be represented as Mj(γj, fj = f), where the failure rate of all the machines is f. The occurring case of the same problem can be solved by pseudo-polynomially solvable 0-1 knapsack problem [160].

Scheduling of Equal Execution Time Tasks on Unreliable Machine112

Either a task is assigned to the most reliable machine or the least reliable machine, depending on the weight of the task. This approach assigns the current task to the most reliable machine (the machine with the lowest failure rate) that is available at the time the scheduling decision for the task is made.

Task Scheduling on Unreliable Machine with Repetition and Replication118

Parameter Setup
Result of Different Task Ordering and Task Mapping
Result for Different Task Mapping for Different Execution Time
Result for Different Weight Distributions
Result for Real World Traces
Result for Task Repetition and Replication

Based on the value of α0 and β0, this approach allows up to one iteration or one iteration of the task to improve performance. Here we report the performance of different heuristics based on the weight assigned to the tasks.

Summary

VM replication [242] is used to deploy redundant copies of VMs to meet application reliability requirements. And propose a heuristic for efficient replication on machines with the same failure rate (called REFR) using the analyzed minimum number of replicas to ensure the reliability requirement of delay-sensitive independent jobs in the target environment.

Related Research Work

We also calculate and use the job subreliability requirement for the jobs and other parameters to deploy the minimum number of machines, for efficiently scheduling jobs to meet the deadline and reliability requirement. We defined and used two performance comparison metrics, namely average VM per task (AVT) and reliability guarantee ratio (RGR) to compare the performance of the proposed approaches and other state-of-the-art approaches.

Problem Formulation

Application Environment
Machine Environment
Reliability Model
Optimization Goal

All tasks of a task can be executed in parallel through independent VMs (i.e. task Ji has a number of tasks) without any communication between them. Suppose that a machine Mj with failure probability fj and a task Til with execution time ei are planned on that machine, then the reliability of the task Til can be defined as.

State-of-the-art Approaches

K-redundancy Method (KR)

The term ki represents the number of additional VMs deployed for the job. Jito meets its reliability requirements. By combining the two constraint equations (Eq. 6.7 and Eq. 6.9), the minimum number of hosting servers mKR using the k-redundancy method can be expressed as follows.

First-Fit Decrease (FFD)

As the value of k increases, the reliability requirement is met, but the number of PMs required to host the VMs will be larger. For example, mF FD = 4,(M1, M2, M3, M4) is the minimum number of PMs required for the feasible job assignment.

Scheduling on Machines with Equal Failure Rate

Since the number of PM (m) must be an integer, so is the lower bound of m. R(Til) = 1−(1−e−e−f.ei)k (6.15) At least a vi number of copies must be scheduled for each task, and more copies of the tasks may be needed to meet the reliability requirement.

Figure 6.1: Reliability distribution of tasks

Scheduling on Machines with Different Failure Rates

Task Prioritization

Longer runtime (HDJ) jobs need more reliable machine scheduling, regardless of the reliability requirement and the number of tasks per job. Medium-interval runtime (MDJ) jobs with high reliability requirements and a larger number of job tasks need more reliable machine scheduling.

Host Machine Selection

So any job with more number of tasks should be given priority to be scheduled for high reliable machines as the reliability requirement is high for each task. Tivi, the subreliability requirement is calculated based on the actual allocation of its previous tasks [261].

Overall Approach

Considering all the tasks to satisfy their reliability requirement and finish their execution before the deadline is a challenging job. The allocation policy looks at the machine with the highest failure rate to meet the task's reliability requirement from the set of active machines where it meets its deadline requirement.

Experimental Setup and Results

Simulation Environment, Parameter Setup and Evaluation Cri-

As capacity increases, the number of PMs required to allocate the jobs decreases. As the value of k = 1 for KR and FFD, the number of jobs missing their reliability requirement decreases.