3.4 Methodology
3.4.4 Interference Aware Scheduling of Tasks with Admission Control 62
3.4. METHODOLOGY
interval. si and fi are the start time and finish time of the task Ti respectively. The final prediction of required number of active machines is calculated using this M Cw instead of xt (the observed value of Eq. 3.7 , Eq. 3.8, and Eq. 3.9).
3.4.4 Interference Aware Scheduling of Tasks with Admis-
Algorithm 1 Interference Aware Resource Provisioning and Scheduling
1: Qb ← batch of tasks (T)
2: for All the tasks Ti inQb do
3: Ti.schedule←T rue
4: Sort the tasks in Qb based on priority/deadline
5: for All the tasks Ti inQb do
6: for All the machinesMj in active machine setMa do
7: sij = IA EST(Ti, Mj)
8: fij = IA EF T(Ti, Mj)
9: for All the machinesMj in the increasing order offij do
10: if fij 6di && RU C(Ti, Mj, sij, fij) then
11: Allocate the task Ti to machine Mj
12: Increment resource utilization of the machine Mj by adding RU of Ti for time [sij :fij]
13: if suitable machine not found for the task Ti then
14: Reject task Ti
15: Ti.schedule←F alse
Algorithm 2 Check for Resource Utilization Constraint
1: ProcedureRUC(Ti, Mj, s, f)
2: if (Mj.mem[s :f] +umbwi )> λ·Cjmbw then
3: return FALSE
4: if (Mj.disk[s:f] +udbwi )> λ·Cjdbw then
5: return FALSE
6: if (Mj.cpu[s:f] +ucpui )> λ·Cjcpu then
7: return FALSE
8: return TRUE
allocated tasks and their expected finish time. As the resource usage pattern affects the estimated execution time of the task, the expected finish time calculation need to consider the usage pattern of already allocated or running tasks on the same machine.
Line 8 of the algorithm calculates the expected finish time of the task considering the interference prediction model which was already discussed in subsection 3.4.2.
At the time of calculation of estimated finish time IA EF T(Ti, Mj) of a task, Ti on machine Mj, the procedure IA EF T use the estimated start time of a task if it gets mapped to VM of the machine Mj based on current queued up workloads in the machineMj. The Eq. 3.5is used to compute the interference of BG tasks on FG task, when all the tasks are scheduled simultaneously on the machine Mj. So to compute the expected execution time of FG task, we need to consider all the BG tasks running concurrently on the same machine. We had slightly modified our model to minimize
3.4. METHODOLOGY
Figure 3.6: The different cases of background (B) or ith task and foreground (F) or jth task process execution
the impact of error as follows. The prediction error can be negative or positive, in case of negative prediction error, the predicted execution time will be less than actual execution time. So the task scheduling based on predicted execution time with negative error causes deadline miss for a task. As most of the cases, the error pattern follows Gaussian distribution and if we distribute our 20% error into four quartiles, then handling 15% of negative error will cater more than 99% of cases in our model.
So we had modified the predicted execution time (PET) as P ET = 1.15×P ET to nullify the 15% of negative error.
There are four possible cases arise when a new task arrives at the system for scheduling which is shown in Figure3.6. Out of four cases, the tasks which fall under the category of case-II will not interfere with each other and kept out for consideration to compute the interference factor. For other cases, we had taken the overlapping time of different tasks for the consideration of the effect of interference. The overlapping time period OTi for the foreground task Ti with respect to the background task Tj is computed as:
OTi =
fj −sj if si 6sj and fj < fi, (Case I) 0, if fi 6sj,(Case II)
fi−sj if si 6sj and fi 6fj,(Case III & IV).
(3.12)
The start time of background application cannot be higher than the start time of foreground application.
After computing the expected finish time of task Ti for all the active machines where the task Ti can be allocated, it greedily selects the machine where estimated finish
time of the task is minimal. It further check for the resource utilization constraint and deadline constraint on the selected machine. If all the constraints are satisfied then the scheduler allocate the task to that particular machine otherwise check for other active machines for allocation in the order of increasing estimated finish time.
If no suitable machines are available then the task get rejected by the scheduler.
Pseudocode for the procedure for checking of resource utilization constraint (RUC) for a task on a machine is shown in Algorithm 2. For the entire duration of the expected execution time of the task (from start time s to finish time f of task), the expected resource utilization should be below the threshold utilization for each resource of the machine. For example in line 2 of Algorithm2check the expected memory bandwidth utilization of task should be below λ·Cjmbw. Similarly the procedure check for other resources of the machine. The procedure returns true if it satisfies resource constraint for all types of resources of a machine, i.e., memory utilization, CPU utilization, and disk I/O bandwidth. We had taken the capacity constraint as defined in [95] to mitigate the interference. In our experimentation, we had made the 90% of the total capacity of the resources as the threshold so that the running task would not violate the SLA. Here we are not considering the migration of VMs because all the VMs are of similar type and running all the time to provide the services to the tasks.
Time Complexity: TheInterferenceAwareResourceProvisioning andScheduling (IARPS) algorithm as shown in Algorithm 1, selects a batch of tasks and order them based on their deadlines. From the Algorithm 1, we can see the complexity of the algorithm can largely be formulated as O(nlogn+nm), where n represents number of tasks, and m represents number of currently active machines. The computational complexity of the procedures used in this algorithm, likeIA EST,IA EF T andRU C are independent of number of tasks, i.e., n and takes constant time. As the number of tasks that arrives at the system is large,nlogndominates the termnm+nlogn. So the time complexity of IARPS approach is O(nlogn). Moreover, linear function used to predict the interference will not affect much to the complexity of the algorithm.
3.5 Experimental Setup and Results
We evaluated the effectiveness of the proposed approach (IARPS) using Google cluster data [5]. The simulation environment accepts a stream of online tasks with config- urable virtual machines and physical machines as input. The task, physical machine and virtual machine specifications accepted by the simulation environment are same
3.5. EXPERIMENTAL SETUP AND RESULTS
as defined in Section 3.3. The scheduling of tasks is done batch-wise manner, even though tasks arrived at the system dynamically. The adopted simulation environment is similar to Eagle [83] and included with other states of art scheduling modules along with two well known shortest job first (SJF) and earliest deadline first (EDF) schedul- ing approaches. We had implemented three scheduling modules, one of our proposed IARPS (described in the Algorithm 1) and other two approaches IA-LongShort, IA- Short and MIMP [257] to compare the performance.
In IA-LongShort approach, we segregate all the tasks to be either long tasks or short tasks. We allocate a proportional number of machines (which was predicted by our resource prediction model), depending on the number of long tasks and short tasks.
Long tasks get allocated to the set of machines which was reserved for long tasks and the same for short tasks. In IA-Short approach, we assume that short tasks do not create interference to any type (either long or short) of tasks, and hence only interference by long tasks are considered for scheduling. MIMP [257] minimizes CPU interference by allowing background batch processing jobs to execute only when other interactive jobs are not actively using CPU at the same time. The other state-of-art approach TRACON [73], which do not consider the online stream of real-time tasks, and in their approach they map the tasks to VMs where they get minimum inter- ference, and they schedule tasks batch-wise manner. So we compared our approach with SJF, IA-Short, IA-LongShort, MIMP, and EDF.