Parallel Machine Models (Deterministic)
Algorithm 5.2.3 Minimizing Makespan with Preemptions) Step 1
5.6 Online Scheduling
In all previous sections the underlying assumptions were based on the fact that all the problemdata (e.g., number of jobs, processing times, release dates, due dates, weights, and so on) are known in advance. The decision-maker can determine at time zero the entire schedule while having all the information at his disposal. This most common paradigm is usually referred to as offline scheduling.
One category of parallel machine scheduling problems that has not yet been addressed in this chapter are the so-called online scheduling problems. In an online scheduling problemthe decision-maker does not know in advance how many jobs have to be processed and what the processing times are. The decision- maker becomes aware of the existence of a job only when the job is released and presented to him. Jobs that are released at the same point in time are presented to the decision-maker one after another. The decision-maker only knows the number of jobs released at that point in time after the last one has been presented to him. The processing time of a job becomes known only when the job has been completed. If the assumption is made that jobs are going to be released at different points in time, then the decision-maker does not know at any given point in time how many jobs are still going to be released and
what their release dates are going to be. (In an offline scheduling problemall information regarding allnjobs is known a priori.)
An online counterpart ofP m||γ can be described as follows. The jobs are going to be presented to the decision-maker one after another going down a list. The decision-maker only knows how long the list is when the end of the list has been reached. When a job has been presented to the decision-maker (or, equivalently, when the decision-maker has taken a job from the list), he may have to wait till one (or more) machines have become idle before he assigns the job to a machine. After he has assigned the job to a machine, the decision-maker can consider the next job on the list. After a job has been put on a machine starting at a certain point in time, the decision-maker is not allowed to preempt and has to wait till the job is completed. If the objective function is a regular performance measure, then it may not make sense for the decision-maker to leave a machine idle when there are still one or more jobs on the list.
The objective functions in online scheduling are similar to those in offline scheduling. The effectiveness of an online scheduling algorithmis measured by itscompetitive ratiowith respect to the objective function. An online algorithm isρ-competitive if for any problem instance the objective value of the schedule generated by the algorithmis at mostρtimes larger than the optimal objective value in case the schedule had been created in an offline manner with all data known beforehand. The competitive ratio is basically equivalent to a worst case bound.
Consider the following online counterpart ofP m||Cmax. There are a fixed number of machines (m) in parallel; this number is known to the decision-maker.
The processing time of a job is at time zero not known to the decision-maker; it only becomes known upon the completion of a job. When a machine is freed the decision-maker has to decide whether to assign a job to that machine or keep it idle. He has to decide without knowing the remaining processing times of the jobs that are not yet completed and without knowing how many jobs are still waiting for processing. One well-known algorithmfor this problemis usually referred to as the List Scheduling (LIST) algorithm. According to LIST, the jobs are presented to the decision-maker according to a list and every time the decision-maker considers the assignment of a job to a machine, he checks the list and takes the next one from the list. So, every time a machine completes a job, the decision maker takes the next job from the list and assigns it to that machine (the decision-maker does not allow for any idle time on the machine).
Theorem 5.6.1. The competitive ratio of the LIST algorithm is2−m1. Proof. First, it has to be shown that the competitive ratio of LIST cannot be better (less) than 2−1/m. Consider a sequence ofm(m−1) jobs with running time 1 followed by one job with running timem. A LIST schedule following this sequence finishes by time 2m−1, while the optimal schedule has a makespan ofm.
In order to show that the competitive ratio cannot be larger than 2−1/m, consider the job that finishes last. Suppose it starts at timetand its processing
140 5Parallel Machine Models (Deterministic) time isp. At all tim es beforetall machines must have been busy, otherwise the last job could have started earlier. Hence the optimal makespanCmax(OP T) must satisfy
Cmax(OP T)≥t+ p m.
In addition,Cmax(OP T)> p,as the optimal schedule must process the last job.
Fromthese two inequalities, it follows that the makespan of the online solution, t+p, is bounded fromabove by
t+p = t+ p m+
1− 1
m
p ≤ 2− 1
m
Cmax(OP T).
Consider now the online counterpart of P m | prmp |
Cj. The decision- maker only finds out about the processing time of a job the moment it has been completed.
The following algorithmfor this online scheduling problemis quite different fromthe LIST algorithm. The so-called Round Robin (RR) algorithmcycles through the list of jobs, giving each job a fixed unit of processing time in turn.
The Round Robin algorithm ensures that at all times any two uncompleted jobs have received an equal amount of processing time or one job has received just one unit of processing more than the other. If the unit of processing is made very small, then the Round Robin rule becomes equivalent to the Processor Sharing rule (see Example 5.2.9). If the total completion time is the objective to be minimized, then the competitive ratio of RR can be determined.
Theorem 5.6.2. The competitive ratio of the RR algorithm is2.
Proof. Assume, for the time being, that the number of jobs, n, is known. In what follows, it will actually be shown that the worst case ratio of RR is 2− 2m/(n+m).
In order to show that the worst case ratio cannot be better (lower) than 2−2m/(n+m), it suffices to find an example that attains this bound. Consider nidentical jobs with processing time equal to 1 and letnbe a multiple ofm. It is clear that under the Round Robin rule allnjobs are completed at timen/m, whereas under the nonpreemptive scheduling rule (which is also equivalent to SPT),mjobs are completed at time 1, mjobs at time 2, and so on. So for this example the ratio isn2/mdivided by
m 2
n m
n m + 1
, which equals 2−2m/(n+m).
It remains to be shown that the worst case ratio cannot be worse (larger) than 2−2m/(n+m). Assume the processing times of the jobs arep1≥p2≥ · · · ≥pn. LetR(,),,= 1, . . . ,n/m, denote the subset of jobsjthat satisfy
(,−1)m < j≤,m.
SoR(1) contains jobs 1, . . . , m(the longest jobs);R(2) contains jobsm+ 1, m+
2, . . . ,2m, and so on. It can be shown that the schedule that minimizes the total completion time is SPT and that the total completion time under SPT is
n j=1
Cj(OP T) = n j=1
Cj(SP T) =
n/m =1
j∈R()
,pj.
Consider now the total completion time under RR. Note thatCj denotes the completion time of thejth longest job. It can be shown now that
Cj(RR) =Cj+1(RR) + (j/m)(pj−pj+1) forj ≥m, while
Cj(RR) =Cm+1(RR) +pj−pm+1
forj < m. Eliminating the recurrence yields forj≥m Cj(RR) = j
mpj+ 1 m
n k=j+1
pk
and forj < m
Cj(RR) =pj+ 1 m
n k=m+1
pk. A simple calculation establishes that
n j=1
Cj(RR) = m j=1
pj+ n j=m+1
2j−1 m pj. The ratio
Cj(RR)/
Cj(OP T) is maximized when all the jobs in the same subset have the same processing time. To see this, note that for OPT (SPT) the coefficient ofpj’s contribution to the total completion time is determined solely by its subset index. On the other hand, for RR, the coefficient is smaller for the longer jobs within a specific group. Thus, reducing the value of each pj to be equal to the smallest processing time of any job in its group can only increase the ratio. By a similar argument, it can be shown that the worst case ratio is achieved whennis a multiple ofm.
Assume now that each subset contains exactlym jobs of the same length.
Letq denote the common processing time of any job in subsetR(,). Then, a simple calculation shows that
n j=1
Cj(OP T) =
n/m
=1
m,q,
142 5Parallel Machine Models (Deterministic) and
n j=1
Cj(RR) =
n/m
=1
m(2,−1)q.
Once again, the ratio is maximized when all theqare equal, implying that the worst case ratio is exactly 2−2m/(n+m).
Since in online scheduling a competitive ratio is usually not expressed as a function ofn(since the number of jobs is typically not known in advance), the competitive ratio has to hold for any value ofn. It follows that the competitive
ratio for RR is equal to 2.
Actually, there are several other variants of the online scheduling paradigm.
The variant considered in this section assumes that the decision-maker does not know the processing time of a job when it is released. The decision-maker only finds out what the processing time is when the job is completed. This form of online scheduling is at times referred to asnon-clairvoyant online scheduling.
In another variant of online scheduling, the processing time of a job becomes known to the decision-maker immediately upon the job’s release. This variant is often referred to as clairvoyant online scheduling. However, in clairvoyant online scheduling the decision-maker still does not know how many jobs are going to be released and when the releases will occur.
An entirely different class of online algorithms are the so-called randomized online algorithms. A randomized algorithm allows the decision-maker to make random choices (for example, instead of assigning a job to the machine with the smallest load, the decision-maker may assign a job to a machine at random). If randomization is allowed, then it is of interest to know the expected objective value, where the expectation is taken over the randomchoices of the algorithm.
A randomized algorithm isσ-competitive if for each instance this expectation is within a factor ofσof the optimal objective value.