Li - Commercial Cloud Services - 2013.pdf

(1)

ContentslistsavailableatSciVerseScienceDirect

The Journal of Systems and Software

j ourna l h o m e p a g e :w w w . e l s e v i e r . c o m / l o c a t e / j s s

On evaluating commercial Cloud services: A systematic review

Zheng Li

^a

, He Zhang

^b,∗

, Liam O’Brien

^c

, Rainbow Cai

^e

, Shayne Flint

^d

aNICTA,SchoolofComputerScience,AustralianNationalUniversity,Canberra,Australia

bStateKeyLaboratoryofNovelSoftwareTechnology,SoftwareInstitute,NanjingUniversity,Jiangsu,China

cGeoscienceAustralia,Canberra,Australia

dSchoolofComputerScience,AustralianNationalUniversity,Canberra,Australia

eDivisionofInformation,AustralianNationalUniversity,Canberra,Australia

a r t i c l e i n f o

Articlehistory:

Received30September2012

Receivedinrevisedform15February2013 Accepted6April2013

Available online 26 April 2013

Keywords:

CloudComputing Cloudserviceevaluation Systematicliteraturereview

a b s t r a c t

Background:CloudComputingisincreasinglyboominginindustrywithmanycompetingprovidersand services.Accordingly,evaluationofcommercialCloudservicesisnecessary.However,theexistingeval- uationstudiesarerelativelychaotic.Thereexiststremendousconfusionandgapbetweenpracticesand theoryaboutCloudservicesevaluation.

Aim:Tofacilitaterelievingtheaforementionedchaos,thisworkaimstosynthesizetheexistingevaluation implementationstooutlinethestate-of-the-practiceandalsoidentifyresearchopportunitiesinCloud servicesevaluation.

Method:Basedonaconceptualevaluationmodelcomprisingsixsteps,thesystematicliteraturereview (SLR)methodwasemployedtocollectrelevantevidencetoinvestigatetheCloudservicesevaluationstep bystep.

Results:ThisSLRidentiﬁed82relevantevaluationstudies.Theoveralldatacollectedfromthesestudies essentiallydepictsthecurrentpracticallandscapeofimplementingCloudservicesevaluation,andinturn canbereusedtofacilitatefutureevaluationwork.

Conclusions:EvaluationofcommercialCloudserviceshasbecomeaworld-wideresearchtopic.Some ofthefindingsofthisSLRidentifyseveralresearchgapsintheareaofCloudservicesevaluation(e.g., ElasticityandSecurityevaluationofcommercialCloudservicescouldbealong-termchallenge),while someotherfindingssuggestthetrendofapplyingcommercialCloudservices(e.g.,comparedwithPaaS, IaaSseemsmoresuitableforcustomersandisparticularlyimportantinindustry).ThisSLRstudyitself alsoconfirmssomepreviousexperiencesandrecordsnewevidence-basedsoftwareengineering(EBSE) lessons.

1. Introduction

Byallowingcustomerstoaccesscomputingserviceswithout owningcomputinginfrastructures,CloudComputinghasemerged as one of the most promising computing paradigms in indus- try (Buyya et al., 2009). Correspondingly, there are more and morecommercialCloudservicessuppliedbyanincreasingnum- berofprovidersavailableinthemarket(ProdanandOstermann, 2009)[LYKZ10].¹ Since different and competitiveCloudservices maybeofferedwithdifferentterminologies,deﬁnitions,andgoals

∗Correspondingauthor.Tel.:+862583621369;fax:+862583621370.

E-mailaddresses:[email protected](Z.Li),[email protected] (H.Zhang),[email protected](L.O’Brien),shayne.ﬂ[email protected](S.Flint).

1Weusetwotypesofbibliographyformats:thealphabeticformatdenotesthe Cloudserviceevaluationstudies(primarystudies)oftheSLR,whilethename-year format(presentinthe“References”section)referstotheotherreferencesforthis article.

(ProdanandOstermann,2009),Cloudservicesevaluationwouldbe crucialandbeneﬁcialforbothservicecustomers(e.g.,cost–beneﬁt analysis)andproviders(e.g.,directionofimprovement)[LYKZ10].

However, the evaluation of commercial Cloud services is inevitablychallengingfortwomainreasons.Firstly,previouseval- uationresultsmaybecomequicklyoutofdate.Cloudprovidersmay continuallyupgradetheirhardwareandsoftwareinfrastructures, andnewcommercialCloudservicesandtechnologiesmaygradu- allyenterthemarket.Forexample,atthetimeofwriting,Amazon isstillacquiringadditionalsitesforClouddatacentreexpansion (Miller,2011);GoogleismovingitsAppEngineservicefromCPU usagemodeltoinstancemodel(Alesandre,2011);whileIBMjust offeredapublicandcommercialCloud(Harris,2011).Asaresult, customerswouldhavetocontinuouslyre-designandrepeateval- uationforemployingcommercialCloudservices.

Secondly,theback-ends(e.g.,conﬁgurationsofphysicalinfra- structure)ofcommercialCloudservicesareuncontrollable(often invisible) fromtheperspective of customers. Unlike consumer- owned computingsystems, customers havelittle knowledgeor 0164-1212/$–seefrontmatter© 2013 Elsevier Inc. All rights reserved.

http://dx.doi.org/10.1016/j.jss.2013.04.021

(2)

controlovertheprecisenatureofCloudserviceseveninthe“locked down”environment[SSS⁺08].Evaluationsinthecontextofpublic CloudComputingaretheninevitablymorechallengingthanthat forsystemswherethecustomerisindirectcontrolofallaspects [Sta09].Infact,itisnaturalthattheevaluationofuncontrollable systemswouldbemorecomplexthanthatofcontrollableones.

Meanwhile,theexistingCloudservicesevaluationresearchis relativelychaotic.Ononehand,theCloudcanbeviewedfromvar- iousperspectives(Stokes,2011),whichmayresultinmarkethype andalsoskepticismandconfusion(Zhangetal.,2010).Assuch,it ishardtopointouttherangeofCloudComputingandafullscope ofmetricstoevaluatedifferentcommercialCloudservices.Onthe otherhand,thereexistsatremendousgapbetweenpracticeand researchaboutCloudservicesevaluation.Forexample,although thetraditionalbenchmarkshavebeenrecognizedasbeinginsufﬁ- cientforevaluatingcommercialCloudservices[BKKL09],theyare stillpredominatelyusedinpracticeforCloudservicesevaluation.

To facilitate relieving the aforementionedresearch chaos, it isnecessaryforresearchersandpractitionerstounderstandthe state-of-the-practiceofcommercialCloudservicesevaluation.For example,theexistingevaluationimplementationscanbeviewedas primaryevidenceforadjustingresearchdirectionsorsummarizing feasibleevaluationguidelines.Asthemainmethodologyapplied forevidence-basedsoftwareengineering(EBSE)(Dybåetal.,2005), theSystematicLiteratureReview(SLR)hasbeenwidelyaccepted as a standard and rigorous approach to evidence aggregation for investigating speciﬁc research questions (Kitchenham and Charters,2007;ZhangandBabar,2011).Naturally,weadoptedthe SLRmethodtoidentify,assessandsynthesizetherelevantprimary studiestoinvestigateCloudservicesevaluation.Infact,according tothepopularaimsofimplementingasystematicreview(Lisboa etal.,2010),theresultsofthisSLRcanhelpidentifygapsincurrent researchandalsoprovideasolidbackgroundforfutureresearch activitiesintheﬁeldofCloudservicesevaluation.

ThispaperoutlinestheworkinvolvedinconductingthisSLRon evaluatingcommercialCloudservices.BeneﬁttingfromthisSLR, weconﬁrmthe conceptualmodelof Cloudservicesevaluation;

thestate-of-the-practiceoftheCloudservicesevaluationisfinally revealed;andseveralfindingsarehighlightedassuggestionsfor future Cloud services evaluation work. In addition to the SLR results, the lessons learned from performing this SLR are also reportedintheend.Byobservingthedetailedimplementationof thisSLR,weconfirmsomesuggestionssuppliedbytheprevious SLRstudies,andalsosummarizeourownexperiencesthatcouldbe helpfulinthecommunityofEBSE(Dybåetal.,2005).Inparticular, todistinguishand elaboratesome specificfindings, three parts (namelyevaluationtaxonomy(Lietal.,2012a),metrics(Lietal., 2012c),andfactors(Lietal.,2012b))oftheoutcomederivedfrom this SLR have been reported separately. To avoid duplication, thepreviously reportedresultsareonly brieflysummarized(cf.

Section5.3,5.4,and5.6)inthispaper.

Theremainderofthispaperisorganizedasfollows.Section2 supplementsthebackgroundofthisSLR,whichintroducesaspatial perspectiveasprerequisitetoinvestigatingCloudservicesevalua- tions.Section3elaboratestheSLRmethodandprocedureemployed in this study.Section 4 briefly describestheSLR results, while Section5 answers thepredefinedresearch questionsand high- lightsthefindings.Section6discussesourownexperiencesinusing theSLRmethod,whileSection7showssomelimitationswiththis study.ConclusionsandsomefutureworkarediscussedinSection8.

2. RelatedworkandaconceptualmodelofCloudservices evaluation

EvaluationofcommercialCloudservicesemergedassoonas those services were published[Gar07b,HLM⁺10]. In fact, Cloud

servicesevaluationhasrapidlyandincreasinglybecomeaworld- wide researchtopicduring recent years.As a result,numerous researchresultshavebeenpublished,coveringvariousaspectsof Cloudservicesevaluation.Althoughitisimpossibletoenumerate alltheexistingevaluation-relatedstudies,wecanroughlydistin- guishbetweendifferentstudiesaccordingtodifferentevaluation aspectson whichtheymainly focused.Note that, sincewe are interestedinthepracticesofCloudservicesevaluation,Experiment- Intensive Studiesare themainreview objectsinthis SLR. Based ontheroughdifferentiation,thegeneralprocessofCloudservices evaluationcanbeapproximatelysummarizedandproﬁledusinga conceptualmodel.

2.1. DifferentstudiesofCloudservicesevaluation Servicefeature-emphasizedstudies:

SinceCloudservicesareconcreterepresentationsoftheCloud Computing paradigm, the Cloud service features to be evalu- atedhavebeendiscussedmainlyoverCloudComputing-related introductions,surveys,orresearchagendas.Forexample,thechar- acteristicsand relationshipsof Clouds and related technologies wereclariﬁedinBuyyaetal.(2009),Fosteretal.(2008),andZhang etal.(2010),whichhintedthefeaturesthatcommercialCloudser- vicesmaygenerallyembrace.Theauthorsportrayedthelandscape ofCloudComputingwithregardtotrust andreputation(Habib etal.,2010).Mostofthestudies(Armbrustetal.,2010;Buyyaetal., 2009;Rimaletal.,2009;Zhangetal.,2010)alsosummarizedand compareddetailed featuresoftypicalCloudservicesinthecur- rentmarket.Inparticular,theBerkeleyviewofCloudComputing (Armbrustetal.,2010)emphasizedtheeconomicswhenemploying Cloudservices.

Metrics-emphasizedstudies:

WhenevaluatingCloudservices,asetofsuitablemeasurement criteriaormetricsmustbechosen.Assuch,everysingleevalua- tionstudyinevitablymentionsparticularmetricswhenreporting theevaluationprocessand/orresult.However,wedidnotﬁndany systematicdiscussionaboutmetricsforevaluatingCloudservices.

Consideringthattheselectionof metricsplaysanessentialrole inevaluationimplementations(ObaidatandBoudriga,2010),we performedacomprehensiveinvestigationintoevaluationmetrics intheCloudComputingdomainbasedonthisSLR.Theinvestiga- tionresulthasbeenpublishedinLietal.(2012c).Tothebestof ourknowledge,thisistheonlymetrics-intensivestudyofCloud servicesevaluation.

Benchmark-emphasized studies: Although traditional benchmarks have been widely employed for evaluating commercial Cloudservices,thereareconcernsthattraditionalbenchmarksmay notbesufﬁcienttomeettheidiosyncraticcharacteristicsofCloud Computing.Correspondingly,theauthorstheoreticallyportrayed whatanidealCloudbenchmarkshouldbe[BKKL09].Infact,sev- eralnew Cloudbenchmarks have beendeveloped, for example Yahoo!CloudServingBenchmark(YCSB)[CST⁺10]andCloudStone [SSS⁺08].Inparticular,sixtypesofemergingscale-outworkloads werecollectedtoconstructabenchmarksuite,namelyCloudSuite (Ferdmanetal.,2012),torepresenttoday’sdominantCloud-based applications,suchasDataServing,MapReduce,MediaStreaming, SATSolver,WebFrontend,andWebSearch.

Experiment-emphasizedstudies:

To revealthe rapidlychanging and customer-uncontrollable nature of commercial Cloud services, evaluations have to be implemented through practical experiments. In detail, an evaluation experiment is composed of experimental environment and experimental manipulation. If only focusing on the Cloud side, experimental environment indicates the involved Cloud resources like amount [Sta09] or location [DPhC09] of service instances,whileexperimentalmanipulationreferstothenecessary

(3)

(1) Requirement

(2) Relevant Service Features

(3) Metrics

(4) Benchmarks (5)

Experimental Environment (Required Resources) (6)

Experimental Manipulation (Operations on Resources)

Evaluation of Commercial Cloud Services Satisfying

Fig.1.AconceptualmodelofthegenericprocessofCloudservicesevaluation.

operationsontheCloud resourcestogetherwithworkloads,for exampleincreasingresourceamount[BIN10]orvaryingrequest frequency[ZLK10].Infact,giventheaforementionedmotivation, theexistingexperiment-intensivestudieshavebeenidentiﬁedand usedasthereviewobjectsinthisSLR.

2.2. AconceptualmodelofthegenericprocessofCloudservices evaluation

As mentioned previously, Cloud Computing is an emerging computingparadigm(Buyyaetal.,2009).Whenitcomestothe evaluationofacomputingsystem(commercialCloudservicesin thiscase),oneofthemostcommonissuesmaybetheperformance evaluation.Therefore,wedecidedtoborrowtheexistinglessons fromperformanceevaluationoftraditionalcomputingsystemsto investigatethegenericprocessofCloudservicesevaluation.Infact, toavoidpossibleevaluationmistakes,thestepscommontoallper- formanceevaluationprojectshavebeensummarizedrangingfrom StatingGoalstoPresentingResults(Jain,1991).Byadaptingthese stepstotheabove-discussedrelatedwork,wedecomposedaneval- uationimplementationprocessintosixcommonstepsandbuilt aconceptualmodelofCloudservicesevaluation,asillustratedin Fig.1andspeciﬁedbelow.

(1)Firstofall,therequirementshouldbespeciﬁedtoclarifythe evaluation purpose, which essentially drives the remaining stepsoftheevaluationimplementation.

(2)Basedontheevaluationrequirement,wecanidentifytherele- vantCloudservicefeaturestobeevaluated.

(3)To measure the relevant service features, suitable metrics shouldbedetermined.

(4)Accordingtothedeterminedmetrics,wecanemploycorre- sponding benchmarks thatmay already exist orhave tobe developed.

(5)Beforeimplementingtheevaluationexperiment,theexperi- mentalenvironmentshouldbeconstructed.Theenvironment includesnotonlytheCloudresourcestobeevaluatedbutalso resourcesinvolvedintheexperiment.

(6)Given all the aforementioned preparation, the evaluation experiment can bedonewithhumanmanipulations, which ﬁnallysatisﬁestheevaluationrequirement.

Theconceptualmodelthenplayedabackgroundandfoundation roleintheconductionofthisSLR.Notethatthisgenericevaluation modelcanbeviewedasanabstractofevaluatinganycomputing paradigm.ForCloudservicesevaluation,thestepadaptationisfur- therexplainedanddiscussedasapotentialvaliditythreatofthis studyinSection7.1.

3. Reviewmethod

Accordingtotheguidelinesfor performingSLR(Kitchenham andCharters,2007),wemademinoradjustmentsandplannedour studyintoaprotocol.Followingtheprotocol,weunfoldthisSLR withinthreestages.

Planningreview:

•JustifythenecessityofcarryingoutthisSLR.

•IdentifyresearchquestionsforthisSLR.

•DevelopSLRprotocolbydeﬁningsearchstrategy,selectioncrite- ria,qualityassessmentstandard,anddataextractionschemafor conductingreviewstage.

Conductingreview:

•Exhaustivelysearchrelevantprimarystudiesintheliterature.

•Select relevant primary studies and assess their qualities for answeringresearchquestions.

•Extractusefuldatafromtheselectedprimarystudies.

•Arrangeandsynthesizetheinitialresultsofourstudyintoreview notes.

Reportingreview:

•Analyze and interprettheinitialresultstogetherwithreview notesintointerpretationnotes.

•FinalizeandpolishthepreviousnotesintoanSLRreport.

3.1. Researchquestions

CorrespondingtotheoverallaimofthisSLRthatistoinvestigate theproceduresandexperiencesofevaluationofcommercialCloud services,sixresearchquestionsweredeterminedmainlytoaddress theindividualstepsofthegeneralevaluationprocess,aslistedin Table1.

Inparticular,weborrowedtheterm“scene”fromthedrama domainfortheresearchquestionRQ6.In thecontextof drama, asceneisanindividualsegmentofaplotinastory,andusually settledinasinglelocation.Byanalogy,hereweuse“setupscene”

torepresentanatomicunitforconstructinga completeexperi- mentforevaluatingcommercialCloudservices.Notethat,forthe convenienceofdiscussion,webroketheinvestigationofService features-orientedstepintotworesearchquestions(RQ2andRQ3), whileweusedoneresearchquestion(RQ6)tocoverbothExperi- mentalEnvironmentandExperimentalManipulationstepsofthe evaluationprocess(cf.Table1).

3.2. Researchscope

Weemployedthreepointsinadvancetoconstrainthescopeof thisresearch.First,thisstudyfocusedonthecommercialCloudser- vicesonlytomakeoureffortclosertoindustry’sneeds.Second,this studypaidattentiontoInfrastructureasaService(IaaS)andPlat- formasaService(PaaS)withoutconcerningSoftwareasaService (SaaS).SinceSaaSisnotusedtofurtherbuildindividualbusiness applications[BKKL09],variousSaaS implementationsmaycom- priseinﬁniteandexclusivefunctionalitiestobeevaluated,which

(4)

Table1

Researchquestions

ID Researchquestion Mainmotivation Investigatedstepofthegeneralevaluation

process RQ1 Whatarethepurposesofevaluating

commercialCloudservices?

Toidentifythepurposes/requirementsof evaluatingcommercialCloudservices.

Requirement

RQ2 WhatcommercialCloudserviceshavebeen evaluated?

ToidentifythemostpopularCloudserviceand itsproviderthathasattractedthedominant researcheffort.

Servicefeatures

RQ3 Whataspectsandtheirpropertiesof commercialCloudserviceshavebeen evaluated?

Tooutlineafullscopeofaspectsandtheir propertiesthatshouldbeconcernedwhen evaluatingCloudservices.

Servicefeatures

RQ4 Whatmetricshavebeenusedforevaluationof commercialCloudservices?

Toﬁndmetricspracticallyusedinthe evaluationofcommercialCloudservices.

Metrics

RQ5 Whatbenchmarkshavebeenusedfor evaluationofcommercialCloudservices?

Toﬁndbenchmarkspracticallyusedinthe evaluationofcommercialCloudservices.

Benchmarks

RQ6 Whatexperimentalsetupsceneshavebeen adoptedforevaluatingcommercialCloud services?

Toidentifythecomponentsofenvironment andoperationsforbuildingevaluation experiments

Experimentalenvironment&Experimental manipulation

couldmakethisSLRoutofcontrolevenifadoptingextremelystrict selection/exclusioncriteria.Third,followingthepastSLRexperi- ences(Alietal.,2010),thisstudyalsoconcentratedontheformal reportsinacademiaratherthantheinformalevaluationpractices inothersources.

3.3. Rolesandresponsibilities

Themembers involved in this SLR include a PhD student, a two-peoplesupervisorypanel,andatwo-peopleexpertpanel.The PhDstudentisnewtotheCloud Computingdomain,andplans tousethisSLRtounfoldhisresearchtopic.Histwosupervisors haveexpertiseinthetwofieldsofservicecomputingandevidence- basedsoftwareengineeringrespectively,whiletheexpertpanelhas strongbackgroundofcomputersystemevaluationandCloudCom- puting.Indetail,theexpertpanelwasinvolvedinthediscussions aboutreviewbackground,researchquestions,anddataextraction schemawhendevelopingtheSLRprotocol;thespecificreviewpro- cesswasimplementedmainlybythePhDstudentwhileunderclose supervision;thesupervisorsrandomlycross-checkedthestudent’s work,forexampletheselectedandexcludedpublications;regular meetingswereheldbythesupervisorypanelwiththestudentto discussandresolvedivergencesandconfusionsoverpaperselec- tion,dataextraction,etc.; unsureissues anddataanalysiswere furtherdiscussedbythefivemembersalltogether.

3.4. Searchstrategyandprocess

Therigorofthesearchprocessisoneofthedistinctivechar- acteristicsofsystematicreviews(ZhangandAliBabar,2010).To trytoimplementanunbiasedandstrictsearch,wesetaprecise publicationtimespan,employedpopularliteraturelibraries,alter- nativelyusedasetof shortsearchstrings,andsupplemented a manualsearchtocompensatetheautomatedsearchforthelack oftypicalsearchkeywords.

3.4.1. Publicationtimespan

Astheterm“CloudComputing”startedtogainpopularityin 2006(Zhangetal.,2010),wefocusedontheliteraturepublished fromthebeginningof2006.Andalsoconsideringthepossibledelay ofpublishing,we restricted thepublicationtime spanbetween January1st,2006andDecember31st,2011.

3.4.2. Searchresources

Withreferencetotheexisting SLR protocolsandreports for referentialexperiences,aswellasthestatisticsof theliterature searchengines(Zhangetal.,2011),webelievedthatthefollowing ﬁveelectroniclibrariesgiveabroadenoughcoverageofrelevant primarystudies:

•ACMDigitalLibrary(http://dl.acm.org/)

•GoogleScholar(http://scholar.google.com)

•IEEEXplore(http://ieeexplore.ieee.org)

•ScienceDirect(http://www.sciencedirect.com)

•SpringerLink(http://www.springer.com) 3.4.3. Proposingsearchstring

Weusedathree-stepapproachtoproposingsearchstringfor thisSLR:

(1)Based onthekeywordsand theirsynonyms intheresearch questions,weﬁrstextractedpotentialsearchterms,suchas:

“cloudcomputing”,“cloudprovider”,“cloudservice”,evaluation,benchmark,metric,etc.

(2)Then, by rationally modifying and combining these search terms,weconstructedasetofcandidatesearchstrings.

(3)Atlast, followingtheQuasi-Gold Standard(QGS) basedsys- tematic searchapproach(Zhanget al.,2011), weperformed severalpilotmanualsearchestodeterminethemostsuitable searchstringaccordingtothesearchperformanceintermsof sensitivityandprecision.

Particularly,thesensitivityandprecisionofasearchstringcan becalculatedasshowninEqs.(1)and(2)respectively(Zhangetal., 2011).

Sensitivity= Numberofrelevantstudiesretrieved

Totalnumberofrelevantstudies 100% (1) Precision= Numberofrelevantstudiesretrieved

Numberofstudiesretrieved 100% (2) In detail, we selected seven Cloud-related conference proceedings (cf.Table 2) to test and contrast sensitivity and precisionofdifferentcandidatesearchstrings.Accordingtothe suggestionsofsearchstrategyscales(Zhangetal.,2011),weﬁnally proposeda search stringwith theOptimumstrategy, as shown below:

(“cloudcomputing”OR“cloudplatform”OR“cloudprovider”

OR “cloud service” OR “cloud offering”) AND (evaluation

(5)

Table2

Sensitivityandprecisionofthesearchstringwithrespecttoseveralconference proceedings.

Targetproceedings Sensitivity Precision

CCGRID2009 100%(1/1) 100%(1/1)

CCGRID2010 N/A(0/0) N/A(0/2)

CCGRID2011 100%(1/1) 50%(1/2)

CloudCom2010 100%(3/3) 27.3%(3/11)

CloudCom2011 100%(2/2) 33.3%(2/6)

CLOUD2009 N/A(0/0) N/A(0/0)

CLOUD2010 N/A(0/0) N/A(0/6)

CLOUD2011 66.7%(2/3) 25%(2/8)

GRID2009 100%(1/1) 50%(1/2)

GRID2010 100%(1/1) 100%(1/1)

GRID2011 N/A(0/0) N/A(0/0)

Total 91.7%(11/12) 28.2%(11/39)

OR evaluating OR evaluate OR evaluated OR experiment OR benchmarkORmetricORsimulation)AND(<Cloudprovider’s name>OR...)

Note that the (<Cloud provider’s name> OR...) denotes the “OR”-connected names of the top ten Cloud providers (SearchCloudComputing,2010).Thespeciﬁcsensitivityandpreci- sionofthissearchstringwithrespecttothosesevenproceedings arelistedinTable2.Givensuchhighsensitivityandmorethan enoughprecision(Zhangetal.,2011),althoughthesearchstring waslocallyoptimized,wehavemoreconﬁdencetoexpectaglob- allyacceptablesearchresult.

3.4.4. Studyidentiﬁcationprocess

Therearethreemainactivitiesinthestudyidentificationpro- cess,aslistedbelow:QuicklyScanningbasedontheautomated search,EntirelyReadingandTeamMeetingfortheinitiallyidenti- fiedstudies,andmanualReferenceSnowballing.Thewholeprocess ofstudyidentificationhasbeenillustratedasasequencediagram inFig.2.

(1)Quicklyscanning:

Given thepre-determinedsearch strings,we unfoldedauto- matedsearchintheaforementionedelectroniclibrariesrespec- tively.Relevantprimarystudieswereinitiallyselectedbyscanning titles,keywordsandabstracts.

(2)Entirelyreadingandteammeeting:

The initiallyidentiﬁed publications weredecided by further reviewingthefull-text,whiletheunsureoneswerediscussedin theteammeeting.

(3)Referencesnowballing:

Quickly Scanning

Entirely Reading

& Team Meeting

Reference Snowballing

Determine relevant studies from initially selected

publications.

(132)

Further identifystudies from references.

(75) Snowballed publications.

Determine relevant studies from snowballed publications.

Finally selected relevant studies.

(7) (18)

(82) (4017)

Fig.2. Studyidentiﬁcationprocessinsequencediagram.Thenumbersinthebrac- ketsdenotehowmanypublicationswereidentiﬁed/selectedatdifferentsteps.

To further ﬁnd possibly missed publications, we also sup- plemented a manual search by snowballing the references (Kitchenhametal.,2011)oftheselectedpapersfoundbytheauto- matedsearch.Thenewpapersidentiﬁedbyreferencesnowballing werealsoreadthoroughlyand/ordisscussed.

3.5. Inclusionandexclusioncriteria

Indetail,theinclusionandexclusioncriteriacanbespeciﬁedas:

Inclusioncriteria:

(1)Publicationsthatdescribepracticalevaluationofcommercial Cloudservices.

(2)Publicationsthatdescribeevaluationtool/method/framework forCloudComputing,andincludepracticalevaluationofcom- mercialCloudservicesasademonstrationorcasestudy.

(3)Publicationsthatdescribepracticalevaluationofcomparisonor collaborationbetweendifferentcomputingparadigmsinvolv- ingcommercialCloudservices.

(4)Publicationsthatdescribecasestudiesofadaptingordeploying theexistingapplicationsorsystemstopublicCloudplatforms withevaluations. Thisscenariocanbe viewedasusing real applicationstobenchmarkcommercialCloudservices.Notethe differencebetweenthiscriterionandExclusionCriterion(3).

(5)Inparticular,aboveinclusioncriteriaapplyonlytoregularaca- demicpublications(Fulljournal/conference/workshoppapers, technicalreports,andbookchapters).

Exclusioncriteria:

(1)Publicationsthatdescribeevaluationofnon-commercialCloud servicesintheprivateCloudoropen-sourceCloud.

(2)Publicationsthatdescribeonlytheoretical(non-practical)discussions, like [BKKL09](cf.TableC.14), aboutevaluation for adoptingCloudComputing.

(3)PublicationsthatproposenewCloud-basedapplicationsorsys- tems,andtheaimofthecorrespondingevaluationismerelyto reﬂecttheperformanceorotherfeaturesoftheproposedappli- cation/system.Notethedifferencebetweenthiscriterionand InclusionCriterion(4).

(4)Publicationsthatarepreviousversionsofthelaterpublished work.

(5)Inaddition,short/positionpapers,demoorindustrypublica- tionsareallexcluded.

3.6. Qualityassessmentcriteria

Sincearelevantstudycanbeassessedonlythroughitsreport, andCloudservicesevaluationbelongstotheﬁeldofexperimental computerscience[Sta09],herewefollowedthereportingstructure ofexperimentalstudies(cf.Table9inRunesonandHöst,2009)to assessthereportingqualityofonepublication.Inparticular,we dividedthereportingstructuralconcernsintotwocategories:the genericResearchreportingqualityandtheexperimentalEvalua- tionreportingquality.

•Researchreporting: Isthepaperorreportwellorganizedand presentedfollowingaregularresearchprocedure?

•Evaluation reporting: Is the evaluation implementation work describedthoroughlyandappropriately?

Indetail,weproposedeightcriteriaasachecklisttoexaminedif- ferentreportingconcernsinarelevantstudy:

Criteriaofresearchreportingquality:

(6)

Table3

Thedataextractionschema.

ID Dataextractionattribute Dataextractionquestion Correspondingresearch question

Investigatedstepinthe generalevaluationprocess

(1) Author Whois/aretheauthor(s)? N/A(Metadata) N/A(Genericinvestigation

inSLR) (2) Afﬁliation Whatis/aretheauthors’afﬁliation(s)?

(3) Publicationtitle Whatisthetitleofthepublication?

(4) Publicationyear Inwhichyearwastheevaluationworkpublished?

(5) Venuetype Whattypeofthevenuedoesthepublicationhave?

(Journal,Conference,Workshop,BookChapter,or TechnicalReport)

(6) Venuename Whereisthepublication’svenue?(Acronymofnameof journal,conference,workshop,orinstitute,e.g.,ICSE,TSE)

(7) Purpose Whatisthepurposeoftheevaluationworkinthisstudy? RQ1 Requirement

(8) Provider BywhichcommercialCloudprovider(s)aretheevaluated servicessupplied?

RQ2 Servicefeatures

(9) Service WhatcommercialCloudserviceswereevaluated?

(10) Serviceaspect Whataspect(s)ofthecommercialCloudserviceswas/were evaluatedinthisstudy?

RQ3 Servicefeatures

(11) Aspectproperty Whatpropertieswereconcernedfortheevaluated aspect(s)?

(12) Metric Whatevaluationmetricswereusedinthisstudy? RQ4 Metrics

(13) Benchmark Whatevaluationbenchmark(s)was/wereusedinthis study?

RQ5 Benchmarks

(14) Environment Whatenvironmentalsetupscene(s)wereconcernedinthis study?

RQ6 Experimentalenvironment

(15) Operation Whatoperationalsetupscene(s)wereconcernedinthis study?

Experimentalmanipulation

(16) Evaluationtime Ifspeciﬁed,whenwasthetimeorperiodoftheevaluation work?

N/A(Additionaldata) N/A(Tonoteevaluation time/period) (17) Conﬁguration Whatdetailedconﬁguration(s)was/weremadeinthis

study?

N/A(Additionaldata) N/A(Tofacilitatepossible replicationofreview)

(1)Istheresearchproblemclearlyspeciﬁed?

(2)Aretheresearchaim(s)/objective(s)clearlyidentiﬁed?

(3)Istherelatedworkcomprehensivelyreviewed?

(4)Areﬁndings/resultsreported?

Criteriaofevaluationreportingquality:

(5)Istheperiodofevaluationworkspeciﬁed?

(6)Istheevaluationenvironmentclearlydescribed?

(7)Istheevaluationapproachclearlydescribed?

(8)Istheevaluationresultanalyzedordiscussed?

Eachcriterionwasusedtojudgeoneaspectofthequalityof apublication,andtoassignaqualityscoreforthecorresponding aspectofthepublication.Thequalityscorecanbe1,0.5,or0,which representthequalityfromexcellenttopoorasansweringYes,Par- tial,orNorespectively.Theoverallqualityofapublicationcanthen becalculatedbysummingupallthequalityscoresreceived.

3.7. Dataextractionandanalysis

Accordingtotheresearchquestionswepreviouslyidentified, thisSLR used adata extractionschematocollectrelevant data fromprimarystudies,aslistedinTable3.Theschemacoversaset ofattributes,andeachattributecorrespondstoadataextraction question.Therelationshipsbetweenthedataextractionquestions andpredefinedresearchquestionsarealsospecified.

Inparticular,thecollecteddatacanbedistinguishedbetween themetadataofpublicationsandexperimentaldataofevaluation work.Themetadatawasmainlyusedtoperformstatisticalinvesti- gationofrelevantpublications,whiletheCloudservicesevaluation datawasanalyzedtoanswerthosepredeﬁnedresearchquestions.

Moreover,thedataofevaluationtimecollectedbyquestion(14)

wasusedinthequalityassessment;thedataextractionquestion (15)aboutdetailedconﬁgurationwastosnapshottheevaluation experimentsforpossiblereplicationofreview.

4. Reviewresults

Todistinguishthemetadataanalysisfromtheevaluationdata analysisinthisSLR,wefirstsummarizetheresultsofmetadata analysisandqualityassessmentinthissection.Thefindingsand answerstothosepredefinedresearchquestionsarethendiscussed inthenextsection.

Followingthesearchsequence(cf.Fig.2),82relevantprimary studiesintotalwereidentiﬁed.Indetail,theproposedsearchstring initiallybrought1198,917, 225,366and1281resultsfromthe ACMDigital Library,Google Scholar,IEEEXplore, ScienceDirect, andSpringerLinkrespectively,aslistedinthecolumnNumberof RetrievedPapersofTable4.

Byreadingtitlesandabstracts,andquicklyscanningpublica- tionsintheautomatedsearchprocess,weinitiallygathered132 papers.Afterentirelyreadingthesepapers,75wereselectedforthis SLR.Inparticular,17undecidedpaperswerefinallyexcludedafter ourdiscussioninteammeetings;twotechnicalreportsandfour conferencepaperswereexcludedduetotheduplicationoftheir latterversions.Asetoftypicalexcludedpapers(cf.AppendixE) wereparticularlyexplainedtodemonstratetheapplicationofpre- definedexclusioncriteria,asshowninAppendixC.Finally,seven morepaperswerechosenbyreferencesnowballinginthemanual searchprocess.Thefinallyselected82primarystudieshavebeen listedinAppendixD.Thedistributionoftheidentifiedpublications fromdifferentelectronicdatabasesislistedinTable4.Notethat thefourmanuallyidentifiedpaperswerefurtherlocatedbyusing GoogleScholar.

(7)

Table4

Distributionofrelevantstudiesoverelectroniclibraries.

Electroniclibrary Numberofretrieved

papers

Numberofrelevant papers

Percentageintotal relevantpapers

ACMDigitalLibrary 1198 21 25.6%

GoogleScholar 917 14 17.1%

IEEEXplore 255 36 43.9%

ScienceDirect 366 0 0%

SpringerLink 1281 11 13.4%

Total 4017 82 100%

Table5

Distributionofstudiesoverquality.

Type Score Numberofpapers Percentage

Researchreportingquality

2 2 2.44%

2.5 2 2.44%

3 22 26.83%

3.5 3 3.66%

4 53 64.63%

Total 82 100%

Evaluationreportingquality

1 1 1.22%

2 8 9.76%

2.5 13 15.85%

3 17 45.12%

3.5 13 15.85%

4 10 12.2%

Total 82 100%

These82primarystudieswereconductedby244authors(co- authors) in total. 40 authors were involved in more than one evaluationworks.Interestingly,onlyfourprimarystudiesincluded co-authorswithadirectaffiliationwithaCloudservicesvendor (i.e.Microsoft).Ononehand,itmaybefairerandmoreacceptable forthirdparties’ evaluationworktobepublished.On theother hand,thisphenomenonmayresultfromthelimitationwithour researchscope(cf.Section7.2).Tovisiblyillustratethedistribution ofauthors’affiliations,wemarktheirlocationsonamap,asshown inFig.3.Notethattheamountofauthors’affiliationsismorethan thetotal numberoftheselectedprimarystudies,becausesome evaluationworkcouldbecollaboratedbetweendifferentresearch organizationsoruniversities.Themapshowsthat,althoughmajor researcheffortswerefromUSA,thetopicofevaluationofcommer- cialCloudserviceshasbeenworld-widelyresearched.

Furthermore,we canmake those afﬁliations beaccurateto:

(1) the background universities of institutes, departments or schools; and (2) the background organizations of individual researchlaboratoriesorcenters.Inthispaper,weonlyfocuson theuniversities/organizationsthathavepublishedthreeormore primarystudies,as shownin Fig.4.We believe theseuniversi- ties/organizationsmayhavemorepotentialtoprovidefurtherand

Fig.3. Studydistributionoverthe(co-)author’safﬁliations.

3 3 3 3 3 3 3 3 3

4 4 4 4 4 4

7

Primary Study Amount

Fig.4. Universities/organizationswiththreeormorepublications.

continualworkonevaluationforcommercialCloudservicesinthe future.

Thedistributionofpublishingtimecanbeillustratedbygroup- ingtheprimarystudiesintoyears,asshowninFig.5.Itisclearthat theresearchinterestsinevaluationofcommercialCloudservices havebeenrapidlyincreasedduringthepastﬁveyears.

Inaddition,these82studiesonevaluationofcommercialCloud servicesscatteredinasmanyas57differentvenues.Suchanum- berofpublishingvenuesaremoredispersivethanweexpected.

Althoughtherewasnotadensepublicationzone,ingeneral,those venuescouldbecategorizedintoﬁvedifferenttypes:BookChapter, TechnicalReport,Journal,Workshop,andConference,asshownin Fig.6.Notsurprisingly,thepublicationsofevaluationworkwere relativelyconcentratedin theCloudand DistributedComputing relatedconferences,suchasCCGrid,CloudCom,andIPDPS.More- over,theemergingandCloud-dedicatedbooks,technicalreports, andworkshopswerealsotypicalpublishingvenuesforCloudser- vicesevaluationwork.

Asforthequalityassessment,insteadoflistingthedetailedqual- ityscoresinthispaper,hereweonlyshowthedistributionofthe studiesovertheirtotalreportingqualityandtotalworkingquality respectively,aslistedinTable5.

0 2

7

13

26

34

2006 2007 2008 2009 2010 2011

Primary Study Amount

Fig.5. Studydistributionoverthepublicationyears.

(8)

Book Chapter 3.7% (3)

Technical Report 2.4% (2)

Journal Paper 14.6% (12) Workshop

Paper 25.6% (21) Conference

Paper 53.7% (44)

Fig.6. Studydistributionoverthepublishingvenuetypes.

Table6

Distributionofstudiesoverevaluationpurpose.

Purpose Primarystudies

Cloudresource exploration

[ADWC10][BCA11][BK09][BL10][BT11]

[CA10][CBH⁺11][CHK⁺11][dAadCB10]

[GCR11][GK11][Gar07b][HLM⁺10][ILFL11]

[IYE11][LYKZ10][LW09][PEP11][RD11]

[RTSS09][SDQR10][Sta09][SASA⁺11][TYO10]

[VDG11][WN10][WVX11][YIEO09][ZLK10]

Businesscomputingin theCloud

[BS10][BFG⁺08][CMS11][CRT⁺11][DPhC09]

[GBS11][Gar07a][GS11][JMW⁺11][KKL10]

[LML⁺11][LYKZ10][SSS⁺08]

Scientiﬁccomputingin theCloud

[AM10][BIN10][DDJ⁺10][DSL⁺08][EH08]

[GWC⁺11][Haz08][HH09][HHJ⁺11][HZK⁺10]

[INB11][IOY⁺11][JDV⁺09][JDV⁺10][JD11]

[JMR⁺11][JRM⁺10][EKKJP10][LYKZ10]

[LHvI⁺10][LJB10][LJ10][LML⁺11][LZZ⁺11]

[MF10][NB09][OIY⁺09][PIRG08][RSP11]

[RVG⁺10][SKP⁺11][SMW⁺11][TCM⁺11]

[VJDR11][MVML11][VPB09][Wal08]

[WKF⁺10][WWDM09][ZG11]

Comparisonbetween computing paradigms

[CHS10][IOY⁺11][KJM⁺09][ZLZ⁺11]

Accordingtothequalityassessment,inparticular,wecanhigh- lighttwolimitationsoftheexistingCloudservicesevaluationwork.

Firstly,lessthan16%publicationsspeciﬁcallyrecordedthetimeof evaluationexperiments.Asmentionedearlier,sincecommercial Cloudservicesarerapidlychanging,thelackofexposingexper- imentaltimewouldinevitablyspoilreusingevaluationresultsor trackingpastdatainthefuture.Secondly,someprimarystudiesdid notthoroughlyspecifytheevaluationenvironmentsorexperimen- talprocedures.Asaresult,itwouldbehardforotherstoreplicate theevaluationexperimentsorlearnfromtheevaluationexperi- encesreportedinthosestudies,especiallywhentheirevaluation resultsbecameoutofdate.

5. Discussionaddressingresearchquestions

Thediscussioninthissectionisnaturallyorganizedfollowing thesequenceofanswerstothesixpredeﬁnedresearchquestions.

5.1. RQ1:WhatarethepurposesofevaluatingcommercialCloud services?

Afterreviewingtheselectedpublications,wehavefoundmainly fourdifferentmotivationsbehindtheevaluationsofcommercial Cloudservices,asillustratedinFig.7.

Cloud Resource Exploration

(29)

Comparison between Computing Paradigms

(4) Scientific

Computing in the Cloud (40)

Business Computing in the Cloud (13)

Fig.7.PurposesofCloudservicesevaluation.

TheCloudResourceExplorationcanbeviewedasarootmoti- vation. As thename suggests,it is to investigatethe available resources like computation capability supplied by commercial Cloud services.For example, thepurpose of study [Sta09] was topurely understandthecomputation performance of Amazon EC2.Theotherthreeresearchmotivationsareessentiallyconsis- tentwiththeCloudResourceExploration,whiletheyhavespecific intentionsofapplyingCloudresources,i.e.,Scientific/BusinessCom- putingin the Cloud is toinvestigate applying Cloud Computing toScientific/Businessissues,and ComparisonbetweenComputing ParadigmsistocompareCloudComputingwithothercomputing paradigms. For example, study [JRM⁺10] particularly investi- gatedhigh-performancescientificcomputingusingAmazonWeb services; the benchmark Cloudstone [SSS⁺08] was proposedto evaluatethecapabilityofCloudforhostingWeb2.0applications;

thestudy[CHS10]performedacontrastbetweenCloudComputing andCommunityComputingwithrespecttocosteffectiveness.

Accordingtothesefourevaluationpurposes,thereviewedpri- marystudiescanbedifferentiated intofourcategories,aslisted in Table 6. Note that one primary study mayhave more than oneevaluationpurposes,andwejudgeevaluationpurposesofa studythrough itsdescribed application scenarios. For example, althoughthedetailedevaluationcontextscouldbebroadranging fromCloudproviderselection[LYKZ10]toapplicationfeasibility verification[VJDR11,wemaygenerallyrecognizetheirpurposes asScientificComputing intheCloud ifthesestudiesinvestigated scientificapplicationsintheCloud.Ontheotherhand,thestud- ieslike“performanceevaluationofpopularCloudIaaSproviders”

Table7

DistributionofstudiesoverCloudserviceaspects/properties.

Aspect Property #Papers Percentage

Performance

Communication 24 29.27%

Computation 20 24.39%

Memory(Cache) 12 14.63%

Storage 28 34.15%

Overallperformance 48 58.54%

Total 78 95.12%

Economics

Cost 35 42.68%

Elasticity 9 10.98%

Total 40 48.78%

Security

Authentication 1 1.22%

Datasecurity 4 4.88%

Infrastructuralsecurity 1 1.22%

Overallsecurity 1 1.22%

Total 6 7.32%

(9)

[SASA⁺11]onlyhavethemotivationCloudResourceExplorationif theydidnotspecifyanyapplicationscenario.

Apartfromtheevaluation workmotivatedbyCloud Resource Exploration,wefoundthattherearethree timesmoreattention paidtoScientificComputingintheCloud(40studies)comparedto BusinessComputingintheCloud(13studies).Infact,thestudiesaim- ingatComparisonbetweenComputingParadigmsalsointendedto useScientificComputingfortheirdiscussionandanalysis[CHS10, KJM⁺09].GiventhatCloudComputingemergedasabusinessmodel (Zhangetal.,2010),publicCloudservicesareprovidedmainlyto meetthetechnologicalandeconomicrequirementsfrombusiness enterprises,whichdoesnotmatchthecharacteristicsofscientific computingworkloads[HZK⁺10,OIY⁺09].However,thestudydis- tributionoverpurposes(cf.Table6)suggeststhatthecommercial CloudComputingisstillregardedasapotentialandencouraging paradigmtodealwithacademicissues.Wecanfindasetofreasons forthis:

•Since the relevant studies were all identiﬁed from academia (cf.Section7),intuitively,ScientiﬁcComputingmayseemmore academicthanBusinessComputingintheCloudforresearchers.

•Although thepublic Cloud is deficient for Scientific Comput- ingonthewholeduetotherelativelypoorperformance and significantvariability[BIN10,JRM⁺10,OIY⁺09],smallerscaleof computationscanparticularlybenefitfromthemoderatecom- putingcapabilityoftheCloud[CHS10,HH09,RVG⁺10].

•Theon-demandresourceprovisioningintheCloudcansatisfy somehigh-priorityortime-sensitiverequirementsofscientiﬁc workwhen in-houseresourcecapacity is insufﬁcient[CHS10, Haz08,OIY⁺09,WWDM09].

•It would be more cost effective to carry out temporary jobs onCloudplatformstoavoidtheassociatedlong-termoverhead ofpoweringandmaintaininglocalcomputingsystems[CHS10, OIY⁺09].

•Through appropriate optimizations, the current commercial CloudcanbeimprovedforScientiﬁcComputing[EH08,OIY⁺09].

•Once commercial Cloud vendors pay more attention to Sci- entiﬁc Computing, they can make the current Cloud more academia-friendlybyslightlychangingtheirexistinginfrastruc- tures[HZK⁺10].Interestingly,theindustryhasacknowledgedthe academicrequirementsandstartedofferingservicesforsolving complexscience/engineeringproblems(Amazon,2011).

5.2. RQ2:WhatcommercialCloudserviceshavebeenevaluated?

EvaluationsarebasedonservicesavailablefromspeciﬁcCloud providers.BeforediscussingtheindividualCloudservices,weiden- tifytheserviceproviders.NinecommercialCloudprovidershave beenidentiﬁedinthisSLR:Amazon,BlueLock,ElasticHosts,Flexi- ant,GoGrid,Google,IBM,Microsoft,andRackspace.Mappingthe 82primarystudiestothesenineproviders,asshowninFig.8,we showthatthecommercialCloudservicesattractingmostevalua- tioneffortsareprovidedbyAmazon.Notethatoneprimarystudy maycovermorethanoneCloudprovider.Thisphenomenonisrea- sonablebecauseAmazonhasbeentreatedasoneofthetopandkey CloudComputingprovidersinbothindustryandacademia(Buyya etal.,2009;Zhangetal.,2010).

WithdifferentpublicCloudproviders,wehave exploredthe evaluatedCloudservicesinthereviewedpublications,aslistedin AppendixB.NotethattheCloudservicesareidentiﬁedaccordingto theircommercialdeﬁnitionsinsteadoffunctionaldescriptions.For example,thework[HLM⁺10]explainsAzureStorageServiceand AzureComputingServicerespectively,whereaswetreatedthem astwodifferentfunctionalresourcesinthesameWindowsAzure service.Thedistributionofreviewedpublicationsoverdetailedser- vicesisillustratedasshowninFig.9.Similarly,oneprimarystudy

77

1 3 1 2 7

1 13

4 Provider Percentage of Primary Studies (Amount)

Fig.8.DistributionofprimarystudiesoverCloudproviders.

mayperformevaluationofmultiplecommercialCloudservices.In particular,fiveservices(namelyAmazonEBS,EC2andS3,Google AppEngine, and Microsoft WindowsAzure) were themost fre- quentlyevaluatedservicescomparedwiththeothers.Therefore, theycanbeviewedastherepresentativecommercialCloudser- vices,atleastinthecontextofCloudservicesevaluation.Notethat biascouldbeinvolvedintheserviceidentificationinthisworkdue tothepre-specifiedprovidersinthesearchstring,asexplainedin Section7.3.

AmongthesetypicalcommercialCloudservices,AmazonEBS, EC2 and S3 belong to IaaS, Google AppEngine is PaaS, while MicrosoftWindowsAzureisrecognizedasacombinationofIaaS andPaaS[ZLK10].IaaSistheon-demandprovisioningofinfrastruc- turalcomputingresources,andthemostsigniﬁcantadvantageisits ﬂexibility[BKKL09].PaaSreferstothedeliveryofaplatform-level environmentincludingoperatingsystem,softwaredevelopment frameworks,andreadilyavailabletools,whichlimitscustomers’

controlwhile takingcomplete responsibilityof maintainingthe

Fig.9. DistributionofprimarystudiesoverCloudservices.

(10)

Table8

DistributionofmetricsoverCloudserviceaspects/properties(basedonLietal., 2012candupdated).

Aspect Property #Metrics

Performance

Communication 9

Computation 7

Memory(Cache) 7

Storage 11

Overallperformance 18

Economics Cost 18

Elasticity 4

Security

Authentication 1

Datasecurity 3

Infrastructuralsecurity 1

Overallsecurity 1

environmentonbehalfofcustomers [BKKL09].Thestudydistri- butionover services(cf.Fig.9)indicatesthat IaaS attracts more attentionofevaluationworkthanPaaS.Suchafindingisessentially consistentwiththepreviousdiscussionswhenansweringRQ1.The flexibleIaaSmaybetterfitintothediverseScientificComputing.In fact,nichePaaSandSaaSaredesignedtoprovideadditionalbene- fitsfortheirtargetingapplications,whileIaaSismoreimmediately usableforparticularandsophisticatedapplications[JD11](Harris, 2012).Inotherwords,giventhediversityofrequirementsinthe Cloudmarket,IaaSandPaaSwouldservedifferenttypesofcus- tomers,andtheycannotbereplacedwitheachother.Thisfinding canalsobeconfirmedbyarecentindustryevent:thetraditional PaaSproviderGooglejustofferedanewIaaS–ComputeEngine (Google,2012).

5.3. RQ3:WhataspectsandtheirpropertiesofcommercialCloud serviceshavebeenevaluated?

The aspects of commercial Cloud services can be initially investigated fromgeneral surveysand discussions about Cloud Computing.Inbrief,fromtheviewofBerkeley(Armbrustetal., 2010), Economics of Cloud Computing should be particularly emphasizedindecidingwhethertoadoptCloudornot.Therefore, weconsideredEconomicsasanaspectwhenevaluatingcommer- cialCloudservices.Meanwhile,althoughwedonotagreewithall theparametersidentifiedforselectingCloudComputing/Provider inHabibetal.(2010),weacceptedPerformanceandSecurityas twosignificantaspectsofacommercialCloudservice.Suchanini- tialinvestigationofserviceaspectshasbeenverifiedbythisSLR.

OnlyPerformance,Economics,and Securityandtheirproperties havebeenevaluatedintheprimarystudies.

Thedetailedpropertiesandthecorrespondingdistributionof primarystudiesarelistedinTable7.Note thataprimarystudy usuallycoversmultipleCloudserviceaspectsand/orproperties.In particular,weonlytakeintoaccountthephysicalpropertiesfor thePerformanceaspectinthispaper.Thecapacitiesofdifferent physicalpropertiesandtheirsophisticatedcorrelations(cf.Fig.10) havebeenspeciﬁedinourpreviouswork(Lietal.,2012a).

Overall, we ﬁnd that the existing evaluation work over- whelminglyfocusedontheperformancefeatures ofcommercial Cloudservices.Manyothertheoreticalconcernsaboutcommercial Cloud Computing, Security in particular, were not well evaluated yet in practice. Given the study distribution over service aspects/properties (cf.Table 7), several research gaps can be revealedorconﬁrmed:

•Sincememory/cachecouldcloselyworkwiththecomputation andstorageresourcesincomputingjobs,itishardtoexactlydis- tinguishtheeffecttoperformance broughtbymemory/cache, which may be the main reason why few dedicated Cloud

Capacity Part Physical

Property Part

Computation Communication

Storage Memory (Cache)

Availability

Latency (Time)

Data Throughput (Bandwidth)

Transaction Speed

Reliability

Variability Scalability

Fig.10. Thepropertiesoftheperformanceaspect(fromLietal.,2012a).

memory/cacheevaluationstudieswerefoundfromtheliterature.

Inadditiontothememoryperformance,thememoryhierarchy couldbeanotherinterestingissuetobeevaluated[OIY⁺09].

•AlthoughonemajorbeneﬁtclaimedforCloudComputingiselas- ticity,itseemsdifﬁcultforpeopletoknowhowelasticaCloud platformis.Infact,evaluatingelasticityofaCloudserviceisnot trivial(KossmannandKraska,2010),andthereislittleexplicit measurement toquantify theamountof elasticityin a Cloud platform(Islametal.,2012).

•ThesecurityofcommercialCloudserviceshasmanydimensions andissues peopleshouldbeconcernedwith(Armbrustetal., 2010;Zhangetal.,2010).However,notmanysecurityevalua- tionswerereﬂectedintheidentiﬁedprimarystudies.Similarto theabovediscussionaboutelasticityevaluation,themainreason maybethatthesecurityisalsohardtoquantify(Brooks,2010).

Therefore,weconcludethattheElasticityandSecurityevalua- tionofcommercialCloudservicescouldbealong-termresearch challenge.

5.4. RQ4:Whatmetricshavebeenusedforevaluationof commercialCloudservices?

Beneﬁtingfromtheabove investigationofaspects and their properties of commercial Cloud services, we can conveniently identifyand organizetheircorrespondingevaluationmetrics.In fact, more than 500 metrics including duplications have been isolatedfromtheexperimentsdescribedintheprimarystudies.

After removing the duplications, we categorized and arranged themetricsnaturallyfollowingtheaforementionedCloudservice aspects/properties.Notethatwejudgedduplicatemetricsaccord- ingtotheirusagecontextsinsteadofnames.Somemetricswith differentnamescouldbeessentiallyduplicateones,whilesome metricswithidenticalnameshouldbedistinguishediftheyare usedfordifferentevaluationobjectives.Forexample,themetric Upload/DownloadData Throughputhasbeenused forevaluating bothCommunication[Haz08]andStorage[PIRG08],andtherefore itwasarrangedunderbothCloudserviceproperties.

Duetothelimitofspace,wedonotelaboratealltheidentiﬁed metricsinthispaper.Infact,wehavesummarizedtheexistingeval- uationmetricsintoacataloguetofacilitatethefuturepracticeand researchintheareaofCloudservicesevaluation(Lietal.,2012c).

Hereweonlygiveaquickimpressionoftheirusagebydisplaying thedistributionofthosemetrics,asshowninTable8.