Computing I. Abstract of Part
II. The NIST Definition
2. Essential Characteristics
The NIST definition of cloud computing specifies five essential characteristics: (1) on- demand self-service, (2) broad network access, (3) resource pooling, (4) rapid elasticity, and (5) measured service (Mell & Grance 2011:2). These characteristics are constituent features of cloud computing systems. Thus, if a computing system delivers computing capabilities with these characteristics, it can be referred to as a cloud computing system.
The following subsections discuss these characteristics in detail.
2.1. On-demand Self-service
Theon-demand self-service characteristic is defined as one where
“[a] consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service provider” (Mell & Grance 2011:2).
This characteristic thus specifies the temporal properties with which computing capabili- ties can be made available and the mode of interaction between consumers and providers to make them available.
On-demandcan generally be understood as making something available “when requested
or needed” (Merriam-Webster 2016:“on demand”) and “at any time” (Cambridge Uni- versity Press 2016:“on demand”). It implies a low lead time for computing capabilities to be delivered and the lack of temporal constraints regarding when they can be delivered.
As a result, cloud consumers are no longer required to plan far ahead (Armbrust et al.
2010:51; Zhang et al. 2010:7).
Self-servicerefers to a delivery concept in which consumers can access the provider’s ser- vices through a technological interface without the direct involvement of the provider’s staff (Meuter et al. 2000:50). In cloud computing, consumers can access computing ca- pabilities through web portals that allow them to identify, select, order, and consume computing capabilities in a self-service manner (Cisco Systems Inc. 2011:2) and manage and monitor their own resource consumption at any time (Vernier et al. 2011). Cisco, a manufacturer of networking equipment, suggests structuring cloud computing web por- tals by means ofservice catalogs that contain an overview of the computing capabilities a provider offers (Cisco Systems Inc. 2011:2). Such catalogs describe computing services in a standardized way including,inter alia, information about pricing, service level com- mitments, and terms and conditions of service use (Cisco Systems Inc. 2011:2). To enable self-service, the information included in service catalogs must be sufficient to enable con- sumers to make decisions unilaterally about which service(s) to purchase to address their needs.
On-demand and self-service are not new concepts,per se. However, cloud computing com- bines these concepts, creating significant flexibility for consumers: They neither need to plan ahead nor coordinate with a provider in person to satisfy newly emerging computing demands. Self-service makes the acclaimed concept of on-demand delivery tangible to consumers (Businesscloud.de 2012). The combination of these characteristics may also be why cloud computing is often described as “user friendly” (Vaquero et al. 2009:51ff.) and
“convenient” (Mell & Grance 2011:2).
2.2. Broad Network Access
The characteristic ofbroad network accessis defined as:
“Capabilities [that] are available over the network and accessed through stan- dard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, tablets, laptops, and workstations)” (Mell &
Grance 2011:2).
Cloud computing harnesses the network – especially the internet – as a primary channel to deliver computing capabilities (Zhang et al. 2010:11). This has become possible because network costs have declined while network bandwidth, speed, and geographic coverage
have increased significantly over the last decades (Williams 2012:5f.). Today, countless WLAN hot spots and mobile phone networks with high bandwidth provide ubiquitous internet access. Therefore, broad network access can be seen as both “a trait of cloud computing and as an enabler” (Williams 2012:6).
2.3. Resource Pooling
The characteristic ofresource pooling is explained as follows:
“The provider’s computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dy- namically assigned and reassigned according to consumer demand. There is a sense of location independence in that the customer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state, or datacenter)” (Mell & Grance 2011:2).
Resource pooling ultimately aims to realize economies of scale by increasing the utilization of resources (Zhang et al. 2010:11). In cloud computing, resource pooling extends to sharing computing infrastructure (e.g. storage capacity, processing power, memory, and network bandwidth) and applications (Mell & Grance 2011:2; Dillon et al. 2010:27ff.). The former is achieved through (1) virtualization, and the latter through (2) multi-tenancy.
Virtualizationenables the sharing of computing infrastructure and is considered in more detail below (see Section D.III.2). For now, it is sufficient to understand the basic prin- ciple. Virtualization means the simulation of multiple computer systems (referred to as
“virtual machines”) on a single physical machine (Goldberg 1974:34). Each consumer is then allocated a dedicated virtual machine, thus giving him or her the illusion of actually having a dedicated physical machine at hand (Sugerman et al. 2001:1). Consumers are thus logically isolated by means of virtual machines, but share the same physical hard- ware resources. Physical resources are dynamically assigned to each virtual machine (and thus to each consumer) according to current demand, thereby increasing physical resource utilization (Armbrust et al. 2010:52f.) and lowering costs (Zhang et al. 2010:7ff.). In fact, cloud computing massively exploits virtualization technology to achieve economies of scale by consolidating infrastructure in very large data centers at low-cost locations (Armbrust et al. 2010:52) while delivering computing capabilities to any location in the world via the internet. To enable consolidation and dynamic (re-)assignment of resources, providers require flexibility in managing resources, which is why consumers are only given the pos- sibility to specify the geographic location of resources at a higher level of abstraction.
Multi-tenancy enables the sharing of applications. It is achieved via a certain design feature in the software architecture that allows multiple users, referred to as “tenants”
to simultaneously use the same application and database instance (Bezemer & Zaidman 2010:88), where each tenant is given the illusion of being the only consumer using the software, and other concurrent tenants remain invisible (Wilder 2012). Thus, unlike virtualization, multi-tenancy logically isolates different users on the application level. The larger the number of tenants that share an application, the more likely it is that tenants have slightly varying requirements for the shared application. Multi-tenant applications therefore need to allow tenants to configure and customize at least certain components of the application according to their needs. Generally speaking, multi-tenant applications need to make a tradeoff between standardization and customization. To achieve tenant- specific adjustments while sharing the same application, Mietzner et al. (2008:156ff.) propose that multi-tenant applications should consist of two parts: one that is fixed for all tenants and one that is customizable by each tenant. Customization can be achieved using so-called “application templates,” which define how tenants can adjust software components (modules) according to pre-defined alternatives. Service modularity, which is discussed in more detail below (see Section D.III.3), is thus a very important method for enabling application sharing (Azeez et al. 2010).
2.4. Rapid Elasticity
The characteristic ofrapid elasticity is defined as
“[c]apabilities [that] can be elastically provisioned and released, in some cases automatically, to scale rapidly outward and inward commensurate with de- mand. To the consumer, the capabilities available for provisioning often ap- pear to be unlimited and can be appropriated in any quantity at any time”
(Mell & Grance 2011:2).
Elasticity seems to be the pivotal characteristic that distinguishes cloud computing from other computing paradigms (Galante & Bona 2012:263). As Owens (2010:46) remarks,
“[e]lasticity, in my very humble opinion, is the true golden nugget of cloud computing and what makes the entire concept extraordinarily evolutionary, if not revolutionary.”
In fact, elasticity is so important that Amazon even integrates the term into the name of its cloud computing services, “Amazon Elastic Compute Cloud.”
Elasticity and scalability are often used as synonyms in the context of cloud computing;
however, the concepts are different “and should never be used interchangeably” (Galante &
Bona 2012:263). In fact, scalability is a prerequisite for elasticity (Herbst et al. 2013:24f.).
Scalability refers to a system’s ability to cope with an increasing workload in a graceful manner or its ability to increase throughput when additional resources are added (see e.g.
Bondi 2000:195; Agrawal et al. 2011:5; Galante & Bona 2012:263; Herbst et al. 2013:25).
Some systems have anupper scalability bound, which refers to the maximum number of resources that can be added to the system (Herbst et al. 2013:24). Scalability is a static property because it does not specify any temporal aspects regarding how fast, how often, and how many resources can be added or removed from the system (Galante & Bona 2012:263; Agrawal et al. 2011:10).
Elasticity, by contrast, is a dynamic property that particularly focuses on the temporal aspects of the adaptation process when resources are added or removed (Galante & Bona 2012:263; Agrawal et al. 2011:10). It can be defined as
“the degree to which a system is able to adapt to workload changes by pro- visioning and deprovisioning resources in an autonomic manner, such that at each point in time the available resources match the current demand as closely as possible” (Herbst et al. 2013:24).
The nature of the adaptation processes can be characterized along two dimensions: (a) speed of scaling, which refers to the time necessary to move from an under-provisioned state to an optimal or over-provisioned state (scaling up) or to move from an over-provisioned state to an optimal or under-provisioned state (scaling down); and (b)precision of scal- ing, which refers to the absolute difference between the current resources allocated and those actually being demanded (Herbst et al. 2013:24f.). Cloud computing systems are defined to scale “rapidly,” which means that scaling happens with a lead time of min- utes (Armbrust et al. 2010:53) or even in real time (Matros 2012:60f.). These prompt resource reconfigurations are technically enabled primarily through virtualization (Verma et al. 2010:11). In terms of precision, Armbrust et al. (2010:53) emphasize that cloud computing systems can add and remove resources at a “fine grain.” Thus, elasticity in cloud computing ultimately allows for significantly reducing (and ideally avoiding) under- and over-provisioning of resources by means of rapid and precise capacity adaptation processes.
Rapid elasticity is closely related to resource pooling. Elastic resource provisioning would neither be feasible nor economically viable without significant pooling of resources, as is done, for example, in extremely large data centers with hundreds of thousands of physical machines (Armbrust et al. 2010:52). The illusion of an infinite resource pool can only be achieved by creating resource pools large enough that individual consumers’ capacity fluctuations do not reach the upper scalability bound of the system. The illusion of an infinite resource pool can only be achieved in an economically viable manner by allowing providers to dynamically (re-)assign resources to different consumers in accordance with their current capacity demand and thus efficiently use the aggregate capacity of these large resource pools (Verma et al. 2010:11).
2.5. Measured Service
The characteristic ofmeasured serviceis explained as follows:
“Cloud systems automatically control and optimize resource use by leveraging a metering capability1 at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts).
Resource usage can be monitored, controlled, and reported, providing trans- parency for both the provider and consumer of the utilized service” (Mell &
Grance 2011:2, footnote original).
Although the idea of pay-per-use appears intuitive, it may be confused with renting re- sources. However, as Armbrust et al. (2010:53) explain, they are essentially different:
Renting resources means paying a negotiated price for the right to use them for a certain period of time, regardless of whether or not the resources are actually used. Conversely, pay-per-use means that actual resource usage is metered, and consumers pay only for the resources that are actually used, regardless of the period of time over which the usage has occurred. The measured service characteristic is anchored in the concept of utility computing, which aims to provide computing capabilities in a pay-per-use manner, simi- larly to the way utility services are provided for gas, water, and electricity (Buyya et al.
2008:5) without requiring consumers to make any upfront commitments (Armbrust et al.
2010:51f.; Zhang et al. 2010:7).