WO2019215676A1

WO2019215676A1 - A model-driven approach for design time generation and runtime usage of elasticity rules

Info

Publication number: WO2019215676A1
Application number: PCT/IB2019/053846
Authority: WO
Inventors: Mahin Abbasipour; Maria Toeroe; Ferhat Khendek
Original assignee: Telefonaktiebolaget LM Ericsson AB
Current assignee: Telefonaktiebolaget LM Ericsson AB
Priority date: 2018-05-10
Filing date: 2019-05-09
Publication date: 2019-11-14
Anticipated expiration: 2020-11-10

Abstract

A system is reconfigured at runtime according to elasticity rules. The elasticity rules are generated offline and contain formulas used for dimensioning the system during a system configuration generation process. In response to a trigger, one or more of the elasticity rules are invoked. One or more of the formulas specified in the one or more elasticity rules are evaluated to determine actions to be executed for reconfiguring the system. The actions are executed to scale the system with respect to one or more entities among service provider entities and service entities.

Description

A MODEL-DRIVEN APPROACH FOR DESIGN TIME GENERATION AND RUNTIME

USAGE OF ELASTICITY RULES

TECHNICAL FIELD

[0001] Embodiments of the invention relate to the automatic generation of elasticity rules at system configuration design time and runtime usage of the elasticity rules.

BACKGROUND

[0002] A Service Level Agreement (SLA) is a contract negotiated and agreed on between a service provider and a customer; it defines the expected quality of the services provided. For instance, the level of service availability, i.e. the percentage of time the service is provided in a given period of time, may be part of the SLA. The rights and obligations of each party are also described in the SLA. When any of the parties fails to meet their respective obligations SLA violations occur and the responsible party may be subject to penalties.

[0003] To increase revenue and reputation, service providers aim at avoiding SLA violations (i.e. penalties) while minimizing the system’s resources usage. To achieve this aim, a system needs to adapt to workload changes by having resources provisioned and de -provisioned dynamically to satisfy the actual workload according to the SLAs. In the cloud environment, this is known as elasticity, i.e. where a system evolves and adapts dynamically to workload variations by scaling up/down and in/out. Dynamic reconfiguration includes resource rearrangement in addition to resource provisioning and de-provisioning.

[0004] To scale a system dynamically, actions are taken according to a set of defined patterns, called elasticity rules. An elasticity rule may provide different actions that are applicable and can be performed in different situations. An elasticity rule is generally invoked by a trigger, which is generated in reaction to a monitoring event.

[0005] Elasticity rules may be defined online with the help of agents that watch and leam about the behavior and the usage of the system using machine learning techniques. A reward function can be used to evaluate the effectiveness of actions that the agents take towards a system goal. The goal of the agents, however, is different from the system goal. The agents try to maximize the rewards and therefore leam and repeat the actions that have the greatest return. A drawback is that if the agents leam fast, they may leam problems such as Denial of Service (DoS) attacks and, therefore, they may incorporate the problems in the elasticity mles as well. In this case, the elasticity mles need to be updated and the system needs to be brought back to its normal state. Since the relation between the generated elasticity mles and configuration is not clear, identifying the problems in the elasticity mles and updating them is difficult. In other words, such online approaches do not guarantee that the system evolves correctly within the designed boundaries. Moreover, depending on the learning method used, it may be difficult to come up with an appropriate reward function to drive the agents towards the system goal.

Furthermore, machine learning techniques can be very resource hungry and, therefore, can be inefficient with respect to resource usage.

SUMMARY

[0006] In one embodiment, a method is provided for runtime reconfiguration of a system according to elasticity rules. The elasticity rules are generated offline and contain formulas used for dimensioning the system during a system configuration generation process. The method comprises: invoking one or more of the elasticity rules in response to a trigger; evaluating one or more of the formulas specified in the one or more elasticity rules to determine actions to be executed for reconfiguring the system; and executing the actions to scale the system with respect to one or more entities among service provider entities and service entities.

[0007] In another embodiment, there is provided a network node comprising processing circuitry and memory. The memory contains instructions executable by the processing circuitry to reconfigure a system at runtime according to elasticity rules. The network node is operative to perform the aforementioned method.

[0008] In yet another embodiment, there is provided a network node operable to generate elasticity rules for a system. The elasticity rules are generated offline and contain formulas used for dimensioning the system during a system configuration generation process. The network node comprises a rule invocation module operative to invoke one or more of the elasticity rules in response to a trigger; an evaluation module operative to evaluate one or more of the formulas specified in the one or more elasticity rules to determine actions to be executed for reconfiguring the system; and an execution module operative to execute the actions to scale the system with respect to one or more entities among service provider entities and service entities.

[0009] Other aspects and features will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments in conjunction with the accompanying figures. BRIEF DESCRIPTION OF THE DRAWINGS

[0010] Embodiments will now be described, by way of example only, with reference to the attached figures.

[0011] Figure 1 illustrates an example of a system configuration at runtime according to one embodiment.

[0012] Figure 2 illustrates a portion of a configuration model for the example in Figure 1 according to one embodiment.

[0013] Figure 3 illustrates an elasticity rule metamodel according to one embodiment.

[0014] Figure 4 illustrates an extended configuration generation process according to one embodiment.

[0015] Figure 5 is a block diagram illustrating an overview of the design time generation and runtime usage of elasticity rules according to one embodiment.

[0016] Figure 6 is a flow diagram illustrating a method for runtime reconfiguration of a system according to elasticity rules according to an embodiment.

[0017] Figure 7 is a block diagram of a network node according to one embodiment.

[0018] Figure 8 is a block diagram of a network node according to another embodiment.

[0019] Figure 9 is an architectural overview of a cloud computing environment according to one embodiment.

DETAILED DESCRIPTION

[0020] Reference may be made below to specific elements, numbered in accordance with the attached figures. The discussion below should be taken to be exemplary in nature, and should not be considered as limited by the implementation details described below, which as one skilled in the art will appreciate, can be modified by replacing elements with equivalent functional elements.

[0021] Embodiments of the invention provide runtime reconfiguration of a system using elasticity rules. The elasticity rules are defined and generated offline during a system configuration generation process using a model-driven approach. More specifically, the elasticity rules are defined and generated when the system is dimensioned and configured for providing required services with the highest expected workload. As a result, there is a clear relation between the generated elasticity rules and the system configuration to allow the elasticity rules to be traced if necessary. Moreover, the elasticity rules allow the rearrangement of resources as well as the addition and removal of the resources.

[0022] During the configuration generation process, a set of formulas are used to dimension the system based on the required maximum workload to be supported by the system. According to the formulas, the elasticity rules including their associated conditions, actions, pre-requisites and follow-ups are generated for the configured entities or their types together with the appropriate thresholds. At runtime when a threshold is violated, the action of an applicable and feasible elasticity rule is executed, which recalculates the formula used at configuration generation, but with the current measurements of the system, thus providing a new configuration value for use under the changed circumstances. In addition, the elasticity rules can be used to rearrange configuration entities according to the distribution principles of the system.

[0023] The dimensions of the system, i.e. the maximum eligible number of entities in the system, are determined according to one or more formulas based on the characteristics of these entities, their relations and the maximum required workload to service. This information is captured in the form of equations which are used not only for system dimensioning and configuration but also for the definition of the elasticity rules that govern the system dynamic reconfiguration within the dimensioned scope. Since at runtime several elasticity rules may be invoked simultaneously, a framework is provided for an elasticity rule structure that allows action correlation to avoid conflicting reconfiguration actions.

[0024] System Configuration and Representation. A system configuration describes the entities composing the system, their relationships and their characteristics. The entities can be classified into two categories: service entities to represent the provided services and service provider entities to describe the resources that can provide those services. In addition, the common characteristics of a set of entities are abstracted as entity types, which are also part of the system configuration.

[0025] Such a system can be viewed from two perspectives: service side and service provider side. On the service side, a workload unit (WU) is a service entity that represents a chunk of the workload for a given service type. It is a logical entity. On the provider side, service provider entities are pieces of application software, physical or virtual computing nodes, which are tangible resources. The application software entities are represented as serving units (SUs) capable of supporting/providing WUs. Each serving unit is deployed on a computing node.

[0026] One reason for the distinction between service entities and service provider entities is that the relation between them may or may not be one-to-one, and service entities can be assigned and re-assigned dynamically to different service provider entities even to multiple of them simultaneously. To provide highly available services, the provider side contains entities to provide and also to protect the service entities in case of provider side failure. High availability is a basic requirement for carrier grade services. Accordingly, a set of redundant serving units can be grouped into a work pool, which can be organized with different redundancy models, namely 2N, N+M, N-way active, N-way and No redundancy. Depending on the redundancy model, at runtime a workload unit may have one or more active and zero or more standby assignments; each of which is assigned to a different serving unit of a work pool. A serving unit with an active assignment provides the service. A serving unit with a standby assignment does not provide the service but is ready to become active in a minimal amount of time.

[0027] The assignment rate of a workload unit is the workload capacity (e.g. the number of requests per second) represented by one active assignment of the workload unit. For example, the assignment rate of 400 for a workload unit means that each active assignment, when assigned, allocates a capacity capable of handling for up to 400 requests per second.

[0028] When a serving unit with an active assignment fails, its assignment is automatically re assigned to a redundant, if it exists, standby serving unit. Thus, to protect a service from hardware failures, the different serving units of a work pool are hosted on different physical nodes. A work pool is deployed on a node group eligible for hosting its serving units. If the service units are deployed on virtual computing nodes, the same principle applies to them, i.e. virtual computing nodes of a node group are to be hosted on different physical hosts. This imposes a constraint for the migration of a virtual computing node: the migration can happen only to physical hosts, which are eligible hosting the virtual computing node based on its node group and therefore the hosted serving units. [0029] Table I summarizes the configuration entities and their definitions.

Table I. Configuration Entities

[0030] When a serving unit is not assigned any workload unit, it is removed from the system by“locking” it to reduce the resource/power consumption. The serving unit remains in the configuration, but it is said to be in the“locked” state and accordingly terminated or powered off. In contrast, when such a serving unit needs to be assigned some workload again it is added back to the work pool (i.e. reconfiguring the work pool) by“unlocking” it. Thus, its state becomes“unlocked” resulting in instantiation or power up and thus available to provide services. An entire work pool or a node can also be added (removed) to (from) the system. In this case, the state of the work pool/node changes from“locked” to“unlocked” (from “unlocked” to“locked”). Similarly, on the service side, a workload unit can be in the“unlocked” or“locked” states depending on whether the chunk of workload it represents needs to be assigned or not.

[0031] Accordingly, a configuration is generated with all the entities necessary considering the maximum workload that the system is to provide and protect. When the workload varies at runtime, the system changes the state of the entities by unlocking and locking them to add or remove them to/from the system according to the current workload and thus reconfigure the running system dynamically.

[0032] Figure 1 illustrates an example of a system configuration 100 at runtime according to one embodiment. The availability of the service is maintained by assigning multiple active assignments (shown with solid lines in Figure 1) to different serving units of a protecting work pool (e.g., Work Pooll). The“unlocked” and“locked” serving units of Work Pooll are shown with solid and dashed rectangles, respectively. Each serving unit is hosted on a separate node. In this example, if an assignment is added to WorkloadUnitl due to workload increase, Serving Unit4 and its hosting node are“unlocked” to support the added assignment. Similarly, when one of the active assignments of WorkloadUnitl is removed due to workload decrease, the SU and the node that were supporting the assignment are locked to save resources. If any of the serving units which provide the service fails, since the workload is shared among other serving units with the active role, the service remains available. However, the total capacity is reduced until the failed SU is repaired. If the repair performed by the availability management is not successful, the increased load on the remaining serving units is handled by the elasticity management and the failed SU will be replaced by Serving Unit4, provided the maximum capacity was not already reached. Note that the maximum capacity of the system remains reduced until the failed unit is repaired.

[0033] Figure 2 illustrates a portion of a configuration model for the example in Figure 1 according to one embodiment. As shown in Figure 2, there is one WU (WorkloadUnitl) of Service Type 1 which has three assignments and is protected by WorkPool 1. The redundancy model of this work pool is N-way active and each serving unit of WorkPool 1 can handle at most one active assignment at a time.

[0034] The Service Availability Forum has defined configuration models (such as the configuration model shown in Figure 2) and a set of middleware services to enable the development of highly available systems and applications. According to the embodiments disclosed herein, elasticity rules may be generated during the system configuration generation time. The system configuration may be represented by a configuration model having a portion such as what is shown in Figure 2. However, it is noted that in alternative embodiments elasticity rules may be generated during the generation of a system configuration defined in another domain where similar principles hold.

[0035] The Elasticity Rule Metamodel. Figure 3 illustrates an elasticity rule metamodel 300 according to one embodiment. By applying elasticity rules, entities can be adapted to workload changes at runtime. In this disclosure, elasticity rules are defined for entity types because instances of the same type share the same features and are subject to the same actions. The EntityType metaclass 320 specifies the type of the configuration entities to which the elasticity rule applies.

[0036] To reconfigure the system based on workload fluctuations, resources are either allocated or released at runtime. As a result, for an entity type two different elasticity rules are defined: one to expand the system and another one to shrink the system. The value of the attribute scalingRule in the Elasticity Rule metaclass 310 specifies if the elasticity rule is for expanding or shrinking the system.

[0037] An elasticity rule may consist of different actions 370, each applicable and feasible in a specific situation. The applicability of an action is defined with a Boolean expression represented by the Condition metaclass 330 in the elasticity rule metamodel 300. At runtime, for an action to be applicable and therefore considered for execution, its associated condition must evaluate to true. For example, to scale up a resizable virtual machine (VM), the condition checks whether the VM has not reached the maximum capacity that it can expand to. If the VM has already reached the maximum capacity, the scale up action is not applicable and cannot be considered for execution.

[0038] An applicable action may or may not be feasible. For instance, even though a VM has not yet reached its maximum capacity, the VM cannot be expanded due to the lack of resources in the hosting node which it depends on. If some of these resources can be freed up, then the VM can also be resized. The Prerequisite metaclass 340 is defined to check the feasibility of an action. A prerequisite evaluating to false may be satisfied by first taking actions of other elasticity rules on sponsor entities. In this case, a trigger (which is an instance of the Trigger metaclass 350) to invoke the prerequisite elasticity rule is generated for providing the required sponsor resources first. Since prerequisite triggers initiate the allocation of prerequisite resources, the scalingType of these triggers is Increase. In the VM expansion example, a prerequisite Boolean expression may be defined as whether the hosting node has available resources for the VM to expand. If the hosting node has the resources for the VM to expand (i.e. the prerequisite evaluates to true), the corresponding action is taken to expand the VM capacity. If the hosting node lacks resources for the VM to expand (i.e. the prerequisite evaluates to false), then a trigger is generated to invoke the resource allocation action on the hosting node (i.e. the sponsor entity of the VM).

[0039] The Follow-Up metaclass 360 is defined to check whether to execute a follow-up action on a sponsor entity after the execution of an action on an entity (which depends on the sponsor entity). According to the evaluation of a follow-up Boolean expression, a follow-up trigger (which is an instance of the trigger metaclass 350) may be generated to invoke an elasticity rule to execute a follow-up action on the sponsor entity. For instance, after the removal of workload units or assignments, a follow-up trigger may be generated to initiate an elasticity rule to remove any provider entity without assignments. A follow-up trigger is generated when the scalingRule of the executed elasticity rule is Decrease.

[0040] An action contained in an elasticity rule is an operation 380, which has a method specified with a language. For example, the Object Constraint Language (OCL) may be used for expressing the method of the operation and the Boolean expression of the conditions, follow-ups and prerequisites in the elasticity rule.

[0041] The method of an operation and Boolean expression of a condition, follow-up and prerequisite contain a number of parameters 390. These parameters belong to the entity type of the elasticity rule or its entities. The values of some of these parameters are set during the configuration generation process while others are obtained at runtime from the monitoring system or the configuration.

[0042] Each action of an elasticity rule has a cost. The attribute midCost represents an approximate cost of the action and its value is the median of the minimum cost and the maximum cost. An action incurs the minimum cost when all the prerequisites are met; thus, it is the cost of the action. An action incurs the maximum cost when none of the prerequisites are met and all prerequisite actions are invoked. The midCost for an action is calculated as part of the elasticity rule generation process. Recursively, all the prerequisite elasticity rules are generated with their actions to calculate it.

[0043] Next, the extended configuration generation approach is described. The elasticity rules may be used to reconfigure the system at runtime according to workload variations. As such, there is a tight coupling between the elasticity rules and the configuration. Therefore, the configuration generation process can be used to generate not only the configuration, but the elasticity rules as well. The elasticity rules may be generated based on the formulas used for dimensioning the system (e.g., equations used for calculating lower and upper boundaries of configuration values). System dimensioning is a process or operation that determines the number of configuration entities on the service side as well as on the service provider side.

[0044] Figure 4 illustrates an extended configuration generation process 400 according to one embodiment. The user requirements 411, the service ontology 412 and the description of the available software (i.e. the software catalog 413) are used as input for the process 400. First, in step 410 of user requirements decomposition and serving unit type selection, the functional user requirements are decomposed (using the service ontology 412) to the level that they can be matched with serving unit types available in the software catalog supporting these functional requirements. The matched serving unit types) are the candidate serving unit types and at this stage they are“prototypes” as they allow for different deployment options.

[0045] The maximum workload that the system needs to be able to handle is one of the non functional user requirements. The maximum workload determines the service side of the system configuration in terms of the active capacity and is expressed as the number of active assignments of the different service types. The service dimensioning step 420 considers the candidate serving unit types, which determine the service types to express the maximum requested workload to support. As part of the service side dimensioning, in step 420 the number of active assignments is calculated for each service type using appropriate equations. [0046] When an equation is used for calculating the number of instances of an entity type, two elasticity rules are generated for the entity type: one with the scaling type Increase and one with the scaling type Decrease. The equation used for the calculation is transformed into the operations of the actions of the generated elasticity rules 480 while the variables of the equations are transformed into the parameters of those operations. Further details of the elasticity rule generation are explained in detail later. To trigger these elasticity rules 480, thresholds are also generated.

[0047] According to embodiments described herein, a threshold represents a point at which one or more actions are taken to reconfigure the system. For example, when at runtime the workload for a workload unit of service type“A” reaches its maximum threshold, the elasticity rule with scaling type Increase is invoked for service type“A”. By executing the action of this elasticity rule, the equation based on which this action was defined (and was used to calculate the number of assignments of the service type) is re-applied with the parameters reflecting the current workload. As a result, the required number of active assignments for service type“A” is recalculated for the current workload. By changing the number of active assignments, the service side capacity of the system is reconfigured. Note that to actually perform this reconfiguration, prerequisites may need to be satisfied.

[0048] The prototype selection step 430 considers the candidate serving unit types and their work pool types defined in the software catalog 413 for selecting those that can provide the requested availability of services. This requested availability is another non-functional user requirement. Based on availability estimations, the candidate serving unit types that cannot provide the service types with the requested level of availability are removed together with the elasticity rules 480 generated for the service types it supports. It is expected that the prototypes in the software catalog 413 are described by their vendor(s) in terms of their performance, availability and other characteristics as well as their monitoring facilities that can be observed by a monitoring system. In this step, for the remaining candidate serving unit types these metrics are extracted from the software catalog and captured in a measurable metrics 432 model. The generated measureable metrics 432 model can be used later to specify the monitoring agents of a monitoring system.

[0049] In the type creation step 440, from the candidate prototypes offering different deployment options, fully specified types are created for deployment by parametrization, as well as missing work pool types are added with an appropriate redundancy model. This in turn determines whether a workload unit of a service type can have one or many active assignments. [0050] In the SUs and WPs dimensioning step 450 (also referred to as the service provider dimensioning step 450), the required numbers of serving units and work pools are determined based on the previously calculated number of active assignments representing the maximum requested workload. The system includes sufficient work pools and serving units to provide and protect all the active assignments. Therefore, this step involves the grouping of active assignments into workload units, adding the standby assignments necessary for the redundancy model and calculating the service provider entities for all. The relation between the service side capacity and the service provider side capacity is often not 1 : 1 as the latter includes the active capacity as well as the capacity required for the protection of the active assignments by standbys and spare serving units. Since this step calculates the number of WPs and SUs using equations, the related elasticity rules 480 are also generated, as described in more detail later.

[0051] In the last step, which is the distribution of entities and node configuration generation step 460, the required number of nodes (e.g. virtual) is calculated and entities among the nodes are distributed based on the cluster information 462 and templates 463. The templates 463 are defined based on distribution principles. The distribution guarantees that SUs of the same work pool are configured on different virtual and physical nodes and this is maintained also in case of node migration. In this step, the node configuration is generated. Moreover, the initial states of the entities (i.e. locked or unlocked) and their related thresholds are set. In this step, the elasticity rules 480 for the nodes are also generated. As a result of the aforementioned steps of process 400, a system configuration 470 is generated. In the different selection steps (410, 430) and in the dimensioning step (450) of the configuration generation process traceability information 490 is generated to maintain the relation between the generated configuration and the selections made as well as the elasticity rules generated. This traceability information could be used later, for example, to determine what part of the configuration needs to be regenerated when some user requirements change.

[0052] In the user requirements 411, the workload that the system supports is specified as a range (e.g. minimum and maximum number of requests per second). The configuration entities are generated for the maximum workload that the system can handle. However, generating the configuration for the maximum workload does not mean that all entities of the configuration are instantiated in the system. Instead, it means that with all these entities instantiated (i.e. healthy and in the“unlocked” state) the system can handle the maximum workload according to the SUAs. This represents the configuration boundaries therefore the boundary thresholds are set at this point. [0053] When the system starts provisioning services, the workload may be not at the maximum. In one embodiment, the system may be initially dimensioned for the mid workload (i.e. median of the minimum and the maximum workload specified in the user requirements). Using the same equations, the initial capacity of the system is configured for handling the mid workload by locking configuration entities not needed to support any workload and setting the related attributes. The values of different thresholds of the deployed system are determined at this step (i.e. the distribution of entities and node configuration generation step 460) to reflect the unlocked capacity of the system.

[0054] Figure 5 is a block diagram illustrating an overview of the design time generation and runtime usage of elasticity rules according to one embodiment. At design time 510 (which is also referred to as the system configuration generation time), a system configuration process is performed at step 511 using a set of formulas 550 including equations and inequalities. The formulas 550 are used to dimension the system to thereby generate a system configuration 530 and elasticity rules 540 as described before with reference to Figure 4. At runtime 520, the system may be reconfigured at step 512 due to runtime events such as workload variations. The system is reconfigured using the elasticity rules 540 generated at the design time 510 with the formulas 550 re-applied to entities of the system. One or more of the formulas 550 may be evaluated using runtime measurements of the system and constants calculated during the system configuration generation process 511 as parameter values.

[0055] At runtime, with the generated elasticity rules triggered by the thresholds violations, the system is reconfigured within the configuration boundary as the workload varies within the range of the minimum and the maximum workload specified in the user requirements. Threshold triggers are primarily issued on service entities (i.e. WUs) and computing nodes (either physical or virtual nodes). The threshold triggers on service entities represent variations in the workload coming from users. As explained before, these triggers may lead to the generation of prerequisite and/or follow-up triggers on provider side entities. In case a threshold trigger is issued on a node while no threshold trigger is generated on its supported WUs, the issued threshold trigger is not directly related to the workload variation; e.g. it may be related to the distribution of entities among the nodes. In the approach described herein, different estimates are used to reconfigure the system (e.g. estimation of thresholds, of cost and of load). Thus, the reconfiguration may not result in an optimal distribution. In this case, complementary actions are taken to rearrange the entities hosted by the node, which means the rearrangement of assignments or virtual computing nodes. Note that the capacity of an SU in terms of the number of assignments is checked as prerequisite or follow-up. [0056] As mentioned earlier, each elasticity rule consists of action(s) and possibly condition, follow-ups, follow-up triggers, prerequisites and prerequisite triggers. Table II summarizes the parameters used to define these elements as well as their descriptions. These elements are specified in the process of generating the elasticity rules, which will be described in more detail later.

Table II. Description of parameters used for defining elasticity rules

[0057] The following description explains the generation of the elasticity rules for the service side in detail. The entity types of the elasticity rules for the service side are the service types realized by workload units (e.g. ServiceTypel in Figure 2 is realized by WorkloadUnitl).

[0058] The definition of action with respect the service side is now provided. The actions addAssignment and addWorkloadUnit are for the case of workload increase, and

removeAssignment and removeWorkloadUnit are the actions defined for the case of workload decrease. These actions change the number of active assignments of an unlocked workload unit and/or the number of unlocked workload units in the system. These result in changes of the service side capacity of the system.

[0059] In the service dimensioning step 420 in Figure 4, to determine the number of active assignments, the assignment rate is calculated first (i.e. the workload capacity represented by one active assignment). The assignment rate is calculated based on the characteristics of the given serving unit type and it remains fixed (up until major changes such as upgrade is performed that would change these characteristics). The number of required active assignments (i.e. the active capacity of service side) is calculated according to equation (1):

#ActiveAssignments = ceil (Workload/AssignmentRate) (1)

[0060] Equation (1) is re-used in the elasticity rule and transformed into the methods of the addAssignment and removeAssignment actions. The variables of equation (1) are transformed into parameters by which the aforementioned actions are defined. The variable #ActiveAssignments is transformed into an output parameter calculated by the methods of their operations. The variable Workload is transformed into an input parameter and its value is provided at runtime by the monitoring system. The value of AssignmentRate is constant, whose value is determined at the configuration generation.

[0061] Depending on the applicable redundancy model, a workload unit may group one or more active assignments. Equation (2) is used in the service dimensioning step 420 of Figure 4 to determine the number of required workload units from the calculated active assignments.

#WUs = ceil (#ActiveAssignment / max#ActiveAssignmentswu) (2)

[0062] The value of max#ActiveAssignmentswu is determined at service dimensioning time based on the maximum workload of a customer and the number of nodes that the customer is allowed to use. It remains constant similar to the assignment rate.

[0063] Equation (2) is transformed into the method of the addWorkloadUnit action to add workload units when the workload of a customer exceeds the capacity of the current workload units. This equation is also re-used in the elasticity rules which have their scaling rule set to Decrease to define the method of the removeWorkloadUnit action; i.e. when fewer workload units are needed for the calculated number of active assignments and some workload units are to be locked as a follow-up action. Note that using these equations (1) and (2) guarantees that the system has the minimum required number of active assignments and WUs.

[0064] At runtime, when operations of these elasticity rules are executed the service side of the system is re-dimensioned. For example, if the measurement from the monitoring system shows that the current workload represented by a workload unit with two active assignments and with the assignment rate of 400 requests per second has increased to 1100 requests per second, then the number of assignments for that workload unit is changed to three. On the other hand, if a similar increase is detected for a service type where each workload unit can have only one active assignment, then the increase needs three unlocked workload units in the system. If there are two unlocked workload units, a third one is added to the system.

[0065] As part of such reconfigurations, the threshold values related to the appropriate services, i.e. for the service side of the system, are updated. For this purpose, the equations used to determine the boundary thresholds in the service dimensioning step 420 in Figure 4 are also re-used to define the method for the updateThreshold operation. This operation is part of the add/removeAssignment action and it is executed when the adding/removeAssignment operation is executed.

[0066] The definitions of prerequisite and follow-up with respect the service side are now provided. Service entities depend on service provider entities for being provided and protected, i.e. the service side relies on the resources of the service provider side. In addition, services may depend on each other within the service side, i.e. to function one service may require another service. Therefore for each dependency, a prerequisite is generated for the case of the addition of an assignment or a workload unit, and a follow-up is generated for the removal case. At the same time, both the prerequisites and the follow-ups are generated as they are applicable to the same sponsor entities.

[0067] Prerequisite and follow-ups for checking the service provider side capacity are now explained. With respect to service provider side capacity, prerequisites are used to ensure that there are sufficient service provider entities to which the added active assignments or the assignments of the added workload units can be assigned. Additionally, follow-ups are used to ensure that provider entities without assignment are removed. The system also ensures that there are sufficient groups of serving units (i.e. work pools) to provide and protect the required number of workload units and their assignments. Therefore, the formulas used for dimensioning the work pools and serving units are reused to define the Boolean expressions of the prerequisite and the follow-up checks. Inequality (3) is respected when adding an assignment or workload unit:

Current #WPs > Required #WPs for protection (3)

[0068] To avoid wasting resources, the left-hand side of inequality (3) has an upper boundary which is equal to the required number of work pools for protection. Hence inequality (4) is used to check if a work pool is in excess (i.e. more than necessary) when an assignment or workload unit is removed.

Current #WPs > Required #WPs for protection (4)

[0069] For the service provider side prerequisite, the system starts with (3) and for the service provider side follow-up, the system starts with (4), and both sides of the inequalities are defined. The current number of work pools (i.e. the left-hand sides of inequalities (3) and (4)) is obtained from the system at runtime. The number of work pools which are required for protecting workload units is determined based on the equations used in the service provider dimensioning step 450 of Figure 4 that calculates the number of required work pools. The service provider side prerequisite and follow-up are generated at the same time when the number of work pools is determined.

[0070] The number of work pools may be calculated differently for different redundancy models. For instance, for the 2N redundancy model where each active assignment requires a workload unit, Equation (5) is used to dimension the work pools. #WPs = ceil (#WUs / min (max#ActiveAssignmentsPerServingUnit,

max#StandbyAssignmentsPerServingUnit)) (5)

[0071] Equation (5) is used in the service side elasticity rules to define the right-hand side of the Boolean expressions of the prerequisite (3) and follow-up (4). For this purpose, the variables of (5) are transformed into parameters. The number of workload units (#WUs) is transformed into a parameter whose value for the prerequisite is calculated using (1), i.e. the required number of workload units; while for the follow-up it is the current number of workload units and comes from the current configuration since it has changed as a result of performing the

removeAssignment/workloadUnit action of the elasticity rule in question. Similarly to the AssignmentRate, the variables max#ActiveAssignmentsPerServingUnit and

max#StandbyAssignmentsPerServingUnit are both constant and their values are determined at configuration generation time.

[0072] Prerequisites and follow-ups for checking the service side capacity are now explained. If a service depends on another service (i.e. the sponsor), then the sponsor entity is dimensioned in terms of active assignments based on the dependent entity according to equation (6):

#ActiveAssignmentssponsor = ceil(#ActiveAssignmentSDependent ^x AssignmcntRatCDependem /AssignmentRate sponsor) (6)

[0073] By rewriting (6), the prerequisite and the follow-up applicable at runtime can be generated to check if the current number of active assignments of the sponsor provides the required capacity for the active assignments of the dependent. From equation (6), inequality (7) is obtained as the Boolean expression for the prerequisite. The Boolean expression for the follow-up is defined similarly.

Current #ActiveAssignmentssponsor > ceil (required #ActiveAssignmentSDependent ^x

AssignmcntRatCDependem / AssignmentRatesponsor) (7)

[0074] For the Boolean expressions, the current number of assignments of the sponsor is transformed into a parameter whose value is obtained at runtime from the system. The required number of assignments of the dependent is transformed into a parameter whose value in the prerequisite is calculated using (1). In the follow-up the value is obtained from the current configuration, which has changed as a result of performing removeAssignment action.

[0075] In addition, to check if there are enough unlocked workload units currently in the system to group the required number of active assignments, based on (2) a prerequisite as well as a follow-up are generated. The inequality (8) obtained from (2) is used as the Boolean expression of the prerequisite. The Boolean expression of the follow-up is defined similarly.

Current #WUs ^x max#ActiveAssignmentswu > #RequiredActiveAssignments (8) [0076] The definitions of prerequisite and follow-up triggers with respect the service side are now provided. Triggers are normally generated on entities. At the time of the configuration generation, however, it may not be possible to specify on which entity a prerequisite or a follow up trigger needs to be issued; only the type of this entity can be specified. As a result, the prerequisite and follow-up triggers are also defined for the appropriate type. This means that for a prerequisite/follow-up that checks the capacity of the work pools of a type, the corresponding prerequisite/follow-up trigger is defined on the work pool type. For a prerequisite/follow-up that checks the capacity of workload units of a service type, the prerequisite/follow-up trigger is defined on the service type. Then at runtime, the follow-up/prerequisite trigger is issued on the work pool or workload unit sponsoring the workload unit for which the elasticity rule was invoked. The scalingType of a prerequisite trigger is Increase and Decrease for a follow-up trigger, because a prerequisite trigger initiates the allocation of the prerequisite resources and a follow-up trigger initiates the release of excess resources of the sponsors. The attribute measurement of a prerequisite trigger represents the minimum sponsor capacity which is required to be added to meet the prerequisite Boolean expression. In contrast, the attribute measurement of the follow-up trigger represents the minimum sponsor capacity which is required to be removed so that the follow-up Boolean expression is evaluated to false indicating no extra sponsor resource. These prerequisite and follow-up triggers are defined together with their corresponding prerequisite and follow-up Boolean expressions.

[0077] The definition of condition with respect the service side is now provided. The addWorkloadUnit, addAssignment, removeWorkloadUnit and removeAssignment actions are applicable when, by adding/removing workload units or assignments, the designed boundaries of the system are not violated. Moreover, the action addWorkloadUnit is applicable only if the workload unit on which the threshold trigger was issued is currently“locked” as the action changes its status to“unlocked”. In contrast, the action addAssignment is applicable if the workload unit for which the threshold trigger is generated is currently“unlocked”. Therefore, the state of the workload unit is transformed into a parameter by which the Boolean expressions of these conditions are defined. The action removeAssignment is applicable if the workload unit contains some active assignments and removeWorkloadUnit is applicable when fewer workload units for grouping active assignments are required. As a result, for the removeAssignment and removeWorkloadUnit actions, the current number of active assignments in a workload unit is transformed into a parameter by which the Boolean expressions of the conditions are defined.

All the aforementioned conditions are generated when their corresponding actions are generated.

[0078] The following description explains in detail the generation of elasticity rules for the service provider side, such as work pools and nodes. [0079] An elasticity rule for a work pool is triggered as a prerequisite when the workload increases and the current work pools cannot provide the added workload units/assignments, or as a follow-up action when the workload decreases and serving units and/or work pools become in excess and should be removed.

[0080] The definition of action with respect to the service provider side is now provided. Depending on the situation, a number of actions may be performed:

[0081] For example, one of the actions may be to reconfigure the current work pools by adding or removing SUs. The capacity of the system for providing workload units can change by reconfiguring its current work pools. A work pool is reconfigured by changing the state of its constituting serving units. That is, the capacity of a work pool can increase by“unlocking” some of its“locked” serving unit. Similarly, the work pool can be reconfigured by“locking” its unassigned serving units when the workload decreases. By taking such actions, the number of “unlocked” serving units in a work pool changes. The equation used to dimension the serving units at configuration design time is used in the work pool elasticity rule as the method of the reconfigureWorkPool operation.

[0082] Another action may be to add new work pools or removing the work pools in excess. When the workload is not at its maximum, some of the work pools may not have any workload units to protect. To avoid wasting resources, the action RemoveWorkPool is taken to lock the excess work pools and their serving units. In contrast when a new workload unit is required, to increase the capacity of the system, a new work pool may be required as the service provider entity. By performing the action addWorkPool, the work pool and some of its serving units become“unlocked”. At runtime, when the operation addWorkPool or RemoveWorkPool is executed based on the required or removed workload units the current number of unlocked work pools in the system changes. The equation used in the service provider dimensioning step to calculate the number of work pools (i.e. equation (5)) is used to define the method of addWorkPool or removeWorkPool operations. The required number of“unlocked” serving units in each work pool is determined according to the redundancy model of the work pool. For example, if the redundancy model is 2N, the required number of“unlocked” serving units is 2 (i.e. one serving unit for supporting the active assignments and one serving unit for supporting the standby assignments).

[0083] The definitions of prerequisite and follow-up for the service provider side are provided Prerequisite and Follow-up Definitions. Serving units are hosted on nodes; therefore to unlock a serving unit, as prerequisite the hosting node should be in the“unlocked” state and it should have enough capacity to host the added serving unit. The load that is imposed on the node by requests for a service is estimated by a function at runtime. This function takes into account parameters that characterize the workload as well as the node (e.g. the types of workload the node currently supports, the operating system, etc.). By calculating the estimated load at runtime, the system can check if the underlying node has enough resources to support the required serving unit. As a result, in the work pool elasticity rule model, the Boolean expressions of the prerequisites are defined as (9) and (10) to check if the node is“unlocked” and if it has enough resources to host the required serving unit:

node.state=”unlocked” (9) node.maxThreshold > node.load + su.requiredResource (10)

[0084] In contrast, by putting a serving unit into the“locked” state, the resources of the hosting node may become in excess; thus a follow-up trigger on the node should be generated to initiate the removal of the node or its excess resources (if applicable) by a follow-up action. The resources of the node are in excess if the current load on the node is less than its minimum threshold. Therefore, in the work pool elasticity rule model the follow-up is defined as (11):

node.minThreshold > node.load (11)

[0085] The Boolean expressions (9), (10) and (11) are generated in the last step 460 of the configuration generation process 400 when the nodes for hosting the work pools are determined. At this step, the variables maxThreshold, minThreshold, load and state are transformed into parameters that belong to the node. The variable requiredResource of (10) is transformed into a parameter that belongs to the hosted SU.

[0086] The definitions of prerequisite and follow-up triggers with respect to the service provider side are now provided. Since the SUs of a work pool may be hosted on different nodes of a node group, it may not be possible to specify offline on which node the prerequisite or follow-up trigger is to be generated at runtime. However, the node group to which a serving unit belongs can be specified. Therefore, at design time, the prerequisite/follow-up trigger is defined for the node group. At runtime, when the prerequisite for adding an SU is not met or when after the removal of the SU resources become extra, the trigger is generated for the hosting node.

[0087] The definition of condition with respect to the service provider side is now provided. In case of increase, the action reconfigureWorkPool is applicable if the work pool on which this action is to be taken is unlocked. That is, the condition for reconfigureWorkPool is that the work pool on which this action is to be taken is unlocked. If the work pool is locked, the action addWorkPool is applicable. As a result, the state of the work pool is transformed into a parameter by which the Boolean expressions of these conditions are defined. In case of decrease, the action reconfigureWorkPool is applicable if the work pool still protects some workload units. In contrast, the action removeWorkPool is applicable when the work pool does not have any workload unit to protect and, as the result of the action, it should be locked. Therefore, the current number of protected WUs is transformed into a parameter by which the conditions of reconfigureWorkPool and removeWorkPool actions are defined. All the aforementioned conditions are generated when their corresponding actions are generated.

[0088] Elasticity rules for nodes are now discussed. The configuration of nodes need to support the work pools, their serving units and if applicable virtual compute nodes, which is guaranteed by the prerequisites and follow-ups of their elasticity rules. However, because of different estimates used at their execution, these may not guarantee an optimal distribution of assignments to the SUs and work pools hosted on the nodes. To redistribute hosted entities, additional complementary actions may be needed. Therefore, the actions of node elasticity rules are categorized into: Actions to handle prerequisite or follow-up triggers; and actions to redistribute the hosted entities for better resource utilization. Note that, in turn, actions of the latter category may require actions of the first category as prerequisites/follow-ups. To define the action of the second category, templates (e.g., the templates 463 of Figure 4) are defined based on the distribution principles. These templates 463 are used to generate the different elements of the elasticity rules for the nodes. Since a node can be a member of multiple node groups, the elasticity rules are not defined per node group. The elasticity rules are defined per node at the last step of the configuration generation process 400 when the node configuration is generated.

[0089] In the following, the actions of the node elasticity rules are explained.

[0090] Add or Remove a Node. These actions are defined for the cases of adding a node as a prerequisite or removing one as a follow-up action. The action addNode is applicable when the state of the node is“locked” and by taking this action, the state will change to“unlocked”. The action removeNode is applicable when an“unlocked” node has no services to support and by taking this action, the state of the node changes to“locked”.

[0091] The prerequisite for the addNode action is expressed as inequality (12). According to this prerequisite, if the node is hosted by another node (e.g. it is a VM hosted by a physical node), the hosting node should have enough resources for the hosted node (i.e. by unlocking the hosted node, the maximum threshold of its hosting node should not be reached).

node.hostingNode notEmpty() implies

node .hostingNode. load + requiredResource < node.hostingNode.maxThreshold (12)

[0092] If the node is hosted by another node, the follow-up defined as inequality (13) is associated with the removeNode action to check if by the removal of the node, the resources of the hosting node are in excess. Since the nodes can migrate at runtime, at design time it may not be possible to specify on which hosting node the prerequisite and follow-up triggers are defined. Therefore, at design time, the prerequisite and follow-up triggers are defined on the group of nodes which can host the node.

node.hostingNode notEmpty() implies

node hostingNode. load > node.hostingNode.minThreshold (13)

[0093] Adding or Removing Virtual/Physical Resources to or from the Node. These actions are defined primarily for the cases of prerequisite/follow-up actions. They can be used also to avoid redistribution by adding/removing resources of the current node if it is resizable. A node can be a resizable virtual machine or a hyperscale system, e.g. Ericsson HDS 8000. Resources can be added to a resizable node to decrease the resource utilization, or removed from it to increase. These actions are only included in the elasticity rules of resizable nodes. By these actions, the amount of resources allocated to a node changes. A resizable computing node still has a maximum capacity that it can expand to. If the node has reached its maximum capacity, no more resources can be added to the node and this action is not applicable. Therefore, the condition for addResources action is defined as (14) to check whether the node has not reached its maximum capacity.

node.maxNodeBoundary > node. currentRe source + requiredResource (14)

[0094] To take the removeResources action, as condition, the node should have at least one running process. The prerequisite, prerequisite trigger, follow-up and follow-up trigger of these actions are similar to those of add/removeNode actions.

[0095] Rearrangement of Workload. These actions are defined to redistribute hosted entities of a node, i.e. to resolve the threshold trigger on the node by taking actions on its hosted entities.

To decrease the load on a node, the supported services can be moved out to other nodes if as prerequisite there are service provider entities with enough capacity to host them. At runtime, when the node supports multiple services, based on the estimated cost of releasing one unit of resource it is decided which supported service should be moved out. To rearrange the workload, one of the following actions are defined:

[0096] Migration of hosted nodes. If the node is capable of hosting other nodes, some of its hosted nodes can be migrated to other hosting nodes to release the resources of the given node. The prerequisite to migrate a hosted node is expressed as (15). Accordingly, in the hosting node group there should be a hosting node with enough capacity to host the hosted node to be migrated. As expressed in (16), this action is applicable when the hosting node hosts at least one hosted node. The prerequisite and follow-up triggers are defined on the group of hosting nodes which are eligible to host the hosted node.

nodeGroup.nodes() exists (n|n.load + requiredResource < maxNodeThreshold) (15) node.hostedNodes size() > 0 (16)

[0097] Moving assignments/workload units to other nodes. Similar to the migration of hosted nodes, one way of releasing a node’s resources is to move the assignments or workload units supported by the node to other nodes. The prerequisite, prerequisite trigger, follow-up and follow-up trigger of this action are similar to those of addWorkloadUnit/Assignment and removeWorkloadUnit/Assignment actions .

[0098] Adding workload unit/assignment to an additional node. By adding a workload unit or an assignment, the workload is shared among more nodes and therefore, less load will be imposed on the given node. This action is applicable if the boundary of the system from the service side has not been reached.

[0099] Swapping the active and standby assignments. Standby assignments often impose less load on the nodes than active assignments; therefore, swapping the role may reduce the load on the node having the active assignment. However, for services such as databases where the load imposed by the active and the standby assignments are quite the same, this action may not be effective. This action has no prerequisite and prerequisite trigger; however, a condition is defined to check if for an active assignment supported by the node, there exists a standby assignment such that the load imposed by its standby will be less than the load imposed by it. Depending on the redundancy model of the protecting work pool, other constraints may be needed too. After performing this action, if a failure happens, the node may experience high load again as the standby assignment becomes active due to the failure.

[00100] To maximize resource utilization and meet SLAs, service providers need to adapt systems dynamically to the workload variations. A set of elasticity rules are used to reconfigure the system dynamically. This disclosure presents an approach to generate automatically elasticity rules at system configuration generation time, and reconfigure the system at runtime using the elasticity rules. While the system’s configuration is designed (i.e. generated automatically), the calculations used to dimension the system as well as some computed parameters are reused to define the elasticity rules. The reuse of system dimensioning knowledge results in more accurate elasticity rules and less resource usage compared to runtime elasticity rules definition approaches that are based on learning techniques. Moreover, the elasticity rules considered in this disclosure are at a finer granularity than what is presented in the related work as the rearrangement, in addition to the addition and removal, of resources are taken into account.

[00101] Figure 6 is a flow diagram illustrating a method 600 for runtime reconfiguration of a system according to elasticity rules according to one embodiment. The method 600 begins at step 610 when one or more of the elasticity rules are invoked in response to a trigger. The elasticity rules are generated offline and contain formulas used for dimensioning the system during a system configuration generation process. One or more of the formulas specified in the one or more elasticity rules are evaluated at step 620 to determine actions to be executed for reconfiguring the system. The actions are executed at step 630 to scale the system with respect to one or more entities among service provider entities and service entities.

[00102] In one embodiment, the formulas include equations and inequalities, which quantitatively describe required numbers of the entities for accommodating runtime workload variations and thresholds for the system containing the required numbers of the entities. For example, equations and/or inequalities may be used to calculate the number of entities needed for a given workload or the capacity provided by the entities for the workload. Results of the calculations can be used to determine how many entities need to be added or removed from the current number in the system, and to determine what the new thresholds are for these entities based on the capacity these entities provide.

[00103] In one embodiment, the trigger occurs when one of the thresholds is violated. In one embodiment, the service entities represent provided services and the service provider entities represent resources providing the services. In one embodiment, the elasticity rules may be applied on the system by executing corresponding actions which provision, de-provision and re arrange resources and services provided by the system.

[00104] In one embodiment, evaluating the formulas, the runtime measurements of the system and constants calculated during the system configuration generation process are obtained and used as parameter values for the one or more formulas. In one embodiment, each action is executed according to a method defined by at least one of the formulas.

[00105] In one embodiment, an action on an entity is executed when a corresponding condition for the action is met and when all prerequisites of the action are met if the action has any prerequisites. The corresponding condition for the action is a Boolean expression which indicates a status or a constraint of the system, the entity on which the action is executed, or the entities in the system for the action to be applicable. In general, the corresponding condition for the action indicates the applicability of the action. For example, a condition may indicate whether the system has the capacity for the action to apply. A condition for a node may be whether the node is physical or virtual, as a virtual node such as a VM can be moved or resized (if the host has the capacity) but a physical node can neither be moved nor resized. A condition may be used to select an action appropriate for the redundancy model of the system, e.g. 2N redundancy cannot have additional serving units while N-way redundancy can.

[00106] In one embodiment, a prerequisite of an action is evaluated at runtime to determine whether to allocate prerequisite resources of the system. The prerequisite is a Boolean expression defined by at least one of the formulas. In one embodiment, a follow-up of an action is evaluated at runtime to determine whether to release excess resources of the system. The follow-up is a Boolean expression defined by at least one of the formulas.

[00107] In one embodiment, each elasticity rule is expressed by one or more actions and corresponding costs. A corresponding cost of an action may be estimated during the system configuration generation process based on a minimum cost of the action where all prerequisites of the action are met and a maximum cost of the action where none of the prerequisites are met. In one embodiment, at least one action is identified to redistribute entities of the system on physical or virtual nodes of the system at runtime according to the corresponding costs.

[00108] In one embodiment, the system detects a threshold violation as a trigger, in response to which one or more of the elasticity rules are invoked. The thresholds of the system indicate boundaries of unlocked capacity of the system and are generated at the system configuration generation time. In some embodiments of the system, a configuration value may have a minimum threshold and a maximum threshold for its upper boundary and lower boundary, respectively. A threshold violation means that the configuration value is near (i.e. within a predefine vicinity of) and may soon breach the corresponding boundary. When reconfiguring the system, one or more of the thresholds may be updated.

[00109] In one embodiment, each action is generated for a group of the entities having a same entity type, and each action is matched to an entity at runtime.

[00110] Figure 7 is a block diagram illustrating a network node 700 according to an embodiment. In one embodiment, the network node 700 may be a server in an operator network or in a data center. The network node 700 includes circuitry which further includes processing circuitry 702, a memory 704 or instruction repository and interface circuitry 706. The interface circuitry 706 can include at least one input port and at least one output port. The memory 704 contains instructions executable by the processing circuitry 702 whereby the network node 700 is operable to perform the various embodiments described herein.

[00111] Figure 8 is a block diagram of an example network node 800 for runtime

reconfiguration of a system according to elasticity rules. In one embodiment, the network node 800 may be a server in an operator network or in a data center. The network node 800 comprises a rule invocation module 810, an evaluation module 820 and an execution module 830. The elasticity rules are generated offline and contain formulas used for dimensioning the system during a system configuration generation process. In one embodiment, the rule invocation module 810 is operative to invoke one or more of the elasticity rules in response to a trigger. The evaluation module 820 is operative to evaluate one or more of the formulas specified in the one or more elasticity rules to determine actions to be executed for reconfiguring the system. The execution module 830 is operative to execute the actions to scale the system with respect to one or more entities among service provider entities and service entities. The network node 800 can be configured to perform the various embodiments as have been described herein.

[00112] Figure 9 is an architectural overview of a cloud computing environment 900 that comprises a hierarchy of a cloud computing entities. The cloud computing environment 900 can include a number of different data centers (DCs) 930 at different geographic sites connected over a network 935. Each data center 930 site comprises a number of racks 920, each rack 920 comprises a number of servers 910. It is understood that in alternative embodiments a cloud computing environment may include any number of data centers, racks and servers. A set of the servers 910 may be selected to host resources 940. In one embodiment, the servers 910 provide an execution environment for hosting entities and their hosted entities, where the hosting entities may be service providers and the hosted entities may be the services provided by the service providers. Examples of hosting entities include virtual machines (which may host containers) and containers (which may host contained components), among others. A container is a software component that can contain other components within itself. Multiple containers can share the same operating system (OS) instance, and each container provides an isolated execution environment for its contained component. As opposed to VMs, containers and their contained components share the same host OS instance and therefore create less overhead. Each of the servers 910, the VMs, and the containers within the VMs may be configured to perform the various embodiments as have been described herein.

[00113] Further details of the server 910 and its resources 940 are shown within a dotted circle 915 of Figure 9, according to one embodiment. The cloud computing environment 900 comprises a general-purpose network device (e.g. server 910), which includes hardware comprising a set of one or more processor(s) 960, which can be commercial off-the-shelf (COTS) processors, dedicated Application Specific Integrated Circuits (ASICs), or any other type of processing circuit including digital or analog hardware components or special purpose processors, and network interface controller(s) 970 (NICs), also known as network interface cards, as well as non-transitory machine -readable storage media 990 having stored therein software and/or instructions executable by the processor(s) 960.

[00114] During operation, the processor(s) 960 execute the software to instantiate a hypervisor 950 and one or more VMs 941, 942 that are run by the hypervisor 950. The hypervisor 950 and VMs 941, 942 are virtual resources, which may run node instances. In one embodiment, the node instance may be implemented on one or more of the VMs 941, 942 that run on the hypervisor 950 to perform the various embodiments as have been described herein. In one embodiment, the node instance may be instantiated as a network node performing the various embodiments as described herein.

[00115] In an embodiment, the node instance instantiation can be initiated by a node 901, which may be a machine the same as or similar to the network node 700 of Figure 7. For example, the node 901 can communicate with the server 910 via the network 935. In one embodiment, the node 901 may dimension a system during a system configuration process and generate elasticity rules for the system.

[00116] Embodiments may be represented as a software product stored in a machine -readable medium (such as the non-transitory machine-readable storage media 990, also referred to as a computer-readable medium, a processor-readable medium, or a computer usable medium having a computer readable program code embodied therein). The non-transitory machine-readable medium 990 may be any suitable tangible medium including a magnetic, optical, or electrical storage medium including a diskette, compact disk read only memory (CD-ROM), digital versatile disc read-only memory (DVD-ROM) memory device (volatile or non-volatile) such as hard drive or solid state drive, or similar storage mechanism. The machine-readable medium may contain various sets of instructions, code sequences, configuration information, or other data, which, when executed, cause a processor to perform steps in a method according to an embodiment. Those of ordinary skill in the art will appreciate that other instructions and operations necessary to implement the described embodiments may also be stored on the machine-readable medium. Software running from the machine-readable medium may interface with circuitry to perform the described tasks.

[00117] The above-described embodiments are intended to be examples only. Alterations, modifications and variations may be effected to the particular embodiments by those of skill in the art without departing from the scope which is defined solely by the claims appended hereto.

Claims

CLAIMS What is claimed is:

1. A method for runtime reconfiguration of a system according to elasticity rules, comprising:

invoking one or more of the elasticity rules in response to a trigger, wherein the elasticity rules are generated offline and contain formulas used for dimensioning the system during a system configuration generation process;

evaluating one or more of the formulas specified in the one or more elasticity rules to determine actions to be executed for reconfiguring the system; and

executing the actions to scale the system with respect to one or more entities among service provider entities and service entities.

2. The method of claim 1, wherein the formulas include equations and inequalities, which quantitatively describe required numbers of the entities for accommodating runtime workload variations and thresholds for the system containing the required numbers of the entities.

3. The method of claim 2, wherein the trigger occurs when one of the thresholds is violated.

4. The method of claim 1, wherein the service entities represent provided services and the service provider entities represent resources providing the services.

5. The method of claim 1, further comprising:

applying the elasticity rules on the system by executing corresponding actions which provision, de-provision and re-arrange resources and services provided by the system.

6. The method of claim 1, wherein evaluating the one or more of the formulas further comprises:

obtaining runtime measurements of the system and constants calculated during the system configuration generation process as parameter values for the one or more formulas.

7. The method of claim 1, wherein each action is executed according to a method defined by at least one of the formulas.

8. The method of claim 1, further comprising:

executing an action on an entity when a corresponding condition for the action is met and when all prerequisites of the action are met if the action has any prerequisites.

9. The method of claim 8, wherein the corresponding condition for the action is a Boolean expression indicating a status or a constraint of the system, the entity or the entities in the system for the action to be applicable.

10. The method of claim 1, further comprising:

evaluating a prerequisite of an action at runtime to determine whether to allocate prerequisite resources of the system, wherein the prerequisite is a Boolean expression defined by at least one of the formulas.

11. The method of claim 1, further comprising:

evaluating a follow-up of an action at runtime to determine whether to release excess resources of the system, wherein the follow-up is a Boolean expression defined by at least one of the formulas.

12. The method of claim 1, wherein each elasticity rule is expressed by one or more actions and corresponding costs, and wherein a corresponding cost of an action is estimated during the system configuration generation process based on a minimum cost of the action where all prerequisites of the action are met and a maximum cost of the action where none of the prerequisites are met.

13. The method of claim 12, further comprising:

identifying at least one action to redistribute entities of the system on physical or virtual nodes of the system at runtime according to the corresponding costs.

14. The method of claim 1, wherein invoking the one or more of the elasticity rules in response to the trigger further comprises: detecting a threshold violation as the trigger, wherein thresholds of the system indicate boundaries of unlocked capacity of the system and are generated at the system configuration generation time; and

updating one or more of the thresholds when reconfiguring the system.

15. The method of claim 1, wherein each action is generated for a group of the entities having a same entity type, and is matched to an entity at runtime.

16. A network node comprising:

processing circuitry; and

memory containing instructions executable by the processing circuitry to reconfigure a system at runtime according to elasticity rules, the network node operative to:

invoke one or more of the elasticity rules in response to a trigger, wherein the elasticity rules are generated offline and contain formulas used for dimensioning the system during a system configuration generation process;

evaluate one or more of the formulas specified in the one or more elasticity rules to determine actions to be executed for reconfiguring the system; and

execute the actions to scale the system with respect to one or more entities among service provider entities and service entities.

17. The network node of claim 16, wherein the formulas include equations and inequalities, which quantitatively describe required numbers of the entities for accommodating runtime workload variations and thresholds for the system containing the required numbers of the entities.

18. The network node of claim 17, wherein the trigger occurs when one of the thresholds is violated.

19. The network node of claim 16, wherein the service entities represent provided services and the service provider entities represent resources providing the services.

20. The network node of claim 16, wherein the network node is further operative to: apply the elasticity rules on the system by executing corresponding actions which provision, de-provision and re-arrange resources and services provided by the system.

21. The network node of claim 16, wherein when evaluating the one or more of the formulas, the network node is further operative to:

obtain runtime measurements of the system and constants calculated during the system configuration generation process as parameter values for the one or more formulas.

22. The network node of claim 16, wherein each action is executed according to a method defined by at least one of the formulas.

23. The network node of claim 16, wherein the network node is further operative to:

execute an action on an entity when a corresponding condition for the action is met and when all prerequisites of the action are met if the action has any prerequisites.

24. The network node of claim 23, wherein the corresponding condition for the action is a Boolean expression indicating a status or a constraint of the system, the entity or the entities in the system for the action to be applicable.

25. The network node of claim 16, wherein the network node is further operative to:

evaluate a prerequisite of an action at runtime to determine whether to allocate prerequisite resources of the system, wherein the prerequisite is a Boolean expression defined by at least one of the formulas.

26. The network node of claim 16, wherein the network node is further operative to:

evaluate a follow-up of an action at runtime to determine whether to release excess resources of the system, wherein the follow-up is a Boolean expression defined by at least one of the formulas.

27. The network node of claim 16, wherein each elasticity rule is expressed by one or more actions and corresponding costs, and wherein a corresponding cost of an action is estimated during the system configuration generation process based on a minimum cost of the action where all prerequisites of the action are met and a maximum cost of the action where none of the prerequisites are met.

28. The network node of claim 27, wherein the network node is further operative to:

identify at least one action to redistribute entities of the system on physical or virtual nodes of the system at runtime according to the corresponding costs.

29. The network node of claim 16, wherein when invoking the one or more of the elasticity rules in response to the trigger, the network node is further operative to:

detect a threshold violation as the trigger, wherein thresholds of the system indicate boundaries of unlocked capacity of the system and are generated at the system configuration generation time; and

update one or more of the thresholds when reconfiguring the system.

30. The network node of claim 16, wherein each action is generated for a group of the entities having a same entity type, and is matched to an entity at runtime.

31. A network node operable to reconfigure a system at runtime according to elasticity rules, the network node comprising:

a rule invocation module operative to invoke one or more of the elasticity rules in response to a trigger, wherein the elasticity rules are generated offline and contain formulas used for dimensioning the system during a system configuration generation process;

an evaluation module operative to evaluate one or more of the formulas specified in the one or more elasticity rules to determine actions to be executed for reconfiguring the system; and

an execution module operative to execute the actions to scale the system with respect to one or more entities among service provider entities and service entities.