WO2013185175A1

WO2013185175A1 - Predictive analytics for resource provisioning in hybrid cloud

Info

Publication number: WO2013185175A1
Application number: PCT/AU2013/000624
Authority: WO
Inventors: Kevin Lee; Anna Liu
Original assignee: National Ict Australia Limited
Priority date: 2012-06-15
Filing date: 2013-06-12
Publication date: 2013-12-19

Abstract

This invention concerns the prediction of demand of a computing system. A processor predicts the demand of a computer system by iteratively determining plural demand patterns based on an input stream of historical demand data. In each iteration the processor merges substrings of the input stream based on a difference between multiple demand pattern estimates of the previous iteration and the substrings of the input stream. The processor then determines a demand prediction based on a difference between a demand query and each of the demand patterns. This way, the processor learns patterns from historical data and predicts the demand based on the learned data. These demand patterns represent patterns that occur consistently over a longer time period.

Description

Title

Predictive Analytics for Resource Provisioning in Hybrid Cloud Technical Field

This invention concerns the prediction of demand of a computing system. More particularly, the invention concerns a method, a system and software for predicting demand of a computer system.

Background Art

Cloud computing has changed many workflows related to the Internet. The differences between traditional server-based web hosting and web-services in the cloud will now be described with reference to a simple example.

Fig. 1 illustrates a server-based web-hosting system 100 comprising a server 102 connected to a display 104, an input device 106, such as a keyboard and a computer network 108, such as the Internet comprising a domain name service (DNS) server 109. A computer 110 of a user 111 is connected to the Internet 108.

Stored on the server 102 there is a computer file 112 comprising computer code, such as html, that characterises the content, appearance and behaviour of a webpage. Typically, the server 102 stores more than one such computer file but only one is shown for the sake of clarity. The server 102 executes a text editor software 1 14, such as Vim, and a webserver software 1 16, such as Apache. Both the text editor and the webserver access the computer file 1 12.

When in use, the display 104 shows the text editor 114 to a web designer 120. The web designer 120 uses the input device 106 to alter the computer file 1 12 to create or modify the content, appearance and behaviour of the web page. The web designer 120 may alternatively edit a local copy of the web page offline and then transfer the updated copy to the server 102 for deployment. The web designer also registers an internet address, such as www.example.com, with the DNS server 109 such that the internet address is associated with the IP address of the server 102.

When the user 1 1 1 enters the internet address into a browser software executed on computer 110, the browser software queries the DNS server for the IP address related to the address. Then the browser software connects to the server 102 and retrieves the computer file 112. The browser software interprets the computer code of the computer file and displays the web page to the user 111.

Typically, many users access the webpage stored on server 102. The computer file may include complex instructions for processes executed on the server 102 such as an online shop. These processes require computing power and the required computation power depends on the number of users that access the computer file 1 12. It is difficult for the web designer to decide how much investment into computing power is necessary to provide reliable web presence.

Fig. 2 illustrates a cloud-based web hosting system 200 comprising a client computer 202 connected to the display 104, input device 106 and a computing cloud 208. The computer 110 of the user 1 1 1 is also connected to the computing cloud 208. The computing cloud 208 comprises resources for data storage, computing power and network services. Unlike in the example of Fig. 1 , the client computer 202 executes only a browser 1 14. The computer file 1 12 is now stored by the computing cloud 208. The text editor 114 as well as the webserver software 1 16 are executed by the computing cloud 208 and displayed to the web designer 120 via a browser executed by the client computer 202.

When a user 1 11 accesses the web page of web designer 120, the browser executed on computer 110 does not connect to the client computer 202. Instead, the computer 110 retrieves the computer file 112 from the computing cloud 208. The computing cloud 208 offers resources to a large number of providers such as web designer 120. These resources may be infrastructure, such as virtual servers with root access, which is referred to as Infrastructure as a Service (IaaS) or may be a platform, such offereing the Java servlet platfor, for example Tomcat, which is referred to as Platform as a Service (PaaS). These resources may also be software, such as an online shop platform, which is referred to as Software as a Service (SaaS).

Throughout this specification the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention as it existed before the priority date of each claim of this application.

Disclosure of Invention

In a first aspect there is provided a computer implemented method for predicting demand of a computer system, the method comprising:

iteratively determining plural demand patterns based on an input stream of historical demand data by each iteration merging substrings of the input stream based on a difference between multiple demand pattern estimates of the previous iteration and the substrings of the input stream; and

determining a demand prediction based on a difference between a demand query and each of the demand patterns.

It is an advantage that the method determines demand patterns and then determines a demand prediction based on a difference between a demand query and each of the demand patterns. As a result, the method learns patterns from historical data and predicts the demand based on the learned data.

It is a further advantage that the learning is done by merging substrings of the input stream based on a difference between multiple demand patterns and the substrings of the input stream. As result, the demand patterns represent patterns that occur consistently over a longer time period and are therefore learned by the method.

Merging substrings of the input stream may comprise:

associating each substring of the input stream with one of the demand pattern estimates that has the smallest distance from that substring of the input stream; and updating each demand pattern estimate with a merged substring of the substrings of the input stream associated with that demand pattern estimate.

It is an advantage that the substrings are associated with closest demand pattern estimates and that the demand pattern estimates are updated with a merged substring or the associated substrings. As a result, the demand pattern is refined by taking into account substrings that are close to the demand pattern. Determining the demand prediction may comprise:

determining the demand pattern that has the smallest distance from the demand query; and

determining a demand prediction based on the determined demand pattern.

It is an advantage that the demand pattern with the smallest distance is determined for determining a demand prediction. As a result, the closest demand pattern is determined and the demand prediction is accurate.

The demand prediction may be a most recent value of the determined demand patterns.

It is an advantage that the most recent value of the pattern is the prediction value. As a result, the remaining values are used as input value for finding the closest demand pattern and the single output value gives the prediction.

The demand pattern estimates may be determined by randomly selecting substrings of the input stream. It is an advantage that the substrings are selected randomly. As a result, the demand pattern estimates are determined in an efficient manner. It is a further advantage that these randomly selected substrings represent an initialisation for the demand pattern estimates. The input stream may be based on monitoring data.

It is an advantage that monitoring data forms the basis of the prediction. As a result, the prediction learns patterns in the monitoring data which is a good indication of the load of the system.

The monitoring data may be one or more of:

computing time;

number of API calls;

amount of used memory;

amount of sent or received data;

number of messages sent or received; and amount of bandwidth used.

It is an advantage that multiple different values of the monitoring data may be used. As a result, the method is flexible to be adapted to different scenarios and may also combine several different monitoring data.

The input stream may be based on human hints and a model of the monitoring data.

It is an advantage that the input stream is based on human hints and a model of the monitoring data As a result, human hints can be incorporated into the learning of demand patterns which allows the method to incorporate the knowledge of a user about events that will take place in the future and are not discernable from the monitoring data. The model may be one or more of:

linear regression;

autoregressive integrated moving average; and

neural network. The input stream may be based on an internet traffic model.

It is an advantage that the method also considers the input from an internet traffic model. As a result, the method can also predict the traffic demand which enables the automatic deployment of network bandwidth.

The computer implemented method may further comprise the steps of:

receiving or accessing the historical demand data from multiple different sources;

translating the historical demand data into a common data format; and determining the input stream based on the translated historical demand data.

It is an advantage that the method translates the historical demand data into a common format. As a result, the demand can be predicted for a wide range of different input data formats.

The substrings may be of a predetermined length. It is an advantage that the substrings are of a predetermined length. As a result, the merging of the substrings and the determination of the prediction is efficient. The computer implemented method may further comprise determining multiple resources that together meet the determined demand prediction.

It is an advantage that the method determines multiple resources that meet the demand. As a result, the method provides information about the required resources and a user can employ or a system can automatically employ the determined multiple resources such that the demand is met.

The multiple resources may be a combination of resources of multiple different types. It is an advantage that the method determines resources of different types. As a result, the method has a higher degree of freedom to determine the resources optimally.

The combination of resources of multiple different types may be based on a pricing of the resources.

It is an advantage that the method is based on the pricing of the resources. As a result, the method not only determines resources that meet the demand but also determines resources that meet the demand at a minimal cost. The pricing of the resources may be based on usage time or processor cycles.

It is an advantage that usage time and processor cycles are readily available from many cloud providers. As a result, the pricing can be determined efficiently and accurately. The computer implemented method may further comprise determining a difference between the determined multiple resources and currently employed multiple resources.

It is an advantage that the method determines the difference between required resources and currently deployed resources. As a result, a user gets information on how to reconfigure the current deployment of resources, such as by adding or removing resources or by changing the type of some of the resources. The demand may be a demand for Internet traffic.

It is an advantage that the Internet traffic is a demand. As a result, the Internet traffic is predicted and the resources can employed to meet that demand. As a result, the method is useful not only in computationally intensive applications but also in data intensive applications, such as when starting a marketing project involving a downloadable video. In a second aspect there is provided software that when installed on a computer causes the computer to perform the method for predicting demand of a computer system.

In a third aspect there is provided a prediction apparatus for predicting demand of a computer system, the apparatus comprising:

an input port to receive or access an input stream of historical demand data; a processor

to iteratively determine plural demand patterns based on the input stream by each iteration merging substrings of the input stream based on a difference between multiple demand pattern estimates of a previous iteration and the substrings of the input stream, and

to determine a demand prediction based on a difference between a demand query and each of the demand patterns; and

an effector to configure the computer system based on the determined demand. Optional features described of any aspect, where appropriate, similarly apply to the other aspects also described here.

Brief Description of Drawings

Fig. 1 illustrates a server-based web-hosting system.

Fig. 2 illustrates a cloud-based web hosting system.

Examples will be described with reference to the following figures in which:

Fig. 3 illustrates a computer network.

Fig. 4 illustrates the Monitoring Server.

Fig. 5 illustrates the predictive analytics engine.

Fig. 6 illustrates the scaling effector. Fig, 7 illustrates a computer implemented method for predicting demand of a computer system.

Fig. 8 shows a plot of data points.

Fig. 9 illustrates a first example of a set of substrings created from the input data streams in Fig. 8.

Fig. 9 illustrates a second example of a set of substrings created from the input data streams in Fig. 8.

Best Mode for Carrying Out the Invention

Fig. 3 illustrates a computer network 300 comprising a computer system of a remote site, such as computing cloud 302, connected to a prediction apparatus on a local site, such as server 310, via a data communication network, such as the Internet.

The computing cloud 302 comprises a cloud provider application programming interface (API) and controller 304, a number of cloud virtual machines 306 on which an application is running and a monitoring client 308. The cloud virtual machines 306 and the monitoring client 308 are connected via a cloud virtual network (not shown).

In one example, the server 310 comprises a memory 312 including program memory 314 and data memory 316. In a different example, a program, such as software and data are stored on the same memory. The server 310 further comprises a processor 318 that implements the functionalities of a monitoring server 320, a predictive analytics engine 322 and a scaling effector 324. In another example, the server 310 has more than one processors, which includes physical and virtual processors.

The server 310 further comprises an input port 326, such as a network interface, to receive an input stream of historical demand data from the cloud 302. The input port 326 may also be a port of processor 318 so that processor 318 accesses the input stream of historical demand data from data memory 316. Software installed on program memory 314 causes the processor 318 to perform the method of Fig. 7.

It is to be understood that any kind of data port may be used to receive data, such as a network connection, a memory interface, a pin of the chip package of processor 214, or logical ports, such as I sockets or parameters of functions stored on program memory 216 and executed by processor 214. For example, the predictive analytics engine 322 may receive the input stream as a return value after calling a function of the monitoring server 320. These parameters may also be handled by-value or by-reference in the source code. The processor 214 may receive data through all these interfaces, which includes memory access of volatile memory, such as cache or RAM, or non-volatile memory, such as an optical disk drive, hard disk drive, storage server or cloud storage. The computer system 200 may further be implemented within a cloud computing environment, such as a managed group of interconnected servers hosting a dynamic number of virtual machines.

The server 310 is controlled by an administrator 340 using display device 342 and input device 344. The scaling effector 324 communicates with the cloud provider API and controller 304 via the Internet to control the deployment of resources, such as the number of application nodes, such as cloud virtual machines 306, to perform the application. The predictive analytics engine 322 communicates with monitoring client 308 to receive demand data from the cloud.

When in use, the administrator 340 purchases resources of the computing cloud at a cost to implement an application or web site, such as an online shop. In another example, the computer network 300 comprises more than one computing clouds 302 and the administrator 340 purchases resources from more than one computing cloud.

The local site 310 is where the processing of demand data takes place; it is also where scaling decisions are made and actions are being triggered to accommodate the scaling decision. One the other hand, the remote site 302 is where the cloud provider resides and where demand data is being collected and transferred back to the local site. In a hybrid cloud scenario, there can be one or more remote sites 302; however, the description below illustrates examples with only one remote site (or one cloud provider) for simplicity of explanation, however, the invention can operate with multiple cloud providers at the same time.

In one example, the network 300 can be used by enterprise organisations wishing to deploy applications into the cloud, specifically with focus on e-commerce applications that are deployed by providers for infrastructure as a service (IaaS). In other examples, other application types such as platform as a service (PaaS) is used. Based on the resource requirement of the application 306 (which is expected to change over time), the predictive analytics engine 322 will adapt to the changing needs by instructing the scaling effector 324 to acquire more (or less) resources from the cloud providers.

The specific resources that the scaling effector 324 acquires from the cloud provider are virtual instances with the application 306 deployed on them. These resources are more generally referred to as nodes. By dynamically provision and deprovision resources, the need for the consumer to perform manual capacity planning is removed.

The remote cloud site 302 is connected to the local site 310 through a high speed internet connection such as an ADSL1 or a broadband connection. The exact throughput requirement is dependent on the type and amount of demand data, such as monitoring information from the monitoring client 308. For example, monitoring at a high frequency (e.g., in the scale of seconds) can have a significant impact on the amount of data needed to send through the internet connection, hence a higher throughput requirement is needed. Similarly, monitoring of a large number of applications/nodes can also increase the amount of data. Typically, a connection with a throughput of greater than 2MB/s should be sufficient for monitoring tens of nodes 306 at a frequency of once every one minute.

There are two units on the remote cloud side connecting to the local site via the internet connection, namely the monitoring client 308 and the cloud provider API and controller 304. The monitoring client 308 is a program that operates on a virtual machine and is responsible for collecting monitoring information from individual application nodes 306 in the remote cloud site 302. The collection of this information is via the virtual cloud network. That is, monitoring information is transferred within the remote cloud site 302 from individual application nodes 306 to the monitoring client 308. Each application node 306 contains a data collection agent which is a piece of software that collects system information such as CPU utilisation, computing time, number of API calls, number of messages sent or received, amount of bandwidth used, memory usage, network data throughput, etc. The agent can be implemented in different ways ranging from a simple script that executes specific system commands, such as the "top" command of the Linux operating system, to commercial/open-source monitoring tools such as Nagios. The collected data is in semi-structured or structured formats, such as JSON, XML and OWL (but not restricted to these formats). Each agent in an application node 306 collects data in a periodic manner based on a timer set by the monitoring client. The default value is 30 seconds, meaning that a 'snapshot' of the system information is taken every 30 seconds and transferred to the monitoring client 308. The monitoring client 308 then combines the monitoring data from different application nodes 306 into a single data file (again, in semi-structured or structured formats) and transfers the data file to the local site 310.

It should be noted that the number of application nodes 306 running on the cloud virtual machines is not fixed but can be changed over time by the scaling actions. To solve this issue, there needs to be a subscription mechanism in place to ensure that every application node 306 is properly subscribed to the monitoring client 308 upon creation (i.e., monitoring client 308 will start receiving monitoring information from the application node 306) and deletion (i.e., monitoring client 308 will stop receiving monitoring information from the application node 306).

The subscription mechanism can be implemented by maintaining a list at the monitoring client 308 of live agents in application nodes. In one example for updating this list, the agent in each application node 306 is aware of the location (i.e., IP address or host name) of the monitoring client 308 and is added/deleted from the list upon creation/deletion of the application node 306. In a different example, the monitoring client 308 calls the provider's API 304 to determine the identifiers of live agents/application nodes. This API call is usually supported by cloud providers.

The cloud provider API and controller 304 is a large piece of software that is provided by the computing cloud 302. The server 310 interacts with this component through API calls only and expects the controller will behave correctly and consistently over time. For example, if the server 310 issues an API call to the computing cloud 302 to request for an additional virtual instance, the server 310 expects that a virtual instance created after certain time period.

To ensure this is. in fact the case, the server 310 implements a timer which will elapse after a specified time period (e.g., 5 minutes). Once the timer expires, an alarm will be raised and exception is thrown. The exception will be logged and serve as a trace to the system.

It is important to note that API calls to the computing cloud 302 do not always produce the same behaviour each time. For example, an API call to create an instance does not always succeed and even if it is successful it may take a long time to start an instance. It is unlikely that multiple API calls will have exactly the same spin-up times.

This is shown in experiments where variation in spin-up times in Amazon's EC2 were observed - occasionally with extreme outliers which takes more than 5 minutes to start an instance. A similar behaviour was observed for spin-down times.

The local site 310 is composed of three main components: Monitoring server 320 that collects monitoring data from the remote site; Predictive Analytics Engine 322 that performs the data processing and returns the decision in terms of scaling actions; and Scaling effector 324 that interprets the scaling actions from the analytics engine and translates the actions into API calls of the specific provider.

In one example, the Predictive Analytics Engine 322 determines multiple resources that together meet the predicted demand. In one example, the predicted demand is 200 Elastic Compute Units (ECUs) and with a maximal load on the virtual machines of 50%, the Predictive Analytics Engine 322 determines 400 virtual machines if each machine provides one ECU. In a different example, the predictive analytics engine 322 determines a combination of resources of different types. The remote site may offer virtual machines with 1 , 2 or 5 ECUs per virtual machine. The predictive analytics engine 322 then determines an optimal combination, of virtual machines. In some examples, the pricing for the different types of machines is taken into account such that the optimal combination of types of virtual machine minimises the overall cost.

In one example, there are two main network connections with the remote site 302. One is to maintain synchronisation of the monitoring information to the local site, and the other to execute cloud API commands to control the creation and deletion of virtual instances 306 on the remote site. Note that the monitoring server 320, the predictive analytics engine 322 and the scaling effector 324 in Fig. 3 are purely logical; they may all be residing in a single machine or in multiple machines. Having them reside in a single machine would minimise the communication overhead between these components. The following part of the description describes in some detail each of the three components on the local site.

Fig. 4 illustrates the Monitoring Server 320: The Monitoring Server 320 acts as the main component that operates with the monitoring client 308 on the remote side. It consists of a monitoring data receiver 402 that directly interacts with the remote side monitoring client 308. It is responsible for receiving inputs from monitoring client 308 from multiple cloud providers. There are two modes of operation for the monitoring server: push or pull methods. In the 'push' mode, the monitoring server acts as a listener to multiple cloud providers. The listener is a piece of program that waits for data requests coming from individual monitoring client 308. Upon receiving data requests, individual monitoring clients 'push' data to the monitoring server where they will be buffered and passed on to the data translation' unit.

The pull method is similar to the push method but with the exception that information is polled periodically from the monitoring client 308. The monitoring server 320 issues a poll periodically. The frequency of the poll is defined in the monitoring server 320 and could be changed to accommodate the need of different cloud consumers (e.g., by adjusting the granularity of the monitoring data). An example value of the polling period is 30 seconds. The reason for choosing this value is to ensure that the monitoring server 320 is capable of keeping up with the monitoring data workload (that is the data size will not be growing too rapidly to flood the database server) while without compromising too much on the granularity. The default value for Amazon EC2 is 60 seconds, hence polling in 30 seconds intervals is sufficient for many applications.

It should be noted that the format of the information is kept in its original form (i.e., no processing is done on the remote site 302) until it reaches the monitoring server 320 where there is a common translation module 404 to convert the monitoring data into a supported format. The purpose of the translation module 404 is to resolve discrepancies that could arise form different types of monitoring clients, and to translate the data into a common data format.

The translation procedure takes two arguments: the data in its original form and a set of business rules. The business rules describe details of the translation from individual elements in the original format to elements of the target format. Typically, the business rules are coded by hand once and incremental rules are added in case the translation process is changed. It is also possible to generate these mappings automatically based on the syntax of the two formats.

The translation process can sometimes lead to the loss of information. Gnce the data is translated, the original data will be dropped and will not be recoverable. The translated data is directly stored into a database server 406 to be read by other components. In one example, the database server 406 is implemented as a relational database solution with schema defined to reflect the structure of the data. In a different example, it can be implemented with NoSQL database solutions, such as key-value store and RDF database. The former is likely to be more scalable but has the disadvantage of not able to support the ACID property 4. The latter is more suitable for graph-like data. Furthermore, the monitoring server 320 is responsible for receiving two other sources of information directly from the user 340, namely a human hints receiver 408 and a internet traffic model receiver 410. In one example, the monitoring server 320 receives the information from the user via a user interface presented to the user. In a different example, the monitoring server 320 receives the information via a programming interface or service interface, such as Representational State Transfer (REST). Unlike information from the monitoring data receiver, these two receivers do not receive information in a periodic manner. Instead, they receive information when certain events happen (such as manager decision to initiate a marketing campaign), which is often irregular and unpredictable. The input format for the human hints receiver 408 may be in the form of a text file in comma-delimited format with the first column indicating the time, and one or more columns indicating the values of the resource needed at the indicated time in that row.

Other formats are also suitable for representing human hints such as XML and OWL. It depends on the type of human hints and the expressivity needed to represent them. For example, a function of time is more easily represented in XML than in a comma- delimited format. The input format for the internet traffic model receiver 410 can be represented in a similar way as the human hints. Internet traffic models are usually in the form of a function which is more easily represented in XML as mentioned earlier. Fig. 5 illustrates the predictive analytics engine 322 in more detail. This module is the core of the engine where most of the processing takes place. It consists of four main units: a database query client 502, a data formatting module 504, a learning module 506 and a prediction module 508. The predictive analytics engine 322 executes in a periodic manner. On each execution the database query client executes a query to the monitoring server 320 for a window of most up-to-date data set. The size of this window can be defined by the user or adjusted automatically based on the historical data, such as by choosing a window size that yields the most predictive power in terms of accuracy. To reduce processing time, the server 310 limits the size of the dataset otherwise the learning module 506 can become a performance bottleneck to the system. The exact size limit depends on the learning algorithm used, as well as the environment in which the algorithm is executed.

In one example, the algorithm is executed on Amazon EC2 with a medium size instance (2x EC2 Computer Unit and 3.75GB memory) for running a neural network algorithm. The learning algorithm completes within a couple of seconds for window size of 3 with roughly 300 entries of history data. It reaches its limit when window size is around 10 with more than 2000 entries of history data. This may take up to half an hour to complete. Other learning algorithms would have different limitations. In cases where the dataset is large, other processing frameworks can be adopted to speed up the processing/learning. In one example, a parallelised/distributed framework is adopted to utilise the processing power of multiple processors/machines. The data is extracted from the database and fed into a formatting module 504. The formatting module 504 is responsible for converting the dataset into a format that can be used as input to the learning module. In one example, this is in some sort of vector format. The vector format can be stored in memory if the dataset is of reasonable size, otherwise it can be dumped into a file (e.g., in comma-delimited format). Upon successful formatting the data, the formatted data is transferred to the learning module 506 for further processing. In another example, the data from the monitoring server 320 comprises data from multiple different data sources, such as monitoring data translation module 404, human hints receiver 407 or internet traffic model receiver 410 of Fig. 4. In this example, the data is stored on database server 406 in Fig. 4 as multiple data sets, one for each data source 404, 408 and 410 and the formatting module 504 also aggregates the data into a single data set. Therefore, the formatting module 504 is also referred to as aggregator or aggregator module. In a different example, the data is aggregated in monitoring server 320 and stored on database server 406 as one single data set. The learning module 506 analyses the formatted dataset from the formatting module 504 and generates a data stream of historical demand data, which is a generalisation of the given dataset. Once this data stream is created, the prediction module 508 uses it to make a prediction from the current values. The predictive value represents the amount of resources needed, it is then converted into the number of instances needed.

Fig. 6 illustrates the scaling effector 324 in more detail. The scaling effector 324 is the component that interprets the result of the decision making engine 322 and causes an adjustment in the level of available resources (i.e., number of operating virtual machines) on the remote site in the cloud 302. Based on the predicted number of required instances, the scaling effector 324 determines the number of instances 306 to be added or deleted.

One example to achieve this is by considering the number of currently live instances to see if it matches the desired number. If the desired number is higher then this unit will issue a scaling action to increase the number of instances, or if the desired number is lower then scaling action to decrease the number of instances is issued, otherwise nothing needs to be done.

A hybrid cloud command generator 602 converts the scaling actions into API calls of the specific cloud platform. The hybrid cloud command generator 602 maintains internally a mapping for each supported cloud provider and each of these mappings maps a specific scaling action into the corresponding commands for the platform. Typically, these mappings are previously constructed by the user manually. The commands are sent to the cloud provider API and controller via common cloud interface 604. Fig. 7 illustrates a computer implemented method 700 performed by prediction module 508 for predicting demand of a computer system. The method 700 is initialised with a substring length n, a number of patterns p to create and a number of iterations m which determines its convergence condition.

The method 700 commences by receiving 702 one or more input data streams. In this example, the method 700 receives two data streams, such as monitoring data from the translation module 404 and data from the human hints receiver 408, but in other examples any number of input data streams from a variety of sources may be used.

The monitoring data comes from the monitoring module 320 and it is periodically updated based on information collected from each of the instances 306. For convenience, we use the monitoring data below in this example:

Time Samples Average Sum Minimum Maximum Unit

2010-08-14 14:52:00 1.0 17.56 17.56 17.56 17.56 Percent

2010-08-14 14:53:00 1.0 25.02 25.02 25.02 25.02 Percent

2010-08-14 14:54:00 1.0 20.63 20.63 20.63 20.63 Percent

2010-08-14 14:55:00 1.0 26.98 26.98 26.98 26.98 Percent

2010-08-14 14:56:00 1.0 29.37 29.37 29.37 29.37 Percent

In this example, the computing cloud offers multiple different instance types, such as a small instance and a large instance. The data above shows that at time 2010-08-14 14:52:00, the average CPU load is 17.56% of the allocated instance type. In this specific case, an EC2 small instance is used which has one Elastic Compute Unit (lx ECU). The processing power of one ECU roughly corresponds to 1.2GHz Xeon. The units of the last column of the Table above can be safely ignored since the same unit is used throughout the whole example. In the case where the units vary, for example, if different instance types are used then it is necessary that we normalise the units before applying the algorithm.

The human hint information is a more reliable source of information given directly by human experts who have substantial knowledge about the domain of interest. In this example, we use the form of human hint we mentioned earlier where we assumed a function of f(t) to be the resource consumption over a period of time. This function can be specified manually by the human expert and it is intended to be a baseline for specifying the estimated workload. Alternatively, this function can be deduced from a set of data like the one given above.

Using set of data from above, we can translate it to a corresponding function f(t) (where 1 < t < 5) as follows. Alternatively, this function can be specified explicitly by the domain expert,

t f(t)

1 17.56

2 25.02

3 20.63

4 26.98

5 29.37

Fig. 8 shows a plot 800 comprising the above data as a first set of data points 802. From this set of data, we can use linear regression to model the monitoring data by determining the function f(t), provided that there exists a linear relationship in the data. In the simple form of linear regression where there is only one variable, we have f(t) - mt + b. The values m and b are constants that we need to determine based on the above data. For this given data set, we have calculated the values m = 2.558 and b = 16.238.

Now, suppose that we have the following form of human hint: "The workload for the coming week will be five times that of usual"

For the form of human hint given above, we expect the workload to be five times to that of the 'usual' workload. The term 'usual' here is relative to the baseline function f(t) and so, based on the linear regression above, we expect the estimated workload to be f (t) = 5 x f(t) = 5 x (mt + b) between a given time interval.

In a different example, a neural network or autoregressive integrated moving average (ARIMA) is used to model the monitoring data and determine a function of the demand data over time. Of course, other models may also be adopted.

Example A

*

In Example A, we have two data sources going into the aggregator 504 via the database server 406: one from the monitoring data translation module 402 and one from the human hints receiver 408 in Fig. 4. One example for aggregating the data sources is taking the set union of the two sources, assuming that the two sources of information are compatible with each other.

Before we aggregate the two data sources, we need to preprocess each of the data sources so that they are in the right format for the learning module 506. Also, we need to make sure that the two data sources are in identical formats so that the two data sets can be combined by the data formatting module 504. The first step to preprocess the data is to specify a substring length n. In one example, n is chosen by the domain experts. In another example, the method determines n automatically.

In this example n = 3 is chosen. The preprocessing step involves creating 704 input vectors by looking at 3 consecutive entries in the original input data stream, that is each input vector is a substring of the input data stream of length n=3. By applying this preprocessing step to the above monitoring data, we obtain vl=[20.63,26.98,29.37], v2=[25.02,20.63,26.98] and v3 = [17.56,25.02,20.63].

In a different example, a window size of n=8 is chosen for an online book store application with a specific workload. In some examples, the window size is not static, because the amount of workload is constantly changing which means new "patterns" would emerge. Similarly, the values m and p are very much application specific and changes dynamically over time.

Fig. 9 illustrates a set of vectors 900 created from the input data streams in Fig. 8. It is noted that the horizontal location (x-value) of the vectors bears no meaning. It is merely for compactness of the presentation that some vectors are located at the same horizontal locations, that is same x-values, as others. Vectors vl, v2 and v3 are denoted with reference numerals 901, 902 and 903, respectively.

Note that, the first two values of each vector are treated as inputs to the vector and the third value, which is the most recent value, as the output. The learning algorithm will allow us to find relationship between the inputs and the output, where the output will be the estimated value. In other words, once the prediction module 508 has learned demand patterns, a query vector comprising historical demand data is received and a demand pattern is selected such that the input values of the demand pattern, that is all values except the most recent value, are closest to the historical demand data of the query vector. The most recent value of the demand pattern is then used to predict the future demand.

Next, we consider the preprocessing steps for human hints information. This depends on the type of the given human hints. Let's suppose we have the instance of human hints given above and the human hint tells us that we have a workload of 5 f(t) for a given duration. We then have the following:

t 5 f(t)

1 93.98

2 106.77

3 1 19.56

4 132.35

5 145,14 This data is also plotted in Fig. 8 as second set of data points 804. From this data set, we can deduce v4 =[93.98; 106.77;! 19.56]; v5=[106.77;119.56;132.35] and v6 = [1 19.56; 132.35; 145.14]. Vectors v4, v5 and v6 are denoted in Fig. 9 with reference numerals 904, 905 and 906, respectively. The union of the two sets above is:

{vl ; v2; v3; v4; v5; v6}.

This aggregated set of data will serve as inputs to the learning module 506 to yield demand patterns. To start the learning by the learning module 506, we give a value to each of the inputs. Suppose m = 2 and p = 2, and vl ; ...; v6 be the vectors given above.

The method 700 initialises 706 the demand pattern estimates c by setting each of the patterns in {cl;c2;...;cp} to a random vector in {vl ; . . . ; vn}. As a result, the method determines the demand pattern estimates by randomly selecting substring of the input stream. Let's suppose v3 is randomly selected for cl' and v5 for c2. The method continues by iterating through each of the vectors in {vl ; ... ; vn} and computing 708 the distance between each of ,{vl;... ; vn} to each of {cl ; c2} .

Next, the method 700 associates 710 each of {vl ; ... ; vn} with its closest demand pattern estimate, that is one of {cl ; c2}. Here we define dist(x; y) as the weighted distance between the vectors x and y, it is formally defined as follows:

dist(x; y) = (xl - yl)^A2 + ... + (_xn - yn) ^A2 where n is the size of a vector (excluding the output value).

The distances between the input vectors and the demand pattern estimates are given in the following table:

Inputs dist(x; y)

(vl cl) 80.2292

(v2 cl) 59.5946

(v3 cl) 0

(v4 cl) 12063.285

(v5 cl) 16357.91 12

(v6 cl) 21306.8738

(vl c2) 14561.8164

(v2 c2) 16153.589

(v3 c2) 16357.9112

(v4 c2) 327.1682

(v5 c2) 0

(v6 c2) 327.1682

It is easy to see that (vl ; ι = 80.2292 < (vl ; c2) - 14561.8164, so vl is allocated to cl . Similarly, we can derive the following association where each substring of the input stream is associated with one of the demand pattern estimates that has the smallest distance from that substring of the input stream:

Vector Associated pattern (or winnerj )

vl cl

v2 cl

v3 cl

v4 c2

v5 c2

v6 c2

The next step of method 700 is to update 712 the demand pattern estimates. We create two variables ccl and cc2 for storing vectors associated with cl and c2 respectively: ccl = {vl ; v2; v3}

cc2 = {v2; v3; v4) Next, we merge all the vectors associated with a particular demand pattern estimate and form an updated demand pattern estimate. The merge function takes a set of vectors as input and returns a single vector as output. It is formally defined as follows:

merge(vl ; . . . ; vn) = wl vl + ... + wnvn where the weights wl ; ... ;wn are constants provided by the user and ∑)»^'=i ^{= 1} , that is the sum of all weights is equal to one.

In this example, we assign different weights for human hint information and monitoring information. The former is a more reliable source so we allocate a higher weight to it. Vectors from the same source are allocated with equal weights.

In one example, the weights for each of the vectors are: wl = w2 = w3 = 0.8/3 and w4 = w5 = w6 = 0.2/3. For ccl , all vectors come from the same source so they are assigned equal weights. Hence, the updated demand pattern estimates are cl = 0.33*vl +0.33*v2 +0.33*v3 = [25.66, 24.21 , 21.07]. Similarly, for cc2, we have c2 = [106.77, 1 19.56, 132.35]. This is the end of the first iteration. The method 700 determines 714 whether the maximum number of iterations m has been reached. If the maximum number is not reached, the method 700 steps into the next iteration. The next iteration works the same way as before except that we now determine 708 the distances with the updated demand pattern estimates cl , c2. It is easy to see that the association (or winner J) will not be changed in the second iteration and so we have ccl = {vl,v2,v3} and cc2 = {v2,v3,v4} again. At the end of the second iteration, it will break out of the loop and return cl = [25.66, 24.21 , 21.07] (907 in Fig. 9) and c2 = [106.77, 1 19.56, 132.35] (908 in Fig. 9) as the determined demand pattern.

Let's now suppose that we want to estimate the upcoming workload based on the previous two values, that is, the method 700 receives 716 a demand query vector, for example vq = [60,80, ?]. The received query vector is also depicted in Fig. 9 with reference numeral 910. Method 700 determines 718 a demand pattern that is closest to the query vector, that is the method finds an index i from the result of the previous steps where vq is closest to a pattern ci. Again, we apply the distance function as defined above and we get dist(cl ,vq) = 4291.7597 and dist(c2,vq) = 3752.4265. Hence dist(c2,vq) < dist(cl ,vq), so c2 (908 in Fig. 9) is closer to vq (910 in Fig. 9) than cl (907 in Fig. 9). The prediction module 508 of predictive analytics engine 322 predicts 720 the demand by using the third value (output value) of cl as the prediction value (i.e., 132.35).

Example B

, In example B, each data source goes into a different learning procedure. The advantage of this approach is that the separation between the processing of the two data sources is clear, and so the processing can be performed in parallel at the same time. Assuming that we have two sets of input data {vl, v2, v3 } and {v4, v5, v6} as shown in example A, we feed each of these sets into its corresponding learning module 506.

Fig. 10 illustrates the input vectors 901 to 906 of Fig. 9. Note that, the learning modules for the two data sets do not have to be the same as long as each learning module at the end returns one or more demand patterns based on the corresponding input data.

To simplify the scenario, we will use the same learning algorithm as in example A for the two data sources. We will repeat the procedures as shown in example A for each of the two data sources. We assume for both scenarios, we set m (number of iterations) and p (number of nodes) to be 2.

The method 700 commences as in example A by receiving 702 the input data stream and creating 704 input vectors. However, the two input streams are not aggregated in this example. Estimation from Monitoring Information: For the monitoring information data source (i.e., {vl, v2, v3}), the learning module 506 initialises the demand patterns by first allocating two random vectors to cl, c2. Suppose we have the following random initialisation selection: cl = vl and c2 = v3. The learning module

506 then determines 708 the distances as in example A:

Inputs dist(x,y)

(vl,cl) 0

(v2,cl) 46.0346

(v3,cl) 80.2292

(vl,c2) 80.2292

(v2,c2) 59.5946

(v3,c2) 0 Hence, the method 700 associates 710 the following vector to the demand pattern estimates:

ccl = {vl ,v2}

cc2 = {v3}

Updating 712 the demand pattern estimates by applying the merge function to each of ccl and cc2, yields:

cl = [28.175, 23.805, 22.825] (1020 in Fig. 10)

c2 = [20.63,25.02, 17.56] (1022 in Fig. 10)

Again, the second iteration will not change the associations, so cl and c2 are the final demand patters. It is noted that in this example, the decision 714 to perform another iteration may be based on the maximum number of iterations or on the values of the demand pattern estimates. If the demand pattern estimates do not change or do not change more than a predetermined threshold from one iteration to the next, the method proceeds with step 716, that is receiving 716 the query vector vq 910.

The method determines 718 that cl (1020) is closer to vq than c2 (1022). Based on the monitoring information only, we determine a monitoring demand prediction of 22.825 .

Monitoring from Human Hint Information: For human hint information, we have v4, v5, v6 as the input vectors 904, 905 and 906, respectively. The method 700 initialises the demand pattern estimates 706 by randomly associating two of these vectors to cl and c2, say cl = v4 and c2 = v5, and then determines 708 the distance table below: Inputs dist(x,y)

(vl,cl) 0

(v2,cl) 327.1682

(v3, cl) 1308.6728

(vl,c2) 327.1682

(v2,c2) 0

(v3,c2) 327.1682

The method 700 associates the following vectors to demand pattern estimates:

ccl = {vl }

cc2 = {v2,v3} Updating 710 the demand pattern estimates by applying the merge function to each of ccl and cc2 yields:

cl = [93.98, 106.77, 1 19.56] (1024 in Fig. 10)

c2 = [113.165, 125.955, 138.745] (1026 in Fig. 10)

Since the allocation of vectors did not change in the second iteration, this is the result for the demand patterns. Comparing the distance of vq against cl and c2, the method determines 718 that cl (1024 in Fig. 10) is closer to vq than c2 (1026 in Fig. 10), so the human hint demand prediction is 1 19.56. Method 700 now predicts 720 the overall demand based on the demand predictions from the two learning modules. In one example, method 700 uses the average of the two predictions resulting in an overall predicted demand of (22.825 + 1 19.56)/2 - 71.19. In the examples above, the prediction is based on the last value of the closest patterns. In other examples, this function is substituted with different functions depending on the intended domain and input data set.

It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the specific embodiments without departing from the scope as defined in the claims.

It should be understood that the techniques of the present disclosure might be implemented using a variety of technologies. For example, the methods described herein may be implemented by a series of computer executable instructions residing on a suitable computer readable medium. Suitable computer readable media may include volatile (e.g. RAM) and/or non-volatile (e.g. ROM, disk) memory, carrier waves and transmission media. Exemplary carrier waves may take the form of electrical, electromagnetic or optical signals conveying digital data steams along a local network or a publically accessible network such as the internet.

It should also be understood that, unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as "estimating" or "processing" or "computing" or "calculating", "optimizing" or "determining" or "displaying" or "maximising" or the like, refer to the action and processes of a computer system, or similar electronic computing device, that processes and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.

Claims

CLAIMS:

1. A computer implemented method for predicting demand of a computer system, the method comprising:

2. The computer implemented method of claim 1, wherein merging substrings of the input stream comprises:

3. The computer implemented method of claim 1 or 2, wherein determining the demand prediction comprises:

determining a demand prediction based on the determined demand pattern.

4. The computer implemented method of claim 3, wherein the demand prediction is a most recent value of the determined demand patterns.

5. The computer implemented method of any one of the preceding claims, wherein the demand pattern estimates are determined by randomly selecting substrings of the input stream.

6. The computer implemented method of any one of the preceding claims, wherein the input stream is based on monitoring data.

7. The computer implemented method of claim 6, wherein the monitoring data is one or more of:

computing time; number of API calls;

amount of used memory;

amount of sent or received data;

number of messages sent or received; and

amount of bandwidth used .

8. The computer implemented method of any one of the preceding claims, wherein the input stream is based on human hints and a model of the monitoring data.

9. The computer implemented method of claim 8, wherein the model is one or more of:

linear regression;

autoregressive integrated moving average; and

neural network.

10. The computer implemented method of any one of the preceding claims, wherein the input stream is based on an internet traffic model.

11. The computer implemented method of any one of the preceding claims, further comprising the steps of:

12. The computer implemented method of any one of the preceding claims, wherein the substrings are of a predetermined length.

13. The computer implemented method of any one of the preceding claims, further comprising determining multiple resources that together meet the determined demand prediction.

14. The computer implemented method of claim 13, wherein the multiple resources are a combination of resources of multiple different types.

15. The computer implemented method of claim 14, wherein the combination of resources of multiple different types is based on a pricing of the resources.

16. The computer implemented method of claim 15, wherein the pricing of the resources is based on usage time or processor cycles.

17. The computer implemented method of any one of claims 13 to 16, further comprises determining a difference between the determined multiple resources and currently employed multiple resources.

18. The computer implemented method of any one of the preceding claims, wherein the demand is a demand for Internet traffic.

19. Software that when installed on a computer causes the computer to perform the method of any one or more of the claims 1 to 18.

20. A prediction apparatus for predicting demand of a computer system, the apparatus comprising:

an effector to configure the computer system based on the determined demand.