CN114064203A

CN114064203A - Cloud virtual machine load prediction method based on multi-scale analysis and deep network model

Info

Publication number: CN114064203A
Application number: CN202111265373.8A
Authority: CN
Inventors: 孟海宁; 张嘉薇; 李维; 朱磊; 杨哲; 童新宇; 黑新宏
Original assignee: Xian University of Technology
Current assignee: Xian University of Technology
Priority date: 2021-10-28
Filing date: 2021-10-28
Publication date: 2022-02-18
Anticipated expiration: 2041-10-28
Also published as: CN114064203B

Abstract

The invention discloses a cloud virtual machine load prediction method based on multi-scale analysis and a deep network model, which collects the resource and performance data of the cloud virtual machine; obtains the cloud virtual machine resource and performance sequence data; performs wavelet decomposition on the original data sequence; Denoising; obtain the denoised data sequence; perform first-order difference on the denoised data sequence, and construct a time step matrix with the first-order difference data sequence; divide the time step matrix into training set and test set; The set data is normalized; the training set data is used to train the DLSTM prediction model, the trained DLSTM prediction model is used to predict the test set data, and the performance of the DLSTM prediction model is evaluated; the invention solves the problem of traditional prediction methods. The problem of low accuracy of load prediction of cloud virtual machines that run for a long time and have a large amount of data.

Description

Cloud virtual machine load prediction method based on multi-scale analysis and deep network model

Technical Field

The invention belongs to the technical field of time series prediction, and particularly relates to a cloud virtual machine load prediction method based on multi-scale analysis and a deep network model.

Background

Cloud computing has been rapidly developed in recent years, and has attracted attention of professionals in various fields. Experts try to apply cloud computing to various fields and test the performance of cloud computing for processing various tasks. The wide application of the cloud computing technology not only brings new technology and high performance, but also brings high load to the system. Virtualization is one of important support technologies of cloud computing, and has high expansibility, high flexibility and high cost performance, so that people can obtain corresponding services according to own requirements, the expenditure can be saved, and the resource utilization rate can be improved. But as with other servers, the load conditions can greatly affect the performance of the virtual machine. In order to further optimize the application of the cloud virtual machine, it becomes important to explore the influence of the load condition of the cloud virtual machine on the virtual machine.

With the continuous development of cloud computing technology and cloud systems, more and more experts begin to research the load change rule of the cloud system and explore how to provide services for people by using a cloud virtual machine in a better load state.

The existing load prediction methods include LSTM and BP neural networks and the like. The LSTM can satisfy the demand of memorizing data information for a long period of time to achieve a prediction effect, but its performance in time series prediction is not very good. In the BP neural network method, the selection of parameters is difficult, and it is difficult to select the most suitable parameters to achieve the best prediction effect.

Disclosure of Invention

The invention provides a cloud virtual machine load prediction method based on multi-scale analysis and a deep network model, and solves the problem that the traditional prediction method is not accurate enough in prediction of cloud virtual machine load data which runs for a long time and has large data volume.

The technical scheme adopted by the invention is that the cloud virtual machine load prediction method based on the multi-scale analysis and the deep network model is implemented according to the following steps:

step 1, collecting data indexes of load conditions of a cloud virtual machine;

step 2, acquiring time sequence data of cloud virtual machine resources and performance parameters;

step 3, performing wavelet transformation on the sequence data obtained in the step 2, and denoising the original data according to a set threshold;

step 4, preprocessing the cloud virtual machine load data subjected to denoising in the step 3;

step 5, dividing the cloud virtual machine load data preprocessed in the step 4 into a training set and a testing set;

step 6, carrying out normalization processing on the training set and the test set of the cloud virtual machine load data in the step 5;

step 7, constructing a DLSTM prediction model of a cloud virtual machine load data time sequence;

step 8, training the prediction DLSTM model in the step 7 by using the normalized training set data in the step 6;

and 9, predicting the data of the test set after normalization in the step 6 by using the DLSTM prediction model after training in the step 7, and evaluating the performance of the DLSTM prediction model.

The invention is also characterized in that:

the time sequence data of the cloud virtual machine performance in the step 2 is CPU response time;

the specific contents of the wavelet transform denoising method in the step 3 are as follows: performing wavelet transformation by using noisy data, setting a threshold lambda, denoising original data, and reconstructing through inverse wavelet transformation to obtain denoised data;

wherein the wavelet in step 3 is db8 in the Daubechies (dbN) wavelet family; setting a threshold lambda, and then adopting a fixed threshold estimation method to remove dryness, wherein the fixed threshold is as follows:

wherein the pretreatment process in the step 4 specifically comprises the following steps: firstly, carrying out first-order difference on denoised data; recording the sequence after de-noising as X ═ X₁，x₂，...，x_n) (n is the length of the entire time series), and the data series after the difference is Y ═ Y (Y is the length of the entire time series)₁，y₂，...，y_n-1) (ii) a Using the post-value minus the pre-value in the sequence, i.e.:

y_i＝x_i+1-x_i (2)

obtaining a first-order difference data sequence Y by using a formula (2), thereby eliminating the time dependence of the time sequence;

secondly, converting the first-order difference data sequence into a time step matrix, wherein each unit in the matrix comprises a data segment with the length of a time step for prediction; the time step used in the scheme is 2, and the construction process is as follows: converting the original sequence into a matrix P of n x 1₁(ii) a Inserting a 0 before the original sequence, and converting into a matrix P of n x 1₂(ii) a Will matrix P₁And P₂Merging into an n x 2 matrix P'; i.e. P₁＝[y₁，y₂，...，y_n]^T，P₂＝[0，y₁，y₂，...，y_n-1]^T，

P′＝[P₂ P₁] (3)；

Wherein the normalization process in the step 5 specifically comprises the following steps: use of

Denotes x_iNormalized value, | x | Y_maxIs the maximum value in the absolute value of the data after differenceNormalizing the data in the matrix P' to [ -1, 1 [ ]]An interval;

the structure of DLSTM in step 6 is specifically as follows: DLSTM is stacked from multiple LSTMs, each of which maintains a traditional structure; the input of the DLSTM model input layer at the moment of t is recorded as x_tThe output of the output layer is h_t(ii) a In order to prevent overfitting of the model, a dropout layer is arranged, so that the activation value of a neuron stops working at a certain probability p during forward propagation, and p is set to be 0.3 in the scheme;

connecting an activation layer behind the hidden layer so that the matrix operation result has nonlinearity; the activation function for the forgetting gate and the output gate in LSTM is a Sigmoid function, i.e.

The function outputs a value of 0 or 1, where outputting 0 indicates discarding the current information and outputting 1 indicates retaining the current information;

the input gate activation function being a tanh function, i.e.

For calculating candidate value vector information;

the input of the i-th layer LSTM of the DLSTM prediction model at the time t is x_tAnd h is output at the time t-1_t-1T-1 Module State C_t-1And i-1 layer LSTM hidden states

The components are combined together; output h at time t_tAnd state C_tTransmitting to the t +1 moment; i-th layer LSTM output at time t

Transferring to the next layer of LSTM for auxiliary prediction until the last layer of LSTM obtains an output value; hidden state of first layer LSTM at time t

Transmitting the LSTM of the second layer as input, and repeating the steps until the last LSTM outputs a result;

wherein the LSTM has a forgetting gate, an input gate and an output gate; the forgetting gate calculation method comprises the following steps:

the input gate calculation method comprises the following steps:

the candidate value vector calculation method comprises the following steps:

the state output is:

the output gate calculation method comprises the following steps:

the output is:

h_t＝o_t*tanh(C_t) (9)；

in the expressions (4) to (9), W is weight information, h_tIs the output at time t, x_tFor the input at the time t, the input is,

hidden state of i-th layer LSTM at time t, and b is bias term

Wherein, the output data in the step 8 and the step 9 needs to be subjected to normalization removal and differencing removal to obtain a predicted value; the evaluation prediction model adopts a root mean square error RMSE and a minimized root mean square error RMSPE; the formulas are respectively as follows:

where N is the length of the data, y_iTo predict value, x_iThe cloud virtual machine load is raw data.

The invention has the beneficial effects that:

the cloud virtual machine load prediction method based on the multi-scale analysis and the deep network model solves the problem that the traditional prediction method is low in accuracy in cloud virtual machine load prediction for long-time operation and large data volume. A wavelet transformation method is provided for carrying out multi-scale decomposition on the data, and the data are decomposed into a high-frequency subsequence and a low-frequency subsequence; denoising the wavelet sequence of each scale by using a proper threshold value; through wavelet inverse transformation reconstruction, denoised data are obtained so as to obtain a better prediction effect; compared with the traditional LSTM method, the DLSTM method has the advantages that each layer of LSTM in the DLSTM runs on different time scales, and the result is transmitted to the next layer of LSTM, so that the DLSTM can effectively utilize each layer of LSTM, and more complex time sequence data can be learned. The prediction accuracy of DLSTM is therefore higher when a large amount of data is predicted.

Drawings

FIG. 1 is a DLSTM structure diagram of a cloud virtual machine load prediction method based on multi-scale analysis and a deep network model;

FIG. 2 is a general framework diagram of the cloud virtual machine load prediction method based on multi-scale analysis and a deep network model according to the present invention;

FIG. 3 is a wavelet denoising frame diagram of the cloud virtual machine load prediction method based on multi-scale analysis and a depth network model;

FIG. 4 is a time sequence diagram of raw data of the cloud virtual machine load prediction method based on multi-scale analysis and a deep network model according to the present invention;

FIG. 5 is a time sequence diagram of denoised data of the cloud virtual machine load prediction method based on multi-scale analysis and a deep network model;

FIG. 6 is a prediction result diagram of the cloud virtual machine load prediction method based on multi-scale analysis and a deep network model according to the present invention;

fig. 7 is a prediction comparison diagram of the cloud virtual machine load prediction method based on multi-scale analysis and a deep network model.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

The invention provides a cloud virtual machine load prediction method based on multi-scale analysis and a Deep network model, the overall framework is shown in fig. 2, the method specifically comprises a method for predicting cloud virtual machine load by a Wavelet Transform (WT) and Deep Long Short Term Memory (DLSTM) neural network model, and the method is implemented by the following steps:

step 1, collecting data indexes of load conditions of a cloud virtual machine;

step 2, acquiring time series data of cloud virtual machine resources and performance parameters, as shown in fig. 4;

step 3, as shown in fig. 3, performing wavelet denoising on the sequence data obtained in step 2; performing wavelet transformation by using the noisy data; setting a threshold lambda, denoising the cloud virtual machine load original data, and reconstructing through inverse wavelet transform to obtain denoised data, as shown in fig. 5. Wherein the fixed threshold is:

step 4, preprocessing the sequence data after denoising in the step 3; firstly, carrying out first-order difference on data, and specifically comprising the following steps: recording the sequence after de-noising as X ═ X₁，x₂，...，x_n) (n is the length of the time series), and the differentiated data series is as follows: y ═ Y₁，y₂，...，y_n-1) The later value in the sequence is subtracted from the former value to obtain a first-order difference data sequence Y, so that the time dependence of the time sequence is eliminatedSex; namely:

y_i＝x_i+1-x_i (2)

secondly, converting the first-order difference data sequence into a time step matrix, wherein each unit in the matrix comprises a data segment with the length of a time step for prediction; the time step used in the invention is 2, and the conversion process is as follows: converting the original sequence into a matrix P of n x 1₁(ii) a Inserting a 0 before the original sequence, and converting into a matrix P of n x 1₂(ii) a Will matrix P₁And P₂Merging into an n x 2 matrix P'; i.e. P₁＝[y₁，y₂，...，y_n]^T，P₂＝[0，y₁，y₂，...，y_n-1]^TThe merging method comprises the following steps:

P′＝[P₂ P₁] (3)；

step 6, carrying out normalization processing on the training set and the test set of the cloud virtual machine load data in the step 5; use of

(

Denotes x_iNormalized value, | x | Y_maxMaximum of absolute values of the data after differentiation) to [ -1, 1]An interval;

step 7, constructing a DLSTM prediction model of a cloud virtual machine load data time sequence; as shown in fig. 1, DLSTM is made up of a stack of LSTMs, each of which maintains a conventional structure. The input of the DLSTM model input layer at the moment of t is recorded as x_tThe output of the output layer is h_t(ii) a In order to prevent overfitting of the model, a dropout layer is arranged, so that the activation value of a neuron stops working at a certain probability p during forward propagation, and p is set to be 0.3 in the method; the DLSTM hidden layer is used for carrying out multi-level abstraction on input features; for each node's input, the hidden layer has a different connection weightThe neurons of the output layer adjust the weights of the neurons of the hidden layer, so that the output result tends to real data;

the active layer is connected after the hidden layer, so that the matrix operation result has nonlinearity, and the active function used by the forgetting gate and the output gate in the LSTM is a Sigmoid function, namely

The function outputs a value of 0 or 1, where output 0 indicates discarding the current information and output 1 indicates retaining the current information. The input gate activation function being a tanh function, i.e.

For calculating candidate value vector information;

Pass to the next layer of LSTM to assist prediction until the last layer of LSTM takes an output value. Hidden state of first layer LSTM at time t

each layer of LSTM has a forgetting gate, an input gate and an output gate;

the forgetting gate calculation method comprises the following steps:

the input gate calculation method comprises the following steps:

the candidate value vector calculation method comprises the following steps:

the state output is:

the output gate calculation method comprises the following steps:

the output is:

h_t＝o_t*tanh(C_t) (9)；

where W is weight information, h_tIs the output at time t, x_tFor the input at the time t, the input is,

hidden state of i-th layer LSTM at time t, and b is bias term

step 9, predicting the data of the test set after normalization in the step 6 by using the DLSTM prediction model after training in the step 7, and evaluating the performance of the DLSTM prediction model, as shown in FIG. 6;

examples

The embodiment adopts the load condition of the cloud virtual machine as an example; comparing the prediction result with the LSTM based on the multi-scale analysis and the DLSTM model, and the result is shown in FIG. 7; taking a mean square root error (RMSE) and a minimized Root Mean Square Prediction Error (RMSPE) as evaluation indexes, wherein the formulas are respectively shown as (10) to (11);

where N is the length of the data, y_iTo predict value, x_iThe method comprises the steps of obtaining original data of cloud virtual machine load;

the method comprises the following specific steps:

step 1, collecting data indexes of load conditions of a cloud virtual machine;

step 3.1, performing wavelet transformation on the sequence data obtained in the step 2;

step 3.2, setting a threshold value, and denoising the cloud virtual machine load sequence data;

step 3.3, obtaining de-noised data through wavelet inverse transformation reconstruction, wherein the fixed threshold value is

Step 4.1, performing first-order difference on the denoised data in the step 3, and specifically comprising the following steps: recording the sequence after de-noising as X ═ X₁，x₂，...，x_n) (n is the length of the entire time series), and the data series after the difference is Y ═ Y (Y is the length of the entire time series)₁，y₂，...，y_n-1) Using the value after the sequence minus the value before, i.e. y_i＝x_i+1-x_iObtaining a first-order difference data sequence Y, thereby eliminating the time dependence of the time sequence;

step 4.2, converting the first-order difference data sequence into a time step matrix, wherein each unit in the matrix comprises a data segment with the length of a time step for prediction; the time step used in the invention is 2, and the construction process is as follows: converting the original sequence into a matrix of n x 1P₁(ii) a Inserting a 0 before the original sequence, and converting into a matrix P of n x 1₂I.e. P₁＝[y₁，y₂，...，y_n]^T，P₂＝[0，y₁，y₂，...，y_n-1]^T；

Step 4.3, the matrix P₁And P₂Are combined into an n x 2 matrix P ', i.e. P' ═ P₂ P₁]；

step 6, carrying out normalization processing on the training set and the test set of the cloud virtual machine load data in the step 5; the normalization processing process specifically comprises the following steps: use of

(

Denotes x_iNormalized value, | x | Y_maxFor the maximum of the absolute values of the differenced data) to normalize the data in matrix P' to [ -1, 1]An interval;

step 7.1, constructing a DLSTM prediction model of a cloud virtual machine load data time sequence; DLSTM is stacked from multiple LSTMs, each of which maintains a conventional structure. The input of the DLSTM model input layer at the moment of t is recorded as x_tThe output of the output layer is h_t(ii) a In order to prevent overfitting of the model, a dropout layer is arranged, so that the activation value of a neuron stops working at a certain probability p during forward propagation, and in the method, p is set to be 0.3, so that the generalization of the model is stronger and the model does not depend on certain local characteristics; the DLSTM hidden layer is used for carrying out multi-level abstraction on input features; for the input of each node, the hidden layer has different connection weights, and the neurons of the output layer adjust the weights of the neurons of the hidden layer, so that the output result tends to real data.

The active layer is connected after the hidden layer so that the matrix operation result has non-linearity. The active function used by the forgetting gate and the output gate in the LSTM is a Sigmoid function, and the function outputs a value of 0 or 1, wherein the output of 0 indicates discarding the current information, and the output of 1 indicates keeping the current information. The input gate activation function is a tanh function and is used for calculating candidate value vector information;

Claims

1. The cloud virtual machine load prediction method based on the multi-scale analysis and the deep network model is characterized by comprising the following steps:

step 1, collecting data indexes of load conditions of a cloud virtual machine;

2. The method for predicting the load of the cloud virtual machine based on the multi-scale analysis and the deep network model according to claim 1, wherein the time series data of the cloud virtual machine performance in the step 2 is CPU response time.

3. The cloud virtual machine load prediction method based on the multi-scale analysis and the deep network model as claimed in claim 1, wherein the specific content of the wavelet transformation denoising method in the step 3 is as follows: and performing wavelet transformation by using the noisy data, setting a threshold lambda, denoising the original data, and reconstructing through inverse wavelet transformation to obtain denoised data.

4. The method for predicting the load of the cloud virtual machine based on the multi-scale analysis and depth network model according to claim 3, wherein the wavelet in the step 3 is db8 in the Daubechies (dbN) wavelet family; setting a threshold lambda, and then adopting a fixed threshold estimation method to remove dryness, wherein the fixed threshold is as follows:

5. the cloud virtual machine load prediction method based on the multi-scale analysis and the deep network model according to claim 1, wherein the preprocessing process in the step 4 specifically comprises: firstly, carrying out first-order difference on denoised data; recording the sequence after de-noising as X ═ X₁，x₂，...，x_n) (n is the length of the entire time series), and the data series after the difference is Y ═ Y (Y is the length of the entire time series)₁，y₂，...，y_n-1) (ii) a Using the post-value minus the pre-value in the sequence, i.e.:

y_i＝x_i+1-x_i (2)

secondly, converting the first-order difference data sequence into a time step matrix, wherein each unit in the matrix comprises a data segment with the length of a time step for prediction; the time step used in the scheme is 2, and the scheme is constructedThe process is as follows: converting the original sequence into a matrix P of n x 1₁(ii) a Inserting a 0 before the original sequence, and converting into a matrix P of n x 1₂(ii) a Will matrix P₁And P₂Merging into an n x 2 matrix P'; i.e. P₁＝[y₁，y₂，...，y_n]^T，P₂＝[0，y₁，y₂，...，y_n-1]^T，

P′＝[P₂P₁] (3)。

6. The cloud virtual machine load prediction method based on the multi-scale analysis and the deep network model according to claim 1 or 5, wherein the normalization processing in the step 5 specifically comprises: use of

Denotes x_iNormalized value, | x | Y_maxNormalizing the data in the matrix P' to [ -1, 1 ] for the maximum of the absolute values of the differentiated data]An interval.

7. The cloud virtual machine load prediction method based on the multi-scale analysis and the deep network model according to claim 1, wherein the construction of the DLSTM in the step 6 is specifically as follows: DLSTM is stacked from multiple LSTMs, each of which maintains a traditional structure; the input of the DLSTM model input layer at the moment of t is recorded as x_tThe output of the output layer is h_t(ii) a In order to prevent overfitting of the model, a dropout layer is arranged, so that the activation value of a neuron stops working at a certain probability p during forward propagation, and p is set to be 0.3 in the scheme;

the input gate activation function being a tanh function, i.e.

For calculating candidate value vector information;

And transmitting the LSTM of the second layer as an input, and the like until the last LSTM outputs a result.

8. The cloud virtual machine load prediction method based on multi-scale analysis and deep network model of claim 7, wherein the LSTM has a forgetting gate, an input gate and an output gate; the forgetting gate calculation method comprises the following steps:

the input gate calculation method comprises the following steps:

the candidate value vector calculation method comprises the following steps:

the state output is:

the output gate calculation method comprises the following steps:

the output is:

h_t＝o_t*tanh(C_t) (9)。

for the hidden state of the ith layer LSTM at time t, b is the bias term.

9. The method for predicting the load of the cloud virtual machine based on the multi-scale analysis and the deep network model according to claim 1, wherein the output data in the steps 8 and 9 need to be subjected to denormalization and dedifferentiation to obtain a predicted value; the evaluation prediction model adopts a root mean square error RMSE and a minimized root mean square error RMSPE; the formulas are respectively as follows: