CN111950810A

CN111950810A - A multivariate time series prediction method and equipment based on self-evolution pre-training

Info

Publication number: CN111950810A
Application number: CN202010876972.2A
Authority: CN
Inventors: 李文中; 万晨; 张治杰; 丁望祥; 叶保留; 陆桑璐
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2020-08-27
Filing date: 2020-08-27
Publication date: 2020-11-17
Anticipated expiration: 2040-08-27
Also published as: CN111950810B

Abstract

The invention discloses a multivariate time series prediction method and equipment based on self-evolution pre-training. The method is based on a pre-training strategy and a deep sequence model of a convolution network and a long-short memory network, combined with univariate self-evolution information and multi-variable dependence. The relationship information is modeled, and the optimization algorithm of multivariate time series forecasting is realized, taking into account the overall forecasting accuracy and the local univariate forecasting accuracy. The invention has better overall prediction accuracy, and is superior to the existing multivariate time series prediction method in terms of the guarantee of prediction accuracy for local single variables.

Description

A multivariate time series prediction method and equipment based on self-evolution pre-training

技术领域technical field

本发明涉及时间序列预测，具体涉及一种多变量时间序列预测方法。The invention relates to time series prediction, in particular to a multivariate time series prediction method.

背景技术Background technique

采用时间序列预测方法、获得未来时刻的预测值，对于辅助决策、优化资源分配、提前采取干预措施等具有基础且重要的意义。在实时运行的信息系统有着广泛的应用。许多系统产生了大量的时间序列数据，主要包括系统性能监测数据，而在这些监测数据的性能预测中产生了许多的应用价值，比如预测网络流量用于资源规划与异常发现，预测数据中心的磁盘故障以提前替换等。因此预测系统性能的未来数据为现代信息系统的日常操作、运维等提供了重要的帮助。最近几年有大量的研究关注实时系统的时间序列预测问题。线性统计类算法(例如AR、ARIMA)对时间序列预测领域产生了影响。然而由于线性模型在很多实际应用中缺乏适用性，非线性时间序列分析和预测被提出并逐渐应用。除了经典的统计方法模型，机器学习模型已经成为时间序列预测领域的重要算法。研究人员将线性回归模型、支持向量机、决策树等机器学习模型应用于时间序列分析。基于机器学习的时间序列预测算法把时间序列预测作为监督学习任务，其中输入属性为历史观测的时间序列数据，输出标签为未来数据。然而，目前关注整体预测目标优化的多变量时间序列预测算法会带来局部变量预测精度损失的问题。例如，在多变量预测的回归任务中，现有算法以所有变量的整体预测误差(加权)和为最小化目标，通过优化这样的最小化目标提高整体预测精度。The use of time series forecasting methods to obtain forecast values in the future is of fundamental and important significance for assisting decision-making, optimizing resource allocation, and taking early intervention measures. It has a wide range of applications in real-time information systems. Many systems generate a large amount of time series data, mainly including system performance monitoring data, and many application values are generated in the performance prediction of these monitoring data, such as predicting network traffic for resource planning and anomaly detection, and predicting disks in data centers. Failure to replace in advance, etc. Therefore, future data to predict system performance provides important help for the daily operation, operation and maintenance of modern information systems. In recent years, a large amount of research has focused on the time series forecasting problem of real-time systems. Linear statistical algorithms (eg AR, ARIMA) have made an impact on the field of time series forecasting. However, due to the lack of applicability of linear models in many practical applications, nonlinear time series analysis and forecasting have been proposed and gradually applied. In addition to the classical statistical method models, machine learning models have become important algorithms in the field of time series forecasting. Researchers apply machine learning models such as linear regression models, support vector machines, and decision trees to time series analysis. The time series prediction algorithm based on machine learning regards time series prediction as a supervised learning task, in which the input attribute is the time series data of historical observations, and the output label is the future data. However, the current multivariate time series forecasting algorithms that focus on the optimization of the overall forecasting objective will bring about the problem of the loss of local variable forecasting accuracy. For example, in the regression task of multivariate prediction, existing algorithms take the overall prediction error (weighted) sum of all variables as the minimization objective, and improve the overall prediction accuracy by optimizing such a minimization objective.

发明内容SUMMARY OF THE INVENTION

发明目的：为保障局部变量的预测精度，提出了基于自演化预训练的多变量时间序列预测算法。既可以实现多变量时间序列预测的优化算法，同时还能兼顾整体预测精度和局部单变量的预测精度Purpose of the invention: In order to ensure the prediction accuracy of local variables, a multivariate time series prediction algorithm based on self-evolution pre-training is proposed. It can not only realize the optimization algorithm of multivariate time series prediction, but also take into account the overall prediction accuracy and local univariate prediction accuracy.

技术方案：为了能够预测实时运行的系统的性能，从而针对预测值进行合理的操作和运维，第一方面，本发明提出一种基于自演化预训练的多变量时间序列预测方法，包括以下步骤：Technical solution: In order to be able to predict the performance of the system running in real time, so as to carry out reasonable operation and maintenance of the predicted value, in the first aspect, the present invention proposes a multivariate time series prediction method based on self-evolution pre-training, including the following steps :

S1、获取表征实时运行系统性能的N维度指标数据，构成N个单变量时间序列，作为历史输入序列，并进行预处理；S1. Obtain N-dimension index data representing the performance of the real-time operating system, form N univariate time series, take it as a historical input sequence, and perform preprocessing;

S2、将N个单变量时间序列输入到基于差分特征构造的自演化预训练模型SE中，建立单变量时间序列线性回归模型，并将多阶差分信息显式地融入到线性自回归模型中，形成对应预测输出；S2. Input N univariate time series into the self-evolution pre-training model SE constructed based on difference features, establish a univariate time series linear regression model, and explicitly integrate the multi-order difference information into the linear autoregression model, form the corresponding prediction output;

S3、将N个单变量时间序列组成的二维张量先进行大小不同的一维卷积操作，提取变量间依赖关系信息得到多个特征图；S3. First perform one-dimensional convolution operations of different sizes on the two-dimensional tensors composed of N univariate time series, and extract the dependency information between variables to obtain multiple feature maps;

S4、将特征图输入到长短记忆网络(LSTM)中对多尺度特征图进行时序建模，来捕捉多变量依赖关系的时序信息，对应的多变量建模的网络组件形成对应输出；S4. Input the feature map into the long short-term memory network (LSTM) to perform time series modeling on the multi-scale feature map to capture the time series information of multivariate dependencies, and the corresponding multivariate modeling network components form corresponding outputs;

S5、将步骤S2和S4得到的输出进行向量元素的一一对应相加融合，构建多变量预测输出，并训练网络，得到训练完毕的模型；S5, perform the one-to-one correspondence addition and fusion of the vector elements to the outputs obtained in steps S2 and S4 to construct a multivariate prediction output, and train the network to obtain a trained model;

S6、将需要进行预测的时间序列输入到步骤S5得到的已训练完毕的模型中，从而得到时间序列的预测值。S6. Input the time series to be predicted into the trained model obtained in step S5, so as to obtain the predicted value of the time series.

第二方面，本发明提出一种计算机设备，所述设备包括一个或多个处理器；存储器，用于存储一个或多个程序，当一个或多个程序被一个或多个处理器执行时，使得一个或多个处理器实现根据本发明的第一方面所述的方法。In a second aspect, the present invention provides a computer device, the device includes one or more processors; a memory for storing one or more programs, when the one or more programs are executed by the one or more processors, One or more processors are caused to implement the method according to the first aspect of the present invention.

有益效果：本发明提出一种基于自演化预训练的多变量时间序列预测方法，基于预训练策略与卷积网络、长短记忆网络深度序列模型，结合单变量自演化信息与多变量依赖关系信息进行建模，实现了多变量时间序列预测的优化算法，同时兼顾了整体预测精度和局部单变量的预测精度。本发明具有较好的整体预测精度，对于局部单变量的预测精度保障性方面优于现有的多变量时间序列预测方法。Beneficial effects: The present invention proposes a multivariate time series prediction method based on self-evolution pre-training, which is based on pre-training strategies and deep sequence models of convolutional networks and long-short memory networks, combined with univariate self-evolution information and multi-variable dependency information. Modeling realizes the optimization algorithm of multivariate time series prediction, taking into account the overall prediction accuracy and the local univariate prediction accuracy. The invention has better overall prediction accuracy, and is superior to the existing multivariate time series prediction method in terms of the guarantee of prediction accuracy for local single variables.

附图说明Description of drawings

图1是根据本发明实施例的多变量时间序列预测方法总体流程图；1 is an overall flow chart of a multivariate time series prediction method according to an embodiment of the present invention;

图2是根据本发明实施例的多变量时间序列预测方法主要过程示意图；2 is a schematic diagram of the main process of the multivariate time series prediction method according to an embodiment of the present invention;

图3是根据本发明实施例的模型训练阶段处理流程图；FIG. 3 is a process flow chart of a model training phase according to an embodiment of the present invention;

图4是根据本发明实施例的模型运行阶段处理流程图。FIG. 4 is a process flow chart of a model running phase according to an embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图对本发明的技术方案做更进一步的说明。应当了解，以下提供的实施例仅是为了详尽地且完全地公开本发明，并且向所属技术领域的技术人员充分传达本发明的技术构思，本发明还可以用许多不同的形式来实施，并且不局限于此处描述的实施例。对于表示在附图中的示例性实施方式中的术语并不是对本发明的限定。The technical solutions of the present invention will be further described below with reference to the accompanying drawings. It should be understood that the embodiments provided below are only to disclose the present invention in detail and completely, and to fully convey the technical idea of the present invention to those skilled in the art, and the present invention can also be implemented in many different forms, and does not Limited to the embodiments described here. The terms used in the exemplary embodiments shown in the drawings are not intended to limit the invention.

本发明的实施例中使用基于自演化预训练的多变量时间序列预测方法对时序进行预测。基于预训练策略与卷积网络、长短记忆网络深度序列模型，结合单变量自演化信息与多变量依赖关系信息进行建模，实现了多变量时间序列预测的优化算法，同时兼顾了整体预测精度和局部单变量的预测精度。参照图1和图2，本发明的方法包括以下步骤：In the embodiment of the present invention, a multivariate time series prediction method based on self-evolution pre-training is used to predict the time series. Based on the pre-training strategy, convolutional network, and long-short-term memory network deep sequence model, combined with univariate self-evolution information and multivariate dependency information for modeling, an optimization algorithm for multivariate time series prediction is realized, while taking into account the overall prediction accuracy and Prediction accuracy for local univariate. 1 and 2, the method of the present invention comprises the following steps:

步骤S1，从实时运行的操作系统中通过监控系统采集相应的多变量时间序列数据，其中可能包括CPU负载、带宽利用率、带宽速度、内存利用率、响应时延、IO读写等多个关键性指标。在现实中，数据经常产生一些问题如数据缺失、数据不规则等，所以先对数据进行预处理操作，再将预处理后的数据作为模型的输入。此外，为了对时间序列有更好的建模效果，还需要评估时间序列的周期性得到历史窗口T。Step S1, collect corresponding multivariate time series data from the real-time operating system through the monitoring system, which may include multiple key factors such as CPU load, bandwidth utilization, bandwidth speed, memory utilization, response delay, IO read and write, etc. gender indicators. In reality, data often have some problems such as missing data, irregular data, etc., so the data is preprocessed first, and then the preprocessed data is used as the input of the model. In addition, in order to have a better modeling effect on the time series, it is also necessary to evaluate the periodicity of the time series to obtain the historical window T.

对数据进行预处理操作，即以时间维度将数据集进行划分，需要留出训练集供模型进行训练，并将数据作标准化处理使得数据值能够映射到同一个区间，这样有利于模型的学习，如min-max标准化：Preprocessing the data, that is, dividing the data set in the time dimension, it is necessary to set aside a training set for the model to train, and standardize the data so that the data values can be mapped to the same interval, which is conducive to the learning of the model. Such as min-max normalization:

其中，min_x为x的最小值，max_x为x中的最大值。通过自相关函数ACF来评估时间序列的周期性从而得到历史输入序列长度T。CPU负载、带宽利用率、带宽速度、内存利用率、响应时延、IO读写等多个关键性指标分别构成一个单变量时间序列，每个序列的长度为T，x表示这个序列的向量值。Among them, min _x is the minimum value of x, and max _x is the maximum value of x. The periodicity of the time series is evaluated by the autocorrelation function ACF to obtain the historical input sequence length T. Multiple key indicators such as CPU load, bandwidth utilization, bandwidth speed, memory utilization, response delay, IO read and write respectively constitute a univariate time series, the length of each series is T, and x represents the vector value of the series .

步骤S2，将步骤S1预处理后的包括CPU负载、带宽利用率、内存利用率、响应时延等关键性指标N条单变量时间序列分布输入到基于差分特征构造的自演化预训练模型SE(Self-Evolution)中，建立单变量时间序列线性回归模型，此模型的作用在于能够保证实时系统产生的单变量时间序列的预测精度。此外，考虑多阶差分信息可以在平稳、非平稳序列的预测中提高适用性、鲁棒性，则将多阶差分信息显式地融入到线性自回归模型中，形成对应预测输出。具体包括如下步骤：Step S2, input the N univariate time series distributions of key indicators including CPU load, bandwidth utilization, memory utilization, response delay, etc. preprocessed in step S1 into the self-evolution pre-training model SE ( In Self-Evolution), a univariate time series linear regression model is established. The function of this model is to ensure the prediction accuracy of the univariate time series generated by the real-time system. In addition, considering that the multi-order difference information can improve the applicability and robustness in the prediction of stationary and non-stationary sequences, the multi-order difference information is explicitly integrated into the linear autoregressive model to form the corresponding prediction output. Specifically include the following steps:

步骤S2-1、将N个单变量时间序列输入到基于差分特征构造的自演化预训练模型SE中，建立单变量时间序列线性回归模型。对于第i个序列有：Step S2-1, input the N univariate time series into the self-evolution pre-training model SE constructed based on the difference feature, and establish a univariate time series linear regression model. For the i-th sequence we have:

其中

表示第i个序列中第t-j时刻的时序值，

为对应的线性权值，T为通过自相关函数ACF来评估时间序列的周期性而得到的历史输入序列长度。in

represents the time series value at time tj in the i-th sequence,

is the corresponding linear weight, and T is the historical input sequence length obtained by evaluating the periodicity of the time series through the autocorrelation function ACF.

步骤S2-2、考虑差分阶数q＝1时，有

即

可见一阶差分下预测值含有

把

等差分特征显式地融入到线性自回归模型中，形成对应预测输庄

Step S2-2, when considering the difference order q=1, there are

which is

It can be seen that the predicted value under the first order difference contains

Bundle

The equal difference feature is explicitly integrated into the linear autoregressive model to form the corresponding prediction loss.

为差分特征的融合权值用于训练，限制为非负实数，softmax为对应的归一化操作，Q为最大差分阶数；

为标准的线性自回归项，x_t-T：t-1表示从t-1时刻到t-T时刻的时间序列，T为历史输入序列长度。

The fusion weight of the differential feature is used for training, limited to non-negative real numbers, softmax is the corresponding normalization operation, and Q is the maximum differential order;

is a standard linear autoregressive term, x _{tT: t-1} represents the time series from time t-1 to time tT, and T is the length of the historical input sequence.

步骤S3，将N个单变量时间序列组成的二维张量先进行大小不同的一维卷积操作，提取变量间依赖关系信息得到多个特征图。Step S3, first perform one-dimensional convolution operations of different sizes on the two-dimensional tensors composed of N univariate time series, and extract the dependency information between variables to obtain multiple feature maps.

本发明采用多个卷积核的一维CNN层的时序卷积对多元时间序列之间的复杂依赖关系进行表示，为了能够得到蕴含多变量序列的局部依赖关系多层次信息的特征图(feature map)。对于输入序列

进行变量维度上的一维卷积操作，表示如下：The present invention uses time series convolution of one-dimensional CNN layers with multiple convolution kernels to represent complex dependencies between multivariate time series, in order to obtain a feature map containing multi-level information of local dependencies of multivariate sequences. ). for the input sequence

Perform a one-dimensional convolution operation on the variable dimension, which is expressed as follows:

C^(k)＝W^(k)*X_t-T：t-1 C ^(k) = W ^(k) *X _{tT: t-1}

W^(k)为第k个卷积核，C^(k)是卷积结果，即从变量间依赖关系中提取的特征图。在卷积网络中，通常采用k个卷积核W⁽¹⁾，W⁽²⁾，...，W^(k)形成多个通道的特征图C⁽¹⁾，C⁽²⁾，...，C^(k)。不同于现有算法将所有卷积核设定为相同尺寸，本发明采用可变尺寸的卷积核。与同尺度卷积核相比，多尺度卷积核显式地描述了不同尺度的关联时间段特征。C⁽¹⁾，C⁽²⁾，...，C^(k)蕴含着多变量时间序列的局部依赖关系的多层次信息。W ^(k) is the kth convolution kernel, and C ^(k) is the convolution result, that is, the feature map extracted from the dependencies between variables. In a convolutional network, k convolution kernels W ⁽¹⁾ , W ⁽²⁾ , ..., W ^(k) are usually used to form feature maps C ⁽¹⁾ , C ⁽² ), ..., W(k) of multiple channels. ., C ^(k) . Different from existing algorithms which set all convolution kernels to the same size, the present invention adopts variable size convolution kernels. Compared with the same-scale convolution kernels, the multi-scale convolution kernels explicitly describe the associated time period features at different scales. C ⁽¹⁾ , C ⁽²⁾ , ..., C ^(k) contain multi-level information of local dependencies of multivariate time series.

步骤S4，将特征图输入到长短记忆网络(Long Short-Term Memory，LSTM)中对多尺度特征图进行时序建模，来捕捉多变量依赖关系的时序信息，将每个特征图分别输入到对应的LSTM并从而得到对应的输出。Step S4, input the feature map into the long short-term memory network (Long Short-Term Memory, LSTM) to perform time-series modeling on the multi-scale feature map to capture the time-series information of multi-variable dependencies, and input each feature map into the corresponding LSTM and thus get the corresponding output.

LSTM是一种经典的循环神经网络，可以解决序列的长期依赖问题。本发明中基于多变量建模的网络组件就是多个LSTM基本记忆单元构成的结构，每个LSTM相应地输入一个特征图。具体地，将步骤S3得到第k个卷积核所有通道的特征图C^(k)按时间维度进行拼接，其中通道数由时间序列数据本身维数决定，拼接后的特征图然后作为对应的第k个长短记忆网络的输入；对应网络单元的隐藏输出输入到全连接层，形成输出向量h^(k)∈R^N×1。K个相同的LSTM总计输出h⁽1⁾，h⁽²⁾，...，h^(K)，这些输出包含了序列的依赖信息。据此将所有向量h⁽¹⁾，h⁽²⁾，...，h^(K)相加形成多变量依赖关系建模输出结果

LSTM is a classic recurrent neural network that can solve long-term dependencies of sequences. The network component based on multivariate modeling in the present invention is a structure composed of multiple LSTM basic memory units, and each LSTM correspondingly inputs a feature map. Specifically, the feature maps C ^(k) of all channels of the kth convolution kernel obtained in step S3 are spliced according to the time dimension, wherein the number of channels is determined by the dimension of the time series data itself, and the spliced feature maps are then used as the corresponding th Inputs of k long-short memory networks; the hidden outputs of the corresponding network units are input to the fully connected layer, forming an output vector h ^(k) ∈ R ^N×1 . A total of K identical LSTM outputs h ⁽ 1 ⁾ , h ⁽²⁾ , ..., h ^(K) , these outputs contain sequence dependency information. Accordingly, all vectors h ⁽¹⁾ , h ⁽²⁾ , ..., h ^(K) are added together to form the multivariate dependency modeling output

步骤S5，将步骤S2和S4得到的输出进行向量元素的一一对应相加融合，构建多变量预测输出，并训练网络，得到训练完毕的模型。步骤具体如下：In step S5, the outputs obtained in steps S2 and S4 are added and fused in one-to-one correspondence of vector elements to construct a multivariate prediction output, and train the network to obtain a trained model. The steps are as follows:

将步骤S2和S4得到的输出进行向量元素的一一对应相加融合，融合方法如下：The outputs obtained in steps S2 and S4 are added and fused in one-to-one correspondence of vector elements, and the fusion method is as follows:

W^(se)，W^(id)为融合权值，通过人工设置作为模型的超参数，

表示元素乘法。这样的两部分加性融合具有明确实际含义，增加了算法的可解释性，通过让损失函数达到最小，这样既可以控制局部单变量的精度损失不会变大，同时也可以控制整体预测精度损失，从而达到算法的目的。例如，第i个变量的预测值

那么对于该变量的预测，两项取值的相对大小便直接反映了两部分信息在预测中的占比。完成模型构建后，为训练模型，构造优化目标对应的损失函数如下：W ^(se) , W ^(id) are the fusion weights, which are manually set as the hyperparameters of the model,

Represents element-wise multiplication. Such two-part additive fusion has a clear practical meaning and increases the interpretability of the algorithm. By minimizing the loss function, it can not only control the accuracy loss of the local single variable, but also control the overall prediction accuracy loss. , so as to achieve the purpose of the algorithm. For example, the predicted value of the ith variable

Then, for the prediction of this variable, the relative size of the two values directly reflects the proportion of the two pieces of information in the prediction. After the model construction is completed, for the training model, the loss function corresponding to the optimization target is constructed as follows:

整体模型可基于梯度下降法进行优化。模型训练的权值为Self-Evolution,CNN和LSTM模型的权重。此模型为单步预测模型，那么最后的输出

为一个多变量向量值。The overall model can be optimized based on gradient descent. The weights for model training are the weights of Self-Evolution, CNN and LSTM models. This model is a single-step prediction model, then the final output

is a multivariate vector of values.

上述步骤S1-S5即为模型训练阶段，如图3所示，通过历史时间序列数据对基于自演化预训练的多变量时间序列预测模型进行训练。The above steps S1-S5 are the model training stage. As shown in FIG. 3 , the multivariate time series prediction model based on self-evolution pre-training is trained through historical time series data.

步骤S6，利用训练好的模型对待预测的时间序列进行预测。Step S6, using the trained model to predict the time series to be predicted.

如图4所示，在模型运行阶段，将需要进行预测的时间序列进行预处理后输入到步骤S5得到的已训练完毕的模型中，得到时间序列的预测值，即为模型最终得到关于实时系统的预测输出，在本实施例中，为CPU负载、带宽利用率、内存利用率、响应时延等关键性指标的预测值，可以为系统的实时监测和运维提供有力的参考。As shown in Figure 4, in the model running stage, the time series to be predicted is preprocessed and input into the trained model obtained in step S5 to obtain the predicted value of the time series, that is, the model finally obtains information about the real-time system The predicted output of , in this embodiment, is the predicted value of key indicators such as CPU load, bandwidth utilization, memory utilization, and response delay, which can provide a powerful reference for real-time monitoring and operation and maintenance of the system.

本领域普通技术人员可以理解，实现上述实施例的全部或部分步骤可以通过硬件来完成，也可以通过程序来指令相关的硬件完成，所述的程序可以存储于一种计算机可读存储介质中。在本发明的上下文中，所述计算机可读介质可以被认为是有形的且非暂时性的。非暂时性有形计算机可读介质的非限制性示例包括非易失性存储器电路(例如闪存电路、可擦除可编程只读存储器电路或掩膜只读存储器电路)、易失性存储器电路(例如静态随机存取存储器电路或动态随机存取存储器电路)、磁存储介质(例如模拟或数字磁带或硬盘驱动器)和光存储介质(例如CD、DVD或蓝光光盘)等。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above embodiments can be completed by hardware, or can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable storage medium. In the context of the present invention, the computer-readable medium may be considered tangible and non-transitory. Non-limiting examples of non-transitory tangible computer-readable media include non-volatile memory circuits (eg, flash memory circuits, erasable programmable read-only memory circuits, or masked read-only memory circuits), volatile memory circuits (eg, static random access memory circuits or dynamic random access memory circuits), magnetic storage media such as analog or digital magnetic tapes or hard drives, and optical storage media such as CD, DVD or Blu-ray discs, among others.

用于实施本发明的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器，使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行，作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program code for implementing the methods of the present invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, performs the functions/functions specified in the flowcharts and/or block diagrams. Action is implemented. The program code may execute entirely on the machine, partly on the machine, partly on the machine and partly on a remote machine as a stand-alone software package or entirely on the remote machine or server.

此外，虽然采用特定次序描绘了各操作，但是这应当理解为要求这样操作以所示出的特定次序或以顺序次序执行，或者要求所有图示的操作应被执行以取得期望的结果。在一定环境下，多任务和并行处理可能是有利的。同样地，虽然在上面论述中包含了若干具体实现细节，但是这些不应当被解释为对本发明的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实现中。相反地，在单个实现的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实现中。Additionally, although operations are depicted in a particular order, this should be understood to require that such operations be performed in the particular order shown or in a sequential order, or that all illustrated operations should be performed to achieve desirable results. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, although the above discussion contains several implementation-specific details, these should not be construed as limitations on the scope of the invention. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.

以上详细描述了本发明的优选实施方式，但是，本发明并不限于上述实施方式中的具体细节，在本发明的技术构思范围内，可以对本发明的技术方案进行多种等同变换，这些等同变换均属于本发明的保护范围。The preferred embodiments of the present invention have been described in detail above. However, the present invention is not limited to the specific details of the above-mentioned embodiments. Within the scope of the technical concept of the present invention, various equivalent transformations can be made to the technical solutions of the present invention. These equivalent transformations All belong to the protection scope of the present invention.

Claims

1. a multivariate time series prediction method based on self-evolution pre-training, is characterized in that, this method comprises the following steps:

S1. Obtain N-dimension index data representing the performance of the real-time operating system, form N univariate time series, take it as a historical input sequence, and perform preprocessing;

S2. Input N univariate time series into the self-evolution pre-training model constructed based on difference features, establish a univariate time series linear regression model, and explicitly integrate multi-order difference information into the linear regression model to form a corresponding prediction output

S3. Perform one-dimensional convolution operations of different sizes on the two-dimensional tensors composed of N univariate time series, and extract the dependency information between variables to obtain multiple feature maps;

S4. Input the feature map into the long-short memory network to perform time-series modeling on the multi-scale feature map to capture the time-series information of multi-variable dependencies, and the corresponding multi-variable modeling network components form corresponding outputs

S5, the output obtained in steps S2 and S4

and

One-to-one correspondence addition and fusion of vector elements is performed to construct a multivariate prediction output model, and training is performed to obtain a trained model;

S6. The time series to be predicted is preprocessed and then input into the trained model obtained in step S5, so as to obtain the predicted value of the time series.

2. The detection method according to claim 1, wherein the step S2 comprises:

S2-1. Input N univariate time series into the self-evolution pre-training model constructed based on differential features, and establish a univariate time series linear regression model. For the i-th series, there are:

in

represents the time series value of time tj in the i-th sequence,

is the corresponding linear weight, T is the length of the historical input sequence;

S2-2. Explicitly integrate the multi-order difference information into the linear regression model to form the corresponding prediction output

is the fusion weight of the differential feature, softmax is the corresponding normalization operation, and Q is the maximum differential order;

is a standard linear autoregressive term, x _{tT: t-1} represents the time series from time t-1 to time tT.

3. prediction method according to claim 1, is characterized in that, in described step S3, one-dimensional CNN convolution operation is as follows:

C ^(k) = W ^(k) *X _{tT: t-1}

W ^(k) is the kth convolution kernel, C ^(k) is the convolution result, that is, the feature map extracted from the dependencies between variables, X _t-Tt-1 represents the two-dimensional composition of N univariate time series tensor,

x _{tT: t-1} represents the time series from time t-1 to time tT, and T is the length of the historical input sequence.

4. prediction method according to claim 1, is characterized in that, described step S4 comprises: the feature map C (k) of all channels of k-th convolution kernel obtained in step S3 is spliced according to time dimension, then as corresponding The input of the kth long short-term memory network; the hidden output of the corresponding network unit is input to the fully connected layer, forming an output vector h ^(k) ∈ R ^N×1 , K identical LSTM total output h ⁽¹⁾ , h ^{(2 )} , ..., h ^(K) , add all vectors h ⁽¹⁾ , h ⁽²⁾ , ..., h ^(K) to form the multivariate dependency modeling output

5. detection method according to claim 1 is characterized in that, in described step S5, the result of step S2 and S4 output is weighted and fused, and fusion method is as follows:

W ^(se) , W ^(id) is the fusion weight,

Represents element-wise multiplication.

6. detection method according to claim 5, is characterized in that, in described step S5, training model, the loss function corresponding to construct optimization target is as follows:

N is the dimension of the multivariate time series.

7. A computer device, wherein the device comprises:

one or more processors;

memory; and

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one processor, the programs when executed by the processor implement as in claims 1-6 The steps of any one of the methods.