CN108734338A

CN108734338A - Credit risk forecast method and device based on LSTM models

Info

Publication number: CN108734338A
Application number: CN201810373757.3A
Authority: CN
Inventors: 洪满伙
Original assignee: Alibaba Group Holding Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2018-04-24
Filing date: 2018-04-24
Publication date: 2018-11-02
Also published as: WO2019209846A1; TWI788529B; US20190325514A1; TW201946013A

Abstract

The credit risk prediction method based on the LSTM model includes: obtaining user operation behavior data of the target account within a preset time period; the preset time period is a time series composed of several time intervals with the same time step; The user operation behavior data in the time interval generates the user behavior vector sequence corresponding to each time interval; the generated user behavior vector sequence corresponding to each time interval is input to the LSTM in the trained LSTM model based on the encoding-decoding architecture The encoder performs calculations to obtain hidden state vectors corresponding to each time interval; the LSTM model includes an LSTM encoder and an LSTM decoder that introduces an attention mechanism; the hidden state vectors corresponding to each time interval are used as risk features and input to The LSTM decoder performs calculations to obtain the risk score of the target account in the next time interval; and the weight value of each hidden state vector corresponding to the risk score.

Description

Credit risk prediction method and device based on LSTM model

技术领域technical field

本说明书涉及通信领域，尤其涉及一种基于LSTM模型的信用风险预测方法及装置。This specification relates to the communication field, and in particular to a credit risk prediction method and device based on an LSTM model.

背景技术Background technique

在现有的信用风险防范体系中，已经广泛使用信用风险预测模型来防范信用风险。通过提供来自风险账户的大量风险交易作为训练样本，并从这些风险交易中提取风险特征进行训练，来构建信用风险模型，然后使用构建完成的信用风险模型来对用户的交易账户进行信用风险预测和评估。In the existing credit risk prevention system, credit risk prediction models have been widely used to prevent credit risk. By providing a large number of risk transactions from risk accounts as training samples, and extracting risk features from these risk transactions for training, a credit risk model is constructed, and then the completed credit risk model is used to predict the credit risk of the user's transaction account and Evaluate.

发明内容Contents of the invention

本说明书提出一种基于LSTM模型的信用风险预测方法，所述方法包括：This specification proposes a credit risk prediction method based on the LSTM model, which includes:

获取目标账户在预设时间段内的用户操作行为数据；其中，所述预设时间段为由若干时间步长相同的时间区间组成的时间序列；Obtain user operation behavior data of the target account within a preset time period; wherein, the preset time period is a time series composed of several time intervals with the same time step;

基于所述目标账户在各个时间区间内的用户操作行为数据，生成对应于各个时间区间的用户行为向量序列；Based on the user operation behavior data of the target account in each time interval, generate a user behavior vector sequence corresponding to each time interval;

将生成的对应于各个时间区间的用户行为向量序列输入至训练完毕的基于编码-解码架构的LSTM模型中的LSTM编码器进行计算，得到对应于各个时间区间的隐藏状态向量；其中，所述LSTM模型包括LSTM编码器，和引入了注意力机制的LSTM解码器；Input the generated user behavior vector sequence corresponding to each time interval to the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture for calculation, and obtain the hidden state vector corresponding to each time interval; wherein, the LSTM The model includes an LSTM encoder and an LSTM decoder with an attention mechanism;

将对应于各个时间区间的隐藏状态向量作为风险特征，输入至所述LSTM解码器进行计算，得到所述目标账户在下一时间区间内的风险评分；以及，各隐藏状态向量对应于所述风险评分的权重值；其中，所述权重值表征所述隐藏状态向量对所述风险评分的贡献度。The hidden state vectors corresponding to each time interval are used as risk features, input to the LSTM decoder for calculation, and the risk score of the target account in the next time interval is obtained; and, each hidden state vector corresponds to the risk score The weight value of ; wherein, the weight value represents the contribution of the hidden state vector to the risk score.

可选的，所述方法还包括：Optionally, the method also includes:

获取若干被标记了风险标签的样本账户在所述预设时间段内的用户操作行为数据；Obtain the user operation behavior data of several sample accounts marked with risk tags within the preset time period;

基于所述若干样本账户在各个时间区间内的用户操作行为数据，生成对应于各个时间区间的用户行为向量序列；Based on the user operation behavior data of the several sample accounts in each time interval, generate a user behavior vector sequence corresponding to each time interval;

将生成的用户行为向量序列作为训练样本训练基于编码-解码架构的LSTM模型。The generated user behavior vector sequence is used as a training sample to train the LSTM model based on the encoder-decoder architecture.

可选的，基于账户在各个时间区间内的用户操作行为数据，生成对应于各个时间区间的用户行为向量序列，包括：Optionally, based on the user operation behavior data of the account in each time interval, generate a user behavior vector sequence corresponding to each time interval, including:

获取账户在各个时间区间内的多种用户操作行为数据；Obtain various user operation behavior data of the account in various time intervals;

从获取到的用户操作行为数据中提取关键因子，并对所述关键因子进行数字化处理，得到与所述用户操作行为数据对应的用户行为向量；extracting key factors from the acquired user operation behavior data, and digitizing the key factors to obtain a user behavior vector corresponding to the user operation behavior data;

对与各个时间区间内的多种用户操作行为数据对应的用户行为向量进行拼接处理，生成对应于各个时间区间的用户行为向量序列。The user behavior vectors corresponding to various user operation behavior data in each time interval are spliced to generate a sequence of user behavior vectors corresponding to each time interval.

可选的，所述多种用户行为包括信贷表现行为、用户消费行为、理财支付行为；Optionally, the various user behaviors include credit performance behavior, user consumption behavior, financial management payment behavior;

所述关键因子包括与信贷表现行为对应的借贷订单状态和借贷还款金额、与用户消费行为对应的用户消费类目和用户消费笔数、与理财支付行为对应的理财支付类型和理财收益金额。The key factors include the loan order status and loan repayment amount corresponding to the credit performance behavior, the user consumption category and the number of user consumption transactions corresponding to the user consumption behavior, the financial management payment type and the financial management income amount corresponding to the financial management payment behavior.

可选的，所述LSTM编码器采用多层的many-to-one结构；所述LSTM解码器采用输入节点和输出节点数量对称的多层的many-to-many结构。Optionally, the LSTM encoder adopts a multi-layer many-to-one structure; the LSTM decoder adopts a multi-layer many-to-many structure with symmetrical numbers of input nodes and output nodes.

可选的，所述将生成的对应于各个时间区间的用户行为向量序列输入至训练完毕的基于编码-解码架构的LSTM模型中的LSTM编码器进行计算，得到对应于各个时间区间的隐藏状态向量，包括：Optionally, the generated user behavior vector sequence corresponding to each time interval is input to the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture for calculation, and the hidden state vector corresponding to each time interval is obtained ,include:

将生成的对应于各个时间区间的用户行为向量序列输入至训练完毕的基于编码-解码架构的LSTM模型中的LSTM编码器进行双向传播计算，得到前向传播计算得到的第一隐藏状态向量；以及，后向传播计算得到的第二隐藏状态向量；其中，在进行前向传播计算和后向传播计算时，对应于各个时间区间的用户行为向量序列的输入顺序相反；Input the generated user behavior vector sequence corresponding to each time interval into the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture to perform two-way propagation calculation, and obtain the first hidden state vector obtained by the forward propagation calculation; and , the second hidden state vector obtained by backpropagation calculation; wherein, when performing forward propagation calculation and backpropagation calculation, the input order of the user behavior vector sequence corresponding to each time interval is reversed;

对所述第一隐藏状态向量和所述第二隐藏状态向量进行拼接处理，得到对应于各个时间区间的最终隐藏状态向量。Perform splicing processing on the first hidden state vector and the second hidden state vector to obtain final hidden state vectors corresponding to each time interval.

可选的，所述将对应于各个时间区间的隐藏状态向量作为风险特征，输入至所述LSTM解码器进行计算，得到所述目标账户在下一时间区间内的风险评分，包括：Optionally, the hidden state vectors corresponding to each time interval are used as risk features, input to the LSTM decoder for calculation, and the risk score of the target account in the next time interval is obtained, including:

将对应于各个时间区间的隐藏状态向量作为风险特征，输入至所述LSTM解码器进行计算，得到所述目标账户在下一时间区间内的输出向量；The hidden state vector corresponding to each time interval is used as the risk feature, input to the LSTM decoder for calculation, and the output vector of the target account in the next time interval is obtained;

对所述输出向量进行数字化处理，得到所述目标账户在下一时间区间内的风险评分。The output vector is digitized to obtain the risk score of the target account in the next time interval.

可选的，所述输出向量为多维向量；Optionally, the output vector is a multidimensional vector;

所述对所述输出向量进行数字化处理，包括以下中的任一：The digital processing of the output vector includes any of the following:

提取所述输出向量中取值位于0～1之间的子向量的取值作为风险评分；Extracting the value of the sub-vector whose value is between 0 and 1 in the output vector as the risk score;

如果所述输出向量中包含多个取值位于0～1之间的子向量时，计算该多个子向量的取值的平均值作为风险评分；If the output vector contains multiple sub-vectors with values between 0 and 1, calculate the average value of the values of the multiple sub-vectors as the risk score;

如果所述输出向量中包含多个取值位于0～1之间的子向量时，提取该多个子向量的取值中的最大值或者最小值作为风险评分。If the output vector contains multiple sub-vectors with values between 0 and 1, extract the maximum or minimum value among the values of the multiple sub-vectors as the risk score.

本说明书还提出一种基于LSTM模型的信用风险预测装置，所述装置包括：This specification also proposes a credit risk forecasting device based on the LSTM model, the device comprising:

获取模块，获取目标账户在预设时间段内的用户操作行为数据；其中，所述预设时间段为由若干时间步长相同的时间区间组成的时间序列；The acquisition module acquires user operation behavior data of the target account within a preset time period; wherein, the preset time period is a time series composed of several time intervals with the same time step;

生成模块，基于所述目标账户在各个时间区间内的用户操作行为数据，生成对应于各个时间区间的用户行为向量序列；A generation module, based on the user operation behavior data of the target account in each time interval, generates a sequence of user behavior vectors corresponding to each time interval;

第一计算模块，将生成的对应于各个时间区间的用户行为向量序列输入至训练完毕的基于编码-解码架构的LSTM模型中的LSTM编码器进行计算，得到对应于各个时间区间的隐藏状态向量；其中，所述LSTM模型包括LSTM编码器，和引入了注意力机制的LSTM解码器；The first calculation module inputs the generated user behavior vector sequence corresponding to each time interval to the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture for calculation, and obtains the hidden state vector corresponding to each time interval; Wherein, the LSTM model includes an LSTM encoder, and an LSTM decoder that introduces an attention mechanism;

第二计算模块，将对应于各个时间区间的隐藏状态向量作为风险特征，输入至所述LSTM解码器进行计算，得到所述目标账户在下一时间区间内的风险评分；以及，各隐藏状态向量对应于所述风险评分的权重值；其中，所述权重值表征所述隐藏状态向量对所述风险评分的贡献度。The second calculation module is to input the hidden state vectors corresponding to each time interval as risk features into the LSTM decoder for calculation, and obtain the risk score of the target account in the next time interval; and each hidden state vector corresponds to A weight value of the risk score; wherein, the weight value represents the contribution of the hidden state vector to the risk score.

可选的，所述获取模块进一步：Optionally, the acquisition module further:

所述生成模块进一步：The generating module further:

所述装置还包括：The device also includes:

训练模块，将生成的用户行为向量序列作为训练样本训练基于编码-解码架构的LSTM模型。The training module uses the generated user behavior vector sequence as a training sample to train the LSTM model based on the encoding-decoding architecture.

可选的，所述生成模块进一步：Optionally, the generating module further:

可选的，所述第一计算模块：Optionally, the first computing module:

可选的，所述第二计算模块：Optionally, the second computing module:

本说明书还提出一种电子设备，包括：The specification also proposes an electronic device, comprising:

处理器；processor;

用于存储机器可执行指令的存储器；memory for storing machine-executable instructions;

其中，通过读取并执行所述存储器存储的与基于LSTM模型的信用风险预测的控制逻辑对应的机器可执行指令，所述处理器被促使：Wherein, by reading and executing the machine-executable instructions stored in the memory and corresponding to the control logic of credit risk prediction based on the LSTM model, the processor is prompted to:

附图说明Description of drawings

图1是本说明书一实施例提供的一种基于LSTM模型的信用风险预测方法的流程图；Fig. 1 is a flow chart of a credit risk prediction method based on an LSTM model provided by an embodiment of this specification;

图2是本说明书一实施例提供的一种基于encoder–decoder架构的LSTM模型；Fig. 2 is an LSTM model based on an encoder-decoder architecture provided by an embodiment of this specification;

图3是本说明书一实施例提供的一种多种多层LSTM网络架构的示意图；FIG. 3 is a schematic diagram of a variety of multi-layer LSTM network architectures provided by an embodiment of this specification;

图4是本说明书一实施例提供的一种对用户划分群体的示意图；Fig. 4 is a schematic diagram of dividing users into groups provided by an embodiment of this specification;

图5是本说明书一实施例提供的一种为LSTM编码器中的各数据节点构建用户行为向量序列的示意图；Fig. 5 is a schematic diagram of constructing a user behavior vector sequence for each data node in an LSTM encoder provided by an embodiment of this specification;

图6是本说明书一实施例提供的承载一种基于LSTM模型的信用风险预测装置的服务端的硬件结构图；Fig. 6 is a hardware structural diagram of a server carrying a credit risk prediction device based on an LSTM model provided by an embodiment of this specification;

图7是本说明书一实施例提供的一种基于LSTM模型的信用风险预测装置的逻辑框图。Fig. 7 is a logical block diagram of an LSTM model-based credit risk prediction device provided by an embodiment of this specification.

具体实施方式Detailed ways

本说明书旨在提出一种，在对目标账户进行信用风险预测的场景下，基于目标账户在一段时间内的用户操作行为数据来训练基于encoder–decoder(编码-解码)架构的LSTM模型，基于训练完成的LSTM模型对目标账户在未来一段时间内的信用风险进行预测的技术方案。This specification aims to propose an LSTM model based on the encoder-decoder (encoding-decoding) architecture based on the user operation behavior data of the target account over a period of time in the scenario of credit risk prediction for the target account. The completed LSTM model is a technical solution for predicting the credit risk of the target account in the future.

在实现时，建模方可以预先定义一个需要预测信用风险的目标时间段作为表现窗口，以及预先设计一个观察目标账户的用户行为表现的预设时间段作为观察窗口，并将上述表现窗口和观察窗口基于建模方定义的时间步长，组成时间序列。When implementing, the modeling party can pre-define a target time period that needs to predict credit risk as the performance window, and pre-design a preset time period for observing the user behavior performance of the target account as the observation window, and combine the above performance window with the observation The windows are based on the time steps defined by the modeling party and form the time series.

例如，在一个例子中，假设建模方需要基于目标账户过去12个月的用户操作行为数据，来预测该目标账户在未来6个月的信用风险，那么可以将表现窗口设计为过去6个月，将观察窗口设计为过去12个月。假设建模方定义的时间步长为1个月，那么可以将表现窗口和观察窗口划分为时间步长为1个月的若干时间区间组成时间序列。此时每一个时间区间称之为上述时间序列中的一个数据节点。For example, in an example, assuming that the modeling party needs to predict the credit risk of the target account in the next 6 months based on the user operation behavior data of the target account in the past 12 months, then the performance window can be designed as the past 6 months , design the observation window to be the past 12 months. Assuming that the time step defined by the modeling party is 1 month, then the performance window and observation window can be divided into several time intervals with a time step of 1 month to form a time series. At this time, each time interval is called a data node in the above time series.

建模方可以准备若干被标记了风险标签的样本账户，并获取这些样本账户在上述观察窗口内的用户操作行为数据，并基于各样本账户在该观察窗口中的各个时间区间内的用户操作行为数据，来构建与各个时间区间对应的用户行为向量序列作为训练样本，来训练基于encoder–decoder架构的LSTM模型；其中，上述LSTM模型包括LSTM编码器和引入了注意力机制(Attention mechanism)的LSTM解码器。The modeling party can prepare several sample accounts marked with risk tags, and obtain the user operation behavior data of these sample accounts in the above observation window, and based on the user operation behavior of each sample account in each time interval in the observation window Data, to construct the user behavior vector sequence corresponding to each time interval as a training sample, to train the LSTM model based on the encoder–decoder architecture; wherein, the above LSTM model includes an LSTM encoder and an LSTM that introduces an attention mechanism (Attention mechanism) decoder.

例如，可以基于这些训练样本输入至LSTM编码器进行训练计算，来训练LSTM编码器，然后将训练LSTM编码器时从训练样本中计算得到的，对应于各个时间区间的隐藏状态向量作为训练解码器所需的特征变量，继续输入至LSTM解码器进行训练计算，来训练LSTM解码器，并通过迭代执行以上过程，直到LSTM模型训练完毕。For example, these training samples can be input to the LSTM encoder for training calculations to train the LSTM encoder, and then the hidden state vectors corresponding to each time interval calculated from the training samples when training the LSTM encoder are used as the training decoder The required feature variables continue to be input to the LSTM decoder for training calculations to train the LSTM decoder, and the above process is performed iteratively until the LSTM model is trained.

当建模方基于训练完成的上述LSTM模型对目标账户在上述表现窗口中的信用风险进行预测时，可以采用同样的方式，获取目标账户在上述观察窗口内的用户操作行为数据，并基于该目标账户在该观察窗口内的各个时间区间内的用户操作行为数据，来构建与各个时间区间对应的用户行为向量序列作为预测样本，然后将这些预测样本输入上述LSTM模型的LSTM编码器中进行计算得到与各个时间区间对应的隐藏状态向量。When the modeling party predicts the credit risk of the target account in the above-mentioned performance window based on the above-mentioned LSTM model that has been trained, the same method can be used to obtain the user operation behavior data of the target account in the above-mentioned observation window, and based on the target The user operation behavior data of the account in each time interval within the observation window is used to construct the user behavior vector sequence corresponding to each time interval as a prediction sample, and then input these prediction samples into the LSTM encoder of the above LSTM model for calculation. Hidden state vectors corresponding to each time interval.

进一步的，可以将通过LSTM编码器计算得到的与各个时间区间对应的隐藏状态向量作为该目标账户的风险特征，输入至上述LSTM模型进行计算，输入该目标账户的风险评分，以及各个隐藏状态向量相对于上述风险评分的权重值；其中，该权重值表征上述隐藏状态向量对上述风险评分的贡献度。Further, the hidden state vectors corresponding to each time interval calculated by the LSTM encoder can be used as the risk characteristics of the target account, input to the above LSTM model for calculation, input the risk score of the target account, and each hidden state vector A weight value relative to the above-mentioned risk score; wherein, the weight value represents the contribution of the above-mentioned hidden state vector to the above-mentioned risk score.

在以上技术方案中，一方面，由于将目标账户在各个时间区间内的用户行为向量序列，作为输入数据直接输入基于编码-解码架构的LSTM模型中的LSTM编码器中进行计算，就可以得到对应于各个时间区间的隐藏状态向量，进而可以将得到的隐藏状态向量作为风险特征进一步输入至LSTM解码器进行计算，来完成该目标账户的风险预测得到风险评分；因此，可以无需建模人员基于目标账户的用户操作行为数据，来开发和探索建模所需的特征变量，可以避免由于基于建模人员的经验设计的特征变量不够准确，而造成的难以深度挖掘出数据中包含的信息，对模型进行风险预测的准确度造成影响；而且，也不需要对人工设计的特征变量进行存储维护，可以降低系统的存储开销；In the above technical solution, on the one hand, since the user behavior vector sequence of the target account in each time interval is directly input into the LSTM encoder in the LSTM model based on the encoding-decoding architecture as input data for calculation, the corresponding The hidden state vector of each time interval, and then the obtained hidden state vector can be further input into the LSTM decoder for calculation as a risk feature, so as to complete the risk prediction of the target account and obtain a risk score; The user operation behavior data of the account is used to develop and explore the characteristic variables required for modeling, which can avoid the difficulty of digging out the information contained in the data due to the inaccurate characteristic variables designed based on the experience of the modeler. The accuracy of risk prediction is affected; moreover, there is no need to store and maintain the artificially designed characteristic variables, which can reduce the storage overhead of the system;

另一方面，由于基于编码-解码架构的LSTM模型的LSTM解码器中，引入了注意力机制，因此将LSTM编码器得到的对应于各个时间区间的隐藏特征变量作为风险特征，输入LSTM解码器进行风险预测计算，可以得到对应于各个时间区间的隐藏状态向量对应于最终风险评分的权重值，从而能够直观的评估出各个隐藏特征变量对最终得到的风险评分的贡献度，进而可以提升LSTM模型的可解释性。On the other hand, due to the introduction of the attention mechanism in the LSTM decoder of the LSTM model based on the encoding-decoding architecture, the hidden feature variables corresponding to each time interval obtained by the LSTM encoder are used as risk features and input into the LSTM decoder for The risk prediction calculation can obtain the weight value of the hidden state vector corresponding to each time interval corresponding to the final risk score, so that the contribution of each hidden feature variable to the final risk score can be intuitively evaluated, and the LSTM model can be improved. interpretability.

下面通过具体实施例并结合具体的应用场景对本说明书进行描述。The specification is described below through specific embodiments and in combination with specific application scenarios.

请参考图1，图1是本说明书一实施例提供的一种基于LSTM模型的信用风险预测方法，应用于服务端，所述方法执行以下步骤：Please refer to Figure 1, Figure 1 is a credit risk prediction method based on the LSTM model provided by an embodiment of this specification, which is applied to the server, and the method performs the following steps:

步骤102，获取目标账户在预设时间段内的用户操作行为数据；其中，所述预设时间段为由若干时间步长相同的时间区间组成的时间序列；Step 102, acquiring user operation behavior data of the target account within a preset time period; wherein, the preset time period is a time series composed of several time intervals with the same time step;

步骤104，基于所述目标账户在各个时间区间内的用户操作行为数据，生成对应于各个时间区间的用户行为向量序列；Step 104, based on the user operation behavior data of the target account in each time interval, generate a user behavior vector sequence corresponding to each time interval;

步骤106，将生成的对应于各个时间区间的用户行为向量序列输入至训练完毕的基于编码-解码架构的LSTM模型中的LSTM编码器进行计算，得到对应于各个时间区间的隐藏状态向量；其中，所述LSTM模型包括LSTM编码器，和引入了注意力机制的LSTM解码器；Step 106, input the generated user behavior vector sequence corresponding to each time interval to the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture for calculation, and obtain the hidden state vector corresponding to each time interval; wherein, The LSTM model includes an LSTM encoder, and an LSTM decoder that introduces an attention mechanism;

步骤108，将对应于各个时间区间的隐藏状态向量作为风险特征，输入至所述LSTM解码器进行计算，得到所述目标账户在下一时间区间内的风险评分；以及，各隐藏状态向量对应于所述风险评分的权重值；其中，所述权重值表征所述隐藏状态向量对所述风险评分的贡献度。Step 108: Input the hidden state vectors corresponding to each time interval as risk features into the LSTM decoder for calculation, and obtain the risk score of the target account in the next time interval; and, each hidden state vector corresponds to the The weight value of the risk score; wherein, the weight value represents the contribution of the hidden state vector to the risk score.

上述目标账户，可以包括用户的支付账户，用户可以通过在相应的支付客户端(比如支付APP)上登录目标账户来发起支付交易。The above-mentioned target account may include the payment account of the user, and the user may initiate a payment transaction by logging into the target account on a corresponding payment client (such as a payment APP).

上述服务端，可以包括面向用户的支付客户端提供服务，对用户登录客户端所使用的支付账号进行风险识别的服务器、服务器集群或者基于服务器集群构建的云平台。The above-mentioned server may include a server, a server cluster, or a cloud platform based on a server cluster that provides services for the user-oriented payment client and performs risk identification on the payment account used by the user to log in to the client.

上述操作行为数据，可以包括用户在客户端上登录目标账户后执行的一系列与交易相关的操作行为而产生的数据；The above-mentioned operational behavior data may include data generated by a series of transaction-related operational behaviors performed by the user after logging into the target account on the client;

例如，上述操作行为可以包括用户的信贷表现行为、用户消费行为、理财支付行、店铺经营行为、日常交友行为等。用户在通过客户端完成以上示出的操作行为时，客户端可以将执行上述操作行为所产生的数据上传至服务端，由服务端在其本地的数据库中作为事件进行保存。For example, the above-mentioned operation behaviors may include the user's credit performance behavior, user consumption behavior, wealth management and payment bank, store operation behavior, daily friendship behavior, etc. When the user completes the above-mentioned operation behavior through the client, the client can upload the data generated by performing the above operation behavior to the server, and the server will save it as an event in its local database.

在本说明书中，建模方可以预先定义一个需要预测信用风险的目标时间段作为表现窗口，以及预先设计一个观察目标账户的用户行为表现的预设时间段作为观察窗口，并将上述表现窗口和观察窗口基于建模方定义的时间步长，组成时间序列。In this specification, the modeling party can pre-define a target time period that needs to predict credit risk as the performance window, and pre-design a preset time period for observing the user behavior performance of the target account as the observation window, and combine the above performance window and The observation window is based on the time step defined by the modeling party to form a time series.

其中，上述表现窗口和观察窗口所对应的时间段的取值大小，可以由建模方基于实际的预测目标来自定义设置，在本说明书中不再进行具体限定。相应的，上述时间步长的取值大小，也可以由建模方基于实际的业务需求，来自定义设置，在本说明书中也不再进行具体限定。Wherein, the value of the time period corresponding to the above-mentioned performance window and observation window can be customized and set by the modeling party based on the actual prediction target, and will not be specifically limited in this specification. Correspondingly, the above-mentioned value of the time step can also be customized by the modeling party based on actual business requirements, and will not be specifically limited in this specification.

在以下实施例中，将以建模方需要基于目标账户过去12个月的用户操作行为数据，来预测该目标账户在未来6个月的信用风险，以及定义的上述时间步长为1个月为例进行说明。In the following example, the modeling party needs to predict the credit risk of the target account in the next 6 months based on the user operation behavior data of the target account in the past 12 months, and the defined time step is 1 month Take this as an example.

在这种情况下，可以将上述表现窗口设计为过去6个月，将观察窗口设计为过去12个月。进一步的，还可以按照定义的时间步长，将表现窗口划分为6个时间步长均为1个月的时间区间，然后将这些时间区间组织成时间序列；以及，将观察窗口划分为12个时间步长均为1个月的时间区间，然后将这些时间区间组织成时间序列。In this case, the above-mentioned performance window can be designed as the past 6 months, and the observation window can be designed as the past 12 months. Further, according to the defined time step, the performance window can be divided into 6 time intervals with a time step of 1 month, and then these time intervals can be organized into a time series; and, the observation window can be divided into 12 The time steps are all 1-month time intervals, and these time intervals are then organized into time series.

请参见图2，图2为本说明书示出的一种基于encoder–decoder架构的LSTM模型。Please refer to FIG. 2, which is an LSTM model based on an encoder-decoder architecture shown in this specification.

如图2所示，上述基于encoder–decoder架构的LSTM模型，具体可以包括LSTM编码器、以及引入了注意力机制的LSTM解码器。As shown in Figure 2, the above-mentioned LSTM model based on the encoder-decoder architecture can specifically include an LSTM encoder and an LSTM decoder that introduces an attention mechanism.

上述LSTM编码器(Encoder)，用于对上述观察窗口中的各数据节点输入的用户行为向量序列进行特征发现，并将各数据节点输出的隐藏状态向量(即最终发现的特征)，进一步输入至LSTM解码器。其中，LSTM编码器中的数据节点，与上述观察窗口中的各时间区间相对应。上述观察窗口中的每一个时间区间，分别对应LSTM编码器中的一个数据节点。The above-mentioned LSTM encoder (Encoder) is used to perform feature discovery on the user behavior vector sequence input by each data node in the above-mentioned observation window, and further input the hidden state vector (that is, the finally discovered feature) output by each data node to LSTM decoder. Wherein, the data nodes in the LSTM encoder correspond to each time interval in the above-mentioned observation window. Each time interval in the above observation window corresponds to a data node in the LSTM encoder.

上述LSTM解码器(Decoder)，用于基于LSTM编码器从输入的用户行为向量序列中发现的风险特征，以及用户在观察窗口中各个数据节点中的行为表现，对表现窗口中的各数据节点的信用风险进行预测，输出与表现窗口中的各数据节点对应的预测结果。其中，LSTM解码器中的数据节点，与上述表现窗口中的各时间区间相对应。上述表现窗口中的每一个时间区间，分别对应LSTM解码器中的一个数据节点。The above-mentioned LSTM decoder (Decoder) is used to analyze the risk characteristics of each data node in the performance window based on the risk characteristics discovered by the LSTM encoder from the input user behavior vector sequence, and the behavior performance of the user in each data node in the observation window. The credit risk is forecasted, and the forecast results corresponding to each data node in the performance window are output. Wherein, the data nodes in the LSTM decoder correspond to each time interval in the above-mentioned representation window. Each time interval in the above performance window corresponds to a data node in the LSTM decoder.

需要说明的是，上述LSTM解码器中的第一个数据节点对应的时间区间，为上述编码器中的最后一个数据节点对应的时间区间的下一个时间区间。比如，图2中，0-M1表示与当前时刻的前一个月对应的时间区间；S表示与当前月对应的时间区间；P-M1表示与当前时刻的下一个月对应的时间区间。It should be noted that, the time interval corresponding to the first data node in the above-mentioned LSTM decoder is a time interval next to the time interval corresponding to the last data node in the above-mentioned encoder. For example, in FIG. 2 , 0-M1 represents the time interval corresponding to the previous month at the current moment; S represents the time interval corresponding to the current month; P-M1 represents the time interval corresponding to the next month at the current moment.

上述注意力机制(Attention)，用于为LSTM编码器在观察窗口中的各数据节点输出的特征，分别标注对应于LSTM解码器在表现窗口中的各数据节点输出的预测结果的权重值；其中，该权重值表征LSTM编码器在观察窗口中的各数据节点输出的特征，对应于LSTM解码器在表现窗口中的各数据节点输出的预测结果的贡献度(也称之为影响度)。The above attention mechanism (Attention) is used to mark the weight values corresponding to the prediction results output by the LSTM decoder in the performance window for the features output by each data node in the observation window of the LSTM encoder; wherein , the weight value characterizes the characteristics of the output of each data node in the observation window of the LSTM encoder, and corresponds to the contribution degree (also called the influence degree) of the prediction result output by each data node in the presentation window of the LSTM decoder.

通过引入注意力机制，使得建模方可以直观的查看到LSTM编码器在观察窗口中各个数据节点发现的特征，对最终LSTM解码器最终在表现窗口中各个数据节点输出的预测结果的贡献度，提升LSTM模型的可解释性。By introducing the attention mechanism, the modeler can intuitively view the features found by the LSTM encoder in the observation window of each data node, and the contribution to the final prediction result output by the final LSTM decoder in the performance window of each data node, Improve the interpretability of LSTM models.

在示出的一种实施方式中，为了可以刻画用户的操作行为，上述LSTM编码器和LSTM解码器，均可以采用多层的LSTM网络架构(比如大于3层)。In one embodiment shown, in order to describe the user's operation behavior, both the above-mentioned LSTM encoder and LSTM decoder can adopt a multi-layer LSTM network architecture (for example, more than 3 layers).

其中，上述LSTM编码器和LSTM解码器所采用的多层LSTM网络架构的具体形式，在本说明书中不进行特别限定；例如，请参见图3，多层LSTM网络架构的具体形式，通常可以包括one-to-one、one-to-many、many-to-one、输入和输出节点数量不对称的many-to-many、输入和输出节点数量对称的many-to-many等结构形式。Among them, the specific form of the multi-layer LSTM network architecture adopted by the above-mentioned LSTM encoder and LSTM decoder is not particularly limited in this specification; One-to-one, one-to-many, many-to-one, many-to-many with asymmetric number of input and output nodes, many-to-many with symmetrical number of input and output nodes, etc.

在示出的一种实施方式中，由于LSTM编码器最终需要将观察窗口中的各数据节点输出的隐藏状态向量汇总为一路输入，因此LSTM编码器可以采用如图3中示出的many-to-one结构。而由于LSTM解码器最终需要为表现窗口中的各数据节点分别输出一个对应的预测结果，因此LSTM编码器可以采用如图3中示出的输入和输出节点数量对称的many-to-many结构。In one embodiment shown, since the LSTM encoder finally needs to summarize the hidden state vectors output by each data node in the observation window into one input, the LSTM encoder can use the many-to -one structure. Since the LSTM decoder finally needs to output a corresponding prediction result for each data node in the presentation window, the LSTM encoder can adopt a many-to-many structure with a symmetrical number of input and output nodes as shown in FIG. 3 .

以下通过具体的实施例对以上示出的基于encoder–decoder架构的LSTM模型的训练以及使用过程进行详细描述。The training and use process of the LSTM model based on the encoder-decoder architecture shown above will be described in detail through specific embodiments.

1)用户分群1) User grouping

在本说明书中，由于不同的用户人群的数据厚薄，以及信用行为表现等均存在较大的差异，因此为了避免这种差异对模型准确度的影响，在针对需要进行信用风险评估的用户群体进行建模时，可以按照这些差异对上述用户群体进行用户群体划分，然后针对每一个用户群体分别训练用于对该用户群体中的用户进行信用风险评估的LSTM模型。In this manual, due to the large differences in the data thickness and credit behavior performance of different user groups, in order to avoid the impact of this difference on the accuracy of the model, the user groups that need to conduct credit risk assessment When modeling, the above-mentioned user groups can be divided into user groups according to these differences, and then for each user group, an LSTM model for credit risk assessment of users in the user group can be trained separately.

其中，在对上述用户群体进行用户群体划分时所采用的特征，以及具体的用户群体划分方式，在本说明书中不进行特别限定；Among them, the characteristics adopted when dividing the user groups above, as well as the specific user group division methods, are not particularly limited in this specification;

例如，在实际应用中，可以按照用户数据丰富程度、职业、逾期次数、年龄等特征，来进行用户群体划分；比如，如图4所示，在一个例子中，可以将所有用户划分为数据稀少的群体和数据丰富的群体，然后进一步将数据稀少的群体按照职业划分为诸如工薪族、学生组等用户群体，将数据丰富的群体按照逾期次数，进一步划分为信用良好、信用一般等用户群体。For example, in practical applications, user groups can be divided according to user data richness, occupation, overdue times, age and other characteristics; for example, as shown in Figure 4, in one example, all users can be divided into data-sparse Groups with rich data and groups with rich data, and then groups with scarce data are further divided into user groups such as salaried and student groups according to their occupations, and groups with rich data are further divided into user groups with good credit and general credit according to the number of overdue times.

2)基于encoder–decoder架构的LSTM模型的训练2) Training of LSTM model based on encoder–decoder architecture

在本说明书中，在对划分出的某一用户群体进行上述LSTM模型的训练时，建模方可以收集隶属于该用户群体的大量被标记了风险标签的用户账户作为样本账户。In this specification, when training the above LSTM model for a certain user group, the modeling party may collect a large number of user accounts that belong to the user group and are marked with risk labels as sample accounts.

其中，上述风险标签具体可以包括用于指示账户存在信用风险的标签，和用于指示账户不存在信用风险的标签；比如，对于存在信用风险的样本账户可以标记一个标签1；对于不存在信用风险的样本账户可以标记一个标签0。Among them, the above-mentioned risk tags may specifically include a tag for indicating that the account has credit risk, and a tag for indicating that the account does not have credit risk; for example, a tag 1 can be marked for a sample account with credit risk; for a sample account without credit risk The sample account of can be marked with a tag of 0.

需要说明的是，建模方准备的被标记了风险标签的样本账户中，被标记了用于指示账户存在信用风险的标签，和被标记了用于指示账户不存在信用风险的标签的样本账户的比例，在本说明书中不进行特别限定，建模方可以基于实际的建模需求来进行设置。It should be noted that, among the sample accounts marked with risk labels prepared by the modeling party, there are those marked with a label indicating that the account has credit risk, and the sample accounts marked with a label indicating that the account does not have credit risk The ratio of is not specifically limited in this specification, and the modeling party can set it based on actual modeling requirements.

进一步的，建模方可以获取被标记了风险标签的这些样本账户，在上述观察窗口内的用户操作行为数据，并获取这些样本账户在上述观察窗口中的各个时间区间内产生的用户操作行为数据，基于这些样本账户在上述观察窗口中的各个数据节点对应的时间区间内产生的用户操作行为数据，为各数据节点分别构建对应的用户行为向量序列，然后将构建出的用户行为向量序列作为训练样本来训练上述基于encoder–decoder架构的LSTM模型。Further, the modeling party can obtain these sample accounts marked with risk tags, the user operation behavior data in the above observation window, and obtain the user operation behavior data generated by these sample accounts in each time interval in the above observation window , based on the user operation behavior data generated by these sample accounts in the time interval corresponding to each data node in the above observation window, a corresponding user behavior vector sequence is constructed for each data node, and then the constructed user behavior vector sequence is used as a training Samples to train the above LSTM model based on the encoder–decoder architecture.

在示出的一种实施方式中，建模方可以预先定义多种用于构建用户行为向量序列的用户操作行为，在对观察窗口中的各数据节点分别构建对应的用户行为向量序列时，可以获取上述样本账户在观察窗口中的各个时间区间内，产生的与上述多种用户操作行为对应的的多种用户操作行为数据，并从获取到的用户操作行为数据中分别提取关键因子，然后对提取到的关键因子进行数字化处理，得到与各用户操作行为数据对应的用户行为向量。In one embodiment shown, the modeling party can pre-define a variety of user operation behaviors for constructing user behavior vector sequences, and when constructing corresponding user behavior vector sequences for each data node in the observation window, it can Obtain various user operation behavior data corresponding to the above-mentioned various user operation behaviors generated by the above-mentioned sample accounts in each time interval in the observation window, and extract key factors from the obtained user operation behavior data, and then analyze The extracted key factors are digitized to obtain a user behavior vector corresponding to each user's operation behavior data.

进一步的，在得到与各用户操作行为对应的用户行为向量后，可以对上述观察窗口中的各个数据节点对应的时间区间内的多种用户操作行为数据对应的用户行为向量进行拼接处理，生成对应于各个时间区间的用户行为向量序列。Further, after obtaining the user behavior vectors corresponding to each user operation behavior, the user behavior vectors corresponding to various user operation behavior data in the time interval corresponding to each data node in the above-mentioned observation window can be concatenated to generate a corresponding The sequence of user behavior vectors in each time interval.

其中，建模方定义的上述多种用户操作行为在本说明书中不进行特别限定，建模方可以基于实际的需求进行自定义；从与上述多种用户操作行为对应的用户操作行为数据中提取的关键因子，在本说明书中也不进行特别限定，上述用户操作行为数据中的重要构成要素，均可以作为上述关键因子，Among them, the above-mentioned various user operation behaviors defined by the modeling party are not specifically limited in this specification, and the modeling party can customize based on actual needs; extract from the user operation behavior data corresponding to the above-mentioned various user operation behaviors The key factors of the above-mentioned user operation behavior data are not particularly limited in this specification.

请参见图5，图5为本说明书示出的一种为LSTM编码器中的各数据节点构建用户行为向量序列的示意图。Please refer to FIG. 5 . FIG. 5 is a schematic diagram of constructing a user behavior vector sequence for each data node in the LSTM encoder shown in this specification.

在示出的一种实施方式中，建模方定义的多种用户操作行为，具体可以包括信贷表现行为、用户消费行为、理财支付行为；相应的，上述关键因子，具体可以包括与信贷表现行为对应的借贷订单状态和借贷还款金额、与用户消费行为对应的用户消费类目和用户消费笔数、与理财支付行为对应的理财支付类型和理财收益金额等等。In one embodiment shown, the various user operation behaviors defined by the modeling party may specifically include credit performance behaviors, user consumption behaviors, and wealth management payment behaviors; correspondingly, the above-mentioned key factors may specifically include credit performance behaviors The corresponding loan order status and loan repayment amount, the user consumption category and the number of user consumption transactions corresponding to the user consumption behavior, the financial management payment type and financial income amount corresponding to the financial management payment behavior, etc.

对于观察窗口中的每一个时间区间，可以分别获取样本账户在该时间区间内产生的信贷表现行为数据、用户消费行为数据、理财支付行为数据，然后从信贷表现行为数据中提取出借贷订单状态(图5中示出的为正常、逾期两种状态)和借贷还款金额(图5中示出的为实际的借贷金额和逾期金额；比如，逾期1/50，表示逾期一次，逾期金额50元；正常/10，表示正常还款，还款金额为10元)，从用户消费行为数据中提取出用户消费类目(图5中示出的为手机、黄金、充值、服装等四种消费类目)和用户消费笔数，从理财支付行为数据中提取出理财支付类型(图5中示出的为货币基金、基金两种理财产品类型)和理财收益金额。For each time interval in the observation window, the credit performance behavior data, user consumption behavior data, and wealth management payment behavior data generated by the sample account in this time interval can be obtained separately, and then the loan order status can be extracted from the credit performance behavior data ( Shown in Figure 5 are normal and overdue two states) and loan repayment amount (shown in Figure 5 is the actual loan amount and overdue amount; for example, overdue 1/50 means overdue once, overdue amount 50 yuan ; Normal/10 means normal repayment, and the repayment amount is 10 yuan), and the user consumption category is extracted from the user consumption behavior data (shown in Fig. item) and the number of user consumption transactions, and extract the financial management payment type (shown in Figure 5 as two types of financial management products, monetary fund and fund) and the amount of financial management income from the financial management payment behavior data.

进一步的，可以对从信贷表现行为数据、用户消费行为数据、理财支付行为数据中提取出的信息进行数字化处理，得到每一种用户操作行为数据对应于各时间区间的用户行为向量，而后可以对以上示出的三种用户操作行为数据对应于各时间区间的用户行为向量进行拼接，得到与各时间区间对应的用户行为向量序列。Further, the information extracted from credit performance behavior data, user consumption behavior data, and wealth management payment behavior data can be digitized to obtain user behavior vectors corresponding to each time interval for each user operation behavior data, and then the The three types of user operation behavior data shown above are concatenated corresponding to the user behavior vectors of each time interval, to obtain a sequence of user behavior vectors corresponding to each time interval.

在本说明书中，上述基于encoder–decoder架构的LSTM模型中的LSTM编码器所涉及的计算通常包括输入门计算、记忆门(也称之为遗忘门)计算、单元状态计算以及隐藏状态向量计算四部分；其中，由于在本说明书中，LSTM编码器计算得到的隐藏状态向量，最终会汇总后作为LSTM解码器的输入，因此对于LSTM编码器而言，可以不涉及输出门。以上各部分计算所涉及的计算公式如下所示：In this specification, the calculations involved in the LSTM encoder in the LSTM model based on the encoder-decoder architecture generally include input gate calculations, memory gate (also called forget gate) calculations, unit state calculations, and hidden state vector calculations. Part; where, since in this specification, the hidden state vectors calculated by the LSTM encoder are finally summarized as the input of the LSTM decoder, so for the LSTM encoder, the output gate may not be involved. The calculation formulas involved in the calculation of the above parts are as follows:

f(t)＝f(W_f*X_i+U_f*h(t-1)+b_f)f(t)＝f(W _f *X _i +U _f *h(t-1)+b _f )

i(t)＝f(W_i*X_i+U_i*h(t-1)+b_i)i(t)＝f(W _i *X _i +U _i *h(t-1)+b _i )

m(t)＝tanh(W_m*X_i+U_m*h(t-1)+b_m)m(t)＝tanh(W _m *X _i +U _m *h(t-1)+b _m )

h(t)＝f(t)*h(t-1)+i(t)*m(t)h(t)=f(t)*h(t-1)+i(t)*m(t)

其中，在以上公式中，f(t)表示LSTM编码器第t个数据节点的记忆门；i(t)表示LSTM编码器第t个数据节点的输入门；m(t)表示LSTM编码器第t个数据节点的单元状态(也称之为候选隐藏状态)；h(t)表示LSTM编码器第t个数据节点(即第t个时间区间)对应的隐藏状态向量；h(t-1)表示LSTM编码器第t个数据节点的上一数据节点对应的隐藏状态向量；f表示非线性激活函数，可以基于实际的需求选取合适的非线性激活函数；例如，对于LSTM编码器而言，上述f具体可以采用sigmoid函数。W_f和U_f表示记忆门的权重矩阵；b_f表示记忆门的偏置项。W_i和U_i表示输入门的权重矩阵；b_i表示输入门的偏置项；W_m和U_m表示单元状态的权重矩阵；b_m表示单元状态的偏置项。Among them, in the above formula, f(t) represents the memory gate of the t-th data node of the LSTM encoder; i(t) represents the input gate of the t-th data node of the LSTM encoder; m(t) represents the t-th data node of the LSTM encoder. The unit state of t data nodes (also known as the candidate hidden state); h(t) represents the hidden state vector corresponding to the tth data node (that is, the tth time interval) of the LSTM encoder; h(t-1) Represents the hidden state vector corresponding to the previous data node of the tth data node of the LSTM encoder; f represents the nonlinear activation function, and an appropriate nonlinear activation function can be selected based on actual needs; for example, for the LSTM encoder, the above f can specifically use the sigmoid function. W _f and U _f represent the weight matrix of the memory gate; b _f represents the bias term of the memory gate. W _i and U _i represent the weight matrix of the input gate; b _i represents the bias term of the input gate; W _m and U _m represent the weight matrix of the cell state; b _m represents the bias term of the cell state.

在本说明书中，上述基于encoder–decoder架构的LSTM模型中的LSTM解码器中引入的注意力机制涉及的计算通常包括贡献度取值计算、以及贡献度取值进行归一化处理(归一化至0～1之间)转换成权重值的计算两部分。以上各部分计算所涉及的计算公式如下所示：In this specification, the calculations involved in the attention mechanism introduced in the LSTM decoder in the LSTM model based on the encoder-decoder architecture usually include the calculation of the contribution value and the normalization of the contribution value (normalization between 0 and 1) into two parts of the calculation of the weight value. The calculation formulas involved in the calculation of the above parts are as follows:

etj＝tanh(W_a*s(j-1)+U_a*h(t))etj＝tanh(W _a *s(j-1)+U _a *h(t))

atj＝exp(etj)/sum_T(exp(etj))atj=exp(etj)/sum_T(exp(etj))

其中，在以上公式中，etj表示LSTM编码器第t个数据节点对应的隐藏状态向量，对LSTM编码器第j个数据节点对应的预测结果的贡献度取值；atj表示对etj进行归一化处理后，得到的权重值；exp(etj)表示对etj进行指数函数运算；sum_T(exp(etj))表示对LSTM编码器的共计T个数据节点的etj进行求和。s(j-1)表示LSTM解码器第j个数据节点对应的隐藏状态向量。W_a和U_a为注意力机制的权重矩阵。Among them, in the above formula, etj represents the hidden state vector corresponding to the tth data node of the LSTM encoder, and the contribution value of the prediction result corresponding to the jth data node of the LSTM encoder; atj represents the normalization of etj After processing, the obtained weight value; exp(etj) means to perform an exponential function operation on etj; sum_T(exp(etj)) means to sum up the etj of a total of T data nodes of the LSTM encoder. s(j-1) represents the hidden state vector corresponding to the jth data node of the LSTM decoder. W _a and U _a are the weight matrices of the attention mechanism.

其中，需要说明的是，在以上公式中，对etj进行归一化处理，采用的是将etj的取值进行指数函数运算的结果，与对LSTM编码器的共计T个数据节点的etj进行求和的结果相除的方式，将etj的取值归一化至区间[0,1]，在实际应用中，除了以上公式示出的归一化方式以外，本领域技术人员在将本说明书的技术方案付诸实现时，也可以采用其它的归一化方式，在本说明书中不再进行一一列举。Among them, it should be noted that, in the above formula, etj is normalized, and the result of performing exponential function operation on the value of etj is used, which is calculated with the etj of a total of T data nodes of the LSTM encoder. The method of dividing the results of the sum and the value of etj is normalized to the interval [0,1]. In practical applications, in addition to the normalization method shown in the above formula, those skilled in the art will When the technical solution is put into practice, other normalization methods can also be adopted, which will not be listed one by one in this specification.

在本说明书中，上述基于encoder–decoder架构的LSTM模型中的LSTM编码器涉及的计算通常包括输入门计算、记忆门计算、输出门计算、单元状态计算、隐藏状态向量计算、以及输出向量计算等六部分。以上各部分计算所涉及的计算公式如下所示：In this specification, the calculations involved in the LSTM encoder in the LSTM model based on the encoder-decoder architecture generally include input gate calculations, memory gate calculations, output gate calculations, unit state calculations, hidden state vector calculations, and output vector calculations, etc. six parts. The calculation formulas involved in the calculation of the above parts are as follows:

F(j)＝f(W_F*C_j+U_F*S(j-1)+K_F*y(j-1)+b_f)F(j)＝f(W _F *C _j +U _F *S(j-1)+K _F *y(j-1)+b _f )

I(j)＝f(W_I*C_j+U_I*S(j-1)+K_I*y(j-1)+b_I)I(j)＝f(W _I *C _j +U _I *S(j-1)+K _I *y(j-1)+b _I )

O(j)＝f(W_o*C_j+U_O*S(j-1)+K_O*y(j-1)+b_O)O(j)＝f(W _o *C _j +U _O *S(j-1)+K _O *y(j-1)+b _O )

n(j)＝tanh(W_n*C_j+U_n*S(j-1)+K_m*y(j-1)+b_n)n(j)=tanh(W _n *C _j +U _n *S(j-1)+K _m *y(j-1)+b _n )

S(j)＝F(j)*S(j‐1)+I(j)*n(j)S(j)＝F(j)*S(j‐1)+I(j)*n(j)

y(j)＝O(j)*tanh(S(j))y(j)=O(j)*tanh(S(j))

C_j＝sum_T(atj*h(t))C _j = sum_T(atj*h(t))

其中，在以上公式中，F(j)表示LSTM解码器第j个数据节点的记忆门；I(j)表示LSTM解码器第j个数据节点的输入门；O(j)表示LSTM解码器第j个数据节点的输出门；n(j)表示LSTM解码器第j个数据节点的单元状态；S(j)表示LSTM解码器第j个数据节点对应的隐藏状态向量；S(j-1)表示LSTM解码器第j个数据节点的上一数据节点对应的隐藏状态向量；y(j)表示LSTM解码器第j个节点的输出向量；f表示非线性激活函数，可以基于实际的需求选取合适的非线性激活函数；例如，对于LSTM解码器而言，上述f具体也可以采用sigmoid函数。C_j表示LSTM编码器各个数据节点对应的隐藏状态向量h(t)乘以基于LSTM解码器的注意力机制计算出的注意力权重atj后进行加权计算得到的加权和；W_F、U_F和K_F表示记忆门的权重矩阵；b_F表示记忆门的偏置项。W_I、U_I和K_I表示输入门的权重矩阵；b_I表示输入门的偏置项；W_O、U_O和K_O表示输出门的权重矩阵；b_O表示输出门的偏置项。W_n、U_n和K_n表示单元状态的权重矩阵；b_n表示单元状态的偏置项。Among them, in the above formula, F(j) represents the memory gate of the jth data node of the LSTM decoder; I(j) represents the input gate of the jth data node of the LSTM decoder; O(j) represents the memory gate of the jth data node of the LSTM decoder; The output gate of j data nodes; n(j) represents the unit state of the jth data node of the LSTM decoder; S(j) represents the hidden state vector corresponding to the jth data node of the LSTM decoder; S(j-1) Represents the hidden state vector corresponding to the previous data node of the jth data node of the LSTM decoder; y(j) represents the output vector of the jth node of the LSTM decoder; f represents a nonlinear activation function, which can be selected based on actual needs The nonlinear activation function; for example, for the LSTM decoder, the above f can also use the sigmoid function. C _j represents the weighted sum obtained by multiplying the hidden state vector h(t) corresponding to each data node of the LSTM encoder by the attention weight atj calculated based on the attention mechanism of the LSTM decoder; W _F , U _F and K _F represents the weight matrix of the memory gate; b _F represents the bias term of the memory gate. W _I , U _I and K _I represent the weight matrix of the input gate; b _I represents the bias term of the input gate; W _O , U _O and K _O represent the weight matrix of the output gate; b _O represents the bias term of the output gate. W _n , U _n and K _n represent the weight matrix of the cell state; b _n represents the bias item of the cell state.

在本说明书中，以上各公式中示出的W_f、U_f、b_f、W_i、U_i、b_i、W_m、U_m、b_m、W_a、U_a、W_F、U_F、K_F、b_F、W_I、U_I、K_I、b_I、W_O、U_O、K_O、b_o、W_n、U_n和K_n、b_n等参数，即为上述LSTM模型最终需要训练出的模型参数。In this specification, W _f , U _f , b _f , W _i , U _i , _bi , W _m , U _m , b _m , W _a , U _a , W _F , U _F shown in the above formulas , K _F , b _F , W _I , U _I , K _I , b _I , W _O , U _O , K _O , b _o , W _n , U _n and K _n , b _n are the above LSTM model Finally, the trained model parameters are needed.

在训练上述LSTM模型时，具体可以将基于以上示出的被标记了风险标签的样本账户在观察窗口中的各时间区间内的用户操作行为数据，构建出的与各时间区间对应的用户行为向量序列作为训练样本，输入至LSTM编码器中进行训练计算，再将LSTM编码器的计算结果作为输入继续输入至LSTM解码器中进行训练计算，并通过迭代以上的训练计算过程，不断对以上的模型参数进行调整；当将以上各参数调整至最优值时，此时模型的训练算法收敛，上述LSTM模型训练完毕。When training the above LSTM model, the user behavior vector corresponding to each time interval constructed based on the user operation behavior data in each time interval in the observation window of the sample account marked with the risk label shown above can be The sequence is used as a training sample, input into the LSTM encoder for training calculation, and then the calculation result of the LSTM encoder is continuously input into the LSTM decoder for training calculation, and the above model is continuously updated by iterating the above training calculation process Adjust the parameters; when the above parameters are adjusted to the optimal value, the training algorithm of the model converges at this time, and the training of the above LSTM model is completed.

其中，需要说明的是，在训练上述LSTM模型时采用的训练算法，在本说明书中不进行特别限定；例如，在一种实现方式中，可以采用梯度下降法来不断进行迭代运算，来训练上述LSTM模型。Among them, it should be noted that the training algorithm used in training the above LSTM model is not particularly limited in this specification; for example, in one implementation, the gradient descent method can be used to continuously perform iterative operations to train the above LSTM model.

3)基于encoder–decoder架构的LSTM模型的信用风险预测3) Credit risk prediction of LSTM model based on encoder–decoder architecture

在本说明书中，按照以上实施例中示出的模型训练流程，针对每一个划分出的用户群体分别训练一个LSTM模型，并基于训练完成的该LSTM模型对隶属于该用户群体的用户账户进行信用风险评估。In this specification, according to the model training process shown in the above embodiments, an LSTM model is trained for each divided user group, and based on the trained LSTM model, user accounts belonging to the user group are credited. risk assessment.

当建模方需要针对某一目标账户进行风险评估时，建模方可以获取该目标账户，获取该目标账户在上述观察窗口中的各个时间区间内产生的用户操作行为数据，基于该目标账户在上述观察窗口中的各个数据节点对应的时间区间内产生的用户操作行为数据，为各数据节点分别构建对应的用户行为向量序列。When the modeling party needs to conduct risk assessment for a certain target account, the modeling party can obtain the target account and obtain the user operation behavior data generated by the target account in each time interval in the above observation window, based on the target account in For the user operation behavior data generated in the time interval corresponding to each data node in the observation window, a corresponding user behavior vector sequence is respectively constructed for each data node.

其中，为上述目标账户构建用户行为向量序列的过程，在本说明书中不再进行赘述，可以参考之前实施例的描述；例如，仍然可以采用图5中示出的方式，为目标账户构建与观察窗口中的各时间区间对应的用户行为向量序列。Wherein, the process of constructing the user behavior vector sequence for the above-mentioned target account will not be described in detail in this specification, and the description of the previous embodiment may be referred to; for example, the method shown in Figure 5 can still be used to construct and observe The sequence of user behavior vectors corresponding to each time interval in the window.

当为目标账户构建出对应于观察窗口中的各个时间区间的用户行为向量序列后，首先可以从训练完成的LSTM模型中确定出与该目标账户所属的用户群体对应的LSTM模型，然后将该用户行为向量序列作为预测样本，输入至该LSTM模型的LSTM编码器中的各数据节点进行计算。After the user behavior vector sequence corresponding to each time interval in the observation window is constructed for the target account, the LSTM model corresponding to the user group to which the target account belongs can be determined first from the trained LSTM model, and then the user The behavior vector sequence is used as a prediction sample, which is input to each data node in the LSTM encoder of the LSTM model for calculation.

其中，对于LSTM模型而言，通常采用正向传播计算或者反向传播计算中的其中一种。所谓正向传播计算，是指对应于观察窗口中的各个时间区间的用户行为向量序列，在LSTM模型中的输入顺序，与LSTM模型中的各数据节点的传播方向相同；反之，所谓反向传播计算，是指对应于观察窗口中的各个时间区间的用户行为向量序列，在LSTM模型中的输入顺序，与LSTM模型中的各数据节点的传播方向相反。Among them, for the LSTM model, one of forward propagation calculation or back propagation calculation is usually used. The so-called forward propagation calculation refers to the user behavior vector sequence corresponding to each time interval in the observation window, and the input order in the LSTM model is the same as the propagation direction of each data node in the LSTM model; otherwise, the so-called back propagation Calculation refers to the user behavior vector sequence corresponding to each time interval in the observation window. The input order in the LSTM model is opposite to the propagation direction of each data node in the LSTM model.

也即，对于反向传播计算和正向传播计算而言，观察窗口中的各个时间区间的用户行为向量序列作为输入数据的输入顺序完全相反。That is to say, for the backpropagation calculation and the forward propagation calculation, the sequence of user behavior vectors in each time interval in the observation window is completely reversed as the input data.

例如，以正向传播计算为例，对于目标账户对应于观察窗口中的第1个时间区间(即第1个月)的用户行为向量序列X₁，可以将其作为LSTM编码器各数据节点的传播方向上的第1个数据节点的数据输入，按照以上示出的LSTM编码计算公式，求解出f(1)、i(1)、m(1)，再基于计算出的f(1)、i(1)、m(1)进一步求解出与第1个时间区间对应的隐藏状态向量h(1)。然后再将第2个时间区间的用户行为向量序列X₂，作为LSTM编码器各数据节点的传播方向上的第2个数据节点的数据输入，采用相同的计算方式进行计算，以此类推，依次分别进行计算出与第2～12个时间区间对应的隐藏状态向量h(2)～h(12)。For example, taking forward propagation calculation as an example, for the target account corresponding to the user behavior vector sequence X ₁ of the first time interval (i.e., the first month) in the observation window, it can be used as the For the data input of the first data node in the propagation direction, f(1), i(1), m(1) are solved according to the LSTM coding calculation formula shown above, and then based on the calculated f(1), i(1), m(1) further solve the hidden state vector h(1) corresponding to the first time interval. Then, the user behavior vector sequence X ₂ of the second time interval is used as the data input of the second data node in the propagation direction of each data node of the LSTM encoder, and the same calculation method is used for calculation, and so on, in turn The hidden state vectors h(2)-h(12) corresponding to the 2nd to 12th time intervals are respectively calculated.

又如，以反向传播计算为例，则可以将目标账户对应于观察窗口中的第12个时间区间(也即最后一个时间区间)的用户行为向量序列X₁₂，作为LSTM编码器各数据节点的传播方向上的第1个数据节点的数据输入，采用相同的计算方式，求解出f(1)、i(1)、m(1)，再基于计算出的f(1)、i(1)、m(1)进一步求解出与第1个时间区间对应的隐藏状态向量h(1)。然后再将第11个时间区间的用户行为向量序列X₁₁，作为LSTM编码器各数据节点的传播方向上的第2个数据节点的数据输入，采用相同的计算方式进行计算，以此类推，依次分别进行计算出与第2～12个时间区间对应的隐藏状态向量h(2)～h(12)。As another example, taking backpropagation calculation as an example, the user behavior vector sequence X ₁₂ of the target account corresponding to the 12th time interval (that is, the last time interval) in the observation window can be used as each data node of the LSTM encoder For the data input of the first data node in the direction of propagation, use the same calculation method to solve f(1), i(1), m(1), and then based on the calculated f(1), i(1 ), m(1) to further solve the hidden state vector h(1) corresponding to the first time interval. Then, the user behavior vector sequence X ₁₁ of the 11th time interval is used as the data input of the second data node in the propagation direction of each data node of the LSTM encoder, and the same calculation method is used for calculation, and so on, in turn The hidden state vectors h(2)-h(12) corresponding to the 2nd to 12th time intervals are respectively calculated.

在示出的一种实施方式中，为了提升LSTM编码器的计算精度，LSTM编码器中的计算可以采用双向传播计算。当分别完成反向传播计算和正向传播计算后，对于LSTM编码器中的每一个数据节点而言，可以分别得到一个前向传播计算得到的第一隐藏状态向量，和一个反向传播计算得到的第二隐藏状态向量。In one embodiment shown, in order to improve the calculation accuracy of the LSTM encoder, the calculation in the LSTM encoder may use bidirectional propagation calculation. After the backpropagation calculation and the forward propagation calculation are completed, for each data node in the LSTM encoder, a first hidden state vector obtained by the forward propagation calculation and a first hidden state vector obtained by the backpropagation calculation can be obtained respectively. The second hidden state vector.

在这种情况下，可以对LSTM编码器中各数据节点对应的第一隐藏状态向量和第二隐藏状态进行拼接，作为与各数据节点对应的最终隐藏状态向量；例如，以LSTM编码器的第t个数据节点为例，假设该数据节点计算出的第一隐藏状态向量记为ht_before，计算出的第二隐藏向量记为ht_after，最终隐藏向量记为ht_final，那么ht_final可以表示为t_final＝[ht_before，ht_after]。In this case, the first hidden state vector and the second hidden state corresponding to each data node in the LSTM encoder can be concatenated as the final hidden state vector corresponding to each data node; for example, the first hidden state vector of the LSTM encoder Take t data nodes as an example, assuming that the first hidden state vector calculated by the data node is recorded as ht_before, the calculated second hidden vector is recorded as ht_after, and the final hidden vector is recorded as ht_final, then ht_final can be expressed as t_final=[ht_before , ht_after].

在本说明书中，当将为目标账户构建出对应于观察窗口中的各个时间区间的用户行为向量序列作为预测样本，输入至上述LSTM模型的LSTM编码器中的各数据节点完成计算后，可以将LSTM编码器中的各数据节点计算得到的隐藏状态向量作为从目标账户的用户操作行为数据中提取出的风险特征，进一步输入至上述LSTM模型中的LSTM解码器，按照以上是实施例中示出的LSTM解码器的计算公式进行计算，以对上述目标账户在上述表现窗口中的各时间区间的信用风险进行预测。In this specification, when the user behavior vector sequence corresponding to each time interval in the observation window is constructed for the target account as a prediction sample, and input to each data node in the LSTM encoder of the above LSTM model to complete the calculation, the The hidden state vector calculated by each data node in the LSTM encoder is used as the risk feature extracted from the user operation behavior data of the target account, and is further input into the LSTM decoder in the above-mentioned LSTM model, according to the above-mentioned embodiment. The calculation formula of the LSTM decoder is calculated to predict the credit risk of the above-mentioned target account in each time interval in the above-mentioned performance window.

例如，首先可以基于LSTM解码器的注意力机制，计算出与LSTM编码器中的各数据节点对应的隐藏状态向量的注意力权重atj，再进一步计算出与LSTM编码器中的各数据节点对应的隐藏状态向量乘以对应的注意力权重atj后的加权和C_j。然后，可以基于以上示出的LSTM解码器的计算公式，进一步计算出与LSTM解码器中第一个数据节点对应的输出向量，对上述目标账户在表现窗口中第一个时间区间的信用风险进行预测；以此类推，可以基于相同的方式，按照以上示出的LSTM解码器的计算公式，依次计算出与LSTM解码器中的下一个数据节点对应的输出向量，对上述目标账户在表现窗口中的下一个时间区间的信用风险进行预测。For example, firstly, based on the attention mechanism of the LSTM decoder, the attention weight atj of the hidden state vector corresponding to each data node in the LSTM encoder can be calculated, and then the attention weight atj corresponding to each data node in the LSTM encoder can be further calculated. The weighted sum C _j of the hidden state vector multiplied by the corresponding attention weight atj. Then, based on the calculation formula of the LSTM decoder shown above, the output vector corresponding to the first data node in the LSTM decoder can be further calculated, and the credit risk of the above-mentioned target account in the first time interval in the performance window can be calculated. Prediction; and so on, based on the same method, according to the calculation formula of the LSTM decoder shown above, the output vector corresponding to the next data node in the LSTM decoder can be calculated sequentially, and the above target account can be displayed in the performance window The credit risk of the next time interval is predicted.

在本说明书中，当完成LSTM解码器的计算后，可以得到LSTM编码器中的各数据节点对应的隐藏状态向量的注意力权重atj，以及与LSTM解码器中的各数据节点对应的输出向量。In this specification, after the calculation of the LSTM decoder is completed, the attention weight atj of the hidden state vector corresponding to each data node in the LSTM encoder and the output vector corresponding to each data node in the LSTM decoder can be obtained.

在示出的一种实施方式中，上述LSTM模型可以进一步对与LSTM解码器中的各数据节点对应的输出向量进行数字化处理，将与各数据节点对应的输出向量转换为与各数据节点对应的风险评分，作为目标账户在表现窗口中各个时间区间的信用风险预测结果。In one embodiment shown, the above-mentioned LSTM model can further digitize the output vectors corresponding to each data node in the LSTM decoder, and convert the output vectors corresponding to each data node into Risk score, as the credit risk prediction result of the target account in each time interval in the performance window.

其中，对上述输出向量进行数字化处理，将上述输出向量转换为风险评分的具体方式，在本说明书中，不进行特别限定；Wherein, the above-mentioned output vector is digitally processed, and the specific method of converting the above-mentioned output vector into a risk score is not particularly limited in this specification;

例如，在一种实现方式中，由于最终输出的输出向量为一个多维向量，且输出向量中通常会包含取值位于0～1之间的子向量。因此，在实现时，可以直接提取上述输出向量中取值位于0～1之间的子向量的取值，作为与该输出向量对应的风险评分。For example, in one implementation manner, since the final output vector is a multi-dimensional vector, the output vector usually includes sub-vectors whose values are between 0 and 1. Therefore, during implementation, the value of the sub-vector whose value is between 0 and 1 in the above output vector can be directly extracted as the risk score corresponding to the output vector.

在示出的另一种实现方式中，如果上述输出向量中包含多个取值位于0～1之间的子向量时，可以提取该多个子向量的取值中的最大值或者最小值作为与该输出向量对应的风险评分；或者，也可以计算该多个子向量的取值的平均值作为风险评分。In another implementation shown, if the above output vector contains a plurality of sub-vectors with values between 0 and 1, the maximum or minimum value of the values of the multiple sub-vectors can be extracted as the The risk score corresponding to the output vector; or, the average value of the values of the multiple sub-vectors may also be calculated as the risk score.

当完成以上计算后，上述LSTM解码器可以将与LSTM解码器中的各数据节点对应的风险评分，以及与上述LSTM编码器中的各数据节点得到的隐藏状态向量，相对于上述风险评分的权重值，作为最终的预测结果进行输出。After the above calculation is completed, the above-mentioned LSTM decoder can use the risk score corresponding to each data node in the LSTM decoder, and the hidden state vector obtained from each data node in the above-mentioned LSTM encoder, relative to the weight of the above-mentioned risk score value, which is output as the final prediction result.

其中，在示出的一种实施方式中，上述LSTM解码器也可以将LSTM解码中的各个数据节点对应的风险评分进行汇总后，转换成为一个上述目标账户在上述表现窗口中是否存在信用风险的预测结果。Among them, in one embodiment shown, the above-mentioned LSTM decoder can also summarize the risk scores corresponding to each data node in the LSTM decoding, and then convert it into a report on whether the above-mentioned target account has credit risk in the above-mentioned performance window. forecast result.

在一种实现方式中，上述LSTM解码器可以将LSTM解码中的各个数据节点对应的风险评分进行求和，然后将求和结果与预设的风险阈值进行比较；如果求和结果大于等于该风险阈值，则输出一个1，表示上述目标账户在上述变现窗口中存在信用风险；反之，如果求和结果小于风险阈值，则输出一个0，表示上述目标账户在上述变现窗口中不存在信用风险。In one implementation, the above-mentioned LSTM decoder can sum the risk scores corresponding to each data node in the LSTM decoding, and then compare the summation result with a preset risk threshold; if the summation result is greater than or equal to the risk threshold, output a 1, indicating that the target account has credit risk in the above realization window; on the contrary, if the summation result is less than the risk threshold, output a 0, indicating that the target account does not have credit risk in the above realization window.

通过以上实施例可见，一方面，由于将目标账户在各个时间区间内的用户行为向量序列，作为输入数据直接输入基于编码-解码架构的LSTM模型中的LSTM编码器中进行计算，就可以得到对应于各个时间区间的隐藏状态向量，进而可以将得到的隐藏状态向量作为风险特征进一步输入至LSTM解码器进行计算，来完成该目标账户的风险预测得到风险评分；因此，可以无需建模人员基于目标账户的用户操作行为数据，来开发和探索建模所需的特征变量，可以避免由于基于建模人员的经验设计的特征变量不够准确，而造成的难以深度挖掘出数据中包含的信息，对模型进行风险预测的准确度造成影响；而且，也不需要对人工设计的特征变量进行存储维护，可以降低系统的存储开销；It can be seen from the above embodiments that, on the one hand, since the user behavior vector sequence of the target account in each time interval is directly input as input data into the LSTM encoder in the LSTM model based on the encoding-decoding architecture for calculation, the corresponding The hidden state vector of each time interval, and then the obtained hidden state vector can be further input into the LSTM decoder for calculation as a risk feature, so as to complete the risk prediction of the target account and obtain a risk score; The user operation behavior data of the account is used to develop and explore the characteristic variables required for modeling, which can avoid the difficulty of digging out the information contained in the data due to the inaccurate characteristic variables designed based on the experience of the modeler. The accuracy of risk prediction is affected; moreover, there is no need to store and maintain the artificially designed characteristic variables, which can reduce the storage overhead of the system;

与上述方法实施例相对应，本说明书还提供了装置的实施例。Corresponding to the foregoing method embodiments, this specification also provides device embodiments.

与上述方法实施例相对应，本说明书还提供了一种基于LSTM模型的信用风险预测装置的实施例。本说明书的基于LSTM模型的信用风险预测装置实施例可以应用在电子设备上。装置实施例可以通过软件实现，也可以通过硬件或者软硬件结合的方式实现。以软件实现为例，作为一个逻辑意义上的装置，是通过其所在电子设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言，如图6所示，为本说明书的基于LSTM模型的信用风险预测装置所在电子设备的一种硬件结构图，除了图6所示的处理器、内存、网络接口、以及非易失性存储器之外，实施例中装置所在的电子设备通常根据该电子设备的实际功能，还可以包括其他硬件，对此不再赘述。Corresponding to the foregoing method embodiments, this specification also provides an embodiment of an LSTM model-based credit risk prediction device. The embodiment of the credit risk prediction device based on the LSTM model in this specification can be applied to electronic equipment. The device embodiments can be implemented by software, or by hardware or a combination of software and hardware. Taking software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of the electronic device where it is located. From the hardware level, as shown in Figure 6, it is a hardware structure diagram of the electronic equipment where the credit risk prediction device based on the LSTM model in this specification is located, except for the processor, memory, network interface, and non- In addition to the volatile memory, the electronic device where the device in the embodiment is located usually may also include other hardware according to the actual function of the electronic device, which will not be repeated here.

图7是本说明书一示例性实施例示出的一种基于LSTM模型的信用风险预测装置的框图。Fig. 7 is a block diagram of an LSTM model-based credit risk prediction device according to an exemplary embodiment of this specification.

请参考图7，所述基于LSTM模型的信用风险预测装置70可以应用在前述图6所示的电子设备中，包括有：获取模块701、生成模块702、第一计算模块703和第二计算模块704。Please refer to FIG. 7, the credit risk prediction device 70 based on the LSTM model can be applied to the electronic device shown in FIG. 6, including: an acquisition module 701, a generation module 702, a first calculation module 703 and a second calculation module 704.

获取模块701，获取目标账户在预设时间段内的用户操作行为数据；其中，所述预设时间段为由若干时间步长相同的时间区间组成的时间序列；The acquisition module 701 is used to acquire user operation behavior data of the target account within a preset time period; wherein, the preset time period is a time series composed of several time intervals with the same time step;

生成模块702，基于所述目标账户在各个时间区间内的用户操作行为数据，生成对应于各个时间区间的用户行为向量序列；A generation module 702, based on the user operation behavior data of the target account in each time interval, generating a sequence of user behavior vectors corresponding to each time interval;

第一计算模块703，将生成的对应于各个时间区间的用户行为向量序列输入至训练完毕的基于编码-解码架构的LSTM模型中的LSTM编码器进行计算，得到对应于各个时间区间的隐藏状态向量；其中，所述LSTM模型包括LSTM编码器，和引入了注意力机制的LSTM解码器；The first calculation module 703 inputs the generated user behavior vector sequence corresponding to each time interval to the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture for calculation, and obtains the hidden state vector corresponding to each time interval ; Wherein, the LSTM model includes an LSTM encoder, and an LSTM decoder that introduces an attention mechanism;

第二计算模块704，将对应于各个时间区间的隐藏状态向量作为风险特征，输入至所述LSTM解码器进行计算，得到所述目标账户在下一时间区间内的风险评分；以及，各隐藏状态向量对应于所述风险评分的权重值；其中，所述权重值表征所述隐藏状态向量对所述风险评分的贡献度。The second calculation module 704 is to input the hidden state vectors corresponding to each time interval as risk features into the LSTM decoder for calculation, and obtain the risk score of the target account in the next time interval; and, each hidden state vector A weight value corresponding to the risk score; wherein the weight value represents the contribution of the hidden state vector to the risk score.

在本实施例中，所述获取模块701进一步：In this embodiment, the acquiring module 701 further:

所述生成模块702进一步：The generation module 702 further:

所述装置70还包括：The device 70 also includes:

训练模块705(图7中未示出)，将生成的用户行为向量序列作为训练样本训练基于编码-解码架构的LSTM模型。The training module 705 (not shown in FIG. 7 ) uses the generated user behavior vector sequence as a training sample to train the LSTM model based on the encoding-decoding architecture.

在本实施例中，所述生成模块702进一步：In this embodiment, the generating module 702 further:

在本实施例中，所述多种用户行为包括信贷表现行为、用户消费行为、理财支付行为；In this embodiment, the various user behaviors include credit performance behavior, user consumption behavior, financial management payment behavior;

在本实施例中，所述LSTM编码器采用多层的many-to-one结构；所述LSTM解码器采用输入节点和输出节点数量对称的多层的many-to-many结构。In this embodiment, the LSTM encoder adopts a multi-layer many-to-one structure; the LSTM decoder adopts a multi-layer many-to-many structure with symmetrical numbers of input nodes and output nodes.

在本实施例中，所述第一计算模块703：In this embodiment, the first calculation module 703:

在本实施例中，所述第二计算模块704：In this embodiment, the second calculating module 704:

在本实施例中，所述输出向量为多维向量；In this embodiment, the output vector is a multidimensional vector;

上述装置中各个模块的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程，在此不再赘述。For the implementation process of the functions and effects of each module in the above-mentioned device, please refer to the implementation process of the corresponding steps in the above-mentioned method for details, and details will not be repeated here.

对于装置实施例而言，由于其基本对应于方法实施例，所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的模块可以是或者也可以不是物理上分开的，作为模块显示的部件可以是或者也可以不是物理模块，即可以位于一个地方，或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本说明书方案的目的。本领域普通技术人员在不付出创造性劳动的情况下，即可以理解并实施。As for the device embodiment, since it basically corresponds to the method embodiment, for related parts, please refer to the part description of the method embodiment. The device embodiments described above are only illustrative, and the modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical modules, that is, they may be located in One place, or it can be distributed to multiple network modules. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution in this specification. It can be understood and implemented by those skilled in the art without creative effort.

上述实施例阐明的系统、装置、模块或模块，具体可以由计算机芯片或实体实现，或者由具有某种功能的产品来实现。一种典型的实现设备为计算机，计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。The systems, devices, modules or modules described in the above embodiments can be specifically implemented by computer chips or entities, or by products with certain functions. A typical implementing device is a computer, which may take the form of a personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation device, e-mail device, game control device, etc. desktops, tablets, wearables, or any combination of these.

与上述方法实施例相对应，本说明书还提供了一种电子设备的实施例。该电子设备包括：处理器以及用于存储机器可执行指令的存储器；其中，处理器和存储器通常通过内部总线相互连接。在其他可能的实现方式中，所述设备还可能包括外部接口，以能够与其他设备或者部件进行通信。Corresponding to the foregoing method embodiments, this specification also provides an embodiment of an electronic device. The electronic device includes: a processor and a memory for storing machine-executable instructions; wherein, the processor and the memory are usually connected to each other through an internal bus. In other possible implementation manners, the device may further include an external interface, so as to be able to communicate with other devices or components.

在本实施例中，通过读取并执行所述存储器存储的与基于LSTM模型的信用风险预测的控制逻辑对应的机器可执行指令，所述处理器被促使：In this embodiment, by reading and executing the machine-executable instructions stored in the memory and corresponding to the control logic of credit risk prediction based on the LSTM model, the processor is prompted to:

在本实施例中，通过读取并执行所述存储器存储的与基于LSTM模型的信用风险预测的控制逻辑对应的机器可执行指令，所述处理器还被促使：In this embodiment, by reading and executing the machine-executable instructions stored in the memory and corresponding to the control logic of credit risk prediction based on the LSTM model, the processor is further prompted to:

获取若干被标记了风险标签的样本账户在所述预设时间段内的用户操作行为数据；基于所述若干样本账户在各个时间区间内的用户操作行为数据，生成对应于各个时间区间的用户行为向量序列；将生成的用户行为向量序列作为训练样本训练基于编码-解码架构的LSTM模型。Obtain user operation behavior data of several sample accounts marked with risk tags within the preset time period; based on the user operation behavior data of the several sample accounts within each time interval, generate user behavior corresponding to each time interval Vector sequence; use the generated user behavior vector sequence as a training sample to train the LSTM model based on the encoding-decoding architecture.

在本实施例中，所述输出向量为多维向量；通过读取并执行所述存储器存储的与基于LSTM模型的信用风险预测的控制逻辑对应的机器可执行指令，所述处理器还被促使执行以下中的任一：In this embodiment, the output vector is a multidimensional vector; by reading and executing the machine-executable instructions stored in the memory and corresponding to the control logic of credit risk prediction based on the LSTM model, the processor is also prompted to execute any of the following:

本领域技术人员在考虑说明书及实践这里公开的发明后，将容易想到本说明书的其它实施方案。本说明书旨在涵盖本说明书的任何变型、用途或者适应性变化，这些变型、用途或者适应性变化遵循本说明书的一般性原理并包括本说明书未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的，本说明书的真正范围和精神由下面的权利要求指出。Other embodiments of the specification will readily occur to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This description is intended to cover any modification, use or adaptation of this description. These modifications, uses or adaptations follow the general principles of this description and include common knowledge or conventional technical means in the technical field not disclosed in this description. . The specification and examples are to be considered exemplary only, with a true scope and spirit of the specification being indicated by the following claims.

应当理解的是，本说明书并不局限于上面已经描述并在附图中示出的精确结构，并且可以在不脱离其范围进行各种修改和改变。本说明书的范围仅由所附的权利要求来限制。It should be understood that this specification is not limited to the precise constructions which have been described above and shown in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the specification is limited only by the appended claims.

以上所述仅为本说明书的较佳实施例而已，并不用以限制本说明书，凡在本说明书的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本说明书保护的范围之内。The above descriptions are only preferred embodiments of this specification, and are not intended to limit this specification. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of this specification shall be included in this specification. within the scope of protection.

Claims

1. A credit risk prediction method based on LSTM model, said method comprising:

Obtain user operation behavior data of the target account within a preset time period; wherein, the preset time period is a time series composed of several time intervals with the same time step;

Based on the user operation behavior data of the target account in each time interval, generate a user behavior vector sequence corresponding to each time interval;

Input the generated user behavior vector sequence corresponding to each time interval to the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture for calculation, and obtain the hidden state vector corresponding to each time interval; wherein, the LSTM The model includes an LSTM encoder and an LSTM decoder with an attention mechanism;

The hidden state vectors corresponding to each time interval are used as risk features, input to the LSTM decoder for calculation, and the risk score of the target account in the next time interval is obtained; and, each hidden state vector corresponds to the risk score The weight value of ; wherein, the weight value represents the contribution of the hidden state vector to the risk score.

2. The method of claim 1, further comprising:

Obtain the user operation behavior data of several sample accounts marked with risk tags within the preset time period;

Based on the user operation behavior data of the several sample accounts in each time interval, generate a user behavior vector sequence corresponding to each time interval;

The generated user behavior vector sequence is used as a training sample to train the LSTM model based on the encoder-decoder architecture.

3. The method according to claim 2, based on the user operation behavior data of the account in each time interval, generating a user behavior vector sequence corresponding to each time interval, comprising:

Obtain various user operation behavior data of the account in various time intervals;

extracting key factors from the acquired user operation behavior data, and digitizing the key factors to obtain a user behavior vector corresponding to the user operation behavior data;

The user behavior vectors corresponding to various user operation behavior data in each time interval are spliced to generate a sequence of user behavior vectors corresponding to each time interval.

4. The method according to claim 3, said various user behaviors include credit performance behavior, user consumption behavior, financial management payment behavior;

The key factors include the loan order status and loan repayment amount corresponding to the credit performance behavior, the user consumption category and the number of user consumption transactions corresponding to the user consumption behavior, the financial management payment type and the financial management income amount corresponding to the financial management payment behavior.

5. The method according to claim 1, the LSTM encoder adopts a multi-layer many-to-one structure; the LSTM decoder adopts a multi-layer many-to-many structure with symmetrical input nodes and output nodes .

6. The method according to claim 1, wherein the generated user behavior vector sequence corresponding to each time interval is input to the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture to obtain the corresponding Hidden state vectors for each time interval, including:

Input the generated user behavior vector sequence corresponding to each time interval into the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture to perform two-way propagation calculation, and obtain the first hidden state vector obtained by the forward propagation calculation; and , the second hidden state vector obtained by backpropagation calculation; wherein, when performing forward propagation calculation and backpropagation calculation, the input order of the user behavior vector sequence corresponding to each time interval is reversed;

Perform splicing processing on the first hidden state vector and the second hidden state vector to obtain final hidden state vectors corresponding to each time interval.

7. The method according to claim 1, wherein the hidden state vector corresponding to each time interval is used as a risk feature, input to the LSTM decoder for calculation, and the risk score of the target account in the next time interval is obtained ,include:

The hidden state vector corresponding to each time interval is used as the risk feature, input to the LSTM decoder for calculation, and the output vector of the target account in the next time interval is obtained;

The output vector is digitized to obtain the risk score of the target account in the next time interval.

8. The method according to claim 1, the output vector is a multidimensional vector;

The digital processing of the output vector includes any of the following:

Extracting the value of the sub-vector whose value is between 0 and 1 in the output vector as the risk score;

If the output vector contains multiple sub-vectors with values between 0 and 1, calculate the average value of the values of the multiple sub-vectors as the risk score;

If the output vector contains multiple sub-vectors with values between 0 and 1, extract the maximum or minimum value among the values of the multiple sub-vectors as the risk score.

9. A credit risk forecasting device based on LSTM model, said device comprising:

The acquisition module acquires user operation behavior data of the target account within a preset time period; wherein, the preset time period is a time series composed of several time intervals with the same time step;

A generation module, based on the user operation behavior data of the target account in each time interval, generates a sequence of user behavior vectors corresponding to each time interval;

The first calculation module inputs the generated user behavior vector sequence corresponding to each time interval to the LSTM encoder in the trained LSTM model based on the encoding-decoding architecture for calculation, and obtains the hidden state vector corresponding to each time interval; Wherein, the LSTM model includes an LSTM encoder, and an LSTM decoder that introduces an attention mechanism;

The second calculation module is to input the hidden state vectors corresponding to each time interval as risk features into the LSTM decoder for calculation, and obtain the risk score of the target account in the next time interval; and each hidden state vector corresponds to A weight value of the risk score; wherein, the weight value represents the contribution of the hidden state vector to the risk score.

10. The device according to claim 9, the acquisition module further:

The generating module further:

The device also includes:

The training module uses the generated user behavior vector sequence as a training sample to train the LSTM model based on the encoding-decoding architecture.

11. The apparatus of claim 10, said generating module further:

12. The device according to claim 11, the various user behaviors include credit performance behavior, user consumption behavior, financial management payment behavior;

13. The device according to claim 9, wherein the LSTM encoder adopts a multi-layer many-to-one structure; the LSTM decoder adopts a multi-layer many-to-many structure with a symmetrical number of input nodes and output nodes .

14. The apparatus of claim 9, said first computing module:

15. The apparatus of claim 9, said second computing module:

16. The device according to claim 9, wherein the output vector is a multidimensional vector;

The digital processing of the output vector includes any of the following:

17. An electronic device comprising:

processor;

memory for storing machine-executable instructions;

Wherein, by reading and executing the machine-executable instructions stored in the memory and corresponding to the control logic of credit risk prediction based on the LSTM model, the processor is prompted to: