[go: up one dir, main page]

CN110232480B - Item recommendation method and model training method using variational regularization flow - Google Patents

Item recommendation method and model training method using variational regularization flow Download PDF

Info

Publication number
CN110232480B
CN110232480B CN201910515356.1A CN201910515356A CN110232480B CN 110232480 B CN110232480 B CN 110232480B CN 201910515356 A CN201910515356 A CN 201910515356A CN 110232480 B CN110232480 B CN 110232480B
Authority
CN
China
Prior art keywords
item
variable
sequence
session
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910515356.1A
Other languages
Chinese (zh)
Other versions
CN110232480A (en
Inventor
钟婷
温子敬
周帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Publication of CN110232480A publication Critical patent/CN110232480A/en
Application granted granted Critical
Publication of CN110232480B publication Critical patent/CN110232480B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Educational Administration (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Machine Translation (AREA)

Abstract

本发明提供了一种利用变分的正则化流实现的项目推荐模型训练方法及项目推荐方法,采用带注意力机制的循环神经网络并加入正则化流的变分推断,通过学习会话序列的隐含变量,为用户推荐下一次点击的项目,只通过用户点击过的项目序列数据就可以稳定有效地近似推断整个会话序列的下一次点击项目,而且预测模型加入注意力机制来增强会话中重要的项目点的权重,大大提高了预测的精确度。此外,本发明在模型中加入了变分的正则化流去学习隐含变量的真实分布,这可以减小传统的变分模型(比如VAE)在基于会话的推荐问题中的误差。

Figure 201910515356

The present invention provides an item recommendation model training method and item recommendation method realized by using variational regularization flow, adopting a cyclic neural network with attention mechanism and adding regularization flow variational inference, by learning the hidden meaning of conversation sequence Contains variables to recommend the next clicked item for the user. Only through the item sequence data that the user has clicked, the next clicked item of the entire session sequence can be approximately inferred stably and effectively, and the prediction model adds an attention mechanism to enhance the important items in the session. The weight of project points greatly improves the accuracy of prediction. In addition, the present invention adds a variational regularization flow to the model to learn the true distribution of latent variables, which can reduce the error of traditional variational models (such as VAE) in session-based recommendation problems.

Figure 201910515356

Description

Project recommendation method realized by using variational regularized stream and model training method
Technical Field
The invention belongs to the field of neural networks in machine learning, and relates to a deep learning-based method, which mainly utilizes regularization Flow to excavate potential information in session-based item sequences, learns implicit variables of each item sequence in a potential space, predicts items which can be clicked next time by a user by using the implicit variables, and recommends the predicted items for the user.
Background
First, the concept of session-based is introduced: session is a mechanism for recording and identifying a user at a server side, a typical scenario is a shopping cart, a specific session (session) is created for a specific object at the server side for identifying the object and tracking browsing click behaviors of the user, so that a set of sequence data clicked by the user is generated.
The conventional method uses similarity between items (items) in the conversation sequence to predict the item (item) of the next click. The method does not consider the sequence among projects and the information of the whole sequence, and can not fully utilize conversation data, so that the prediction accuracy is not high. Another method uses a markov decision process to depict sequence information and predict the next click item by calculating transition probability, and the method has the disadvantage that the number of states is huge, which can increase with the dimension index, resulting in the incapability of calculation. This data can easily lead to prediction errors due to the lack of user profile information in the session data and the long sequence of session data (which contains too many clicked items).
Based on a deep learning method, the project prediction method for improving the prediction accuracy by capturing information in a long conversation sequence is provided, so that the problems of low prediction accuracy, low prediction efficiency, incapability of prediction along with the sharp increase of the number of states and the like in the traditional project prediction method are solved, and the method is a research task of the invention.
Disclosure of Invention
The invention aims to provide a project recommendation model training method realized by regularization of variation aiming at the technical situations of low prediction accuracy, low prediction efficiency and the like of the traditional project conjecture method, and the trained project recommendation model can realize accurate and efficient prediction of a project which is probably clicked by a user.
Another object of the present invention is to provide a method for recommending items to a user by using the above item recommendation model.
The idea of the invention is that a recurrent neural network (GRU or LSTM) and an Attention mechanism (Attention mechanism) are utilized to learn the potential information among the items in the conversation sequence, so that the sequence information of the items in the sequence is better captured, which is beneficial to the learning of long sequences. In addition, the invention adds the process of variable deduction, and learns the real distribution of the hidden variable by using the regularized stream of the variable, thereby greatly avoiding the error of directly using a variable auto-encoder (VAE).
Based on the above inventive concept, the project recommendation model training method implemented by using the regularized stream of the variation provided by the present invention is characterized by comprising the following steps:
s1, constructing a training set: constructing a training set by using a session-based sequence of the same user;
s2, preprocessing of data: dividing the conversation-based sequence into input data and tag items, and embedding each item in the input data into an embedded vector;
s3, obtaining hidden variables, inputting input data expressed as embedded vectors into a cyclic neural network, introducing a regularized flow learning algorithm and an attention mechanism, respectively constructing a first hidden variable and a second hidden variable, and splicing the two hidden variables together to obtain a final hidden variable;
s4, acquiring a first loss, inputting the final hidden variable into a classifier, outputting to obtain a prediction item, calculating cross entropy loss by using the prediction item and a label item of the conversation sequence, and taking the cross entropy loss as the first loss;
s5, acquiring a second loss, and inputting a first hidden variable constructed based on the regularized stream into a decoder to obtain the second loss;
s6, adding the first loss and the second loss to generate a final total loss;
and S7, repeating the steps S2-S6, and minimizing the total loss to obtain the project recommendation model.
In the above project recommendation model training method implemented by using the regularized stream of the variation, step S1 aims to process the raw data to construct a training set. Firstly, deleting the conversation sequences with the sequence length smaller than 5 in the original data, and forming a training set by the rest conversation sequences.
In the project recommendation model training method implemented by using the variational regularization stream, in order to more fully utilize the conversation sequences, each conversation sequence in the training set is divided into a plurality of subsequences, and the subsequences are supplemented to the original training set.
In the above method for training a project recommendation model by using a variational regularization stream, step S2 is to divide input data and labeled projects for each session sequence in a training set, train in the training set in a word2vec manner to obtain an initialized embedded vector (item embedding) matrix of all projects, and convert each session sequence into a vector representation, and specifically includes the following sub-steps:
s21, taking the last item of each conversation sequence as the label item of the sequence and the rest items as input data;
s22, training in the training set by adopting the word2vec method to obtain the initialized embedded vector matrix W of all the itemsembTaking the vector as a parameter matrix of the model, wherein the size of the parameter matrix is NxM, N is the total number of items, and M is the dimension of the embedded vector;
s23, embedding each item in each inputted conversation sequence into matrix WembThe vector in (1) is shown, and an embedded vector matrix X of the session sequence is formed.
In the item recommendation model training method implemented by the regularized stream of variations, step S3 aims to obtain a hidden layer representation of the session-based sequence data. Firstly, inputting input data expressed as embedded vectors into a recurrent neural network to obtain a hidden state, constructing a first hidden variable, and optimizing the first hidden variable through a regularized flow algorithm to obtain an optimized first hidden variable; and simultaneously, acquiring a second hidden variable by using an attention mechanism according to the acquired hidden state, and combining the optimized first hidden variable and the optimized second hidden variable to acquire a final hidden variable. The method comprises the following steps:
s31, using the recurrent neural network to encode the input data expressed as the embedded vector to obtain the hidden state htAnd constructing a first hidden variable z0
Figure BDA0002094840510000031
In the formula (I), the compound is shown in the specification,
Figure BDA0002094840510000032
as an average of the distribution of sessions in the potential space,
Figure BDA0002094840510000033
Figure BDA0002094840510000034
for the variance of the distribution of sessions in the underlying space,
Figure BDA0002094840510000035
hidden state
Figure BDA0002094840510000036
Wherein h istRepresenting the hidden state corresponding to the t-th item of each layer of the recurrent neural network, and updating the gate zt=σ(Wzxt+Uzht-1) Candidate activation function
Figure BDA0002094840510000037
Reset gate rt=σ(Wrxt+Urht-1) Where σ above is expressed as sigmoid activation function, tanh is also an activation function, xtT item, W, representing each layer inputz、Wt、Wμ、Wσ、Uz、Ut、Ur、bμ、bσAll represent the training parameters of the model, and epsilon represents a variable of standard normal distribution for random sampling;
s32, defining a set of invertible functions f ═ f1,…,fK]The first hidden variable z obtained in step S310Inputting the variable into a set of reversible functions to transform, and learning to obtain an optimized first hidden variable zk,zk=fK(fK-1(zK-2))=fK(fK-1(…f1(z0) ); the reversible function f adopted in the step is a planar reversible function, and f is defined as:
Figure BDA0002094840510000038
d represents the dimension of u, w, b ∈ R,
Figure BDA0002094840510000039
the activation functions tanh, u, w, b all represent parameters of the invertible function to be learned.
S33, based on the variation attention mechanism, calculating the probability distribution alpha of different item points in the conversation sequence to the current corresponding output value of the circular track encoder according to the following formulait
Figure BDA0002094840510000041
In the formula, hiA hidden state representing the ith item of each layer of the recurrent neural network, i ═ 1,2, …, t; h istW represents the hidden state of the t-th item, obtained in step S31TTransposing a parameter matrix W to be learned for the session sequence;
then, the attention vector c is calculated by weighted input summation and is taken as a second implicit variable:
Figure BDA0002094840510000042
and S34, splicing the optimized first hidden variable obtained in the step S32 and the second hidden variable obtained in the step S33 together to obtain a final hidden variable m.
In the item recommendation model training method implemented by using the regularized stream of the variation, the steps S4 and S5 are aimed at constructing two loss functions, where one loss function is a cross-entropy loss function of the predicted item and the tagged item, and the other loss function is a reconstruction loss function for making the learned hidden variable distribution closer to the true distribution. The invention uses the cross entropy loss function and the reconstruction loss function as the loss function together, thus leading the model to better keep the structure of the input session sequence and further improving the accuracy of the prediction result.
In step S4, the final hidden variable m is input into a classifier, and the next predicted item of the user is output as a predicted item
Figure BDA0002094840510000043
The tag item y and predicted item of the conversation sequence divided in step S2
Figure BDA0002094840510000044
Calculating the cross entropy loss L1
Figure BDA0002094840510000045
And taking it as a first loss, N being the total number of items;
in step S5, the optimized first hidden variable z in the regularized streamKInput to the decoder to obtain a second loss L2
Figure BDA0002094840510000046
Using posterior distribution qφ(z | x) to approximate the true distribution p of the implied variable zθ(x,z),qφ(z | x) is defined as q (z)K),pθ(x, z) represents the true distribution of the input original session sequence, z, x represent the hidden variables in the regularized flow model and the input session sequence data, respectively, and theta, phi represent the parameters of the probability distribution, respectively, in the above formula
Figure BDA0002094840510000051
Which represents the loss of the reconstruction and,
Figure BDA0002094840510000052
is a constant term and is a constant number,
Figure BDA0002094840510000053
representing a regularized stream, β is a coefficient.
The purpose of steps S6 and S7 is to convert L1And L2And adding the two to generate the final total loss L, minimizing the total loss function, and finishing the training of the model to obtain the project recommendation model.
In the project recommendation model training method implemented by using the variational regularization stream, the adopted Recurrent neural network is a GRU (Gated Recurrent Unit) or an LSTM (Long Short-Term Memory network) and the like. The classifier used is made up of a fully connected network. The decoder used is a GRU or LSTM.
The invention further provides a project recommendation method realized by the variational regularization stream, which comprises the following steps:
s1', preprocessing of data: embedding each item in the session-based sequence as an embedded vector;
and S2', acquiring a user prediction item, and inputting a conversation sequence expressed as an embedded vector into the model trained by the item recommendation model training method to obtain the user prediction item.
In the item recommendation method implemented by using the regularized stream of the variation, the step S2' aims to assign a score to each item by using a scoring matrix method, and recommend the item according to the score, so that the accuracy and efficiency of item recommendation can be further improved while obtaining the recommended item. The method comprises the following steps:
s21', inputting the conversation sequence expressed as the embedded vector into a recurrent neural network, introducing a regularized flow learning algorithm and an attention mechanism, respectively constructing a first hidden variable and a second hidden variable, splicing the two hidden variables together, and acquiring a final hidden variable m;
s22', generating a scoring matrix S ═ W using the resulting hidden variables memb 2Bm, this scoring matrix S containing a score, W, for each itememb 2Represents WembB denotes the modelTraining parameters, m is the final hidden variable, S is selectediThe top n items of the middle score are used as the final prediction results.
The invention provides a variable SEssion-based project Recommendation method (VASER) realized by utilizing a variable regularization stream, which applies the variable regularization stream to a recurrent neural network (GRU or LSTM) with an attention mechanism, can capture information in a long SEssion sequence, and can learn the real distribution of hidden variables through the regularization stream, thereby greatly improving the accuracy of prediction.
The invention provides a project recommendation method realized by using a variational regularization stream. Compared with the prior art, the method has the following beneficial effects:
1. the invention uses the recurrent neural network (GRU or LSTM) with Attention mechanism (Attention mechanism) and adds variation deduction of regularization stream, and recommends the next click item for the user by learning the hidden variable of the conversation sequence.
2. The regularization flow of the variation is added into the model to learn the real distribution of the hidden variables, so that the error of the traditional variation model (such as VAE) in the session-based recommendation problem can be reduced; this is because the conventional VAE needs to assume an a priori distribution, which will bring a large error; the regularized flow does not assume a prior distribution, so that the learned distribution is closer to the true distribution of the underlying variables.
Drawings
Fig. 1 is an overall structure diagram of an item recommendation model implemented by using a regularized stream of variations.
Fig. 2 is a visual map of hidden variables based on a conversation sequence.
FIG. 3 is a diagram illustrating the variation of the prediction results of different item recommendation methods with the length of the session sequence.
Fig. 4 is a diagram illustrating the variation of the prediction result of the item recommendation model (VASER) with the coefficient β in the loss function.
FIG. 5 is a schematic diagram of the prediction results of a project recommendation model (VASER) on different data sets as a function of a parameter K in a regularization stream; (a) corresponding to youchhroose 1/64 data set, (b) corresponding to youchhroose 1/4 data set; k is understood to mean the number of invertible functions (or streams), 1,2,3,4,5 on the abscissa representing in fact 21,22,23,24,25
Interpretation of terms
Word2vec is a vectorization technique aimed at embedding discrete data into a continuous vector space. Word2vec can predict the context by the central Word, and optimize the Word2vec model by using the difference between the predicted context and the real context, so that a proper multidimensional vector model representing the data can be learned. These vectors contain context information of the data, which also represents the relationship between the original data.
A variation inference (variational inference) which simply means that a required distribution p needs to be inferred from existing data; when p is not easily expressed and cannot be directly solved, an attempt can be made to use a variational inference method. That is, find a distribution q that is easy to express and solve, and when the difference between q and p is small, q can be used as an approximate distribution of p instead of p. The whole process uses a variational self-encoder (VAE) to derive an Evidence Lower Bound (ELBO), which is used to learn our approximate distribution q by maximizing the Evidence Lower Bound (ELBO).
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Embodiments project recommendation model training implemented with a regularized stream of variations
The embodiment adopts a real data set YOOSHOSE (the data set can be obtained by https://2015. recurschallenge. com/change. html) as a research object, and the project recommendation model training method provided by the invention is explained in detail.
As shown in fig. 1, a project recommendation model (VASER) implemented by using a variational regularized stream provided by this embodiment is mainly composed of an encoder, an attention layer, a regularized stream layer, a classifier, and a decoder, and the training method of the project recommendation model includes the following steps:
s1, constructing a training set: a training set is constructed using session-based sequences of the same user.
Since the total amount of data in the data set yooshaose is large, the present embodiment randomly fetches 1/4 and 1/64 parts in this data set as two data sets of the experiment, and obtains two project recommendation models according to the training of steps S2-S7, respectively.
TABLE 1 statistical information of data sets
Figure BDA0002094840510000071
As shown in table 1, the training set and the test set were divided over the two experimental data sets and the session sequences with sequence length less than 5 were deleted. Where clicks refers to the total number of all clicked items (including duplicates), train sessions represents the session sequence as a training set, and test sessions represents the session sequence as a test set, and also contains the total number of all items and the average length of the session sequence for each experimental data set.
In order to make full use of the session sequence, the present embodiment sets each session sequence x to { x ═ x1,x2…xtThe division into t-1 subsequences includes: x is the number of1={x1,x2…xt-1},x2={x1,x2…xt-2},…,xt-1={x1,x2And (6) supplementing the original training set and test set with the subsequences.
In this embodiment, the item recommendation model is mainly trained by using data in a training set, and the test set is used for the following test effect discussion.
S2, preprocessing of data: the session-based sequence is divided into input data and tagged items, and each item in the input data is embedded, represented as an embedded vector.
The method comprises the steps of firstly, dividing each conversation sequence in a training set into input data and label items, and training the input data and the label items in the training set by adopting a word2vec mode to obtain an initialized embedding vector (item embedding) matrix W of all itemsembAnd converting each conversation sequence into vector representation, specifically comprising the following sub-steps:
s21, the last item of each conversation sequence is used as the label item of the sequence, and the rest items are used as input data.
This step sets each session sequence x to { x ═ x1,x2…xtY as the tag item of the sequence, will { x }1,x2…xtAs input data.
S22, training in the training set by adopting the word2vec method to obtain the initialized embedded vector matrix W of all the itemsembAnd it is used as a parameter matrix of the model with the size of N × M, where N is the total number of items and M is the dimension of the embedded vector of the items.
This step first initializes an embedding vector (item embedding) matrix W of all itemsembThe size is N × M, where N is the total number of items and M is the dimension of the embedded vector of items, where M is 50. In this embodiment, for the training set, each item point is regarded as a word, each conversation sequence is regarded as a sentence, and training is performed on the training set in a word2vec manner to obtain an initialized embedded vector matrix of all items.
S23, embedding each item in each inputted conversation sequence into matrix WembThe vector in (1) is shown, and an embedded vector matrix X of the session sequence is formed.
This step takes each as input the session sequence x ═ x1,x2…xtEach item in the item embedding matrix WembThe vector in (1) is expressed to form a new conversation orderColumn matrix X, size t 50.
And S3, acquiring the hidden variable, inputting the input data expressed as the embedded vector into a recurrent neural network, and introducing a regularized flow learning algorithm and an attention mechanism to acquire the hidden variable.
Firstly, inputting input data expressed as embedded vectors into a cyclic neural network to obtain a hidden state, constructing a first hidden variable, and optimizing the first hidden variable through a regularized flow algorithm to obtain an optimized first hidden variable; and simultaneously, acquiring a second hidden variable by using an attention mechanism according to the acquired hidden state, and combining the optimized first hidden variable and the optimized second hidden variable to acquire a final hidden variable. The recurrent neural network adopted in this step is GRU, and the number of layers is 100. The method comprises the following steps:
s31, using the recurrent neural network to encode the input data expressed as the embedded vector to obtain the hidden state htAnd constructing a first hidden variable z0
This step encodes a session sequence matrix X represented as an embedded vector using GRUs, the update process for each layer being as follows:
hidden state
Figure BDA0002094840510000091
Wherein t is a hidden state corresponding to the t-th item of each layer of the GRU;
updating the door zt=σ(Wzxt+Uzht-1);
Candidate activation function
Figure BDA0002094840510000092
Reset gate rt=σ(Wrxt+Urht-1);
Generating mean of distribution of real sessions in latent space by the above updating process
Figure BDA0002094840510000093
Sum variance
Figure BDA0002094840510000094
Then sampling in the distribution to obtain a first hidden variable representation of the potential space of each conversation sequence
Figure BDA0002094840510000095
The above sigma is expressed as sigmoid activation function, xtT item, W, representing each layer inputz、Wt、Wn、Wσ、Uz、Ut、Ur、bμ、bσAll represent the training parameters of the model, and ε is a variable used in a standard normal distribution of random samples.
S32, defining a set of invertible functions f ═ f1,…,fK]The first hidden variable z obtained in step S310Inputting the variable into the set of reversible functions to transform, and learning to obtain an optimized first hidden variable zk
Variational inference is commonly employed in the art to approximate the probability of a true posterior distribution. For a given sequence of sessions, its log probability is calculated as follows:
Figure BDA0002094840510000101
in the formula (p)θ(x | z) is the true distribution of the input session sequence data), p (z) is a priori distribution of pre-assumed hidden variables, usually set to a standard normal distribution, qφ(z | x) is a posterior distribution to be learned, which is used to approximate the true distribution of the hidden variables, z is the hidden variable, x is the input session sequence data, and phi and theta are parameters in the probability distribution, respectively.
None of the existing methods completely solves the deviation between the approximate and true posterior, which is mainly due to this term
Figure BDA0002094840510000102
Assumption for prior distribution inThe result of the inaccuracy. The used regularization flow method can well relieve the deviation, and further improve the efficiency of approximating the real posterior distribution in the variation inference.
The process of implementing the regularization stream in this embodiment is: defining a set of invertible functions f ═ f1,…,fK]The first hidden variable z obtained in the last step0Is input into the set of reversible functions to be transformed, and is learned to obtain zk=fK(fK-1(zK-2))=fK(fK-1(…f1(z0) ) each invertible function may be referred to herein as a stream, where K is set to 16. Taking an example of a reversible function called planar herein, f can be defined as
Figure BDA0002094840510000103
d represents the dimension of u, w, b ∈ Rd
Figure BDA0002094840510000104
The activation functions tanh, u, w, b all represent parameters of the invertible function to be learned. Through the operation of the set of reversible functions, the optimized first hidden variable z is obtained through learningkMore closely to the true latent layer data distribution, learning zkThe distribution process is as follows:
Figure BDA0002094840510000105
herein, the
Figure BDA0002094840510000106
Is a Jacobian obtained by a reversible function theory and has
Figure BDA0002094840510000107
Figure BDA0002094840510000108
Here an initialized gaussian distribution p0(z0) By a set of invertible functionsP obtained thereafterK(zk) Is referred to as a regularized stream. Furthermore, with a regularized flow process, the approximate posterior distribution that needs training is
Figure BDA0002094840510000111
Z here0~qφ(z0Is) is a gaussian distribution.
S33, acquiring a second implicit variable c based on the variation attention mechanism.
Firstly, calculating and obtaining the probability distribution alpha of different item points in the conversation sequence to the current corresponding output value of the circular track encoder according to the following formulait
Figure BDA0002094840510000112
In the formula, hiA hidden state representing the ith item of each layer of the recurrent neural network, i ═ 1,2, …, t; h istW represents the hidden state of the t-th item, obtained in step S31TThe transpose of the parameter matrix W that needs to be learned for this session sequence.
Then, the attention vector c is calculated by weighted input summation and is taken as a second implicit variable:
Figure BDA0002094840510000113
s34, splicing the optimized first hidden variable obtained in the step S32 and the second hidden variable obtained in the step S33 together, namely splicing the second hidden variable c to the optimized first hidden variable zkThe final hidden variable m is obtained later.
And S4, acquiring the first loss, inputting the final hidden variable into the classifier, outputting the next prediction item of the user as the prediction item, calculating the cross entropy loss by using the prediction item and the label item, and taking the cross entropy loss as the first loss.
The classifier adopted in the step is composed of a fully connected network. Inputting the combined final hidden variable m into a classificationAnd a processor for outputting the predicted next user click item as the predicted item
Figure BDA0002094840510000114
Then using the sequence label item y (i.e. the last item of each conversation sequence) obtained by the previous division and the predicted item
Figure BDA0002094840510000115
The cross-entropy loss is calculated and,
Figure BDA0002094840510000116
and takes this as the first loss, N being the total number of items.
S5, obtaining a second loss, and normalizing the optimized first hidden variable z in the flowKInput into decoder to obtain second loss L2
The decoder used in this step is a GRU.
Optimizing a first implicit variable z in a regularized streamKThe input to the decoder restores the most original sequence of input sessions, in which the true distribution of the implied variables can be learned. The second loss constructed in this step is:
Figure BDA0002094840510000121
using posterior distribution qφ(z | x) to approximate the true distribution p of the implied variable zθ(x, z) and has qφ(z | x) is defined as q (z)K),pθ(x, z) represents the true distribution of the input original session sequence, z, x represent the hidden variables in the regularized flow model and the input session sequence data, respectively, and theta, phi represent the parameters of the probability distribution, respectively, in the above formula
Figure BDA0002094840510000122
Which represents the loss of the reconstruction and,
Figure BDA0002094840510000123
is a constant term and is a constant number,
Figure BDA0002094840510000124
and the normalized stream is represented, beta is a coefficient and is used for adjusting the influence degree of the normalized stream on the prediction model, and the influence degree is 0.2 in the step.
And S6, adding the first loss and the second loss to generate a final total loss.
This step will be described in1And L2Add together to generate the final total loss L.
And S7, repeating the steps S2-S6, and minimizing the total loss to obtain the project recommendation model.
The purpose of this step is to minimize the total loss. For the first loss L1The minimum value can be calculated by gradient descent. For the second loss L2The lower bound of Evidence (ELBO) (i.e., the maximum of evidence) can be passed
Figure BDA0002094840510000125
Figure BDA0002094840510000126
) To make the second loss L2And minimum.
In order to achieve the above purpose, the learning rate used in the project recommendation model training process is 0.001, and the number of session sequences in the data set used for training in each batch is 512 by adopting a batch training mode. The excessive total loss can improve the training situation by adjusting the dimension of the embedded vector of the project point and the dimension of the output vector in the GRU model.
And when the total loss L is stable, finishing the training of the project recommendation model.
Application example
Aiming at the test sets of the two experimental data sets, respectively adopting the project recommendation model obtained by the embodiment training to execute project recommendation according to the following steps:
s1', preprocessing of data: each item of input data in the session-based sequence is embedded, represented as an embedded vector.
In order to implement the test on the test set, in the present application example, the session-based sequence in the test set is still divided into input data and tag items, and each item in the input data is embedded and represented as an embedded vector. The processing of this step may refer to step S2 in the model training process.
The label item is used for judgment of the prediction accuracy in step S2'.
In practical application, when the trained project recommendation model is adopted to predict the next project of the user, the input data and the label project do not need to be divided based on the sequence of the conversation, and the input data and the label project are all used as input data to be processed.
S2', obtaining the user prediction item, inputting the conversation sequence expressed as the embedded vector into the corresponding item recommendation model obtained by the embodiment training, and obtaining the next prediction item of the user.
The step adopts a scoring matrix method to assign a score to each item, and recommends the item according to the score, so that the accuracy and efficiency of recommending the item can be further improved while the recommended item is obtained. The method comprises the following steps:
and S21', inputting the conversation sequence expressed as the embedded vector into the recurrent neural network, and introducing a regularized flow learning algorithm and an attention mechanism to obtain the hidden variable m.
The processing of this step may refer to step S3 in the model training process.
S22', generating a scoring matrix S ═ W using the resulting hidden variables memb TBm, this scoring matrix S containing a score, W, for each itememb TRepresents WembAnd transposing, wherein m is a final hidden variable, selecting an item n (n is 20) before scoring in the S as a final prediction result, if the 20 items have a real label item of the user, considering that the prediction is correct, in practical application, recommending 20 items to the user every time, and if the user is interested in the items, recommending successfully.
The predicted effect on the test set using the item recommendation model (VASER) described above is shown in table 2.
To further illustrate the predictive effect of the session-based item recommendation method implemented with a variational regularized stream provided by the present invention. The application example further utilizes two experimental data sets to train on six baseline methods (item-KNN, GRU4REC, NARM, STAMP, RelavaR and CRM) to obtain a conversation-based sequence project recommendation model (SBR), and then utilizes the six models to predict a next project to be clicked for a user in a test set, wherein the prediction results of the six models are shown in Table 2. The index Recall @20 represents the proportion of the real label items in the first 20 items of the prediction result in the test case, and Recall @20 does not care about the real ranking of the items, so long as the real label items exist in the first 20 predicted items, the prediction is successful. The second index MRR @20((Mean Recircular Rank) is a mechanism for evaluating the ranking result internationally and commonly used, firstly, the items before 20 are ranked from high to low according to the probability, then, the predicted items are judged one by one, if the first predicted item is correct, the score is 1, the score for the second predicted item is 0.5, the score for the nth is 1/n, if no correct item exists, the score is 0, the final score of MRR @20 is the average value of the sum of the scores of all the sequence predicted items in the test set, MRR considers the ranking of the predicted items, and the index is important in the problem that the recommendation sequence is important.
Table 2: performing a session-based sequential item recommendation model result on two sub data sets
Figure BDA0002094840510000141
The rest of the methods in the table are described below:
Item-KNN: it is an item-to-item model that recommends similar items to those previously visited based on cosine similarity. Reference may be made to the paper: badrul Munir Sarwar, George Karysis, Joseph A Konstan, and John Riedl.2001.item-based collectible filtering correspondence in WWW.
GRU4 REC: this is a cyclic neural network (RNN) based deep learning model for session-based recommendations. It uses a GRU unit to capture sequence information and uses parallel small batch processing skills and an order-based loss function in the training process. Reference may be made to the paper: bal, zsHidasi, Alexandros Karatzoglou, LinasBaltrunas, and Domonkos Tikk.2016.Session-based Recommendations with recovery Neural networks.
NARM: it is a Recurrent Neural Network (RNN) based model that takes the main information from hidden variables using an attention mechanism and combines it with sequence information to generate recommended items. Reference may be made to the paper: sting Li, Pengjie Ren, Zhumin Chen, Zhuochun Ren, Tao Lian, and Jun Ma.2017.neural Interactive Session-based recommendation. in CIKM.
STAMP: it is a priority model that captures the user's general interests from the long-term memory of the session context and the current interests from the short-term memory of the most recently clicked item. Reference may be made to the paper: qiao Liu, Yifu Zeng, RefuoBroksi, and Haibin Zhang.2018 StaMP, ShortTermtentintent/Memory Priority Model for Session-based recommendation. InKDD.
ReLaVaR, a Bayesian version of GRU4Rec, considers the recursion unit of the network as a random hidden variable with a certain prior distribution, and deduces the corresponding posterior distribution to predict the next click item. The method is an SBR-based project-level variation inference method which takes an independent Gaussian function as the prior distribution of projects. Reference may be made to the paper: sotirios P Chatzis, Panayotis Christodou, and Andrea S Androeou.2017. Current tension Variable Networks for Session-Based Recommendation. in DLRS @ RecSys.
CRM, which is a recently proposed method to directly apply VAEs to session-based recommendations. Unlike the project-level variation method ReLaVaR, VRM belongs to a session-level variation inference method. Reference may be made to the paper: zhitao Wang, Chengyao Chen, Ke Zhang, Yu Lei, and Wenjie Li.2018.variational Recurrent Model for Session-based recommendation. in CIKM.
As can be seen from the prediction results in Table 2, the accuracy of the prediction of the conversation-based item recommendation method implemented by using the regularized stream of the variation provided by the invention is higher than that of the existing methods.
In order to explain why the regularized stream of the variation can improve the prediction result, the present application example visually compares the hidden variables provided by the present embodiment of the VASER model with the hidden variables of the general method, as shown in fig. 2, it can be seen from fig. 2 that the hidden variables are more dispersed by the VASER model, which is why the prediction effect of the VASER is higher than that of other methods.
In order to illustrate the influence of the session length on the prediction result, the application example divides the test set into a plurality of subsets according to the session length, then predicts in the plurality of test subsets respectively by adopting a NARM model, a STAMP model and a VASER model, and predicts the next item to be clicked for the user, wherein the prediction results (Recall @20 index) of the six models are shown in FIG. 3, and it can be seen from the figure that the prediction accuracy rate is reduced along with the increase of the session sequence length. However, the dependency of the VASER model provided by the invention on the session length is slightly worse than that of other models, which shows that the VASER model provided by the method has better prediction stability.
To illustrate the influence of β on the prediction result, β of the VASER model in this application is set to 0, 0.1, 0.2, 0.5, and 1, and then the training sets of the data set YOOCHOOSE 1/64 are used to train respectively according to the method provided by the embodiment, and the trained VASER model is used to predict the test set of the data set YOOCHOOSE 1/64, so as to predict the next item to be clicked for the user, and the prediction result (Recall @20 index) is shown in fig. 4. As can be seen from the figure, the effect is best when β is 0.2.
To illustrate the effect of the parameter K in the regularized stream on the prediction results, K of the VASER model of the present application is set to 2 respectively1,22,23,24,25Then, training is carried out by using a training set of data sets Yoochoose 1/64 and Yoochoose1/4 according to the method given by the embodiment, and prediction is carried out in a test set of data sets Yoochoose 1/64 and Yoochoose1/4 by using a trained VASER model, and the next item to be clicked is predicted for the user, and the knot is predictedThe fruit (Recall @20 index) is shown in FIG. 5. As can be seen from the figure, the prediction accuracy increases with the increase of K, but the training time becomes larger with the increase of K, therefore, K in the invention is taken as 24
In summary, the session-based project recommendation method implemented by using the variational regularization stream can effectively solve the problem that the data set lacks user information, and the model can be well adapted to long-sequence session data. The added regularization stream is good at accounting for errors in previous variational inference (e.g., VAE) processes. So that the real distribution of the session sequence can be learned by our model, which is very helpful to improve the prediction accuracy.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (8)

1.一种利用变分的正则化流实现的项目推荐模型训练方法,其特征在于包括以下步骤:1. a project recommendation model training method utilizing the regularization flow realization of variation is characterized in that comprising the following steps: S1,构建训练集:利用同一用户的基于会话的序列构建训练集;S1, build a training set: use the session-based sequence of the same user to build a training set; S2,数据的预处理:将基于会话的序列划分为输入数据和标签项目,并将输入数据中的每一个项目进行嵌入,表示成嵌入向量;S2, data preprocessing: divide the session-based sequence into input data and label items, and embed each item in the input data as an embedding vector; S3,获取隐含变量,将表示成嵌入向量的输入数据输入循环神经网络,并引入正则化流学习算法和注意力机制,分别构建第一隐含变量和第二隐含变量,并将两个隐含变量拼接在一起获取最终的隐含变量;S3: Obtain hidden variables, input the input data expressed as embedded vectors into the recurrent neural network, and introduce a regularized flow learning algorithm and an attention mechanism, respectively construct the first hidden variable and the second hidden variable, and combine the two The hidden variables are spliced together to obtain the final hidden variable; 首先将表示成嵌入向量的输入数据输入循环神经网络,得到隐藏状态,并构建第一隐含变量,通过正则化流学习算法对第一隐含变量优化得到的优化后的第一隐含变量;同时依据得到的隐藏状态,利用注意力机制获取第二隐含变量,并将优化后的第一隐含变量和第二隐含变量拼接在一起得到最终的隐含变量;具体实现方式包括以下分步骤:First, input the input data expressed as an embedded vector into the recurrent neural network to obtain the hidden state, and construct the first latent variable, and optimize the first latent variable obtained by optimizing the first latent variable through the regularized flow learning algorithm; At the same time, according to the obtained hidden state, the attention mechanism is used to obtain the second latent variable, and the optimized first latent variable and the second latent variable are spliced together to obtain the final latent variable; the specific implementation method includes the following points: step: S31,利用循环神经网络对表示成嵌入向量的输入数据进行编码,得到隐藏状态ht,并构建第一隐含变化量z0S31, using a recurrent neural network to encode the input data expressed as an embedded vector to obtain a hidden state h t , and construct a first implicit variation z 0 ;
Figure FDA0003001994720000011
Figure FDA0003001994720000011
式中,
Figure FDA0003001994720000012
为潜在空间中会话分布的均值,
Figure FDA0003001994720000013
Figure FDA0003001994720000014
为潜在空间中会话分布的方差,
Figure FDA0003001994720000015
隐藏状态
Figure FDA0003001994720000016
其中ht表示循神经网络每一层的第t个项目对应的隐藏状态,更新门zt=σ(Wzxt+Uzht-1),候选的激活函数
Figure FDA0003001994720000017
重置门rt=σ(Wrxt+Urht-1),以上的σ表示为sigmoid激活函数,tanh也是一种激活函数,xt表示每一层输入的第t个项目,Wz、Wt、Wu、Wσ、Uz、Ut、Ur、bμ、bσ均表示模型的训练参数,ε表示一个用于随机采样的标准正态分布的变量;
In the formula,
Figure FDA0003001994720000012
is the mean of the session distribution in the latent space,
Figure FDA0003001994720000013
Figure FDA0003001994720000014
is the variance of the session distribution in the latent space,
Figure FDA0003001994720000015
hidden state
Figure FDA0003001994720000016
Where h t represents the hidden state corresponding to the t-th item of each layer of the recurrent neural network, the update gate z t =σ(W z x t +U z h t-1 ), the candidate activation function
Figure FDA0003001994720000017
Reset gate r t =σ(W r x t +U r h t-1 ), the above σ represents the sigmoid activation function, tanh is also an activation function, x t represents the t-th item of the input of each layer, W z , W t , Wu , W σ , U z , U t , Ur , b μ , b σ all represent the training parameters of the model, and ε represents a standard normal distribution variable for random sampling;
S32,定义一组可逆函数f=[f1,…,fK],将步骤S31得到的第一隐含变量z0放进输入到这一组可逆函数中进行变换,学习到得到优化的第一隐含变量zk,zk=fK(fK-1(zK-2))=fK(fK-1(…f1(z0)));S32, define a set of reversible functions f=[f 1 , . . . , f K ], put the first implicit variable z 0 obtained in step S31 into the set of reversible functions for transformation, and learn the optimized first variable an implicit variable z k , z k =f K (f K-1 (z K-2 ))=f K (f K-1 (...f 1 (z 0 ))); S33,基于变分注意力机制,首先按照以下公式计算得到会话序列中不同项目点对循环轨迹编码器当前对应输出值的概率分布αitS33 , based on the variational attention mechanism, first calculate the probability distribution α it of different item points in the conversation sequence to the current corresponding output values of the cyclic trajectory encoder according to the following formula,
Figure FDA0003001994720000021
Figure FDA0003001994720000021
式中,hi表示循环神经网络每一层的第i个项目的隐藏状态,i=1,2,…,t;ht表示第t个项目的隐藏状态,由步骤S31得到,WT为该会话序列需要学习的参数矩阵W的转置;In the formula, hi represents the hidden state of the i -th item in each layer of the recurrent neural network, i=1, 2, ..., t; h t represents the hidden state of the t-th item, obtained from step S31, and W T is The transpose of the parameter matrix W that this session sequence needs to learn; 然后,通过加权输入求和计算得到注意力向量c,并将其作为第二隐含变量:
Figure FDA0003001994720000022
Then, the attention vector c is calculated by summing the weighted inputs and used as the second latent variable:
Figure FDA0003001994720000022
S34,将步骤S32得到的优化后的第一隐含变量和步骤S33得到的第二隐含变量拼接在一起,得到最终的隐含变量m;S34, splicing together the optimized first implicit variable obtained in step S32 and the second implicit variable obtained in step S33 to obtain the final implicit variable m; S4,获取第一损失,将最终的隐含变量输入到分类器中,输出得到预测项目,利用预测项目和该会话序列的标签项目计算交叉熵损失,并将其作为第一损失;S4, obtain the first loss, input the final latent variable into the classifier, and output the predicted item, calculate the cross-entropy loss by using the predicted item and the label item of the session sequence, and use it as the first loss; S5,获取第二损失,将基于正则化流构建的第一隐含变量输入到解码器中得到第二损失;S5, obtain the second loss, and input the first hidden variable constructed based on the regularization stream into the decoder to obtain the second loss; S6,将第一损失与第二损失相加生成最后的总损失;S6, adding the first loss and the second loss to generate the final total loss; S7,重复步骤S2-S6,最小化总损失,即得到所述项目推荐模型。S7, repeating steps S2-S6 to minimize the total loss, that is, to obtain the item recommendation model.
2.根据权利要求1所述利用变分的正则化流实现的项目推荐模型训练方法,其特征在于步骤S1中,训练集内基于会话的所有会话序列的长度均不小于5。2 . The item recommendation model training method using variational regularization flow according to claim 1 , wherein in step S1 , the lengths of all session-based sequences in the training set are not less than 5. 3 . 3.根据权利要求1或2所述利用变分的正则化流实现的项目推荐模型训练方法,其特征在于将每一个会话序列划分为若干子序列,并将这些子序列补充到原有训练集中。3. The item recommendation model training method using variational regularization flow according to claim 1 or 2 is characterized in that each session sequence is divided into several subsequences, and these subsequences are supplemented into the original training set . 4.根据权利要求1所述利用变分的正则化流实现的项目推荐模型训练方法,其特征在于步骤S2包括以下分步骤:4. the item recommendation model training method that utilizes variational regularization flow to realize according to claim 1 is characterized in that step S2 comprises the following substeps: S21,将每一个会话序列的最后一个项目作为这条序列的标签项目,其余项目作为输入数据;S21, the last item of each session sequence is used as the label item of this sequence, and the remaining items are used as input data; S22,采用word2vec的方法在训练集进行训练得到初始化的所有项目的嵌入向量矩阵Wemb,并将其作为模型的参数矩阵,大小为N×M,其中N为项目的总数,M为嵌入向量的维数;S22, use the word2vec method to train in the training set to obtain the initialized embedding vector matrix Wemb of all items, and use it as the parameter matrix of the model, with a size of N×M, where N is the total number of items, and M is the embedding vector. dimension; S23,将每一个作为输入的会话序列中的每一个项目用项目嵌入矩阵Wemb中的向量表示出来,构成该会话序列的嵌入向量矩阵X。S23, each item in each input session sequence is represented by a vector in the item embedding matrix Wemb to form an embedding vector matrix X of the session sequence. 5.根据权利要求1所述利用变分的正则化流实现的项目推荐模型训练方法,其特征在于步骤S32中,采用的可逆函数f为planar可逆函数,f定义为:
Figure FDA0003001994720000031
d表示u,w的维度,b∈R,
Figure FDA0003001994720000032
为激活函数tanh,u,w和b表示需要学习的可逆函数的参数。
5. the item recommendation model training method that utilizes variational regularization flow to realize according to claim 1, it is characterized in that in step S32, the reversible function f that adopts is planar reversible function, and f is defined as:
Figure FDA0003001994720000031
d represents the dimension of u, w, b ∈ R,
Figure FDA0003001994720000032
For the activation function tanh, u, w and b denote the parameters of the invertible function to be learned.
6.根据权利要求1所述利用变分的正则化流实现的项目推荐模型训练方法,其特征在于步骤S4中,将最终的隐含变量m输入一个分类器,输出用户的下一个预测项目,并将其作为预测项目
Figure FDA0003001994720000033
利用步骤S2划分出的会话序列的标签项目y与预测项目
Figure FDA0003001994720000034
计算交叉熵损失L1
Figure FDA0003001994720000035
并将其作为第一损失,其中N为项目的总数;
6. the item recommendation model training method that utilizes variational regularization flow to realize according to claim 1, it is characterized in that in step S4, final latent variable m is input a classifier, the next prediction item of output user, and use it as a forecast item
Figure FDA0003001994720000033
Use the label item y and the prediction item of the conversation sequence divided in step S2
Figure FDA0003001994720000034
Calculate the cross-entropy loss L 1 ,
Figure FDA0003001994720000035
and take it as the first loss, where N is the total number of items;
步骤S5中,将最终的隐含变量m输入到解码器得到第二损失L2In step S5, the final latent variable m is input to the decoder to obtain the second loss L 2 :
Figure FDA0003001994720000036
Figure FDA0003001994720000036
用后验分布qφ(z|x)来近似隐含变量z的真实分布pθ(x,z),且有qφ(z|x)定义为q(zK),pθ(x,z)表示输入的原始会话序列的真实分布,z、x分别表示正则化流模型中的隐含变量和输入的会话序列数据,θ、φ分别表示概率分布的参数,上式中
Figure FDA0003001994720000038
表示重构损失,
Figure FDA0003001994720000039
为常数项,
Figure FDA0003001994720000037
表示正则化流,β为系数;
Use the posterior distribution q φ (z|x) to approximate the true distribution p θ (x, z) of the latent variable z, and have q φ (z|x) defined as q(z K ), p θ (x, z) represents the true distribution of the input original session sequence, z and x represent the hidden variables in the regularized flow model and the input session sequence data, respectively, θ and φ represent the parameters of the probability distribution, respectively, in the above formula
Figure FDA0003001994720000038
represents the reconstruction loss,
Figure FDA0003001994720000039
is a constant term,
Figure FDA0003001994720000037
represents the regularized flow, and β is the coefficient;
步骤S6中,将L1和L2加在一起生成最后的总损失L。 In step S6, L1 and L2 are added together to generate the final total loss L.
7.一种利用变分的正则化流实现的项目推荐方法,其特征在于包括以下步骤:7. A method for recommending items realized by using variational regularization flow, characterized in that it comprises the following steps: S1′,数据的预处理:将基于会话的序列中的每一个项目进行嵌入,表示成嵌入向量;S1', data preprocessing: embed each item in the session-based sequence and represent it as an embedding vector; S2′,获取用户预测项目,将表示成嵌入向量的会话序列输入到按照权利要求1至6任一权利要求所述项目推荐模型训练方法训练好的模型中,得到用户的预测项目。S2': Obtain user prediction items, input the session sequence expressed as an embedded vector into the model trained according to the item recommendation model training method described in any one of claims 1 to 6, and obtain user prediction items. 8.根据权利要求7所述利用变分的正则化流实现的项目推荐方法,其特征在于步骤S2′包括以下分步骤:8. The item recommendation method utilizing variational regularization flow according to claim 7 is characterized in that step S2 ' comprises the following sub-steps: S21′,将表示成嵌入向量的会话序列输入到循环神经网络中,并引入正则化流学习算法和注意力机制,分别构建第一隐含变量和第二隐含变量,并将两个隐含变量拼接在一起,获取最终的隐含变量m;S21', input the conversation sequence represented as an embedding vector into the recurrent neural network, and introduce a regularized flow learning algorithm and an attention mechanism, respectively construct the first implicit variable and the second implicit variable, and combine the two implicit variables. The variables are spliced together to obtain the final implicit variable m; S22′,用得到的最终隐含变量m生成一个评分矩阵S=Wemb TBm,这个评分矩阵S包含对于每一个项目的一个评分,Wemb T表示Wemb的转置,B表示需要学习的模型参数,m是最终的隐含变量,选取Si中评分前n的项目作为最后的预测结果。S22', use the obtained final latent variable m to generate a scoring matrix S=W emb T Bm, this scoring matrix S contains a score for each item, We emb T represents the transpose of We emb , and B represents what needs to be learned Model parameters, m is the final hidden variable, and the top n items in S i are selected as the final prediction result.
CN201910515356.1A 2019-03-01 2019-06-14 Item recommendation method and model training method using variational regularization flow Active CN110232480B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2019101570128 2019-03-01
CN201910157012 2019-03-01

Publications (2)

Publication Number Publication Date
CN110232480A CN110232480A (en) 2019-09-13
CN110232480B true CN110232480B (en) 2021-05-11

Family

ID=67859223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910515356.1A Active CN110232480B (en) 2019-03-01 2019-06-14 Item recommendation method and model training method using variational regularization flow

Country Status (1)

Country Link
CN (1) CN110232480B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851705A (en) * 2019-10-09 2020-02-28 天津大学 Item-based collaborative storage recommendation method and recommendation device
CN110765353B (en) * 2019-10-16 2022-03-08 腾讯科技(深圳)有限公司 Processing method and device of project recommendation model, computer equipment and storage medium
CN110826698A (en) * 2019-11-04 2020-02-21 电子科技大学 Method for embedding and representing crowd moving mode through context-dependent graph
CN110781401A (en) * 2019-11-07 2020-02-11 电子科技大学 A Top-n Item Recommendation Method Based on Collaborative Autoregressive Flow
CN110837565B (en) * 2019-11-14 2022-08-12 中山大学 Model training device and computer equipment for drug recommendation
CN111046257B (en) * 2019-12-09 2023-07-04 北京百度网讯科技有限公司 Session recommendation method and device and electronic equipment
CN111258992A (en) * 2020-01-09 2020-06-09 电子科技大学 Seismic data expansion method based on variational self-encoder
CN111461175B (en) * 2020-03-06 2023-02-10 西北大学 Label recommendation model construction method and device based on self-attention and collaborative attention mechanism
CN111552881B (en) * 2020-05-09 2024-01-30 苏州市职业大学 Sequence recommendation method based on hierarchical variation attention
CN112085252B (en) * 2020-08-03 2024-01-05 清华大学 Anti-fact prediction method for set type decision effect
CN112084415B (en) * 2020-09-17 2024-02-02 辽宁工程技术大学 Recommendation method based on analysis of long-term and short-term time coupling relation between user and project
CN112231582B (en) * 2020-11-10 2023-11-21 南京大学 Website recommendation method and equipment based on variation self-coding data fusion
CN112435751B (en) * 2020-11-10 2023-01-03 中国船舶集团有限公司第七一六研究所 Peritoneal dialysis mode auxiliary recommendation system based on variation inference and deep learning
CN113837301B (en) * 2021-02-08 2025-02-11 宏龙科技(杭州)有限公司 A device and method for discrete semantic coding of multimodal data
CN113222700B (en) * 2021-05-17 2023-04-18 中国人民解放军国防科技大学 Session-based recommendation method and device
CN113395172B (en) * 2021-05-18 2022-11-11 中国电子科技集团公司第五十四研究所 Important user discovery and behavior prediction method based on communication network
CN113378383B (en) * 2021-06-10 2024-02-27 北京工商大学 Food supply chain hazard prediction method and device
CN113935811B (en) * 2021-10-26 2024-05-14 重庆理工大学 Session recommendation method based on topic guidance and double total office attention
CN114662004B (en) * 2022-04-21 2025-02-28 河海大学 A next item recommendation method integrating users' long-term and short-term preferences

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8370203B2 (en) * 2002-10-07 2013-02-05 Amazon Technologies, Inc. User interface and methods for recommending items to users
US8200602B2 (en) * 2009-02-02 2012-06-12 Napo Enterprises, Llc System and method for creating thematic listening experiences in a networked peer media recommendation environment
US8589429B1 (en) * 2011-07-11 2013-11-19 Amazon Technologies, Inc. System and method for providing query recommendations based on search activity of a user base
US10726325B2 (en) * 2017-04-13 2020-07-28 Adobe Inc. Facilitating machine-learning and data analysis by computing user-session representation vectors
CN108256631A (en) * 2018-01-26 2018-07-06 深圳市唯特视科技有限公司 A kind of user behavior commending system based on attention model
CN108470075A (en) * 2018-04-12 2018-08-31 重庆邮电大学 A kind of socialization recommendation method of sequencing-oriented prediction
CN108763493B (en) * 2018-05-30 2022-06-21 深圳市思迪信息技术股份有限公司 Deep learning-based recommendation method

Also Published As

Publication number Publication date
CN110232480A (en) 2019-09-13

Similar Documents

Publication Publication Date Title
CN110232480B (en) Item recommendation method and model training method using variational regularization flow
Wu et al. Session-based recommendation with graph neural networks
US10783361B2 (en) Predictive analysis of target behaviors utilizing RNN-based user embeddings
CN109299396B (en) Convolutional neural network collaborative filtering recommendation method and system fusing attention model
Zhang et al. Dynamic attention-integrated neural network for session-based news recommendation
CN111737578B (en) Recommendation method and system
CN110008409A (en) Sequence recommendation method, device and device based on self-attention mechanism
CN110119467A (en) A kind of dialogue-based item recommendation method, device, equipment and storage medium
CN115658864A (en) Conversation recommendation method based on graph neural network and interest attention network
CN113822776B (en) Course recommendation method, device, equipment and storage medium
WO2021139415A1 (en) Data processing method and apparatus, computer readable storage medium, and electronic device
US20240386202A1 (en) Tuning generative models using latent-variable inference
KR102070049B1 (en) Apparatus and method for collaborative filtering using auxiliary information based on conditional variational autoencoder
CN114117229B (en) An item recommendation method based on graph neural network with directed and undirected structural information
CN110399553A (en) Conversation recommendation list generation method based on counterstudy
CN114564651A (en) Self-supervision recommendation method combined with contrast learning method
US20240378636A1 (en) Asset Audience Gap Recommendation and Insight
Sohafi-Bonab et al. DCARS: Deep context-aware recommendation system based on session latent context
Bao et al. Multisource heterogeneous user-generated contents-driven interactive estimation of distribution algorithms for personalized search
CN112559905B (en) Conversation recommendation method based on dual-mode attention mechanism and social similarity
Ohtomo et al. AM-Bi-LSTM: Adaptive multi-modal Bi-LSTM for sequential recommendation
CN110555161A (en) personalized recommendation method based on user trust and convolutional neural network
Lv et al. Xdm: Improving sequential deep matching with unclicked user behaviors for recommender system
CN114117233A (en) A Conversational News Recommendation Method and Recommendation System Based on User Implicit Feedback
US20250131321A1 (en) Efficient Training Mixture Calibration for Training Machine-Learned Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant