Disclosure of Invention
The invention aims to provide a project recommendation model training method realized by regularization of variation aiming at the technical situations of low prediction accuracy, low prediction efficiency and the like of the traditional project conjecture method, and the trained project recommendation model can realize accurate and efficient prediction of a project which is probably clicked by a user.
Another object of the present invention is to provide a method for recommending items to a user by using the above item recommendation model.
The idea of the invention is that a recurrent neural network (GRU or LSTM) and an Attention mechanism (Attention mechanism) are utilized to learn the potential information among the items in the conversation sequence, so that the sequence information of the items in the sequence is better captured, which is beneficial to the learning of long sequences. In addition, the invention adds the process of variable deduction, and learns the real distribution of the hidden variable by using the regularized stream of the variable, thereby greatly avoiding the error of directly using a variable auto-encoder (VAE).
Based on the above inventive concept, the project recommendation model training method implemented by using the regularized stream of the variation provided by the present invention is characterized by comprising the following steps:
s1, constructing a training set: constructing a training set by using a session-based sequence of the same user;
s2, preprocessing of data: dividing the conversation-based sequence into input data and tag items, and embedding each item in the input data into an embedded vector;
s3, obtaining hidden variables, inputting input data expressed as embedded vectors into a cyclic neural network, introducing a regularized flow learning algorithm and an attention mechanism, respectively constructing a first hidden variable and a second hidden variable, and splicing the two hidden variables together to obtain a final hidden variable;
s4, acquiring a first loss, inputting the final hidden variable into a classifier, outputting to obtain a prediction item, calculating cross entropy loss by using the prediction item and a label item of the conversation sequence, and taking the cross entropy loss as the first loss;
s5, acquiring a second loss, and inputting a first hidden variable constructed based on the regularized stream into a decoder to obtain the second loss;
s6, adding the first loss and the second loss to generate a final total loss;
and S7, repeating the steps S2-S6, and minimizing the total loss to obtain the project recommendation model.
In the above project recommendation model training method implemented by using the regularized stream of the variation, step S1 aims to process the raw data to construct a training set. Firstly, deleting the conversation sequences with the sequence length smaller than 5 in the original data, and forming a training set by the rest conversation sequences.
In the project recommendation model training method implemented by using the variational regularization stream, in order to more fully utilize the conversation sequences, each conversation sequence in the training set is divided into a plurality of subsequences, and the subsequences are supplemented to the original training set.
In the above method for training a project recommendation model by using a variational regularization stream, step S2 is to divide input data and labeled projects for each session sequence in a training set, train in the training set in a word2vec manner to obtain an initialized embedded vector (item embedding) matrix of all projects, and convert each session sequence into a vector representation, and specifically includes the following sub-steps:
s21, taking the last item of each conversation sequence as the label item of the sequence and the rest items as input data;
s22, training in the training set by adopting the word2vec method to obtain the initialized embedded vector matrix W of all the itemsembTaking the vector as a parameter matrix of the model, wherein the size of the parameter matrix is NxM, N is the total number of items, and M is the dimension of the embedded vector;
s23, embedding each item in each inputted conversation sequence into matrix WembThe vector in (1) is shown, and an embedded vector matrix X of the session sequence is formed.
In the item recommendation model training method implemented by the regularized stream of variations, step S3 aims to obtain a hidden layer representation of the session-based sequence data. Firstly, inputting input data expressed as embedded vectors into a recurrent neural network to obtain a hidden state, constructing a first hidden variable, and optimizing the first hidden variable through a regularized flow algorithm to obtain an optimized first hidden variable; and simultaneously, acquiring a second hidden variable by using an attention mechanism according to the acquired hidden state, and combining the optimized first hidden variable and the optimized second hidden variable to acquire a final hidden variable. The method comprises the following steps:
s31, using the recurrent neural network to encode the input data expressed as the embedded vector to obtain the hidden state htAnd constructing a first hidden variable z0;
In the formula (I), the compound is shown in the specification,
as an average of the distribution of sessions in the potential space,
for the variance of the distribution of sessions in the underlying space,
hidden state
Wherein h is
tRepresenting the hidden state corresponding to the t-th item of each layer of the recurrent neural network, and updating the gate z
t=σ(W
zx
t+U
zh
t-1) Candidate activation function
Reset gate r
t=σ(W
rx
t+U
rh
t-1) Where σ above is expressed as sigmoid activation function, tanh is also an activation function, x
tT item, W, representing each layer input
z、W
t、W
μ、W
σ、U
z、U
t、U
r、b
μ、b
σAll represent the training parameters of the model, and epsilon represents a variable of standard normal distribution for random sampling;
s32, defining a set of invertible functions f ═ f
1,…,f
K]The first hidden variable z obtained in step S31
0Inputting the variable into a set of reversible functions to transform, and learning to obtain an optimized first hidden variable z
k,z
k=f
K(f
K-1(z
K-2))=f
K(f
K-1(…f
1(z
0) ); the reversible function f adopted in the step is a planar reversible function, and f is defined as:
d represents the dimension of u, w, b ∈ R,
the activation functions tanh, u, w, b all represent parameters of the invertible function to be learned.
S33, based on the variation attention mechanism, calculating the probability distribution alpha of different item points in the conversation sequence to the current corresponding output value of the circular track encoder according to the following formulait,
In the formula, hiA hidden state representing the ith item of each layer of the recurrent neural network, i ═ 1,2, …, t; h istW represents the hidden state of the t-th item, obtained in step S31TTransposing a parameter matrix W to be learned for the session sequence;
then, the attention vector c is calculated by weighted input summation and is taken as a second implicit variable:
and S34, splicing the optimized first hidden variable obtained in the step S32 and the second hidden variable obtained in the step S33 together to obtain a final hidden variable m.
In the item recommendation model training method implemented by using the regularized stream of the variation, the steps S4 and S5 are aimed at constructing two loss functions, where one loss function is a cross-entropy loss function of the predicted item and the tagged item, and the other loss function is a reconstruction loss function for making the learned hidden variable distribution closer to the true distribution. The invention uses the cross entropy loss function and the reconstruction loss function as the loss function together, thus leading the model to better keep the structure of the input session sequence and further improving the accuracy of the prediction result.
In step S4, the final hidden variable m is input into a classifier, and the next predicted item of the user is output as a predicted item
The tag item y and predicted item of the conversation sequence divided in step S2
Calculating the cross entropy loss L
1,
And taking it as a first loss, N being the total number of items;
in step S5, the optimized first hidden variable z in the regularized streamKInput to the decoder to obtain a second loss L2:
Using posterior distribution q
φ(z | x) to approximate the true distribution p of the implied variable z
θ(x,z),q
φ(z | x) is defined as q (z)
K),p
θ(x, z) represents the true distribution of the input original session sequence, z, x represent the hidden variables in the regularized flow model and the input session sequence data, respectively, and theta, phi represent the parameters of the probability distribution, respectively, in the above formula
Which represents the loss of the reconstruction and,
is a constant term and is a constant number,
representing a regularized stream, β is a coefficient.
The purpose of steps S6 and S7 is to convert L1And L2And adding the two to generate the final total loss L, minimizing the total loss function, and finishing the training of the model to obtain the project recommendation model.
In the project recommendation model training method implemented by using the variational regularization stream, the adopted Recurrent neural network is a GRU (Gated Recurrent Unit) or an LSTM (Long Short-Term Memory network) and the like. The classifier used is made up of a fully connected network. The decoder used is a GRU or LSTM.
The invention further provides a project recommendation method realized by the variational regularization stream, which comprises the following steps:
s1', preprocessing of data: embedding each item in the session-based sequence as an embedded vector;
and S2', acquiring a user prediction item, and inputting a conversation sequence expressed as an embedded vector into the model trained by the item recommendation model training method to obtain the user prediction item.
In the item recommendation method implemented by using the regularized stream of the variation, the step S2' aims to assign a score to each item by using a scoring matrix method, and recommend the item according to the score, so that the accuracy and efficiency of item recommendation can be further improved while obtaining the recommended item. The method comprises the following steps:
s21', inputting the conversation sequence expressed as the embedded vector into a recurrent neural network, introducing a regularized flow learning algorithm and an attention mechanism, respectively constructing a first hidden variable and a second hidden variable, splicing the two hidden variables together, and acquiring a final hidden variable m;
s22', generating a scoring matrix S ═ W using the resulting hidden variables memb 2Bm, this scoring matrix S containing a score, W, for each itememb 2Represents WembB denotes the modelTraining parameters, m is the final hidden variable, S is selectediThe top n items of the middle score are used as the final prediction results.
The invention provides a variable SEssion-based project Recommendation method (VASER) realized by utilizing a variable regularization stream, which applies the variable regularization stream to a recurrent neural network (GRU or LSTM) with an attention mechanism, can capture information in a long SEssion sequence, and can learn the real distribution of hidden variables through the regularization stream, thereby greatly improving the accuracy of prediction.
The invention provides a project recommendation method realized by using a variational regularization stream. Compared with the prior art, the method has the following beneficial effects:
1. the invention uses the recurrent neural network (GRU or LSTM) with Attention mechanism (Attention mechanism) and adds variation deduction of regularization stream, and recommends the next click item for the user by learning the hidden variable of the conversation sequence.
2. The regularization flow of the variation is added into the model to learn the real distribution of the hidden variables, so that the error of the traditional variation model (such as VAE) in the session-based recommendation problem can be reduced; this is because the conventional VAE needs to assume an a priori distribution, which will bring a large error; the regularized flow does not assume a prior distribution, so that the learned distribution is closer to the true distribution of the underlying variables.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Embodiments project recommendation model training implemented with a regularized stream of variations
The embodiment adopts a real data set YOOSHOSE (the data set can be obtained by https://2015. recurschallenge. com/change. html) as a research object, and the project recommendation model training method provided by the invention is explained in detail.
As shown in fig. 1, a project recommendation model (VASER) implemented by using a variational regularized stream provided by this embodiment is mainly composed of an encoder, an attention layer, a regularized stream layer, a classifier, and a decoder, and the training method of the project recommendation model includes the following steps:
s1, constructing a training set: a training set is constructed using session-based sequences of the same user.
Since the total amount of data in the data set yooshaose is large, the present embodiment randomly fetches 1/4 and 1/64 parts in this data set as two data sets of the experiment, and obtains two project recommendation models according to the training of steps S2-S7, respectively.
TABLE 1 statistical information of data sets
As shown in table 1, the training set and the test set were divided over the two experimental data sets and the session sequences with sequence length less than 5 were deleted. Where clicks refers to the total number of all clicked items (including duplicates), train sessions represents the session sequence as a training set, and test sessions represents the session sequence as a test set, and also contains the total number of all items and the average length of the session sequence for each experimental data set.
In order to make full use of the session sequence, the present embodiment sets each session sequence x to { x ═ x1,x2…xtThe division into t-1 subsequences includes: x is the number of1={x1,x2…xt-1},x2={x1,x2…xt-2},…,xt-1={x1,x2And (6) supplementing the original training set and test set with the subsequences.
In this embodiment, the item recommendation model is mainly trained by using data in a training set, and the test set is used for the following test effect discussion.
S2, preprocessing of data: the session-based sequence is divided into input data and tagged items, and each item in the input data is embedded, represented as an embedded vector.
The method comprises the steps of firstly, dividing each conversation sequence in a training set into input data and label items, and training the input data and the label items in the training set by adopting a word2vec mode to obtain an initialized embedding vector (item embedding) matrix W of all itemsembAnd converting each conversation sequence into vector representation, specifically comprising the following sub-steps:
s21, the last item of each conversation sequence is used as the label item of the sequence, and the rest items are used as input data.
This step sets each session sequence x to { x ═ x1,x2…xtY as the tag item of the sequence, will { x }1,x2…xtAs input data.
S22, training in the training set by adopting the word2vec method to obtain the initialized embedded vector matrix W of all the itemsembAnd it is used as a parameter matrix of the model with the size of N × M, where N is the total number of items and M is the dimension of the embedded vector of the items.
This step first initializes an embedding vector (item embedding) matrix W of all itemsembThe size is N × M, where N is the total number of items and M is the dimension of the embedded vector of items, where M is 50. In this embodiment, for the training set, each item point is regarded as a word, each conversation sequence is regarded as a sentence, and training is performed on the training set in a word2vec manner to obtain an initialized embedded vector matrix of all items.
S23, embedding each item in each inputted conversation sequence into matrix WembThe vector in (1) is shown, and an embedded vector matrix X of the session sequence is formed.
This step takes each as input the session sequence x ═ x1,x2…xtEach item in the item embedding matrix WembThe vector in (1) is expressed to form a new conversation orderColumn matrix X, size t 50.
And S3, acquiring the hidden variable, inputting the input data expressed as the embedded vector into a recurrent neural network, and introducing a regularized flow learning algorithm and an attention mechanism to acquire the hidden variable.
Firstly, inputting input data expressed as embedded vectors into a cyclic neural network to obtain a hidden state, constructing a first hidden variable, and optimizing the first hidden variable through a regularized flow algorithm to obtain an optimized first hidden variable; and simultaneously, acquiring a second hidden variable by using an attention mechanism according to the acquired hidden state, and combining the optimized first hidden variable and the optimized second hidden variable to acquire a final hidden variable. The recurrent neural network adopted in this step is GRU, and the number of layers is 100. The method comprises the following steps:
s31, using the recurrent neural network to encode the input data expressed as the embedded vector to obtain the hidden state htAnd constructing a first hidden variable z0。
This step encodes a session sequence matrix X represented as an embedded vector using GRUs, the update process for each layer being as follows:
hidden state
Wherein t is a hidden state corresponding to the t-th item of each layer of the GRU;
updating the door zt=σ(Wzxt+Uzht-1);
Candidate activation function
Reset gate rt=σ(Wrxt+Urht-1);
Generating mean of distribution of real sessions in latent space by the above updating process
Sum variance
Then sampling in the distribution to obtain a first hidden variable representation of the potential space of each conversation sequence
The above sigma is expressed as sigmoid activation function, xtT item, W, representing each layer inputz、Wt、Wn、Wσ、Uz、Ut、Ur、bμ、bσAll represent the training parameters of the model, and ε is a variable used in a standard normal distribution of random samples.
S32, defining a set of invertible functions f ═ f1,…,fK]The first hidden variable z obtained in step S310Inputting the variable into the set of reversible functions to transform, and learning to obtain an optimized first hidden variable zk。
Variational inference is commonly employed in the art to approximate the probability of a true posterior distribution. For a given sequence of sessions, its log probability is calculated as follows:
in the formula (p)θ(x | z) is the true distribution of the input session sequence data), p (z) is a priori distribution of pre-assumed hidden variables, usually set to a standard normal distribution, qφ(z | x) is a posterior distribution to be learned, which is used to approximate the true distribution of the hidden variables, z is the hidden variable, x is the input session sequence data, and phi and theta are parameters in the probability distribution, respectively.
None of the existing methods completely solves the deviation between the approximate and true posterior, which is mainly due to this term
Assumption for prior distribution inThe result of the inaccuracy. The used regularization flow method can well relieve the deviation, and further improve the efficiency of approximating the real posterior distribution in the variation inference.
The process of implementing the regularization stream in this embodiment is: defining a set of invertible functions f ═ f
1,…,f
K]The first hidden variable z obtained in the last step
0Is input into the set of reversible functions to be transformed, and is learned to obtain z
k=f
K(f
K-1(z
K-2))=f
K(f
K-1(…f
1(z
0) ) each invertible function may be referred to herein as a stream, where K is set to 16. Taking an example of a reversible function called planar herein, f can be defined as
d represents the dimension of u, w, b ∈ R
d,
The activation functions tanh, u, w, b all represent parameters of the invertible function to be learned. Through the operation of the set of reversible functions, the optimized first hidden variable z is obtained through learning
kMore closely to the true latent layer data distribution, learning z
kThe distribution process is as follows:
herein, the
Is a Jacobian obtained by a reversible function theory and has
Here an initialized gaussian distribution p
0(z
0) By a set of invertible functionsP obtained thereafter
K(z
k) Is referred to as a regularized stream. Furthermore, with a regularized flow process, the approximate posterior distribution that needs training is
Z here
0~q
φ(z
0Is) is a gaussian distribution.
S33, acquiring a second implicit variable c based on the variation attention mechanism.
Firstly, calculating and obtaining the probability distribution alpha of different item points in the conversation sequence to the current corresponding output value of the circular track encoder according to the following formulait,
In the formula, hiA hidden state representing the ith item of each layer of the recurrent neural network, i ═ 1,2, …, t; h istW represents the hidden state of the t-th item, obtained in step S31TThe transpose of the parameter matrix W that needs to be learned for this session sequence.
Then, the attention vector c is calculated by weighted input summation and is taken as a second implicit variable:
s34, splicing the optimized first hidden variable obtained in the step S32 and the second hidden variable obtained in the step S33 together, namely splicing the second hidden variable c to the optimized first hidden variable zkThe final hidden variable m is obtained later.
And S4, acquiring the first loss, inputting the final hidden variable into the classifier, outputting the next prediction item of the user as the prediction item, calculating the cross entropy loss by using the prediction item and the label item, and taking the cross entropy loss as the first loss.
The classifier adopted in the step is composed of a fully connected network. Inputting the combined final hidden variable m into a classificationAnd a processor for outputting the predicted next user click item as the predicted item
Then using the sequence label item y (i.e. the last item of each conversation sequence) obtained by the previous division and the predicted item
The cross-entropy loss is calculated and,
and takes this as the first loss, N being the total number of items.
S5, obtaining a second loss, and normalizing the optimized first hidden variable z in the flowKInput into decoder to obtain second loss L2。
The decoder used in this step is a GRU.
Optimizing a first implicit variable z in a regularized streamKThe input to the decoder restores the most original sequence of input sessions, in which the true distribution of the implied variables can be learned. The second loss constructed in this step is:
using posterior distribution q
φ(z | x) to approximate the true distribution p of the implied variable z
θ(x, z) and has q
φ(z | x) is defined as q (z)
K),p
θ(x, z) represents the true distribution of the input original session sequence, z, x represent the hidden variables in the regularized flow model and the input session sequence data, respectively, and theta, phi represent the parameters of the probability distribution, respectively, in the above formula
Which represents the loss of the reconstruction and,
is a constant term and is a constant number,
and the normalized stream is represented, beta is a coefficient and is used for adjusting the influence degree of the normalized stream on the prediction model, and the influence degree is 0.2 in the step.
And S6, adding the first loss and the second loss to generate a final total loss.
This step will be described in1And L2Add together to generate the final total loss L.
And S7, repeating the steps S2-S6, and minimizing the total loss to obtain the project recommendation model.
The purpose of this step is to minimize the total loss. For the first loss L
1The minimum value can be calculated by gradient descent. For the second loss L
2The lower bound of Evidence (ELBO) (i.e., the maximum of evidence) can be passed
) To make the second loss L
2And minimum.
In order to achieve the above purpose, the learning rate used in the project recommendation model training process is 0.001, and the number of session sequences in the data set used for training in each batch is 512 by adopting a batch training mode. The excessive total loss can improve the training situation by adjusting the dimension of the embedded vector of the project point and the dimension of the output vector in the GRU model.
And when the total loss L is stable, finishing the training of the project recommendation model.
Application example
Aiming at the test sets of the two experimental data sets, respectively adopting the project recommendation model obtained by the embodiment training to execute project recommendation according to the following steps:
s1', preprocessing of data: each item of input data in the session-based sequence is embedded, represented as an embedded vector.
In order to implement the test on the test set, in the present application example, the session-based sequence in the test set is still divided into input data and tag items, and each item in the input data is embedded and represented as an embedded vector. The processing of this step may refer to step S2 in the model training process.
The label item is used for judgment of the prediction accuracy in step S2'.
In practical application, when the trained project recommendation model is adopted to predict the next project of the user, the input data and the label project do not need to be divided based on the sequence of the conversation, and the input data and the label project are all used as input data to be processed.
S2', obtaining the user prediction item, inputting the conversation sequence expressed as the embedded vector into the corresponding item recommendation model obtained by the embodiment training, and obtaining the next prediction item of the user.
The step adopts a scoring matrix method to assign a score to each item, and recommends the item according to the score, so that the accuracy and efficiency of recommending the item can be further improved while the recommended item is obtained. The method comprises the following steps:
and S21', inputting the conversation sequence expressed as the embedded vector into the recurrent neural network, and introducing a regularized flow learning algorithm and an attention mechanism to obtain the hidden variable m.
The processing of this step may refer to step S3 in the model training process.
S22', generating a scoring matrix S ═ W using the resulting hidden variables memb TBm, this scoring matrix S containing a score, W, for each itememb TRepresents WembAnd transposing, wherein m is a final hidden variable, selecting an item n (n is 20) before scoring in the S as a final prediction result, if the 20 items have a real label item of the user, considering that the prediction is correct, in practical application, recommending 20 items to the user every time, and if the user is interested in the items, recommending successfully.
The predicted effect on the test set using the item recommendation model (VASER) described above is shown in table 2.
To further illustrate the predictive effect of the session-based item recommendation method implemented with a variational regularized stream provided by the present invention. The application example further utilizes two experimental data sets to train on six baseline methods (item-KNN, GRU4REC, NARM, STAMP, RelavaR and CRM) to obtain a conversation-based sequence project recommendation model (SBR), and then utilizes the six models to predict a next project to be clicked for a user in a test set, wherein the prediction results of the six models are shown in Table 2. The index Recall @20 represents the proportion of the real label items in the first 20 items of the prediction result in the test case, and Recall @20 does not care about the real ranking of the items, so long as the real label items exist in the first 20 predicted items, the prediction is successful. The second index MRR @20((Mean Recircular Rank) is a mechanism for evaluating the ranking result internationally and commonly used, firstly, the items before 20 are ranked from high to low according to the probability, then, the predicted items are judged one by one, if the first predicted item is correct, the score is 1, the score for the second predicted item is 0.5, the score for the nth is 1/n, if no correct item exists, the score is 0, the final score of MRR @20 is the average value of the sum of the scores of all the sequence predicted items in the test set, MRR considers the ranking of the predicted items, and the index is important in the problem that the recommendation sequence is important.
Table 2: performing a session-based sequential item recommendation model result on two sub data sets
The rest of the methods in the table are described below:
Item-KNN: it is an item-to-item model that recommends similar items to those previously visited based on cosine similarity. Reference may be made to the paper: badrul Munir Sarwar, George Karysis, Joseph A Konstan, and John Riedl.2001.item-based collectible filtering correspondence in WWW.
GRU4 REC: this is a cyclic neural network (RNN) based deep learning model for session-based recommendations. It uses a GRU unit to capture sequence information and uses parallel small batch processing skills and an order-based loss function in the training process. Reference may be made to the paper: bal, zsHidasi, Alexandros Karatzoglou, LinasBaltrunas, and Domonkos Tikk.2016.Session-based Recommendations with recovery Neural networks.
NARM: it is a Recurrent Neural Network (RNN) based model that takes the main information from hidden variables using an attention mechanism and combines it with sequence information to generate recommended items. Reference may be made to the paper: sting Li, Pengjie Ren, Zhumin Chen, Zhuochun Ren, Tao Lian, and Jun Ma.2017.neural Interactive Session-based recommendation. in CIKM.
STAMP: it is a priority model that captures the user's general interests from the long-term memory of the session context and the current interests from the short-term memory of the most recently clicked item. Reference may be made to the paper: qiao Liu, Yifu Zeng, RefuoBroksi, and Haibin Zhang.2018 StaMP, ShortTermtentintent/Memory Priority Model for Session-based recommendation. InKDD.
ReLaVaR, a Bayesian version of GRU4Rec, considers the recursion unit of the network as a random hidden variable with a certain prior distribution, and deduces the corresponding posterior distribution to predict the next click item. The method is an SBR-based project-level variation inference method which takes an independent Gaussian function as the prior distribution of projects. Reference may be made to the paper: sotirios P Chatzis, Panayotis Christodou, and Andrea S Androeou.2017. Current tension Variable Networks for Session-Based Recommendation. in DLRS @ RecSys.
CRM, which is a recently proposed method to directly apply VAEs to session-based recommendations. Unlike the project-level variation method ReLaVaR, VRM belongs to a session-level variation inference method. Reference may be made to the paper: zhitao Wang, Chengyao Chen, Ke Zhang, Yu Lei, and Wenjie Li.2018.variational Recurrent Model for Session-based recommendation. in CIKM.
As can be seen from the prediction results in Table 2, the accuracy of the prediction of the conversation-based item recommendation method implemented by using the regularized stream of the variation provided by the invention is higher than that of the existing methods.
In order to explain why the regularized stream of the variation can improve the prediction result, the present application example visually compares the hidden variables provided by the present embodiment of the VASER model with the hidden variables of the general method, as shown in fig. 2, it can be seen from fig. 2 that the hidden variables are more dispersed by the VASER model, which is why the prediction effect of the VASER is higher than that of other methods.
In order to illustrate the influence of the session length on the prediction result, the application example divides the test set into a plurality of subsets according to the session length, then predicts in the plurality of test subsets respectively by adopting a NARM model, a STAMP model and a VASER model, and predicts the next item to be clicked for the user, wherein the prediction results (Recall @20 index) of the six models are shown in FIG. 3, and it can be seen from the figure that the prediction accuracy rate is reduced along with the increase of the session sequence length. However, the dependency of the VASER model provided by the invention on the session length is slightly worse than that of other models, which shows that the VASER model provided by the method has better prediction stability.
To illustrate the influence of β on the prediction result, β of the VASER model in this application is set to 0, 0.1, 0.2, 0.5, and 1, and then the training sets of the data set YOOCHOOSE 1/64 are used to train respectively according to the method provided by the embodiment, and the trained VASER model is used to predict the test set of the data set YOOCHOOSE 1/64, so as to predict the next item to be clicked for the user, and the prediction result (Recall @20 index) is shown in fig. 4. As can be seen from the figure, the effect is best when β is 0.2.
To illustrate the effect of the parameter K in the regularized stream on the prediction results, K of the VASER model of the present application is set to 2 respectively1,22,23,24,25Then, training is carried out by using a training set of data sets Yoochoose 1/64 and Yoochoose1/4 according to the method given by the embodiment, and prediction is carried out in a test set of data sets Yoochoose 1/64 and Yoochoose1/4 by using a trained VASER model, and the next item to be clicked is predicted for the user, and the knot is predictedThe fruit (Recall @20 index) is shown in FIG. 5. As can be seen from the figure, the prediction accuracy increases with the increase of K, but the training time becomes larger with the increase of K, therefore, K in the invention is taken as 24。
In summary, the session-based project recommendation method implemented by using the variational regularization stream can effectively solve the problem that the data set lacks user information, and the model can be well adapted to long-sequence session data. The added regularization stream is good at accounting for errors in previous variational inference (e.g., VAE) processes. So that the real distribution of the session sequence can be learned by our model, which is very helpful to improve the prediction accuracy.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.