CN108549729A

CN108549729A - Personalized user collaborative filtering recommending method based on Covering reduct

Info

Publication number: CN108549729A
Application number: CN201810486715.0A
Authority: CN
Inventors: 张志鹏; 任永功; 邹丽; 崔晓松
Original assignee: Liaoning Normal University
Current assignee: Dalian Houren Technology Co ltd
Priority date: 2018-05-21
Filing date: 2018-05-21
Publication date: 2018-09-18
Anticipated expiration: 2038-05-21
Also published as: CN108549729B

Abstract

The invention discloses a personalized user collaborative filtering recommendation method based on coverage reduction, which clearly defines the concept of redundant users of target users, and according to the function of removing redundant elements in coverage reduction in rough set of coverage, the target user's Redundant users are removed, thereby ensuring the quality of the target user's neighboring users, and using the scoring information of these high-quality neighboring users to provide high-precision and diverse personalized recommendations for the target user.

Description

Collaborative filtering recommendation method for personalized users based on coverage reduction

技术领域technical field

本发明涉及推荐系统领域，尤其涉及一种可提高推荐精确度且具有多样性的基于覆盖约简的个性化用户协同过滤推荐方法。The invention relates to the field of recommendation systems, in particular to a personalized user collaborative filtering recommendation method based on coverage reduction that can improve recommendation accuracy and has diversity.

背景技术Background technique

推荐系统能够通过用户的个人信息智能地感知用户的兴趣或需求，实现信息的高质量推荐，有效地解决了“信息过载”问题。用户协同过滤算法是推荐系统领域应用最广泛、最成功的技术之一，其假设如果用户在过去有相似的爱好，那么他们在将来也可能有相似的爱好，具有计算简单、效率及精度高等优点。但是，现有的用户协同过滤算法中，目标用户的邻近用户趋于拥有相同的爱好，所以通过这些邻近用户获得的预测评分高的物品往往集中于少量种类的物品，甚至仅仅是流行物品，因此其推荐的多样性往往不尽人意。The recommendation system can intelligently perceive the user's interests or needs through the user's personal information, realize high-quality recommendation of information, and effectively solve the problem of "information overload". User collaborative filtering algorithm is one of the most widely used and successful technologies in the field of recommendation systems. It assumes that if users have similar hobbies in the past, they may also have similar hobbies in the future. It has the advantages of simple calculation, high efficiency and high precision. . However, in the existing user collaborative filtering algorithm, the adjacent users of the target user tend to have the same hobbies, so the items with high predicted ratings obtained through these adjacent users are often concentrated in a small number of items, or even only popular items, so The diversity of its recommendations is often less than satisfactory.

发明内容Contents of the invention

本发明是为了解决现有技术所存在的上述技术问题，提供一种可提高推荐精确度且具有多样性的基于覆盖约简的个性化用户协同过滤推荐方法。The present invention aims to solve the above-mentioned technical problems existing in the prior art, and provides a personalized user collaborative filtering recommendation method based on coverage reduction that can improve recommendation accuracy and has diversity.

本发明的技术解决方案是：一种基于覆盖约简的个性化用户协同过滤推荐方法，其特征在于依次按照如下步骤进行：The technical solution of the present invention is: a personalized user collaborative filtering recommendation method based on coverage reduction, which is characterized in that the following steps are followed in turn:

步骤1.统计形成二维评分信息表：Step 1. Statistically form a two-dimensional scoring information table:

根据用户对物品的评分信息形成二维评分信息表RM={U，I，R∪{*}}；所述二维评分信息表RM中，U表示用户的集合，I表示物品的集合，R∪{*}表示用户对物品的评分集合，其中*表示用户未对物品进行评分；Form the two-dimensional rating information table RM={U, I, R∪{*}} according to the rating information of the item by the user; in the two-dimensional rating information table RM , U represents the set of users, I represents the set of items, and R ∪{*} indicates the user's rating set for the item, where * indicates that the user has not rated the item;

令用户u∈U对物品i∈I的评分为r _u,i ∈R∪{*}，并且用户u的平均评分为；θ为用户评分的阈值，如果r _u,i ≥θ，表明用户u喜欢物品i；用户u已评分的物品集合为I _u ={i∈I|r _u,i ≠*}；为用户u未评分的物品集合；物品属性矩阵为AM；在用户集合U中，如果存在用户a喜爱的物品集合包含于用户b喜爱的物品集合，那么用户a就称为目标用户的冗余用户；Let user u∈U score r _u,i ∈R∪{*} on item i∈I , and the average score of user u is ; θ is the threshold of user rating, if r _u,i ≥ θ , it indicates that user u likes item i ; the set of items rated by user u is I _u ={i∈I|r _u,i ≠*} ; It is the collection of unrated items for user u ; the item attribute matrix is AM ; in the user set U, if there is a collection of items that user a likes that is included in the collection of items that user b likes, then user a is called the redundant user of the target user ;

步骤2.利用覆盖约简算法对冗余用户进行约简:Step 2. Use the coverage reduction algorithm to reduce redundant users:

步骤2.1令物品集合I作为论域I，在论域I中，每个用户喜欢的物品组成一个集合；在物品属性矩阵AM中提取目标用户的喜爱属性：Step 2.1 Let the item collection I be the domain of discourse I. In the domain of discourse I , each user’s favorite items form a set; extract the target user’s favorite attributes from the item attribute matrix AM :

(1) (1)

式(1)中，m表示属性的数目，at _m表示一个属性，av _m表示属性at _m的值；In formula (1), m represents the number of attributes, at _m represents an attribute, and av _m represents the value of attribute at _m ;

步骤2.2.利用获得的目标用户的喜爱属性，构建目标用户的决策集D，决策集D由具备喜爱属性的物品集合组成：Step 2.2. Use the acquired target user's favorite attributes to construct the target user's decision set D , which is composed of a collection of items with favorite attributes:

(2) (2)

式(2)中，at _m (i)= av _m表示物品i在属性at _m上的值为av _m；In formula (2), at _m (i) = av _m means that the value of item i on attribute at _m is av _m ;

步骤2.3. 将论域I由物品集合缩减为目标用户的决策集D，即论域D；对于每一个用户u ∈U，构建用户u在论域D上的喜爱物品集合C _u：Step 2.3. Reduce the domain of discourse I from the item set to the decision set D of the target user, that is, the domain of discourse D ; for each user u ∈ U , construct the favorite item set C _u of user u on the domain of discourse D :

(3) (3)

令C*=D-∪C _u ，C={C ₁ ,C ₂ …C _n , C*} 构成了目标用户在论域D的一个覆盖C； Let C*=D-∪C _u , C={C ₁ ,C ₂ …C _n , C*} constitute a coverage C of the target user in the domain of discourse D ;

步骤2.4利用覆盖约简算法，将冗余元素从覆盖C中约简，得到约简完毕后的覆盖reduct(C)及约简后的用户U ^r:Step 2.4 Use the coverage reduction algorithm to reduce the redundant elements from the coverage C , and obtain the reduced coverage reduct ( C ) and the reduced user U ^r :

(4) (4)

步骤3.利用约简后的用户U _r构建目标用户au的候选邻近用户u；Step 3. Use the reduced user U _r to construct a candidate adjacent user u of the target user au ;

步骤4.计算目标用户和候选邻近用户的相似度，选取目标用户的邻近用户：Step 4. Calculate the similarity between the target user and the candidate neighboring users, and select the neighboring users of the target user:

利用皮尔森相似度量函数(5)计算目标用户au和候选邻近用户u∈U _r之间的相似度，Use the Pearson similarity measure function (5) to calculate the similarity between the target user au and _{the candidate neighboring users u∈Ur} ,

(5) (5)

式(5)中，sim(au,u)表示目标用户au和候选邻近用户u∈U _r之间的相似度，I _au ={i∈I| r _au,i ≠*}表示目标用户au评价过的物品集合，表示目标用户的平均评分值；In formula (5), sim(au,u) represents the similarity between the target user au and the candidate adjacent user u∈U _r , I _au ={i∈I| r _au,i ≠*} represents the evaluation of the target user au past collection of items, Indicates the average rating value of the target user;

然后选择相似度高的前K个候选邻近用户作为目标用户的邻近用户N _au (k)；Then select the first K candidate adjacent users with high similarity as the adjacent users N _au (k) of the target user;

步骤5.对目标用户未评分物品进行预测评分：Step 5. Predictive scoring of items not rated by the target user:

根据目标用户的邻近用户N _au (k)的评分信息，利用调整加权和函数(6)对目标用户au未评分的物品集合进行预测评分，得到目标用户的预测评分表；According to the scoring information of the target user's neighbor user N _au (k) , use the adjusted weighted sum function (6) to set the unrated item set of the target user au Perform predictive scoring to obtain the target user's predictive score table;

(6) (6)

式(6)中，P _au,i表示目标用户au对物品i的预测评分，U _i ={u∈U|r _u,i ≠*}表示评价过物品i的用户集合；λ作为一个正则化因子：In formula (6), P _au,i represents the target user au ’s predicted score on item i , U _i ={u∈U|r _u,i ≠*} represents the set of users who have evaluated item i ; λ is used as a regularization factor:

(7) (7)

步骤6.选取预测评分高的前N个物品作为推荐结果。Step 6. Select the top N items with high predicted scores as the recommendation result.

本发明明确定义了目标用户的冗余用户的概念，根据覆盖粗糙集中覆盖约简可以移除冗余元素的功能，将目标用户的冗余用户移除，从而保证了目标用户的邻近用户的质量，从而利用这些高质量的邻近用户的评分信息实现为目标用户提供高精度和多样化的个性化推荐。The present invention clearly defines the concept of redundant users of the target user, and removes the redundant users of the target user according to the function of removing redundant elements in the coverage rough set coverage reduction, thus ensuring the quality of the adjacent users of the target user , so as to use these high-quality rating information of neighboring users to provide high-precision and diverse personalized recommendations for target users.

附图说明Description of drawings

图1为本发明实施例的流程示意图；Fig. 1 is the schematic flow chart of the embodiment of the present invention;

图2为本发明实施例与对比例精确度度量（MAE和RMSE）随着目标用户的邻近用户数量的变化而相应的结果示意图。Fig. 2 is a schematic diagram of corresponding results of the embodiment of the present invention and the comparative accuracy metrics (MAE and RMSE) as the number of neighboring users of the target user changes.

图3为本发明实施例与对比例多样性度量（Coverage）随着目标用户的邻近用户数量的变化而相应的结果示意图。FIG. 3 is a schematic diagram of corresponding results of the diversity measure (Coverage) as the number of adjacent users of the target user changes according to the embodiment of the present invention and the comparative example.

具体实施方式Detailed ways

本发明的一种基于覆盖约简的个性化用户协同过滤推荐方法，如图1所示依次按照如下步骤进行：A kind of personalized user collaborative filtering recommendation method based on coverage reduction of the present invention, as shown in Figure 1, proceeds according to the following steps successively:

如：用户集合U = {用户1，用户2，用户3，目标用户}，物品集合I = {物品1，物品2，物品3，物品4，物品5，物品6}，评分R的取值范围为[1,5]。则二维评分信息表RM如表1所示：For example: user set U = {user 1, user 2, user 3, target user}, item set I = {item 1, item 2, item 3, item 4, item 5, item 6}, the value range of rating R is [1,5]. The two-dimensional scoring information table RM is shown in Table 1:

表1Table 1

令用户评分的阈值等与3，评分大于等于3的物品作为用户的喜爱物品，由表1可知：Let the threshold value of the user's score equal to 3, and the items with a score greater than or equal to 3 are used as the user's favorite items, as can be seen from Table 1:

用户1的喜爱物品为{物品2，物品4，物品6}；User 1's favorite items are {item 2, item 4, item 6};

用户2的喜爱物品为{物品4，物品6}；User 2's favorite items are {item 4, item 6};

用户3的喜爱物品为{物品2，物品3，物品6}；User 3's favorite items are {item 2, item 3, item 6};

目标用户的喜爱物品为{物品1，物品3，物品4}；The target user's favorite items are {item 1, item 3, item 4};

(1) (1)

如令物品集合I = {物品1，物品2，物品3，物品4，物品5，物品6}作为论域，表2表示物品的属性矩阵AM，根据表2和目标用户的喜爱物品集合，统计得到目标用户的喜爱物品所对应的属性值：For example, let the item set I = {item 1, item 2, item 3, item 4, item 5, item 6} as the domain of discourse, Table 2 shows the attribute matrix AM of the item, according to Table 2 and the favorite item set of the target user, statistics Get the attribute value corresponding to the favorite item of the target user:

喜剧=3，惊悚=2，动作=1，戏剧=1，音乐=1，Comedy=3, Thriller=2, Action=1, Drama=1, Music=1,

选择统计值最大的两个属性作为目标用户的喜爱属性，则目标用户的喜爱属性为：Select the two attributes with the largest statistical value as the favorite attributes of the target user, then the favorite attributes of the target user are:

[喜剧=1]∧[惊悚=1]∧[动作=0]∧[戏剧=0]∧[音乐=0][comedy=1]∧[thriller=1]∧[action=0]∧[drama=0]∧[music=0]

表2Table 2

喜剧comedy 惊悚thriller 动作action 戏剧drama 音乐music 物品1Item 1 11 00 11 11 00 物品2Item 2 11 11 00 11 00 物品3Item 3 11 11 00 00 00 物品4Item 4 11 11 00 00 11 物品5item 5 00 00 11 11 00 物品6Item 6 11 11 11 00 11

(2) (2)

式(2)中，at _m (i)= av _m 表示物品i在属性at _m上的值为av _m；In formula (2), at _m (i) = av _m means that the value of item i on attribute at _m is av _m ;

如用目标用户的喜爱属性：[喜剧=1]∧[惊悚=1]∧[动作=0]∧[戏剧=0]∧[音乐=0]，构建目标用户的决策集D，该决策集由所有拥有（喜剧，惊悚）属性的物品构成，即：For example, using the favorite attribute of the target user: [comedy=1]∧[thriller=1]∧[action=0]∧[drama=0]∧[music=0], construct the decision set D of the target user, which is composed of All items with (comedy, thriller) attributes, namely:

根据表2可得决策集D={物品2，物品3，物品4，物品6}；According to Table 2, the decision set D={item 2, item 3, item 4, item 6} can be obtained;

步骤2.3.为了最大限度地消除目标用户的冗余用户，将论域I由物品集合缩减为目标用户的决策集D，即论域D；对于每一个用户u∈U，构建用户u在论域D上的喜爱物品集合C _u：Step 2.3. In order to eliminate the redundant users of the target user as much as possible, the domain I is reduced from the item set to the decision set D of the target user, that is, the domain D ; for each user u∈U , construct the user u in the domain A collection of favorites C _u on D :

(3) (3)

如C ₁={物品2，物品4，物品6}；Such as C ₁ = {item 2, item 4, item 6};

C ₂={物品4，物品6}； C ₂ = {item 4, item 6};

C ₃={物品2，物品3，物品6}； C ₃ = {item 2, item 3, item 6};

则C = { C ₁ ，C ₂ ，C ₃ } 就构成目标用户决策集D上的一个覆盖C；Then C = { C ₁ , C ₂ , C ₃ } constitutes a coverage C on the target user decision set D ;

步骤2.4利用覆盖约简算法，将冗余元素从覆盖C中约简，得到约简完毕后的覆盖reduct(C)；冗余元素约简完毕意味着目标用户的冗余用户已经全部删除，从而约简后的用户U ^r:Step 2.4 uses the coverage reduction algorithm to reduce the redundant elements from the coverage C to obtain the coverage reduct ( C ) after the reduction is completed; the completion of the reduction of the redundant elements means that all redundant users of the target user have been deleted, thus User U ^r after reduction:

(4) (4)

由于C ₂ ⊂ C ₁，根据覆盖约简算法，C ₂被称为冗余元素从覆盖C中移除，因此reduct(C)={C ₁ , C ₃}；用户2就被称为目标用户的冗余用户被移除所以约简后的用户U ^r= {用户1，用户3}；Since C ₂ ⊂ C ₁ , according to the cover reduction algorithm, C ₂ is called the redundant element removed from the cover C , so reduce ( C )={ C ₁ , C ₃ }; user 2 is called the target user The redundant users of are removed so the reduced user U ^r = {user 1, user 3};

步骤3.利用约简后的用户U ^r构建目标用户au的候选邻近用户u，即目标用户的候选邻近用户为{用户1，用户3}；Step 3. Use the reduced user U ^r to construct the candidate neighboring user u of the target user au , that is, the candidate neighboring users of the target user are {user 1, user 3};

利用皮尔森相似度量函数(5)计算目标用户au和候选邻近用户u∈U _r之间的相似度， _Use the Pearson similarity measure function (5) to calculate the similarity between the target user au and the candidate neighboring user u∈Ur ,

(5) (5)

即利用皮尔森相似度量函数分别计算目标用户和用户1，目标用户和用户3的相似度：That is, use the Pearson similarity metric function to calculate the similarity between the target user and user 1, and the target user and user 3 respectively:

sim(目标用户，用户1) = -0.76sim(target user, user1) = -0.76

sim(目标用户，用户3) = -0.53sim(target user, user3) = -0.53

如果选取相似度最高的两位候选邻近用户作为目标用户的邻近用户，则目标用户的邻近用户N_目标用户(2) = {用户3，用户1}；If two candidate adjacent users with the highest similarity are selected as the adjacent users of the target user, then the adjacent user N of _{the target user} (2) = {user 3, user 1};

(6) (6)

(7) (7)

根据目标用户的邻近用户的评分信息，利用调整加权函数对目标用户未评分的物品5和物品6进行预测评分，结果如下：According to the rating information of the target user’s neighbor users, the adjusted weighting function is used to predict and score the item 5 and item 6 that the target user has not rated. The results are as follows:

P_{目标用户，物品5}= 2.16P _{target user, item 5} = 2.16

P_{目标用户，物品6}= 4.93P _{target user, item 6} = 4.93

步骤6.选取预测评分值最高的一个物品作为推荐结果的话，物品6将推荐给目标用户。Step 6. If an item with the highest predicted score is selected as the recommendation result, item 6 will be recommended to the target user.

实验：experiment:

(1) 使用公开数据集(1) Use public datasets

使用推荐系统领域经常用来测试推荐系统性能的公开数据集MovieLens。该数据集包含943个用户，1682个电影和100000个评分，评分值分布为{1,2,3,4,5}，每个用户至少对20个电影进行了评分。Use the public dataset MovieLens, which is often used to test the performance of recommender systems in the recommender system field. The dataset contains 943 users, 1682 movies and 100000 ratings, the distribution of ratings is {1, 2, 3, 4, 5}, and each user has rated at least 20 movies.

(2) 评价度量(2) Evaluation metrics

本发明采用平均绝对误差MAE和均方根误差RMSE来度量算法的精确度，MAE和RMSE都是通过计算用户的实际评分和预测评分之间的偏差来度量推荐结果的精确度，因此，MAE和RMSE越小，推荐精度越高：The present invention uses mean absolute error MAE and root mean square error RMSE to measure the accuracy of the algorithm. Both MAE and RMSE measure the accuracy of the recommendation result by calculating the deviation between the user's actual rating and the predicted rating. Therefore, MAE and The smaller the RMSE, the higher the recommendation accuracy:

(8) (8)

(9) (9)

采用覆盖度Coverage来度量算法的多样性。覆盖度指能够推荐给目标用户的物品种类占所有目标用户为评价物品种类的比例，因此，覆盖度Coverage越高，推荐越多样化。Coverage is used to measure the diversity of algorithms. Coverage refers to the ratio of the types of items that can be recommended to target users to the types of items rated by all target users. Therefore, the higher the coverage, the more diverse the recommendations.

(10) (10)

式（10）中，，其中S_u,i表示评价过物品i的用户u的邻近用户。In formula (10), , where S _u,i represents the neighboring users of user u who has rated item i.

（3）参数设置(3) Parameter setting

本发明采用皮尔森相似度量函数计算用户的相似度，使用调整加权函数对目标用户未评分的物品进行预测评分。选取目标用户最喜欢的前两个物品属性作为目标用户的喜爱属性。为了清楚地对比本发明与传统用户协同过滤算法，目标用户的邻近用户数量K∈{20,25,30,…,60}。将推荐物品的数量设定为{2,4,6,8,10,12}。The invention adopts the Pearson similarity measurement function to calculate the user similarity, and uses the adjustment weighting function to predict and score the items not rated by the target user. Select the first two item attributes that the target user likes most as the favorite attribute of the target user. In order to clearly compare the present invention with the traditional user collaborative filtering algorithm, the number of adjacent users of the target user is K∈{20,25,30,...,60}. Set the number of recommended items as {2, 4, 6, 8, 10, 12}.

（4）实验结果对比与分析(4) Comparison and analysis of experimental results

本发明的基于覆盖约简的个性化用户协同过滤推荐方法用CBCF表示，传统的用户协同过滤算法用UBCF表示，图2显示了精确度度量MAE和RMSE的结果。通过图2数据可知，随着目标用户的邻近用户数目的增加，CBCF算法的MAE和RMSE结果一直小于UBCF算法的结果。由于MAE和RMSE越小，推荐精度越高，因此CBCF能够推荐比UBCF精度更高的物品。图3显示了多样性度量Coverage的结果，通过图3可知，随着目标用户的邻近用户数目的增加，CBCF算法的覆盖度明显大于UBCF的覆盖度。由于覆盖度Coverage越高，推荐越多样化，因此CBCF能够推荐比UBCF更加多样的物品。综合实验结果可知，本发明能够同时提供高精度和多样化的推荐结果，从而实现目标用户的个性化推荐。The personalized user collaborative filtering recommendation method based on coverage reduction in the present invention is represented by CBCF, and the traditional user collaborative filtering algorithm is represented by UBCF. Fig. 2 shows the results of accuracy measurement MAE and RMSE. From the data in Figure 2, it can be seen that with the increase of the number of adjacent users of the target user, the MAE and RMSE results of the CBCF algorithm are always smaller than the results of the UBCF algorithm. Since the smaller the MAE and RMSE, the higher the recommendation accuracy, CBCF is able to recommend items with higher accuracy than UBCF. Figure 3 shows the results of the diversity measure Coverage. It can be seen from Figure 3 that as the number of adjacent users of the target user increases, the coverage of the CBCF algorithm is significantly greater than that of the UBCF. Since the higher the coverage, the more diverse the recommendations, so CBCF can recommend more diverse items than UBCF. From the comprehensive experimental results, it can be seen that the present invention can provide high-precision and diversified recommendation results at the same time, so as to realize personalized recommendation for target users.

Claims

1. a kind of personalized user collaborative filtering recommending method based on Covering reduct, it is characterised in that successively in accordance with the following steps It carries out：

Step 1. statistics forms two-dimentional score information table：

Two-dimentional score information table is formed to the score information of article according to userRM={ U, I, R ∪ { * } }；The two dimension scoring letter Cease tableRMIn,UIndicate the set of user,IIndicate the set of article,R∪{*}Indicate scoring set of the user to article, wherein* Indicate that user does not score to article；

Enable useru∈UTo articlei∈IScoring ber _u,i ∈R∪{*}, and useruAverage score be；θIt is commented for user The threshold value divided, ifr _u,i ≥θ, show useruLike articlei；UseruThe article collection to have scored is combined intoI _u ={i∈I|r _u,i ≠*}；For useruThe article set not scored；Goods attribute matrix isAM；In user's set U, liked if there is user a The article set of love is contained in the favorite article set of user b, then user a is known as the redundant subscribers of target user；

Step 2. carries out yojan using Covering reduct algorithm to redundant subscribers：

Step 2.1 enables article setIAs domainI, in domainIIn, the article that each user likes forms a set；In object Product attribute matrixAMMiddle extraction target user's likes attribute：

(1)

In formula (1),mIndicate the number of attribute,at _mIndicate an attribute,av _mIndicate attributeat _mValue；

Step 2.2. likes attribute using the target user of acquisition, builds the decision set of target userD, decision setDBy having The article collection of attribute is liked to be combined into：

D={ i ∈ I|at ₁(i)= av ₁ ,at ₂(i)=av ₂ ,…,at _m(i)=av _m(2)

In formula (2),at _m(i)= av _mIndicate articleiIn attributeat _mOn value beav _m；

Step 2.3. is by domainIThe decision set of target user is reduced to by article setD, i.e. domainD；For each useru ∈U, build useruIn domainDOn like article setC _u：

(3)

It enablesC*=D-∪C _u , C={ C ₁ ,C ₂ …C _n , C*} Target user is constituted in domainDOne coveringC；

Step 2.4. utilizes Covering reduct algorithm, by redundant elements from coveringCMiddle yojan obtains the covering after yojanreduct(C) and yojan after userU ^r:

(4)

Step 3. utilizes the user after yojanU ^rBuild target userauCandidate adjacent useru；

Step 4. calculates the similarity of target user and candidate adjacent user, chooses the adjacent user of target user：

Target user is calculated using Pearson's similarity metric function (5)auWith candidate adjacent useru∈U _rBetween similarity,

(5)

In formula (5),sim(au,u)Indicate target userauWith candidate adjacent useru∈U _rBetween similarity,I _au ={i∈I| r _au,i ≠*}Indicate target userauThe article set evaluated,Indicate the average score value of target user；

Then before selection similarity is highKAdjacent user of a candidate adjacent user as target userN _au (k)；

Step 5. article that do not score target user carries out prediction scoring：

According to the adjacent user of target userN _au (k)Score information, using adjustment weighted sum function (6) to target userauNot The article set of scoringPrediction scoring is carried out, the prediction grade form of target user is obtained；

(6)

In formula (6),P _au,iIndicate target userauTo articleiPrediction scoring,U _i ={u∈U|r _u,i ≠*}Article was evaluated in expressioniUser set；λAs a regularization factors：

(7)

Before step 6 selection prediction scorings are highNA article is as recommendation results.