[go: up one dir, main page]

CN109344321B - System for obtaining user personalized features - Google Patents

System for obtaining user personalized features Download PDF

Info

Publication number
CN109344321B
CN109344321B CN201810879449.8A CN201810879449A CN109344321B CN 109344321 B CN109344321 B CN 109344321B CN 201810879449 A CN201810879449 A CN 201810879449A CN 109344321 B CN109344321 B CN 109344321B
Authority
CN
China
Prior art keywords
user
document
parameter
vector
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810879449.8A
Other languages
Chinese (zh)
Other versions
CN109344321A (en
Inventor
祁勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Dashu Life Health Technology Co ltd
Original Assignee
Weifang Jiubao Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weifang Jiubao Intelligent Technology Co ltd filed Critical Weifang Jiubao Intelligent Technology Co ltd
Priority to CN201810879449.8A priority Critical patent/CN109344321B/en
Publication of CN109344321A publication Critical patent/CN109344321A/en
Application granted granted Critical
Publication of CN109344321B publication Critical patent/CN109344321B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a system for acquiring personalized features of a user. The method automatically updates the personalized features of the user through the signal of the user accessing the user and the signal of the user accessing the document. The personalized features of the user can be updated according to the personalized features of other users who have accessed the user, and can also be updated according to the personalized features of documents that have accessed the user. According to the query conditions submitted by the query user and the personalized features of each user in the group of users, calculating the personalized ranking value of each user in the group of users, and according to the size of the personalized ranking value, sending the identification of at least one user in the group of users to the query user. The invention also provides a system for acquiring the personalized features of the user. The method can be used for searching the user or the user group with specific characteristics in the social network.

Description

System for obtaining user personalized features
The application is a divisional application of a patent with an application date of 2012, 5 and 8, and an application number of 201210152084.1, and is named as a method and a system for acquiring personalized features of a user.
Technical Field
The invention relates to the field of Internet, in particular to a method and a system for acquiring personalized features of a user.
Background
The primary portal to the early internet was the web portal, where users entered the web world through web pages manually edited by web site workers. Subsequently, due to the development of search technology, people began to use search engines as the main entrance of the internet, and users can conveniently retrieve information on the internet by inputting search keywords. In recent years, with the rapid development of applications such as Facebook (Facebook), Twitter (Twitter), Google +, and microblog, social networks are gradually becoming the main entrance of the internet.
On the social network, users filter and filter internet information mainly through a relationship network established by the users. For example, the user obtains the information published or forwarded by the person in question through adding attention (follow), and obtains the information published by the friends through adding friends. Research shows that the relationship network established by users on the social network mainly comprises a 'strong relationship' network such as family, friends and colleagues of the users, and the relationship network is essentially a copy of the entity relationship network under the subscriber line on line. While social networks have real potential to develop "weak-relationship" user resources that users are not currently aware of, such as domain experts, celebrities, acquaintances, etc. on the network. At present, the method for establishing a 'weak relationship' network on a social network is single, a user can easily find a few people with high known names or a large number of fans, but can hardly find a great number of people who have a good skill in a specific professional field or are interested in a specific subject. This is a disadvantage of existing social networking technologies.
One improvement is to set personalized features for each user on the social network, and then to find users or groups of users with specific features according to the personalized features of the users and add them to the users' own relationship network. However, it is difficult to obtain the personalized features of the user on the social network, and there are several difficulties.
The first is the problem of automatic retrieval of personalized information. At present, there are 9 hundred million users on the facial makeup and 1.4 hundred million users on the Xinlang microblog, and it is unrealistic if most users are required to manually maintain the personalized features of the users on a social network. How to automatically acquire the personalized features of the user is a difficult problem. The second is the update problem of personalized information. As time goes by, personal information such as user's hobbies, workplaces, engaged industries, and education levels may change, but it is difficult for most users to update their personalized information in real time. Thirdly, the semantic difference problem of the personalized information. Among the personalized features set by the user, the personalized features with different terms and the same semantics are difficult to classify. Fourth is the completeness of the personalized information. The personal information provided by the user on the website is generally short. For example, a typical description of a user's interest is a favorite of music, playing baseball, or reading a book, and it is cumbersome and difficult to ask the user to describe his or her interest field in a comprehensive and detailed manner.
In summary, how to effectively obtain the personalized features of the user and search the user and the user group with specific features in the social network according to the personalized features is a problem to be solved.
Disclosure of Invention
In view of the problems existing in the prior art, an object of the present invention is to provide a method and a system for obtaining a personalized feature of a user, so as to automatically obtain the personalized feature of the user, and further search a user and a user group with specific features in a social network according to the personalized feature.
In accordance with the above-mentioned objects, the present invention provides a method for obtaining personalized features of a user, wherein,
acquiring a plurality of users on the internet, and storing a user set U which consists of the users, wherein the user set U is {1, 2. Setting a plurality of features, and storing a feature set K ═ {1, 2.., L } consisting of the plurality of features;
setting initial values of parameter vectors for a plurality of users in the user set U;
the following steps are executed for a plurality of times:
receiving a signal of any user i (i belongs to U) accessing any user j (j belongs to U);
reading a parameter vector Ku (i) (uwi1, uwi 2., uwik., uwiL) of the user i, wherein the uwik represents the correlation degree of the user i and a feature K (K epsilon K);
reading a parameter vector ku (j) ═ uwj1, uwj 2.,. uwjk.,. uwjL of the user j, the uwjk representing a degree of correlation of the user j with a feature K (K e K);
updating the parameter vectors of the user i and the user j by using a parameter vector updating algorithm;
Ku*(i)=function1[Ku(i),Ku(j)];
Ku*(j)=function2[Ku(i),Ku(j)];
wherein, the Ku (i) and the Ku (i) represent parameter vectors of the user i before and after updating, respectively, and the Ku (j) represent parameter vectors of the user j before and after updating, respectively.
Compared with the prior art, the method reflects the personalized features of the user by the parameter vector of the user, solves the problems of automatic acquisition, real-time update and the like of the personalized features of the user through a parameter vector updating algorithm, can search the user and the user group with specific features in the social network according to the parameter vector of the user, and provides a new way for the user to establish a 'weak relationship' network.
Drawings
FIG. 1 is a method for representing a parameter vector for each user in a user set U;
FIG. 2 is a method of representing a parameter vector for each document in the document set D;
FIG. 3 is a method for obtaining user personalized features based on a user access user signal;
FIG. 4 is a method for obtaining personalized features of a user based on a signal from the user accessing a document;
FIG. 5 is a flow chart of a method of querying a user for a particular characteristic;
FIG. 6 is a flow chart of a method for querying a user group having a particular characteristic;
fig. 7 is a diagram of a system for obtaining personalized features of a user.
Detailed Description
The method of the present invention will be described in further detail with reference to the accompanying drawings.
The description of specific embodiments of the process of this patent includes the following sections. Firstly, a numbering method of users, documents and features and a parameter vector representation method of the users and the documents are explained; then, a parameter vector updating algorithm based on a user access user signal and a parameter vector updating algorithm based on a user access document signal are explained; then, a method of searching for a user or a user group on a social network according to a given feature is described; finally, a system for obtaining personalized features of a user is described.
The numbering method of the user, document and feature is explained first.
The method comprises the steps that a plurality of users are obtained on the Internet, each user has at least one user identification, and the user identification comprises one of a user account, a mobile phone number, a Cookie identification code, an IP address, an Email address and an instant communication number. The obtained multiple users are numbered uniformly, and the user numbers are collected together to form a user set U, wherein M is the number of the users, and each user has a unique user code. Also, a plurality of documents are acquired on the internet, and a plurality of Web pages are acquired by, for example, a spider program. Documents on the internet have a unique identification, such as the URL address of a Web page. The obtained multiple documents are numbered uniformly, and the document numbers are collected together to form a document set D {1, 2.
And uniformly numbering the features of each element in the user set U and the document set D to form a feature set K, wherein the feature set K is {1, 2. The features represent attributes of users and documents such as news, finance, science, music, military, and sports, among others.
The following describes a method of representing a parameter vector of a user and a document. The parameter vector representation method is similar to the vector representation method of the vector space model VSM, namely, the feature item is used as a basic unit of the user feature or the document feature. The parameter vector of the user is represented by a set of the relevance of the user to each feature, and the parameter vector of the document is represented by a set of the relevance of the document to each feature. If a user or document does not have a feature, the relevance of the user or document to the feature is zero.
Fig. 1 shows a method for representing a parameter vector of each user in a user set U. The parameter vector of any user i (i e U) in the user set U is set to be ku (i) ═ uwi1, uwi2,.., uwik.., uwiL), wherein uwik represents the degree of correlation of the user i with the feature K (K e K), uwik e [ a, b ], a and b are nonnegative constants. In addition, the relevancy of each user in the user set U and the kth feature of the feature set K is collected together to form a vector, which is called the kth user column vector of the user set U (uw1K, uw 2K.., uwMk).
FIG. 2 is a method for representing a parameter vector for each document in the document set D. The parameter vector of any one document n (n e D) in the document set D is set to kd (n) ═ (dwn1, dwn2,. rightwords, dwnk,. and dwnL), where dwnk represents the degree of correlation of the document n with the feature K (K e K), and dwnk e [ a, b ], a and b are nonnegative constants. In addition, the relevance of each document in the document set D and the kth feature of the feature set K are collected together to form a vector, which is called a kth document column vector (dw1K, dw2K,..., dwNk) of the document set D.
The degree of relevance is a real number that represents how closely a document or user is to a feature in the feature set K. If a document or user is associated with music features more and less with sports features, we say that the document or user has a high correlation with music features and a low correlation with sports features. In addition, some features have correlation, so that the dimension of the feature set K can be reduced by reducing the correlation among the features during feature selection, the requirement on a server storage space is reduced, and the algorithm efficiency is improved. Some features need not be listed directly in the feature set because the relevance of these features can be calculated from the relevance of one or several other features in the feature set K.
The following explains a setting method of an initial value of a parameter vector of a user or a document. The following three examples are given for illustration. If the parameter vector of the user or document is not set to an initial value, its initial value of the parameter vector is set to a zero vector by default.
Example 1 is a method of manually setting the initial value of a parameter vector for a user i (i ∈ U) or a document n (n ∈ D). For example, set total number of features L to 5, set feature set K to (science, finance, education, music, sports), and set ku (i) to (uwi1, uwi2, uwi3, uwi4, uwi5) to (0, 0.00032, 0, 0.00059, 0). Namely, the correlation between the user i and the "finance" feature is 0.00032, the correlation between the user i and the "music" feature is 0.00059, and the correlation between the user i and other features is zero. Using a similar method, an initial value of a parameter vector kd (n) ═ dwn1, dwn2,. and dwnk,. and dwnL) of any document n can be set.
Example 2 is a method of setting the initial value of the parameter vector of the user i (i ∈ U). By the user iSubmit a set of document sets H {
Figure DEST_PATH_IMAGE001
The parameter vector of the document r (r e H) is kd (r) (dwr1, dwr 2.., dwrL), so for each K e K, uwik (σ 1/s) · (r e H) [ dwrk/(Σ (K e K) dwrk)]Wherein s is the number of elements of the set H, and σ 1 is a set constant. Using a similar method, the user i may also select a group of users in the user set U to calculate the initial value of the parameter vector of the user i.
Example 3 is a method of setting initial values of parameter vectors of a document. A catalog is a special document with a corresponding document number. For example, web portals typically include catalogs of news, music, sports, finance, and technology. We assume that documents under the same category have some of the same characteristics, e.g., documents under sports category are all related to sports. If document n (n e D) is a document under directory h (h e D), the initial value of the parameter vector of document n is determined by the parameter vector of directory h. For example, for each K ∈ K, dwnk ═ σ 2 · dwhk is set, where σ 2 is a set constant.
Fig. 3 is a method for obtaining a user personalized feature based on a user access user signal. The method specifically comprises the following steps:
s11, acquiring a plurality of users on the Internet, and storing a user set U (1, 2.., M) consisting of the users; setting a plurality of features, and storing a feature set K ═ {1, 2.., L } consisting of the plurality of features;
s12, setting initial values of parameter vectors for a plurality of users in the user set U;
s13, receiving a signal of any user i (i belongs to U) for accessing any user j (j belongs to U);
s14. reading a parameter vector ku (i) ═ uwi1, uwi 2., uwik., uwiL of the user i, wherein uwik represents a degree of correlation of the user i with a feature K (K ∈ K);
s15. read a parameter vector ku (j) ═ uwj1, uwj 2.,. uwjk.,. uwjL for the user j, where uwjk represents a degree of correlation of the user j with a feature K (K e K);
s16, updating the parameter vectors of the user i and the user j by using a parameter vector updating algorithm,
Ku*(i)=function1[Ku(i),Ku(j)],
Ku*(j)=function2[Ku(i),Ku(j)];
return is made to step S13.
Wherein the Ku (i) and the Ku (i) represent parameter vectors of the user i before and after updating, respectively, and the Ku (j) represent parameter vectors of the user j before and after updating, respectively; the kux (i) ═ uwi1, (uwi 2., uwik.,. uwiL.), and the kux (j) ═ uwj 1.,. uwj 2.,. uwjk.,. uwjL.; said function1 indicates that Ku (i) is a function of Ku (i) and Ku (j), and said function2 indicates that Ku (j) is a function of Ku (i) and Ku (j); the user i and the user j represent any two users in the user set U without specific reference to any two users, for example, when the nth time step S13 is executed, i is 1023, j is 29328 in the signal, and when the n +1 th time step S13 is executed, i is 737443, j is 837487 in the signal.
In the method shown in fig. 3, after the step S16 is performed, the method further includes the step of updating the Ku (i) and the Ku (j), that is, assigning values Ku (i) and Ku (j).
In an example of application of the method of fig. 3, the type of the signal is one of the following types: t-1 indicates that the user i pays attention to (follow) the user j, T-2 indicates that the user i adds the user j as a friend, T-3 indicates that the user i transfers information of the user j, T-4 indicates that the user i reviews information issued by the user j, T-5 indicates that the user i collects information of the user j, T-6 indicates that the user i sends a private message to the user j, T-7 indicates that the user i tags the user j, and T-8 indicates that the user i likes information issued by the user j. In one example of an application of the method of fig. 3, the signal is collected from a system log.
In one example of an application of the method depicted in FIG. 3, the method satisfies Ku (i) ≧ Ku (i) and Ku (j) ≧ Ku (j). Wherein the inequality Ku (i) ≧ Ku (i) means that for each K ∈ K, uwik ≧ uwik; the inequality Ku (j) ≧ Ku (j) means that for each K ∈ K, uwjk ≧ uwjk.
In an example of application of the method of FIG. 3, for each K ∈ K, uwik is an increasing function of uwjk; for each K ∈ K, uwjk is an increasing function of uwik.
In an example of application of the method of fig. 3, for each K e K, uwiK is a decreasing function of Σ (K e K) uwjk; for each K e K, uwjK is a decreasing function of Σ (K e K) uwik.
In an example application of the method of fig. 3, the signal comprises user identifications of the user i and the user j, each user identification corresponding to a unique user code. The application example reads the user code of the user i according to the user identification of the user i and reads the parameter vector of the user i according to the user code of the user i; and reading the user code of the user j according to the user identifier of the user j, and reading the parameter vector of the user j according to the user code of the user j.
In the method shown in fig. 3, after the parameter vector updating algorithm is executed for a set number of times, for each feature K ∈ K, normalization processing (normalization) is required on the kth user column vector (uw1K, uw 2K.., uwMk) in the user set U. The meaning of performing the parameter vector update algorithm once is to bring the Ku (i) and Ku (j) into the function1 and function2, resulting in the Ku (j) and Ku (i) processes. The specific application example of the normalization method is as follows:
example 1: a method of normalizing a k-th user column vector (uw1k, uw2 k.., uwMk) in a user set U includes setting temp ∑ (t ∈ U) uwtk, and setting uwik ═ uwik/temp for each i ∈ U.
Example 2: the method for normalizing the kth user column vector (uw1k, uw2 k.., uwMk) in the user set U is as follows. Firstly, calculating temp ═ Σ (t ∈ U) uwtk, and calculating uwik ═ uwik/temp for each i ∈ U; then sequencing uw1k, uw2 k.,. uwMk and equally dividing uw1k, uw2 k.,. uwMk into r groups according to the sequencing result, and taking out the minimum data composition set { s1, s 2.,. sr } in each group, wherein s1 < s 2. < sr; finally, uw1k, uw2 k.., uwMk was treated as follows: for each i ∈ U, if uwik < s1, set uwik ∈ a; if sm is less than or equal to uwik less than or equal to sm +1, setting uwik equal to g (sm); if uwik > sr, then uwik ═ b is set. Wherein g (sm) is an increasing function, g (sm) epsilon (a, b), m is more than or equal to 1 and less than r, a and b are non-negative constants, and r is a set parameter.
Application example 1
This is an example of an application of the method described in fig. 3. In the method shown in fig. 3, the parameter vector updating algorithm updates the parameter vectors of the user i and the user j by the following specific application example:
uwik*=uwik+λ1(j,i,T)·f1[Ku(j)](for each k e UKi
Figure 100002_DEST_PATH_IMAGE002
K)
uwjk*=uwjk+λ2(i,j,T)·f2[Ku(i)](for each k e UKj
Figure 223873DEST_PATH_IMAGE002
K)
Wherein uwik and uwik represent the kth component of the parameter vector of user i before and after updating, respectively, and uwjk represent the kth component of the parameter vector of user j before and after updating, respectively; the λ 1(j, i, T) is an influence coefficient of the user j on the user i under the type T of the signal, and the λ 2(i, j, T) is an influence coefficient of the user i on the user j under the type T of the signal. The UKi is a set composed of features corresponding to Pi components with the largest values in the parameter vectors ku (i) of the user i (uwi1, uwi2,.., uwik,.., uwiL), the UKj is a set composed of features corresponding to Pj components with the largest values in the parameter vectors ku (j) (uwj1, uwj2,.., uwjk,.., uwjL) of the user j, Pi and Pj are set parameters, Pi is less than or equal to L, and Pj is less than or equal to L. For example, i-30, P30-3, UK30 { literature, computer, biology }; j 265, P265 2, and UK265 { music, history }. After the execution of the specific algorithm, a setting uwjk is made for each k e UKi; for each k e UKj, uwik is set to uwik.
In the application example 1, the specific algorithm may be further defined as satisfying uwjk ≧ uwjk for each k ∈ UKi; for each k ∈ UKj, uwik ≧ uwik is satisfied.
In the application example 1, f1[ ku (j) ] is a function of the parameter vector ku (j) of the user j, and f2Ku (i) ] is a function of the parameter vector ku (i) of the user i. Specific implementation methods of the f1[ ku (j)) ] and the f2[ ku (i)) ] include the following examples:
example 1: said f1[ Ku (j) ] is an increasing function of said uwjk, and a decreasing function of Σ (K ∈ K) uwjk; f2[ Ku (i) ], is an increasing function of uwik and a decreasing function of Σ (K ∈ K) uwik.
Example 2: f1[ ku (j)) ] ═ σ 3 · uwjk/(∑ (K ∈ K) uwjk), f2[ ku (i)) ] ═ σ 4 · uwik/(Σ (K ∈ K) uwik), where σ 3 and σ 4 are set constants.
Example 3: f1[ ku (j)) ] ═ σ 5 · uwjk, f2[ ku (i)) ] ═ σ 6 · uwik, where σ 5 and σ 6 are set constants.
Example 4: f1[ ku (j)) ] ═ σ 7 · {1/[1+ exp (-uwjk) ] }, f2[ ku (i)) ] ═ σ 8 · {1/[1+ exp (-uwik) ] }, where σ 7 and σ 8 are set constants.
In the application example 1, a specific implementation method of the λ 1(j, i, T) and the λ 2(i, j, T) includes:
example 1: the λ 1(j, i, T) and the λ 2(i, j, T) are functions of the similarity sim (i, j) between the parameter vectors of the user i and the user j, respectively. For example, λ 1(j, i, T) ═ c1 · sim (i, j), λ 2(i, j, T) ═ c2 · sim (i, j), and sim (i, j) | | | ku (i), ku (j) | | [ ∑ Σ (K ∈ K) (uwik · uwjk) ]/{ [ Σ (K ∈ K) (uwik)2]1/2 [ Σ (K ∈ K) (uwjk)2]1/2}, and c1 and c2 are set constants. The implication of this example is that the higher the similarity between the parameter vectors of user i and user j, the larger the scaling factor they "vote" for each other.
Example 2: the λ 1(j, i, T) ═ U1(j) · U2(i), and the λ 2(i, j, T) ═ U1(i) · U2(j), where U1(j) indicates whether the parameter vector of user j can be used for updating the parameter vectors of other users in the user set U, and U1(i) indicates whether the parameter vector of user i can be used for updating the parameter vectors of other users in the user set U; u2(i) indicates whether the parameter vector of user i can be updated by the parameter vectors of other users in the user set U; u2(j) indicates whether the parameter vector of user j can be updated by the parameter vectors of other users in the user set U. u1(j), u2(j), u1(i) and u2(i) are setting parameters, and their values are 0 or 1. 1 represents yes and 0 represents no. The meaning of this example is that in order to prevent malicious attacks, the parameter vector of a user who has not been authenticated by reliability cannot be updated to the parameter vectors of other users; some special users have their parameter vectors that cannot be updated by the parameter vectors of other users.
Example 3: the λ 1(j, i, T) ═ s1(T), and the λ 2(i, j, T) ═ s2 (T). Wherein the T is a type of user signal accessed by a user, the s1(T) and the s2(T) are functions of the T.
Example 4: the λ 1(j, i, T) and λ 2(i, j, T) were generated using a combination of the methods of examples 1 to 3 above. For example
λ1(j,i,T)={c1·sim(i,j)}·{u1(j)·u2(i)}·s1(T)
λ2(i,j,T)={c2·sim(i,j)}·{u1(i)·u2(j)}·s2(T)。
Example 5: the λ 1(j, i, T) is a function of the number of users in the relational network of the user j, and the λ 2(i, j, T) is a function of the number of users in the relational network of the user i.
Example 6: the λ 1(j, i, T) and the λ 2(i, j, T) are set constants.
In the application example 1, after the specific algorithm is executed for a set number of times, the K-th user column vector (uw1K, uw 2K.., uwMk) needs to be normalized for each feature K ∈ K.
Application example 2
This is an example of the method of application example 1. For the sake of illustration, we assume that there are three users on the internet, each having two features, i.e., user set U ═ {1, 2, 3}, and feature set K ═ 1, 2 }. The parameter vectors for user 1, user 2 and user 3 are (uw11, uw12), (uw21, uw22) and (uw31, uw32), respectively. Where uwik (i ∈ U, K ∈ K) represents the degree of correlation of the user i with the feature K.
If a signal that the user 2 accesses the user 3 is received and the signal type T is 1, updating the parameter vectors of the user 2 and the user 3 according to the following parameter vector updating algorithm:
uw21*=uw21+λ1(3,2,1){uw31/(uw31+uw32)}
uw22*=uw22+λ1(3,2,1){uw32/(uw31+uw32)}
uw31*=uw31+λ2(2,3,1){uw21/(uw21+uw22)}
uw32*=uw32+λ2(2,3,1){uw22/(uw21+uw22)}
where λ 1(3, 2, 1) represents an influence coefficient of the user 3 on the user 2 under the signal type T ═ 1; λ 2(2, 3, 1) represents the influence coefficient of the user 2 on the user 3 with the signal type T equal to 1. Let λ 1(3, 2, 1) ═ c1 · sim (2, 3) · u1(3) · u2(2) · s1 (1); λ 2(2, 3, 1) ═ c2 · sim (2, 3) · u1(2) · u2(3) · s2(1), where s1(1) ═ 3 and s2(1) · 1.5; c1 and c2 are set constants; u1(3) indicates whether the parameter vector of user 3 can be used to update the parameter vectors of other users in the user set U, U1(2) indicates whether the parameter vector of user 2 can be used to update the parameter vectors of other users in the user set U, U2(2) indicates whether the parameter vector of user 2 can be updated by the parameter vectors of other users in the user set U, U2(3) indicates whether the parameter vector of user 3 can be updated by the parameter vectors of other users in the user set U, and U1(2) ═ U2(2) ═ U1(3) ═ U2(3) ═ 1; the sim (2, 3) represents the similarity between the parameter vectors of the user 2 and the user 3, i.e. sim (2, 3) ═ w21 · uw31+ uw22 · uw32)/{ [ (uw21)2+ (uw22)2]1/2 · [ (uw31)2+ (uw32)2] 1/2.
After the above algorithm is executed, the parameter vectors of the user 2 and the user 3 are updated, i.e., uw31 ═ uw31, uw32 ═ uw32, uw21 ═ uw21, and uw22 ═ uw22 are set.
After the above algorithm is executed, the user column vectors (uw11, uw21, uw31) and (uw12, uw22, uw32) are normalized. The algorithm is as follows: if temp1 ═ uw11+ uw21+ uw31, uw11 ═ uw11/temp1, uw21 ═ uw21/temp1, and uw31 ═ uw31/temp1 are set for the characteristic k ═ 1; if temp2 is uw12+ uw22+ uw32, uw12 is uw12/temp2, uw22 is uw22/temp2, and uw32 is uw32/temp2 for the feature k 2.
Fig. 4 is a method for obtaining personalized features of a user based on a signal of the user accessing a document. The method specifically comprises the following steps:
s21, acquiring a plurality of users on the Internet, and storing a user set U (1, 2.., M) consisting of the users; acquiring a plurality of documents on the internet, and storing a document set D ═ 1, 2. Setting a plurality of features, and storing a feature set K ═ {1, 2.., L } consisting of the plurality of features;
s22, setting parameter vector initial values for a plurality of users in the user set U, and setting parameter vector initial values for a plurality of documents in the document set D;
s23, receiving a signal that any user m (m belongs to U) accesses any document n (n belongs to D);
s24, reading a parameter vector ku (m) ═ uwm1, uwm 2.,. uwmk.,. uwmml) of the user m, wherein uwmk represents a degree of correlation of the user m with a feature K (K ∈ K);
s25. reading a parameter vector kd (n) of the document n, (dwn1, dwn 2.., dwnk.,. d wnl), wherein dwnk represents a degree of correlation of the document n with a feature K (K e K);
s26, updating the parameter vectors of the user m and the document n by using a parameter vector updating algorithm 2,
Ku*(m)=function3[Ku(m),Kd(n)],
Kd*(n)=function4[Ku(m),Kd(n)];
after the step S26 is completed, the process returns to the step S23.
Wherein the Ku (m) and the Ku x (m) represent parameter vectors of the user m before and after updating, respectively, and the Kd (n) and the Kd x (n) represent parameter vectors of the document n before and after updating, respectively; the Kd (m) is (uwm1, uwm2, hw, uwmk, hw ml), and the Kd (n) is (dwn1, dwn2, dwnk, dwnL); said function3 indicates that Ku (m) is a function of Ku (m) and Kd (n), and said function4 indicates that Kd (n) is a function of Ku (m) and Kd (n); the user m represents any user in the user set U without referring to a user, and the document n represents any document in the document set D without referring to a document, for example, when the nth execution of step S23 is executed, m 1023 and n 3428 are included in the signal, and when the nth +1 execution of step S23 is executed, m 33456 and n 28477 are included in the signal.
In the method shown in fig. 4, after the step S26 is performed, the method further includes the step of updating the Ku (m) and the Kd (n), i.e., assigning Kd (n) ═ Kd (n) and Ku (m) ═ Ku (m).
In the method of fig. 4, the type of the signal is at least one of the following types: t-9 represents the user m clicks the link of the document n, T-10 represents the address of the user m typing the document n, T-11 represents the user m sets the label for the document n, T-12 represents the user m sets the document n as a bookmark, T-13 represents the user m sets the document n as a favorite (such as +1 of Like and google of facial makeup), T-14 represents the user m forwards the document n, T-15 represents the user m reviews the document n, and T-16 represents the user m collects the document n. In one example of the method of fig. 4, the signal is collected from a Web log. The Web logs comprise a server log (server log), an error log (error log), a Cookie log and the like.
In one example of use of the method described in FIG. 4, the method satisfies Ku (m) ≧ Ku (m) and Kd (n) ≧ Kd (n). Wherein the inequality Ku (m) ≧ Ku (m) means uwmk ≧ uwmk for each K ∈ K; the inequality Kd (n) ≧ Kd (n) means that for each K ∈ K, there is dwnk ≧ dwnk.
In an example of application of the method of fig. 4, for each K e K, uwmk is an increasing function of dwnk, and a decreasing function of Σ (K e K) dwnk; for each K e K, the dwnk is an increasing function of the uwmk and a decreasing function of Σ (K e K) uwmk.
In an example application of the method of fig. 4, the signal includes a user identifier of the user m and a document identifier of the document n, the user identifier corresponds to a unique user code, and the document identifier corresponds to a unique document code. The application example reads the user code of the user m through the user identifier of the user m and reads the parameter vector of the user m according to the user code of the user m; reading the document code of the document n through the document identification of the document n, and reading the parameter vector of the document n according to the document code of the document n.
In the method shown in fig. 4, after the parameter vector update algorithm 2 is executed for a set number of times t1, for each feature K e K, normalizing the kth user column vector (uw1K, uw2K,. once, uwMk); after the parameter vector updating algorithm 2 is executed for a set time t2, for each feature K ∈ K, normalizing the kth document column vector (dw1K, dw 2K., dwNk); wherein t1 and t2 are positive integers. Performing the parameter vector update algorithm 2 once means a process of substituting the Ku (m) and the Kd (n) into the function3 and the function4, resulting in the Ku (m) and the Kd (n). The specific application example of the normalization method is as follows:
example 1: a method of normalizing a kth user column vector (uw1k, uw2 k.., uwMk) in a user set U includes setting temp ∑ (t ∈ U) uwtk, and for each m ∈ U, setting uwMk ═ uwMk/temp. A method of normalizing the kth document column vector (dw1k, dw2 k.., dwNk) in the document set D includes setting temp ∑ (te e D) dwtk, and for each n e D, setting dwNk ═ dwNk/temp.
Example 2: the method for normalizing the kth document column vector (dw1k, dw2 k.., dwNk) in the document set D is as follows. First calculating temp ═ Σ (t ∈ D) dwtk, and dwnk ═ dwnk/temp for each n ∈ D; then sequencing dw1k, dw2 k.,. dwNk and equally dividing dw1k, dw2 k.., dwNk into r groups according to the sequencing result, and taking out the minimum data composition set { s1, s 2.., sr } in each group, wherein s1 < s 2. < sr; dw1k, dw2 k.., dwNk is finally treated as follows: for each n ∈ D, if dwnk < s1, then dwnk ═ a is set; if sm is less than or equal to dwnk is less than or equal to sm +1, then dwnk is set to g (sm); if dwnk > sr, dwnk is set to b. Wherein g (sm) is an increasing function, g (sm) epsilon (a, b), m is more than or equal to 1 and less than r, a and b are non-negative constants, and r is a set parameter. In the same way, the k-th user column vector in the user set U can be normalized.
Application example 3
This is an example of an application of the method described in fig. 4. The parameter vector updating algorithm 2 updates the parameter vectors of the user m and the document n by the following specific application examples:
uwmk*=uwmk+λ3(n,m,T)·f3[Kd(n)](for each k e DKn
Figure 875435DEST_PATH_IMAGE002
K)
dwnk*=dwnk+λ4(m,n,T)·f4[Ku(m)](for each k ∈ DKm
Figure 186330DEST_PATH_IMAGE002
K)
Wherein uwmk and uwmk represent the kth component of the parameter vector of the user m before and after updating, respectively, and dwnk represent the kth component of the parameter vector of the document n before and after updating, respectively; the λ 3(n, m, T) is the influence coefficient of the document n on the user m under the type T of the signal, and the λ 4(m, n, T) is the influence coefficient of the user m on the document n under the type T of the signal. The DKm is a set of features corresponding to Pm components with the largest values in parameter vectors ku (m) (uwm1, uwm 2.,. so, uwmk.,. so, uwmL) of the user m, the DKn is a set of features corresponding to Qn components with the largest values in parameter vectors kd (n) (dwn1, dwn 2.,. dwnk.,. so, dwnL), Pm and Qn are set parameters, Pm is equal to or less than L, and Qn is equal to or less than L. For example, m 30, P30 3, UK30 { music, sports, finance }; n is 265, Q265 is 2, and DK265 is { music, building }. In addition, after the above-described specific algorithm is executed, settings are made such that dwnk ═ dwnk ∈ DKm is set for each k ∈ DKm, and uwmk ═ uwmk is set for each k ∈ DKn.
In the application instance 3, the specific algorithm may be further defined as satisfying uwmk ≧ uwmk for each k ∈ DKn; for each k ∈ DKm, dwnk ≧ dwnk is satisfied.
In the application example 3, f3[ kd (n)) ] is a function of the parameter vector kd (n) of the document n, and f4[ ku (m)) ] is a function of the parameter vector ku (m) of the user m. The specific implementation method of the f3[ Kd (n)) ] and the f4[ Ku (m)) ] comprises the following steps:
example 1: said f3[ Kd (n) ] is an increasing function of said dwnk and a decreasing function of Σ (K ∈ K) dwnk; f4[ Ku (m) ] is an increasing function of uwmk and a decreasing function of ∑ (K ∈ K) uwmk.
Example 2: f3[ kd (n)) ] ═ σ 3 · dwnk/(∑ (K ∈ K) dwnk), f4[ ku (m)) ] ═ σ 4 · uwmk/(Σ (K ∈ K) uwmk), where σ 3 and σ 4 are set constants.
Example 3: f3[ kd (n)) ] ═ σ 5 · dwnk, f4[ ku (m)) ] ═ σ 6 · uwmk, where σ 5 and σ 6 are set constants.
Example 4: f3[ kd (n)) ] ═ σ 7 · {1/[1+ exp (-dwnk) ] }, f4[ ku (m)) ] ═ σ 8 · {1/[1+ exp (-uwmk) ] }, where σ 7 and σ 8 are set constants.
In the application example 3, specific implementation methods of the λ 3(n, m, T) and the λ 4(m, n, T) include the following examples:
example 1: the λ 3(n, m, T) and the λ 4(m, n, T) are functions of the similarity sim (m, n) between the parameter vectors of the user m and the document n, respectively. For example, λ 3(n, m, T) ═ c1 · sim (m, n), λ 4(m, n, T) ═ c2 · sim (m, n), and sim (m, n) | | ku (m), kd (n) | | [ ∑ Σ (K ∈ K) (uwmk · dwnk) ]/{ [ Σ (K ∈ K) (uwmk)2]1/2 [ Σ (K ∈ K) (dwnk)2]1/2}, and c1 and c2 are set constants. The implication of this example is that the higher the similarity between the parameter vectors of the user and the document, the larger the scaling factor they "vote" for each other.
Example 2: the λ 3(n, m, T) ═ U2(m) · D1(n), and the λ 4(n, m, T) · U1(m) · D2(n), where D1(n) indicates whether the parameter vector of document n can be used to update the parameter vectors of users in user set U, U2(m) indicates whether the parameter vector of user m can be updated by the parameter vector of documents in document set D, U1(m) indicates whether the parameter vector of user m can be used to update the parameter vector of documents in document set D, and D2(n) indicates whether the parameter vector of document n can be updated by the parameter vectors of users in user set U. u1(m), u2(m), d1(n) and d2(n) are setting parameters, and the values of the setting parameters are 0 or 1. 1 represents yes and 0 represents no. The meaning of this example is that in order to prevent malicious attacks, some documents (or users) cannot update the parameter vectors of other users (or documents) because the documents (or users) are not authenticated by reliability; some special documents (or users) have their parameter vectors that cannot be updated by the parameter vectors of other users (or documents).
Example 3: the λ 3(n, m, T) ═ s1(T), and the λ 4(m, n, T) ═ s2 (T). Wherein the T is a type of user access document signal, the s1(T) and the s2(T) are functions of the T.
Example 4: the λ 3(n, m, T) and λ 4(m, n, T) were generated using a combination of the methods of examples 1-3 above. Namely, it is
λ3(n,m,T)={c1·sim(m,n)}·{u2(m)·d1(n)}·s1(T)
λ4(m,n,T)={c2·sim(m,n)}·{u1(m)·d2(n)}·s2(T)。
Example 5: the λ 3(n, m, T) is a function of the number of times the document n is accessed or the PageRank value, and the λ 4(m, n, T) is a function of the number of users in the user's m relational network.
Example 6: the λ 3(n, m, T) and the λ 4(m, n, T) are set constants.
In the application example 3, after the specific parameter vector update algorithm is executed for the set number of times, the K-th document column vector (dw1K, dw 2K.., dwNk) and the K-th user column vector (uw1K, uw 2K.., uwMk) need to be normalized for each feature K ∈ K.
Application example 4
This is an application example of the method described in application example 3. For convenience of explanation, we assume that there are two users and three documents on the internet, each having two features, i.e., user set U ═ {1, 2}, document set D ═ 1, 2, 3}, and feature set K ═ 1, 2 }. The parameter vectors of user 1 and user 2 are (uw11, uw12) and (uw21, uw22), respectively, and the parameter vectors of document 1, document 2, and document 3 are (dw11, dw12), (dw21, dw22), and (dw31, dw32), respectively. Wherein uwmk (m is equal to U, K is equal to K) represents the correlation degree of the user m and the characteristic K; dwnk (n ∈ D, K ∈ K) represents the relevance of the document n to the feature K.
Assuming that a signal is received in the server that the user 2 accesses the document 3 and the signal type T is 9, the parameter vectors of the user 2 and the document 3 are updated according to the following algorithm:
uw21*=uw21+λ3(3,2,9){dw31/(dw31+dw32)}
uw22*=uw22+λ3(3,2,9){dw32/(dw31+dw32)}
dw31*=dw31+λ4(2,3,9){uw21/(uw21+uw22)}
dw32*=dw32+λ4(2,3,9){uw22/(uw21+uw22)}
wherein λ 3(3, 2, 9) represents an influence coefficient of the document 3 on the user 2 under the signal type T ═ 9; λ 4(2, 3, 9) represents an influence coefficient of the user 2 on the document 3 under the signal type T of 9. For example, λ 3(3, 2, 9) ═ c1 · sim (2, 3) · s1 (9); λ 4(2, 3, 9) ═ c2 · sim (2, 3) · s2(9), where s1(9) ═ 3 and s2(9) ═ 1.5; c1 and c2 are set constants; the sim (2, 3) represents the similarity between the parameter vectors of the user 2 and the document 3, namely: sim (2, 3) ═ w21 · dw31+ uw22 · dw32)/{ [ (uw21)2+ (uw22)2]1/2 · [ (dw31)2+ (dw32)2]1/2 }.
After the above algorithm is executed, the parameter vectors of the user 2 and the document 3 are updated, i.e., uw21 ═ uw21, uw22 ═ uw22, dw31 ═ dw31, and dw32 ═ dw32 are set.
After the above algorithm is executed, the user column vectors (uw11, uw21) and (uw12, uw22) are normalized, and the document column vectors (dw11, dw21, dw31) and (dw12, dw22, dw32) are normalized.
The algorithm for the normalization process of the user column vectors is as follows: if temp1 ═ uw11+ uw21, uw11 ═ uw11/temp1 and uw21 ═ uw21/temp1 are set for characteristic k ═ 1; if temp2 is uw12+ uw22, uw12 is uw12/temp2 and uw22 is uw22/temp2 for characteristic k 2.
The algorithm for the normalization process of the document column vector is as follows: if temp1 ═ dw11+ dw21+ dw31, dw11 ═ dw11/temp1, dw21 ═ dw21/temp1, dw31 ═ dw31/temp1 are set for the characteristic k ═ 1; if temp2 is dw12+ dw22+ dw32, dw12 is dw12/temp2, dw22 is dw22/temp2, and dw32 is dw32/temp2 for the feature k 2.
FIG. 5 is a flow chart of a method of querying a user for a particular characteristic. The method comprises the following steps executed in the server:
A11. receiving a query vector set by any query user e (e belongs to U);
A12. the query user e selects a group of users Q (Q e.g. U) in the user set U, for example, selects a group of users with ages in a given interval or all users with locations in a set geographical area in a social network; if the user does not select the data, the default value is Q ═ U;
A13. calculating a personalized ranking value UR (e, m) of each user m (m E Q) in the group of users Q according to the query vector and the parameter vector of each user in the group of users Q; the UR (e, m) represents a personalized rank value for the user m based on the user e's query vector;
A14. and sending the identifications of some users with the maximum personalized ranking value to the inquiring user e in the group of users Q.
In the method shown in fig. 5, the query vector set by the user e is ks (e) ═ ks (swe1, swe 2.., swek.,. swek.., sweL), where swek represents the degree of correlation between the user that the user e desires to query and the feature K (K ∈ K), and swek ∈ [ a, b ], a and b are preset nonnegative constants. The query vector ks (e) has the following setting methods.
The first is that the user e selects a feature from the feature set K and sets a feature correlation, for example, set swe 2-0.00023 and swe 6-0.00061, to the feature set K, and the correlation of the user e with other features is 0.
The second is to assign the parameter vector ku (e) of the user e to the query vector ks (e).
The third is that the user e submits a set of users or documents Se. When in use
Figure DEST_PATH_IMAGE003
Then, the parameter vector of the user r (r ∈ Se) is (uwr1, uwr 2.., uwrL), so the query vector of the user e is set to: for each feature K e K, swek ∈ K · (σ 9/s) · (r e Se) [ uwrk/(∑ (K e K) uwrk)](ii) a When in use
Figure DEST_PATH_IMAGE004
Then, the parameter vector of the document r (r ∈ Se) is (dwr1, dwr 2.,. dwrL), so the query vector of the user e is set to: for each feature K e K, swek ∈ K · (σ 10/s) · ∑ (r e Se) [ dwrk/(∑ (K e K) dwrk)]。
In an application example of the method shown in fig. 5, the personalized ranking value UR (e, m) is calculated by a query vector ks (e) (swe1, swe 2.,. swek.,. sweL.) of the user e and a parameter vector ku (m) of the user m (m e Q) (uwm1, uwm 2.,. uwmk.,. u, u ml), for example
Figure DEST_PATH_IMAGE005
FIG. 6 is a flow chart of a method for querying a user group having a particular characteristic. A plurality of user groups are obtained, and a user cluster G ═ 1, 2. A parameter vector of a group of users i (i e G) is set to (gwi1, gwi 2.,. gwik.,. gwiL), wherein the gwik represents a degree of correlation of the group of users i with a feature K (K e K). Therefore, the method for querying the user group with the specific characteristics is as follows:
A21. calculating a parameter vector of each user group in the user group G; the parameter vector of a user group is calculated by the parameter vector of each user contained in the user group; for example, assuming that all users in the user group i form the user set Bi, the parameter vector calculation method of the user group i is to set gwik ═ (σ 11/s) ·Σ (t ∈ Bi) [ uwtk/(Σ (K ∈ K) uwtk) ] for each feature K ∈ K, where s is the number of elements in the user set Bi, and σ 11 is a set constant.
A22. Receiving a query vector set by any query user e (e belongs to U);
A23. calculating a personalized ranking value GR (e, i) of each user group i (i belongs to G) in the user cluster G according to the query vector and the parameter vector of each user group in the user cluster G; the GR (e, i) represents a personalized ranking value for the group of users i based on the user e's query vector;
A24. and in the user cluster G, sending the identifications of some user groups with the maximum personalized ranking values to the inquiring user e.
In the method shown in fig. 6, the query vector set by the user e is ks (e) ═ ks (swe1, swe 2., swek., sweL), where swek represents the correlation between the user group that the user e desires to query and the feature K (K ∈ K), and swek ∈ [ a, b ], a and b are preset nonnegative constants. The query vector Ks (e) has four setting methods, the first three methods are the same as the three methods described in FIG. 5. And fourthly, the user e submits a user group, and the parameter vector of the user group is assigned to the Ks (e).
In the method for searching for the user group, the personalized ranking value GR (e, i) is calculated by a query vector ks (e) (swe1, swe 2.., swek.., sweL) of the user e and a parameter vector ku (i) (i ∈ G) (gwi1, gwi 2.., gwi.., ghil) of the user group i (i ∈ G), for example, the personalized ranking value GR (e, i) is calculated by the user e
Figure DEST_PATH_IMAGE006
Fig. 7 is a system structure diagram for acquiring the personalized features of the user. The system 200 comprises the following functional modules:
user, document and feature setting module 211: acquiring a plurality of users on the internet to form a user set U ═ 1, 2.., M }, and storing the user set U in a user database 220; acquiring a plurality of documents on the internet to form a document set D ═ 1, 2.. multidata, N }, and storing the document set D in a document database 230; setting a plurality of features, forming a feature set K ═ {1, 2.., L }, and storing the feature set in a feature database 240;
user and document parameter vector initial value setting module 212: setting initial values of parameter vectors for a plurality of users in the user set U, and storing the initial values in the user database 220; setting initial values of parameter vectors for a plurality of documents in the document set D and storing the initial values in the document database 230; the initial value of the parameter vector of the user and the document which are not set with the initial value of the parameter vector is set as a zero vector by default;
the user access user signal acquisition module 213: the system comprises a Web log database 250, a signal 1 and a signal processing module, wherein the signal 1 is used for collecting a signal 1 of any user i (i belongs to U) accessing any user j (j belongs to U), and the signal 1 is stored in the Web log database 250; a signal for the user i (101) to access the user j (102) is sent to a social network server 302;
user access document signal collection module 214: the method comprises the steps of collecting a signal 2 of any user m (m belongs to U) (103) accessing any document n (n belongs to D), wherein the signal 2 is stored in a Web log database 250; a signal for said user m (103) to access said document n, to be sent to at least one application server, said application server comprising a web portal server 301, a social network server 302, a search engine server 303 and an instant messaging server 304;
user and document parameter vector update module 215: reading the parameter vectors of the user i (101) and the user j (102) in the user database 220 according to the signal 1, then updating the parameter vectors of the user i (101) and the user j (102) through a parameter vector updating algorithm, and storing the updated parameter vectors of the user i and the user j in the user database 220; reading the parameter vector of the user m (103) in the user database 220 and the parameter vector of the document n in the document database 230 according to the signal 2, then updating the parameter vectors of the user m (103) and the document n by a parameter vector updating algorithm 2, and storing the updated parameter vector of the user m in the user database 220 and the updated parameter vector of the document n in the document database 230;
the user query module 216: the module has the inquiry function of users and user groups; the user query function includes: receiving a query vector set by a querying user and obtaining a set of users
Figure DEST_PATH_IMAGE007
Then, according to the query vector and the parameter vector of each user in the group of users Q, calculating a personalized ranking value of each user in the group of users Q, and according to the magnitude of the personalized ranking value, sending the identification of at least one user in the group of users Q to the query user, see steps A11-A14; the user group query function includes: receiving a query vector set by a query user, then calculating a personalized ranking value of each user group in the user group G according to the query vector and the parameter vector of each user group in the user group G, and sending an identifier of at least one user group in the user group G to the query user according to the size of the personalized ranking value, see steps A21-A24.
The above-mentioned application examples are only preferred application examples of the present invention, and are not intended to limit the scope of the present invention.

Claims (1)

1. A system for obtaining personalized features of a user, the system comprising:
user, document and feature setting module: acquiring a plurality of users on the internet to form a user set U {1, 2., M }, and storing the user set U in a user database; acquiring a plurality of documents on the internet to form a document set D {1, 2.,. N }, and storing the document set D in a document database; setting a plurality of features, forming a feature set K, wherein the feature set K is {1, 2.., L }, and storing the feature set K in a feature database;
the initial value setting module of the parameter vectors of the user and the document comprises: setting initial values of parameter vectors for a plurality of users in the user set U, and storing the initial values in the user database; setting initial values of parameter vectors for a plurality of documents in the document set D, and storing the initial values in the document database; the initial value of the parameter vector of the user and the document which are not set with the initial value of the parameter vector is set as a zero vector by default;
the user accesses the user signal acquisition module: the system comprises a signal 1, a Web log database and a data processing module, wherein the signal 1 is used for acquiring a signal 1 of any user i (i belongs to U) accessing any user j (j belongs to U), and the signal 1 is stored in the Web log database;
the user accesses the document signal acquisition module: the method comprises the steps that a signal 2 for any user m (m belongs to U) to access any document n (n belongs to D) is collected, and the signal 2 is stored in a Web log database;
the parameter vector updating module of the user: reading the parameter vectors of the user i and the user j in the user database according to the signal 1, and then updating the parameter vectors of the user i and the user j through a parameter vector updating algorithm; reading the parameter vector of the user m in the user database and the parameter vector of the document n in the document database according to the signal 2, and then updating the parameter vectors of the user m and the document n through a parameter vector updating algorithm 2;
a user query module: the module has the inquiry function of users and user groups; the user query function includes: receiving a query vector set by a query user and acquiring a group of users, then calculating the personalized ranking value of each user in the group of users Q according to the query vector and the parameter vector of each user in the group of users Q, and sending the identification of some users with the largest personalized ranking values to the query user e; the user group query function includes: receiving a query vector set by a query user, calculating an individualized ranking value of each user group in a user group G according to the query vector and a parameter vector of each user group in the user group G, and sending the identification of some user groups with the largest individualized ranking values to the query user e;
the parameter vector updating algorithm updates the parameter vectors of the user i and the user j by the following method:
uwik*=uwik+λ1(j,i,T)·f1[Ku(j)](for each k e UKi
Figure DEST_PATH_IMAGE002
K)
uwjk*=uwjk+λ2(i,j,T)·f2[Ku(i)](for each k e UKj
Figure 12356DEST_PATH_IMAGE002
K)
Wherein uwik and uwik represent the kth component of the parameter vector of user i before and after updating, respectively, and uwjk represent the kth component of the parameter vector of user j before and after updating, respectively; the λ 1(j, i, T) is an influence coefficient of the user j on the user i under the type T of the signal, and the λ 2(i, j, T) is an influence coefficient of the user i on the user j under the type T of the signal; UKi is a set consisting of features corresponding to Pi components with the largest values among the parameter vectors ku (i) of the user i (uwi1, uwi2,.., uwik,.., uwiL), UKj is a set consisting of features corresponding to Pj components with the largest values among the parameter vectors ku (j) (uwj1, uwj2,., uwjk,. uwjL, uwjL), Pi and Pj are set parameters, Pi is less than or equal to L, and Pj is less than or equal to L;
the parameter vector updating algorithm 2 updates the parameter vectors of the user m and the document n by the following method:
uwmk*=uwmk+λ3(n,m,T)·f3[Kd(n)](for each k e DKn
Figure 663917DEST_PATH_IMAGE002
K)
dwnk*=dwnk+λ4(m,n,T)·f4[Ku(m)](for each k ∈ DKm
Figure 850179DEST_PATH_IMAGE002
K)
Wherein uwmk and uwmk represent the kth component of the parameter vector of the user m before and after updating, respectively, and dwnk represent the kth component of the parameter vector of the document n before and after updating, respectively; the lambda 3(n, m, T) is the influence coefficient of the document n on the user m under the type T of the signal, and the lambda 4(m, n, T) is the influence coefficient of the user m on the document n under the type T of the signal; the DKm is a set of features corresponding to Pm components with the largest values in parameter vectors ku (m) (uwm1, uwm 2.,. so, uwmk.,. so, uwmL) of the user m, the DKn is a set of features corresponding to Qn components with the largest values in parameter vectors kd (n) (dwn1, dwn 2.,. dwnk.,. so, dwnL), Pm and Qn are set parameters, Pm is equal to or less than L, and Qn is equal to or less than L.
CN201810879449.8A 2012-05-08 2012-05-08 System for obtaining user personalized features Active CN109344321B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810879449.8A CN109344321B (en) 2012-05-08 2012-05-08 System for obtaining user personalized features

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810879449.8A CN109344321B (en) 2012-05-08 2012-05-08 System for obtaining user personalized features
CN201210152084.1A CN103390008B (en) 2012-05-08 2012-05-08 A method and system for obtaining user personalized features

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201210152084.1A Division CN103390008B (en) 2012-05-08 2012-05-08 A method and system for obtaining user personalized features

Publications (2)

Publication Number Publication Date
CN109344321A CN109344321A (en) 2019-02-15
CN109344321B true CN109344321B (en) 2021-11-02

Family

ID=49534284

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201210152084.1A Expired - Fee Related CN103390008B (en) 2012-05-08 2012-05-08 A method and system for obtaining user personalized features
CN201810879449.8A Active CN109344321B (en) 2012-05-08 2012-05-08 System for obtaining user personalized features

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201210152084.1A Expired - Fee Related CN103390008B (en) 2012-05-08 2012-05-08 A method and system for obtaining user personalized features

Country Status (1)

Country Link
CN (2) CN103390008B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079064A (en) * 2007-06-25 2007-11-28 腾讯科技(深圳)有限公司 Web page sequencing method and device
CN101334773A (en) * 2007-06-28 2008-12-31 联想(北京)有限公司 Method for filtrating search engine searching result
CN101996229A (en) * 2009-08-18 2011-03-30 财团法人工业技术研究院 Video search method and device using motion vectors

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020103798A1 (en) * 2001-02-01 2002-08-01 Abrol Mani S. Adaptive document ranking method based on user behavior
US6892198B2 (en) * 2002-06-14 2005-05-10 Entopia, Inc. System and method for personalized information retrieval based on user expertise
CN1667607A (en) * 2004-03-11 2005-09-14 国际商业机器公司 Personalized category treatment method and system for document browsing
CN100527132C (en) * 2007-06-22 2009-08-12 腾讯科技(深圳)有限公司 Classified sample set optimizing method and content-related advertising server
CN101770520A (en) * 2010-03-05 2010-07-07 南京邮电大学 User interest modeling method based on user browsing behavior
CN102999540A (en) * 2011-09-10 2013-03-27 祁勇 Method and system for determining user features on Internet
CN103309900A (en) * 2012-03-06 2013-09-18 祁勇 Personalized multidimensional document sequencing method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079064A (en) * 2007-06-25 2007-11-28 腾讯科技(深圳)有限公司 Web page sequencing method and device
CN101334773A (en) * 2007-06-28 2008-12-31 联想(北京)有限公司 Method for filtrating search engine searching result
CN101996229A (en) * 2009-08-18 2011-03-30 财团法人工业技术研究院 Video search method and device using motion vectors

Also Published As

Publication number Publication date
CN103390008A (en) 2013-11-13
CN103390008B (en) 2018-09-28
CN109344321A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
US10311478B2 (en) Recommending content based on user profiles clustered by subscription data
US8620948B2 (en) System and method for crowdsourced template based search
US9043358B2 (en) Enterprise search over private and public data
TWI636416B (en) Method and system for multi-phase ranking for content personalization
US8117256B2 (en) Methods and systems for exploring a corpus of content
US8805833B2 (en) Enhancing personal data search with information from social networks
CN110209827B (en) Search method, search device, computer-readable storage medium, and computer device
CN102054003B (en) Methods and systems for recommending network information and creating network resource index
US9060029B2 (en) System and method for target profiling using social network analysis
US9946799B2 (en) Federated search page construction based on machine learning
US20090094189A1 (en) Methods, systems, and computer program products for managing tags added by users engaged in social tagging of content
US9477720B1 (en) Social search endorsements
CN107155362A (en) Ranking search results for member profile queries
WO2009114204A2 (en) Systems and methods for content sharing
US20140365466A1 (en) Search result claiming
CN102930010A (en) Sequencing of entity attribute and relation
US9342603B2 (en) Experience graph
US10789293B2 (en) Automatic search dictionary and user interfaces
US9465875B2 (en) Searching based on an identifier of a searcher
CN108352032A (en) Systems and methods for returning prioritized content
US9519710B1 (en) Dynamic classification of attribute relevance and classification
CN108959579B (en) System for acquiring personalized features of user and document
CN103309900A (en) Personalized multidimensional document sequencing method and system
CN109344321B (en) System for obtaining user personalized features
Nakashe et al. Smart approach to crawl web interfaces using a two stage framework of crawler

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20211015

Address after: 261061 No.3 apartment 2214, zone B, Weifang Gaoxin Central Business District, 6888 Jiankang East Street, Huayuan community, Qingchi street, Gaoxin District, Weifang City, Shandong Province

Applicant after: Weifang Jiubao Intelligent Technology Co.,Ltd.

Address before: 226661 No. 123 South Street, Qu Tang Town, Haian, Nantong, Jiangsu

Applicant before: Jing Zhuqiang

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20221216

Address after: 274042 Zone B, Weifang High tech Central Business District, No. 6888, Jiankang East Street, Garden Community, Qingchi Street, High tech Zone, Heze City, Shandong Province

Patentee after: Shandong Dashu Life Health Technology Co.,Ltd.

Address before: 261061 No.3 apartment 2214, zone B, Weifang Gaoxin Central Business District, 6888 Jiankang East Street, Huayuan community, Qingchi street, Gaoxin District, Weifang City, Shandong Province

Patentee before: Weifang Jiubao Intelligent Technology Co.,Ltd.

TR01 Transfer of patent right