[go: up one dir, main page]

CN112016836B - A method and device for determining similarity between objects - Google Patents

A method and device for determining similarity between objects Download PDF

Info

Publication number
CN112016836B
CN112016836B CN202010896318.8A CN202010896318A CN112016836B CN 112016836 B CN112016836 B CN 112016836B CN 202010896318 A CN202010896318 A CN 202010896318A CN 112016836 B CN112016836 B CN 112016836B
Authority
CN
China
Prior art keywords
attribute
association
nodes
dimension
association network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010896318.8A
Other languages
Chinese (zh)
Other versions
CN112016836A (en
Inventor
刘红宝
郑建宾
高鹏飞
贡钟瑞
孙权
孙郯
王臻
陈玥如
陈滢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Unionpay Co Ltd
Original Assignee
China Unionpay Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Unionpay Co Ltd filed Critical China Unionpay Co Ltd
Priority to CN202010896318.8A priority Critical patent/CN112016836B/en
Publication of CN112016836A publication Critical patent/CN112016836A/en
Priority to PCT/CN2020/139531 priority patent/WO2022041600A1/en
Priority to KR1020237009620A priority patent/KR102901463B1/en
Priority to TW110100218A priority patent/TWI842973B/en
Application granted granted Critical
Publication of CN112016836B publication Critical patent/CN112016836B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Tourism & Hospitality (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种对象间相似性的确定方法及装置,其中方法为:针对多个属性组中任一属性组的至少一个维度属性,针对评估对象的所述至少一个维度属性,生成多个评估对象在所述至少一个维度属性下的属性关联网络;针对任意两个属性关联网络,将所述两个属性关联网络进行融合,得到融合关联网络;对每个融合关联网络的节点随机遍历,得到所述融合关联网络的多个节点序列;根据所述多个节点序列,确定所述多个评估对象中任意两个评估对象在所述融合关联网络下的相似度。

The invention discloses a method and device for determining similarity between objects, wherein the method is: for at least one dimensional attribute of any attribute group among multiple attribute groups, for the at least one dimensional attribute of the evaluation object, generate multiple Evaluate the attribute association network of the object under the at least one dimensional attribute; for any two attribute association networks, fuse the two attribute association networks to obtain a fusion association network; randomly traverse the nodes of each fusion association network, Multiple node sequences of the fusion association network are obtained; and based on the multiple node sequences, the similarity of any two evaluation objects among the multiple evaluation objects under the fusion association network is determined.

Description

Method and device for determining similarity between objects
Technical Field
The present application relates to the field of similarity analysis, and in particular, to a method and apparatus for determining similarity between objects.
Background
During the course of the daily operation, the mechanism may involve a wide variety of objects. However, different objects may have different characteristics, for example, the type, manner, etc. of the institution's adapted to conduct the business may be different for different areas. In order to improve the decision-making efficiency of the organization, the characteristics of different objects need to be examined, and reasonable decisions are made on the different objects in a targeted manner.
If the performance of one object under a decision can be deduced, the performance of another similar object can certainly guide the decision of another evaluation object. Therefore, how to judge the similarity between evaluation objects is of great research value for the decision of institutions. However, there is currently no method for evaluating the similarity of different objects, which is a problem to be solved.
Disclosure of Invention
The application provides a method and a device for determining similarity between objects, which solve the problem that the prior art does not have a method for evaluating the similarity of different objects.
In a first aspect, the present application provides a method for determining similarity between objects, including: generating an attribute association network of a plurality of evaluation objects under at least one dimension attribute of any one of a plurality of attribute groups for the at least one dimension attribute of the evaluation object; wherein each evaluation object has a uniquely mapped node in the attribute association network; the side information among the nodes in the attribute association network characterizes the association degree among the evaluation objects under the at least one dimension attribute; fusing any two attribute association networks aiming at the two attribute association networks to obtain a fused association network; each evaluation object has a uniquely mapped node in the fusion association network; the side information among the nodes in the fusion association network characterizes the comprehensive association degree among the evaluation objects; randomly traversing the nodes of each fusion association network to obtain a plurality of node sequences of the fusion association network; and determining the similarity of any two evaluation objects in the plurality of evaluation objects under the fusion association network according to the plurality of node sequences.
In the method, an attribute association network of a plurality of evaluation objects under the at least one dimension attribute is generated, association degrees among the plurality of evaluation objects are represented in the attribute association network, and then the two attribute association networks are fused to obtain a fusion association network, so that comprehensive association degrees of the plurality of evaluation objects aiming at regional characteristics are fully represented, nodes of the fusion association network are traversed randomly to obtain a plurality of node sequences of the fusion association network, so that similarity of any two evaluation objects in the plurality of evaluation objects under the fusion association network is determined, and a method for determining similarity among the objects is provided.
Optionally, the generating, for at least one dimension attribute of the evaluation object, an attribute association network of a plurality of evaluation objects under the at least one dimension attribute includes: determining side information of any two evaluation objects in the attribute association network according to attribute values of the two evaluation objects under the dimension attribute aiming at any two evaluation objects in the plurality of evaluation objects; and generating an attribute association network of the plurality of evaluation objects aiming at the dimension attribute according to the side information of the plurality of evaluation objects among the nodes corresponding to the attribute association network.
According to the method, according to the attribute values of the two evaluation objects under the dimension attribute, the side information of the two evaluation objects between the corresponding two nodes in the attribute association network is further determined, and the attribute association network is generated, so that the method for generating the attribute association network under the condition of the attribute values under the same dimension attribute is provided, and the flexibility of generating the attribute association network of the plurality of evaluation objects aiming at the dimension attribute is improved.
Optionally, the evaluation object is a region; the dimension attribute comprises a time sequence position attribute of a user in the area; the attribute values under the time sequence position attribute of the user in the area comprise: a user identification; determining side information of the two evaluation objects between the corresponding two nodes in the attribute association network according to the attribute values of the two evaluation objects under the dimension attribute, wherein the side information comprises the following steps: and determining the side information of the two areas according to the user identifications in the two areas.
In the above method, when the evaluation object is a region, since the time-series position attribute can characterize the relevance of the region in time series, the method of determining the side information of the two regions more accurately according to the user identifications in the two regions.
Optionally, the at least one dimension attribute includes a first type attribute dimension and a second type attribute dimension; the first type attribute dimension and the second type attribute dimension are attribute dimensions which are preset to be associated; the generating, for at least one dimension attribute of the evaluation object, an attribute association network of a plurality of evaluation objects under the at least one dimension attribute includes: determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the first type attribute dimension and the attribute value of the second evaluation object in the second type attribute dimension, or determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the second type attribute dimension and the attribute value of the second evaluation object in the first type attribute dimension; and generating an attribute association network of the plurality of evaluation objects aiming at the first type attribute dimension and the second type attribute dimension according to the side information of the plurality of evaluation objects between the corresponding nodes in the attribute association network.
In the above manner, since the first type attribute dimension and the second type attribute dimension are attribute dimensions of preset association, the side information between two corresponding nodes in the attribute association network can be determined through the attribute values of different type attribute dimensions of the first evaluation object and the second evaluation object, so as to generate the attribute association network, thereby providing an attribute association network generation method aiming at different type attribute dimensions.
Optionally, the randomly traversing the nodes of each converged association network to obtain a plurality of node sequences includes: determining random walk probability among nodes in the fusion association network according to the side information among the nodes in the fusion association network; and randomly traversing the nodes of the fusion association network based on random walk probability among the nodes in the fusion association network to obtain the plurality of node sequences.
In the above manner, the random walk probability among the nodes is determined according to the side information among the nodes in the fusion association network, so that the nodes of the fusion association network are traversed randomly on the basis of considering the random walk probability among the nodes, and the plurality of node sequences are obtained more accurately.
Optionally, the side information among the nodes in the attribute association network is an attribute association weight value among the nodes; the side information among the nodes in the fusion association network is a comprehensive association weight value among the nodes; the fusing the two attribute association networks to obtain a fused association network comprises the following steps: for any two nodes in the fusion association network, determining the comprehensive association weight value of the two nodes in the fusion association network according to the attribute association weight values and the weighting coefficients of the two nodes in the two attribute association networks; and generating the characteristic association network of the plurality of evaluation objects based on the comprehensive association weight values among the nodes in the fusion association network.
In the above manner, the side information between the nodes in the attribute association network is the attribute association weight value between the nodes, the attribute association weight values and the weighting coefficients of the two nodes in the two attribute association networks are comprehensively considered, and the feature association network of the plurality of evaluation objects is generated based on the comprehensive association weight values between the nodes in the fusion association network, so that the feature association network of the plurality of evaluation objects is more accurately generated.
Optionally, the determining, according to the plurality of node sequences, the similarity of any two evaluation objects in the plurality of evaluation objects includes: inputting the plurality of node sequences into a correlation model of a preset word vector to generate an embedded vector of the fusion correlation network; and determining the similarity of any two evaluation objects in the plurality of evaluation objects according to the embedded vector of the fusion association network.
In the above manner, after the embedding vector of the fusion association network is generated, the similarity of any two evaluation objects in the plurality of evaluation objects can be determined according to the embedding vector of the fusion association network, and the embedding vector can more fully and finely characterize the fusion association network, so that the similarity of any two evaluation objects in the plurality of evaluation objects can be more accurately determined.
In a second aspect, the present application provides an apparatus for determining similarity between objects, including: a generating module, configured to generate, for at least one dimension attribute of any one of a plurality of attribute groups, an attribute association network of a plurality of evaluation objects under the at least one dimension attribute for the at least one dimension attribute of the evaluation object; wherein each evaluation object has a uniquely mapped node in the attribute association network; the side information among the nodes in the attribute association network characterizes the association degree among the evaluation objects under the at least one dimension attribute; the fusion module is used for fusing any two attribute association networks aiming at any two attribute association networks to obtain a fusion association network; each evaluation object has a uniquely mapped node in the fusion association network; the side information among the nodes in the fusion association network characterizes the comprehensive association degree among the evaluation objects; the processing module is used for randomly traversing the nodes of each fusion association network to obtain a plurality of node sequences of the fusion association network; and determining the similarity of any two evaluation objects in the plurality of evaluation objects under the fusion association network according to the plurality of node sequences.
Optionally, the generating module is specifically configured to: determining side information of any two evaluation objects in the attribute association network according to attribute values of the two evaluation objects under the dimension attribute aiming at any two evaluation objects in the plurality of evaluation objects; and generating an attribute association network of the plurality of evaluation objects aiming at the dimension attribute according to the side information of the plurality of evaluation objects among the nodes corresponding to the attribute association network.
Optionally, the evaluation object is a region; the dimension attribute comprises a time sequence position attribute of a user in the area; the attribute values under the time sequence position attribute of the user in the area comprise: a user identification; the generating module is specifically configured to: and determining the side information of the two areas according to the user identifications in the two areas.
Optionally, the at least one dimension attribute includes a first type attribute dimension and a second type attribute dimension; the first type attribute dimension and the second type attribute dimension are attribute dimensions which are preset to be associated; the generating module is specifically configured to: determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the first type attribute dimension and the attribute value of the second evaluation object in the second type attribute dimension, or determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the second type attribute dimension and the attribute value of the second evaluation object in the first type attribute dimension; and generating an attribute association network of the plurality of evaluation objects aiming at the first type attribute dimension and the second type attribute dimension according to the side information of the plurality of evaluation objects between the corresponding nodes in the attribute association network.
Optionally, the processing module is specifically configured to: determining random walk probability among nodes in the fusion association network according to the side information among the nodes in the fusion association network; and randomly traversing the nodes of the fusion association network based on random walk probability among the nodes in the fusion association network to obtain the plurality of node sequences.
Optionally, the side information among the nodes in the attribute association network is an attribute association weight value among the nodes; the side information among the nodes in the fusion association network is a comprehensive association weight value among the nodes; the fusion module is specifically used for: for any two nodes in the fusion association network, determining the comprehensive association weight value of the two nodes in the fusion association network according to the attribute association weight values and the weighting coefficients of the two nodes in the two attribute association networks; and generating the characteristic association network of the plurality of evaluation objects based on the comprehensive association weight values among the nodes in the fusion association network.
Optionally, the fusion module is specifically configured to: inputting the plurality of node sequences into a correlation model of a preset word vector to generate an embedded vector of the fusion correlation network; and determining the similarity of any two evaluation objects in the plurality of evaluation objects according to the embedded vector of the fusion association network.
The advantages of the foregoing second aspect and the advantages of the foregoing optional apparatuses of the second aspect may refer to the advantages of the foregoing first aspect and the advantages of the foregoing optional methods of the first aspect, and will not be described herein.
In a third aspect, the present application provides a computer device comprising a program or instructions which, when executed, is operable to perform the above-described first aspect and the respective alternative methods of the first aspect.
In a fourth aspect, the present application provides a storage medium comprising a program or instructions which, when executed, is adapted to carry out the above-described first aspect and the respective alternative methods of the first aspect.
Drawings
FIG. 1 is a flowchart illustrating steps of a method for determining similarity between objects according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating steps of a method for determining similarity between objects according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a device for determining similarity between objects according to an embodiment of the present application.
Detailed Description
In order to better understand the above technical solutions, the following detailed description will be made with reference to the accompanying drawings and specific embodiments of the present application, and it should be understood that specific features in the embodiments and examples of the present application are detailed descriptions of the technical solutions of the present application, and not limiting the technical solutions of the present application, and the technical features in the embodiments and examples of the present application may be combined with each other without conflict.
During the course of the daily operation, the mechanism may involve a wide variety of objects. If the performance of one object under a decision can be deduced, the performance of another similar object can certainly guide the decision of another evaluation object. Therefore, how to judge the similarity between evaluation objects is of great research value for the decision of institutions. However, there is currently no method for evaluating the similarity of different objects, which is a problem to be solved. To this end, as shown in fig. 1, the present application provides a method for determining similarity between objects.
Step 101: for at least one dimension attribute of any one of a plurality of attribute groups, an attribute association network of a plurality of evaluation objects under the at least one dimension attribute is generated for the at least one dimension attribute of the evaluation object.
Step 102: and fusing any two attribute association networks according to any two attribute association networks to obtain a fused association network.
Step 103: and randomly traversing the nodes of each fusion association network to obtain a plurality of node sequences of the fusion association network.
Step 104: and determining the similarity of any two evaluation objects in the plurality of evaluation objects under the fusion association network according to the plurality of node sequences.
In the steps 101 to 104, each evaluation object has a unique mapped node in the attribute association network; side information between nodes in the attribute association network characterizes the degree of association between the evaluation objects under the at least one dimension attribute. Each evaluation object has a uniquely mapped node in the fusion association network; the side information among the nodes in the fusion association network characterizes the comprehensive association degree among the evaluation objects. It should be noted that, the evaluation object may have various situations, such as a region, and may also be an organization, and the method may be used for similarity evaluation between regions, and may also be applied to similarity evaluation between recommendation systems. The attribute association network may include an attribute association network of user behavior and an attribute association network of non-user behavior, and when the attribute association network of user behavior and the attribute association network of non-user behavior are included, the side information between the nodes in the fusion association network characterizes the comprehensive association degree between the evaluation objects under the influence of user behavior.
In an alternative embodiment, step 101 may specifically be:
step (1-1): and determining the side information of any two evaluation objects in the attribute association network according to the attribute values of the two evaluation objects under the dimension attribute aiming at any two evaluation objects in the plurality of evaluation objects.
It should be noted that the dimension attribute may include one or more dimension attributes.
For example, the attribute values of the business number dimension attribute, the user number dimension attribute of the set business in the area a and the area B, such as the business number of the area a and the number of the users of the set business in the area a, and the business number of the area B and the number of the users of the set business in the area B.
Step (1-2): and generating an attribute association network of the plurality of evaluation objects aiming at the dimension attribute according to the side information of the plurality of evaluation objects among the nodes corresponding to the attribute association network.
For example, the side information concrete form of the two evaluation objects in the step (1-2) may be the association weight value.
In an alternative embodiment, the evaluation object is a region; the dimension attribute comprises a time sequence position attribute of a user in the area; the attribute values under the time sequence position attribute of the user in the area comprise: a user identification; the step (1-2) specifically comprises the following steps:
and determining the side information of the two areas according to the user identifications in the two areas.
Specifically, the side information of the two areas may be determined according to the number of users having the same user identifier in the two areas.
In the above embodiment, in addition to the user identifiers in the two areas, the side information of the two areas may be determined according to the stay time differences of the users having the same user identifier in the two areas.
In another alternative embodiment, the at least one dimension attribute includes a first type attribute dimension and a second type attribute dimension; the first type attribute dimension and the second type attribute dimension are attribute dimensions which are preset to be associated; step 101 may specifically be:
step (2-1): for a first evaluation object and a second evaluation object in the plurality of evaluation objects, determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the first type attribute dimension and the attribute value of the second evaluation object in the second type attribute dimension, or determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the second type attribute dimension and the attribute value of the second evaluation object in the first type attribute dimension.
For example, when the evaluation object is an area, the first attribute dimension in the step (2-1) is a proportion of service merchants set in the area, and the second attribute dimension is a proportion of users set in the area, then the side information between two corresponding nodes in the attribute related network may be determined according to the proportion of service merchants set in the first area and the proportion of service users set in the second area, or the side information between two corresponding nodes in the attribute related network may be determined according to the proportion of service users set in the first area and the proportion of service merchants set in the second area.
Step (2-2): and generating an attribute association network of the plurality of evaluation objects aiming at the first type attribute dimension and the second type attribute dimension according to the side information of the plurality of evaluation objects between the corresponding nodes in the attribute association network.
In an optional implementation manner, the side information between the nodes in the attribute association network is an attribute association weight value between the nodes; the side information among the nodes in the fusion association network is a comprehensive association weight value among the nodes; step 102 may specifically be:
for any two nodes in the fusion association network, determining the comprehensive association weight value of the two nodes in the fusion association network according to the attribute association weight values and the weighting coefficients of the two nodes in the two attribute association networks; and generating the characteristic association network of the plurality of evaluation objects based on the comprehensive association weight values among the nodes in the fusion association network.
For example, the two attribute association networks may be an attribute association network of any user behavior and an attribute association network of any non-user behavior, where the attribute association weight value of the attribute association network of the user behavior is a first weight value, the weighting coefficient value is a first weighting coefficient, the attribute association weight value of the attribute association network of the non-user behavior is a second weight value, and the weighting coefficient value is a second weighting coefficient. The above only shows an example of the comprehensive association weight value of any two attribute association networks in the fusion association network, but actually, more attribute association networks can be fused together, and further, the first weight value and the second weight value can also be multiple. For example, the composite association weight value may be calculated as follows:
w comprehensive synthesis =w 1-1 ·a 1-1 +w 1-2 ·a 1-2 +w 1-3 ·a 1-3 +…+w 2-1 ·a 2-1 +w 2-2 ·a 2-2 +w 2-3 ·a 2-3 +…
Wherein w is 1-x Represents a first weight value, w 2-x Representing a second weight value, a 1-x Representing a first weighting coefficient, a 2-x Representing a second weight value.
In an alternative embodiment, step 103 may specifically be:
step (3-1): and determining random walk probability among all nodes in the fusion association network according to the side information among all nodes in the fusion association network.
It should be noted that, in the step (3-1), the edge between two nodes may be two directed edges, such as the edge from node a to node B, and the edge from node B to node a. The side information may be the weight values of two directed sides. The random walk probability of one node to another node can be determined according to the proportion of the weight value. For example, with edges between node A and nodes B, C and D, node A to node B, C and D weight values corresponding to 3,4,5, then the random walk probability from node A to node B is 1/4, the random walk probability from node A to node C is 1/3, and the random walk probability from node A to node C is 5/12.
Step (3-2): and randomly traversing the nodes of the fusion association network based on random walk probability among the nodes in the fusion association network to obtain the plurality of node sequences.
It should be noted that each converged associated network may be traversed multiple times to obtain multiple node sequences, e.g., the node sequence of the converged associated network is ABCDE, ABCEF, ACEF, ABEC. The similarity of two evaluation objects under the fusion correlation network can be determined by counting the condition of the node sequences, for example, the more the ratio of the continuous sequences of two nodes to the total sequence is, the higher the similarity is.
In an alternative embodiment, step 104 may specifically be:
inputting the plurality of node sequences into a correlation model of a preset word vector to generate an embedded vector of the fusion correlation network; and determining the similarity of any two evaluation objects in the plurality of evaluation objects according to the embedded vector of the fusion association network.
The following describes the method for determining similarity between objects according to the present application in further detail by way of example, by combining the descriptions of steps 101 to 104 of the present application. Specifically, taking an example in which the evaluation object is a region, the procedure is summarized as:
and dividing the geographic position into different areas by a spatial data processing method. Constructing a plurality of attribute association networks through GPS space-time data and regional characteristic attribute data of users, generating a node sequence by using a probability random walk method, finally acquiring a plurality of embedded vectors of regional nodes in a Skip-Gram mode, and calculating the similarity between regions. Thereby digging out other unexpanded regions similar to the currently expanded region. The method fully considers the association closeness between the areas, the similarity of the attribute portrait features of the areas and the like to mine the areas which are more similar to the current area in terms of network structure and attribute information. The specific steps can be as follows:
(1) Forming a space-time correlation network G between areas through GPS space-time migration data of users gps
(2) Extracting image features of a region, and constructing a feature set V (V) 1 ,v 2 ,v 3 ,……,v n ) Such as the user age distribution of the area, the business distribution of the area, the user consumption level of the area, etc. Establishing an edge between the areas with similar attributes to construct an attribute association network G between the areas v1 ,G v2 ,G vn
(3) And forming a fusion association network through fusion of the attribute association network, and generating a regional node sequence in a probability migration mode.
(4) And generating a plurality of embedded vectors of the region and the similarity among the region nodes under different fusion networks through a Skip-Gram model.
(5) And comprehensively evaluating the similarity between the areas by a weighted average method.
More specifically, as shown in fig. 2, taking an attribute-related network of user behavior and an attribute-related network of non-user behavior as an example, the process of steps 101 to 104 is as follows:
(1) Attribute association network generation of user behavior:
the user can generate some GPS position data during the use of the APP. To make the user at a certain time t 1 The position of (2) is denoted as G t1 The position of the user at the next moment is marked as G t2 And so on. Position G t1 And G t2 And a side between the two points is formed through the behavior relation of the users. Similarly, a property association network based on user behavior can be constructed among different areas through a plurality of user position data. The strength of the area-to-area connection is determined by the number of users generating the edge and the time difference. Thus, the regional user behavior relationship network is a weighted directed association network.
(2) Attribute association network for non-user behavior:
for areas without user GPS coverage, the areas can not be added into the attribute association network of user behaviors, and the areas are also high-potential expansion areas. Based on this, a region-based portrait character may be designed to be a non-user behavior attribute-dependent network.
The attribute association network for non-user behavior is as follows. The feature images of the areas are classified, including area people stream density, area business industry distribution, area user age distribution, area user consumption level, consumption preference of the area users and the like, so that more area portrait information can be added. And calculating the similarity of different features among the regions, and establishing an edge for the region with similar features. If the merchants in the area A and the area B are dining merchants basically, the characteristic is distributed in a merchant type to generate a correlation network, and an edge can be established between the area A and the area B. If the user ages within region C and region D are concentrated between (25-40), then an edge can be established between region C and region D when the association network is generated with the feature of user age distribution, and the other is the same.
By partitioning different non-user attributes, a plurality of attribute-associated networks of non-user behavior can be generated.
And (3) generating a fusion association network:
by fusing the attribute-related networks of user behavior with the attribute-related networks of non-user behavior, multiple fused-related networks may be generated.
And (5) calculating the similarity of the regional nodes:
for different fusion associated networks, the embedding vectors of the regional nodes under the different fusion associated networks are calculated in the mode, and the similarity among the nodes under the different fusion associated networks is obtained.
As shown in fig. 3, the present application provides a device for determining similarity between objects, including: a generating module 301, configured to generate, for at least one dimension attribute of any one of a plurality of attribute groups, an attribute association network of a plurality of evaluation objects under the at least one dimension attribute for the at least one dimension attribute of the evaluation object; wherein each evaluation object has a uniquely mapped node in the attribute association network; the side information among the nodes in the attribute association network characterizes the association degree among the evaluation objects under the at least one dimension attribute; the fusion module 302 is configured to fuse any two attribute association networks for the two attribute association networks to obtain a fused association network; each evaluation object has a uniquely mapped node in the fusion association network; the side information among the nodes in the fusion association network characterizes the comprehensive association degree among the evaluation objects; the processing module 303 is configured to randomly traverse the nodes of each converged associated network to obtain a plurality of node sequences of the converged associated network; and determining the similarity of any two evaluation objects in the plurality of evaluation objects under the fusion association network according to the plurality of node sequences.
Optionally, the generating module 301 is specifically configured to: determining side information of any two evaluation objects in the attribute association network according to attribute values of the two evaluation objects under the dimension attribute aiming at any two evaluation objects in the plurality of evaluation objects; and generating an attribute association network of the plurality of evaluation objects aiming at the dimension attribute according to the side information of the plurality of evaluation objects among the nodes corresponding to the attribute association network.
Optionally, the evaluation object is a region; the dimension attribute comprises a time sequence position attribute of a user in the area; the attribute values under the time sequence position attribute of the user in the area comprise: a user identification; the generating module 301 is specifically configured to: and determining the side information of the two areas according to the user identifications in the two areas.
Optionally, the at least one dimension attribute includes a first type attribute dimension and a second type attribute dimension; the first type attribute dimension and the second type attribute dimension are attribute dimensions which are preset to be associated; the generating module 301 is specifically configured to: determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the first type attribute dimension and the attribute value of the second evaluation object in the second type attribute dimension, or determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the second type attribute dimension and the attribute value of the second evaluation object in the first type attribute dimension; and generating an attribute association network of the plurality of evaluation objects aiming at the first type attribute dimension and the second type attribute dimension according to the side information of the plurality of evaluation objects between the corresponding nodes in the attribute association network.
Optionally, the processing module 303 is specifically configured to: determining random walk probability among nodes in the fusion association network according to the side information among the nodes in the fusion association network; and randomly traversing the nodes of the fusion association network based on random walk probability among the nodes in the fusion association network to obtain the plurality of node sequences.
Optionally, the side information among the nodes in the attribute association network is an attribute association weight value among the nodes; the side information among the nodes in the fusion association network is a comprehensive association weight value among the nodes; the fusion module 302 is specifically configured to: for any two nodes in the fusion association network, determining the comprehensive association weight value of the two nodes in the fusion association network according to the attribute association weight values and the weighting coefficients of the two nodes in the two attribute association networks; and generating the characteristic association network of the plurality of evaluation objects based on the comprehensive association weight values among the nodes in the fusion association network.
Optionally, the fusion module 302 is specifically configured to: inputting the plurality of node sequences into a correlation model of a preset word vector to generate an embedded vector of the fusion correlation network; and determining the similarity of any two evaluation objects in the plurality of evaluation objects according to the embedded vector of the fusion association network.
The embodiment of the application provides a computer device, which comprises a program or an instruction, and the program or the instruction is used for executing the method for determining the similarity between objects and any optional method provided by the embodiment of the application when being executed.
The embodiment of the application provides a computer readable storage medium, which comprises a program or an instruction, and when the program or the instruction is executed, the program or the instruction is used for executing the method for determining the similarity between objects and any optional method provided by the embodiment of the application.
Finally, it should be noted that: it will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (9)

1. A method for determining similarity between objects, comprising:
generating an attribute association network of a plurality of evaluation objects under at least one dimension attribute of any one of a plurality of attribute groups for the at least one dimension attribute of the evaluation object; wherein the evaluation object is a region; the dimension attribute comprises time sequence position attribute of the user in the area, set proportion of business merchants and set proportion of the user in the business; the attribute values under the time sequence position attribute of the user in the area comprise: a user identification; each evaluation object has a uniquely mapped node in the attribute association network; the side information among the nodes in the attribute association network characterizes the association degree among the evaluation objects under the at least one dimension attribute;
fusing any two attribute association networks aiming at the two attribute association networks to obtain a fused association network; each evaluation object has a uniquely mapped node in the fusion association network; the side information among the nodes in the fusion association network characterizes the comprehensive association degree among the evaluation objects;
randomly traversing the nodes of each fusion association network to obtain a plurality of node sequences of the fusion association network;
inputting the plurality of node sequences into a correlation model of a preset word vector to generate an embedded vector of the fusion correlation network; and determining the similarity of any two evaluation objects in the plurality of evaluation objects under the fusion association network according to the embedded vector of the fusion association network.
2. The method of claim 1, wherein the generating, for at least one dimension attribute of an evaluation object, an attribute association network for a plurality of evaluation objects under the at least one dimension attribute comprises:
determining side information of any two evaluation objects in the attribute association network according to attribute values of the two evaluation objects under the dimension attribute aiming at any two evaluation objects in the plurality of evaluation objects;
and generating an attribute association network of the plurality of evaluation objects aiming at the dimension attribute according to the side information of the plurality of evaluation objects among the nodes corresponding to the attribute association network.
3. The method of claim 2, wherein determining the side information of the two evaluation objects between the corresponding two nodes in the attribute-related network based on the attribute values of the two evaluation objects under the dimension attribute comprises:
and determining the side information of the two areas according to the user identifications in the two areas.
4. The method of claim 1, wherein the at least one dimension attribute comprises a first type attribute dimension and a second type attribute dimension; the first type attribute dimension and the second type attribute dimension are attribute dimensions which are preset to be associated; the generating, for at least one dimension attribute of the evaluation object, an attribute association network of a plurality of evaluation objects under the at least one dimension attribute includes:
determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the first type attribute dimension and the attribute value of the second evaluation object in the second type attribute dimension, or determining side information between two corresponding nodes of the two evaluation objects in the attribute association network according to the attribute value of the first evaluation object in the second type attribute dimension and the attribute value of the second evaluation object in the first type attribute dimension;
and generating an attribute association network of the plurality of evaluation objects aiming at the first type attribute dimension and the second type attribute dimension according to the side information of the plurality of evaluation objects between the corresponding nodes in the attribute association network.
5. The method according to any one of claims 1 to 4, wherein the randomly traversing the nodes of each converged associated network results in a plurality of node sequences, comprising:
determining random walk probability among nodes in the fusion association network according to the side information among the nodes in the fusion association network;
and randomly traversing the nodes of the fusion association network based on random walk probability among the nodes in the fusion association network to obtain the plurality of node sequences.
6. The method according to any one of claims 1 to 4, wherein the side information between nodes in the attribute-related network is an attribute-related weight value between the nodes; the side information among the nodes in the fusion association network is a comprehensive association weight value among the nodes; the fusing the two attribute association networks to obtain a fused association network comprises the following steps:
for any two nodes in the fusion association network, determining the comprehensive association weight value of the two nodes in the fusion association network according to the attribute association weight values and the weighting coefficients of the two nodes in the two attribute association networks;
and generating the characteristic association network of the plurality of evaluation objects based on the comprehensive association weight values among the nodes in the fusion association network.
7. An apparatus for determining similarity between objects, comprising:
a generating module, configured to generate, for at least one dimension attribute of any one of a plurality of attribute groups, an attribute association network of a plurality of evaluation objects under the at least one dimension attribute for the at least one dimension attribute of the evaluation object; wherein the evaluation object is a region; the dimension attribute comprises time sequence position attribute of the user in the area, set proportion of business merchants and set proportion of the user in the business; the attribute values under the time sequence position attribute of the user in the area comprise: a user identification; each evaluation object has a uniquely mapped node in the attribute association network; the side information among the nodes in the attribute association network characterizes the association degree among the evaluation objects under the at least one dimension attribute;
the fusion module is used for fusing any two attribute association networks aiming at any two attribute association networks to obtain a fusion association network; each evaluation object has a uniquely mapped node in the fusion association network; the side information among the nodes in the fusion association network characterizes the comprehensive association degree among the evaluation objects;
the processing module is used for randomly traversing the nodes of each fusion association network to obtain a plurality of node sequences of the fusion association network; inputting the plurality of node sequences into a correlation model of a preset word vector to generate an embedded vector of the fusion correlation network; and determining the similarity of any two evaluation objects in the plurality of evaluation objects under the fusion association network according to the embedded vector of the fusion association network.
8. A computer device comprising a program or instructions which, when executed, performs the method of any of claims 1 to 6.
9. A computer readable storage medium comprising a program or instructions which, when executed, performs the method of any of claims 1 to 6.
CN202010896318.8A 2020-08-31 2020-08-31 A method and device for determining similarity between objects Active CN112016836B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202010896318.8A CN112016836B (en) 2020-08-31 2020-08-31 A method and device for determining similarity between objects
PCT/CN2020/139531 WO2022041600A1 (en) 2020-08-31 2020-12-25 Inter-object similarity determination method and apparatus
KR1020237009620A KR102901463B1 (en) 2020-08-31 2020-12-25 Method and device for determining similarity between objects
TW110100218A TWI842973B (en) 2020-08-31 2021-01-05 A method and device for determining similarity between objects

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010896318.8A CN112016836B (en) 2020-08-31 2020-08-31 A method and device for determining similarity between objects

Publications (2)

Publication Number Publication Date
CN112016836A CN112016836A (en) 2020-12-01
CN112016836B true CN112016836B (en) 2023-11-03

Family

ID=73503484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010896318.8A Active CN112016836B (en) 2020-08-31 2020-08-31 A method and device for determining similarity between objects

Country Status (4)

Country Link
KR (1) KR102901463B1 (en)
CN (1) CN112016836B (en)
TW (1) TWI842973B (en)
WO (1) WO2022041600A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112016836B (en) * 2020-08-31 2023-11-03 中国银联股份有限公司 A method and device for determining similarity between objects
CN113362158B (en) * 2021-05-31 2024-06-11 中国银联股份有限公司 A credit assessment method, device and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8874616B1 (en) * 2011-07-11 2014-10-28 21Ct, Inc. Method and apparatus for fusion of multi-modal interaction data
GB201523224D0 (en) * 2015-12-31 2016-02-17 Murphy Dominic F Defining edges and their weights between nodes in a network
CN108132927A (en) * 2017-12-07 2018-06-08 西北师范大学 A kind of fusion graph structure and the associated keyword extracting method of node
US10129276B1 (en) * 2016-03-29 2018-11-13 EMC IP Holding Company LLC Methods and apparatus for identifying suspicious domains using common user clustering
CN110659799A (en) * 2019-08-14 2020-01-07 深圳壹账通智能科技有限公司 Attribute information processing method and device based on relational network, computer equipment and storage medium
CN110968701A (en) * 2019-11-05 2020-04-07 量子数聚(北京)科技有限公司 Relationship map establishing method, device and equipment for graph neural network

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013126144A2 (en) * 2012-02-20 2013-08-29 Aptima, Inc. Systems and methods for network pattern matching
US9686091B2 (en) * 2013-02-01 2017-06-20 Harman International Industries, Incorporated Network address management and functional object discovery system
US9946800B2 (en) * 2015-07-06 2018-04-17 International Business Machines Corporation Ranking related objects using blink model based relation strength determinations
CN105760503B (en) * 2016-02-23 2019-02-05 清华大学 A Fast Method to Calculate the Similarity of Graph Nodes
US9836183B1 (en) * 2016-09-14 2017-12-05 Quid, Inc. Summarized network graph for semantic similarity graphs of large corpora
CN109712678B (en) * 2018-12-12 2020-03-06 中国人民解放军军事科学院军事医学研究院 Relationship prediction method and device and electronic equipment
CN111523918B (en) * 2019-02-02 2023-09-19 北京极智嘉科技股份有限公司 Methods, devices, equipment and storage media for commodity clustering
CN110046301B (en) * 2019-01-24 2023-07-14 创新先进技术有限公司 Object recommendation method and device
CN112016836B (en) * 2020-08-31 2023-11-03 中国银联股份有限公司 A method and device for determining similarity between objects

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8874616B1 (en) * 2011-07-11 2014-10-28 21Ct, Inc. Method and apparatus for fusion of multi-modal interaction data
GB201523224D0 (en) * 2015-12-31 2016-02-17 Murphy Dominic F Defining edges and their weights between nodes in a network
US10129276B1 (en) * 2016-03-29 2018-11-13 EMC IP Holding Company LLC Methods and apparatus for identifying suspicious domains using common user clustering
CN108132927A (en) * 2017-12-07 2018-06-08 西北师范大学 A kind of fusion graph structure and the associated keyword extracting method of node
CN110659799A (en) * 2019-08-14 2020-01-07 深圳壹账通智能科技有限公司 Attribute information processing method and device based on relational network, computer equipment and storage medium
CN110968701A (en) * 2019-11-05 2020-04-07 量子数聚(北京)科技有限公司 Relationship map establishing method, device and equipment for graph neural network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
基于SkipGram模型的链路预测方法;赵超;朱福喜;刘世超;;计算机应用与软件(第10期) *
基于位置的社交网络中基于时空关系的超网络链接预测方法;胡敏;陈元会;黄宏程;;计算机应用(第06期) *
胡敏 ; 陈元会 ; 黄宏程 ; .基于位置的社交网络中基于时空关系的超网络链接预测方法.计算机应用.2018,(第06期),1682-1697页. *
赵超 ; 朱福喜 ; 刘世超 ; .基于SkipGram模型的链路预测方法.计算机应用与软件.2017,(第10期),241-247页. *

Also Published As

Publication number Publication date
CN112016836A (en) 2020-12-01
TW202211052A (en) 2022-03-16
KR102901463B1 (en) 2025-12-16
TWI842973B (en) 2024-05-21
WO2022041600A1 (en) 2022-03-03
KR20230054438A (en) 2023-04-24

Similar Documents

Publication Publication Date Title
Ronhovde et al. Local resolution-limit-free Potts model for community detection
Yang et al. Characterizing and learning equivalence classes of causal dags under interventions
Rieser-Schüssler et al. Route choice sets for very high-resolution data
TWI360754B (en) Web page analysis using multiple graphs
CN111932318B (en) Region division method and device, electronic equipment and computer readable storage medium
JP7218754B2 (en) Vacant house determination device, vacant house determination method and program
Fieldsend et al. Visualising the landscape of multi-objective problems using local optima networks
CN112016836B (en) A method and device for determining similarity between objects
Mahyar et al. Centrality-based group formation in group recommender systems
KR20160075738A (en) Method and System for Recognizing Faces
CN106021456A (en) Point-of-interest recommendation method fusing text and geographic information in local synergistic arrangement
Shen et al. Spatial-proximity optimization for rapid task group deployment
KR20240175003A (en) Method and apparatus with flexible job shop scheduling
Chatterjee et al. Distributed MST: A smoothed analysis
Gangeraj et al. Estimation of origin–destination matrix from traffic counts based on fuzzy logic
Michau et al. Estimating link-dependent origin-destination matrices from sample trajectories and traffic counts
HK40032061A (en) Method and device for determining similarity between objects
CN118378391A (en) Graph expansion method and device based on graph contrast learning based on hyper-edge reinforcement
CN117649024A (en) Object link relationship prediction model training and link relationship prediction method and device
Panagiotakis et al. Local community detection via flow propagation
CN117237140A (en) A method to maximize social network influence by integrating graph convolutional neural network and Transformer
HK40032061B (en) Method and device for determining similarity between objects
CN113921095A (en) Graph network generation method and system based on network topology characteristics
CN114741383B (en) A network community graph generation method, device, computer equipment and storage medium
CN112328835A (en) Method and device for generating vector representation of object, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40032061

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant