[go: up one dir, main page]

CN117194728A - A business data degree distribution analysis method and device based on graph theory - Google Patents

A business data degree distribution analysis method and device based on graph theory Download PDF

Info

Publication number
CN117194728A
CN117194728A CN202311156530.0A CN202311156530A CN117194728A CN 117194728 A CN117194728 A CN 117194728A CN 202311156530 A CN202311156530 A CN 202311156530A CN 117194728 A CN117194728 A CN 117194728A
Authority
CN
China
Prior art keywords
degree distribution
business data
data
network
degree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311156530.0A
Other languages
Chinese (zh)
Inventor
郭文鑫
李世明
胡亚平
赵瑞锋
谭慧娟
祝锦舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Electric Power Dispatch Control Center of Guangdong Power Grid Co Ltd
Original Assignee
Guangdong Power Grid Co Ltd
Electric Power Dispatch Control Center of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd, Electric Power Dispatch Control Center of Guangdong Power Grid Co Ltd filed Critical Guangdong Power Grid Co Ltd
Priority to CN202311156530.0A priority Critical patent/CN117194728A/en
Publication of CN117194728A publication Critical patent/CN117194728A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a business data degree distribution analysis method and a device based on graph theory, comprising the following steps: acquiring service data to be analyzed, and converting the service data into a representation form of a network diagram so as to obtain a service data network diagram; calculating the degree of each node according to the service number network diagram, and drawing a service data degree distribution diagram according to the degree of each node; obtaining network structure information according to the service data degree distribution map, and performing parameter fitting on the service data degree distribution map according to the network structure information; and correcting missing edges of the fitted degree distribution diagram to obtain an analysis result of the business data degree distribution, and further visually presenting the analysis result of the business data degree distribution to complete analysis of the business data degree distribution. The invention solves the technical problems of low accuracy of degree distribution analysis and difficulty in analysis and display of the service data network in the prior art.

Description

一种基于图论的业务数据度分布分析方法及装置A business data degree distribution analysis method and device based on graph theory

技术领域Technical field

本发明涉及图论数据处理技术领域,尤其涉及一种基于图论的业务数据度分布分析方法及装置。The present invention relates to the technical field of graph theory data processing, and in particular to a business data degree distribution analysis method and device based on graph theory.

背景技术Background technique

度分布是图论和网络理论中都存在的概念,一个图是一个由两个集合和构成的二元组,集合一般由有限个元素构成,其中的元素被称为图的顶点,集合是由有限个元素构成的集合,集合中的每个元素都是一个非负整数。而网络理论的数学框架建立在图论上,网络理论中的网络其实就是图论中的图。Degree distribution is a concept that exists in both graph theory and network theory. A graph is a tuple composed of the sum of two sets. The set is generally composed of a finite number of elements. The elements are called vertices of the graph. The set is composed of A set of finite elements, where each element is a non-negative integer. The mathematical framework of network theory is based on graph theory, and the network in network theory is actually the graph in graph theory.

目前,在对业务数据进行分析处理过程中,通过采用图论并对其度分布分析,即通过度分布分析来解决业务数据之间的相似性或相关性,并进而可以帮助发现业务问题,并与所获取的数据之间的关系是通过数据分析和挖掘来揭示的。目前。通常是直接对图论中的节点连接的综合进行统计,即一个节点的度通常定义为该节点连接的所有连接的总和,业务数据网络的度分布即为网络中节点的度的概率分布或频率分布,但不便于对业务数据的度分布的特征进行分析展示,难以实现对业务数据的度的统计进行验证纠正,导致业务数据网络的度分布分析的准确性较低,难以用于实际的分析应用。Currently, in the process of analyzing and processing business data, graph theory is used to analyze its degree distribution, that is, degree distribution analysis is used to solve the similarity or correlation between business data, which can then help discover business problems and Relationships with the acquired data are revealed through data analysis and mining. at present. Usually, statistics are directly made on the synthesis of node connections in graph theory, that is, the degree of a node is usually defined as the sum of all the connections connected to the node. The degree distribution of the business data network is the probability distribution or frequency of the node degrees in the network. distribution, but it is inconvenient to analyze and display the characteristics of the degree distribution of business data, and it is difficult to verify and correct the degree statistics of business data, resulting in low accuracy of degree distribution analysis of business data networks and making it difficult to use in actual analysis. application.

因此,目前亟需一种能够提高业务数据的度分布分析准确性和展示便捷性的方法。Therefore, there is an urgent need for a method that can improve the accuracy of degree distribution analysis and display convenience of business data.

发明内容Contents of the invention

本发明提供了一种基于图论的业务数据度分布分析方法及装置,以解决现有技术中业务数据网络的度分布分析准确性低、难以进行分析展示的技术问题。The present invention provides a business data degree distribution analysis method and device based on graph theory to solve the technical problems in the prior art that the degree distribution analysis of business data networks is low in accuracy and difficult to perform analysis and display.

为了解决上述技术问题,本发明实施例提供了一种基于图论的业务数据度分布分析方法,包括:In order to solve the above technical problems, embodiments of the present invention provide a business data degree distribution analysis method based on graph theory, including:

获取待分析的业务数据,并将所述业务数据转换成网络图的表现形式,从而得到业务数据网络图;Obtain the business data to be analyzed and convert the business data into the representation of a network diagram, thereby obtaining the business data network diagram;

根据所述业务数网络图,计算出每个节点的度,并根据每个节点的度,绘制业务数据度分布图;Calculate the degree of each node according to the service number network diagram, and draw the service data degree distribution diagram according to the degree of each node;

根据所述业务数据度分布图得到网络结构信息,并根据所述网络结构信息,对所述业务数据度分布图进行参数拟合;Obtain network structure information according to the service data degree distribution diagram, and perform parameter fitting on the service data degree distribution diagram according to the network structure information;

对拟合后的度分布图进行遗漏边纠正,从而得到业务数据度分布的分析结果,进而将所述业务数据度分布的分析结果进行可视化呈现,从而完成对业务数据度分布的分析。The fitted degree distribution graph is corrected for missing edges to obtain an analysis result of the degree distribution of the business data, and then the analysis result of the degree distribution of the business data is visually presented, thereby completing the analysis of the degree distribution of the business data.

作为优选方案,所述业务数据包括社交网络数据、互联网路由关系数据、网络行为数据、交易行为数据和轨迹同步数据。As a preferred solution, the business data includes social network data, Internet routing relationship data, network behavior data, transaction behavior data and trajectory synchronization data.

作为优选方案,所述根据所述业务数网络图,计算出每个节点的度,并根据每个节点的度,绘制业务数据度分布图,具体为:As a preferred solution, calculate the degree of each node based on the service number network diagram, and draw a service data degree distribution diagram based on the degree of each node, specifically as follows:

根据所述业务数据网络图中的每个节点及每个节点所连接的边的总和,计算出每个节点的度;Calculate the degree of each node based on each node in the business data network graph and the sum of the edges connected to each node;

根据每个节点的度,将度的取值作为横轴,将对应的节点数量或比例作为纵轴,从而绘制出业务数据度分布图。According to the degree of each node, use the value of the degree as the horizontal axis and the corresponding number or proportion of nodes as the vertical axis to draw a business data degree distribution diagram.

作为优选方案,所述根据所述业务数据度分布图得到网络结构信息,并根据所述网络结构信息,对所述业务数据度分布图进行参数拟合,具体为:As a preferred solution, the network structure information is obtained according to the service data degree distribution diagram, and the service data degree distribution diagram is parameter-fitted according to the network structure information, specifically as follows:

对所述业务数据度分布图进行分析,得到业务数据度分布图的网络结构信息;所述网络结构信息为幂律分布、泊松分布或高斯分布中的一种;Analyze the business data degree distribution map to obtain the network structure information of the business data degree distribution map; the network structure information is one of power law distribution, Poisson distribution or Gaussian distribution;

根据网络结构信息的类型,通过最小二乘拟合法、线性混合模型法、岭回归法、奇异值分解或最大似然估计法,对所述业务数据度分布图进行度分布参数的拟合;According to the type of network structure information, fit the degree distribution parameters of the business data degree distribution diagram through the least squares fitting method, linear mixed model method, ridge regression method, singular value decomposition or maximum likelihood estimation method;

通过假设检验法,对拟合后的业务数据度分布图进行验证,从而完成对业务数据度分布图的参数拟合。Through the hypothesis testing method, the fitted business data degree distribution chart is verified, thereby completing the parameter fitting of the business data degree distribution chart.

作为优选方案,所述对拟合后的度分布图进行遗漏边纠正,从而得到业务数据度分布的分析结果,具体为:As a preferred solution, the fitted degree distribution diagram is corrected for missing edges, thereby obtaining the analysis results of the business data degree distribution, specifically as follows:

根据拟合后的度分布图进行业务数网络图中各节点的边的校验,并通过校验所得到的遗漏边,对拟合后的度分布图进行遗漏边纠正,从而得到遗漏边纠正后的度分布图,作为业务数据度分布的分析结果。According to the fitted degree distribution diagram, the edges of each node in the business network diagram are verified, and by checking the obtained missing edges, the fitted degree distribution diagram is corrected for the missing edges, thereby obtaining the missing edge correction. The resulting degree distribution diagram is used as the analysis result of the degree distribution of business data.

作为优选方案,所述将所述业务数据度分布的分析结果进行可视化呈现,具体为:As a preferred solution, the analysis results of the business data degree distribution are visually presented, specifically as follows:

通过直方图、网络图、拟合曲线图、散点图、箱线图或柱状图中的一种或多种,对所述业务数据度分布的分析结果进行可视化呈现,以使得所述业务数据各节点的信息能够进行显示。The analysis results of the degree distribution of the business data are visually presented through one or more of a histogram, a network graph, a fitting curve, a scatter plot, a box plot or a histogram, so that the business data The information of each node can be displayed.

相应地,本发明还提供一种基于图论的业务数据度分布分析装置,包括:获取模块、绘制模块、拟合模块和可视化模块;Correspondingly, the present invention also provides a business data degree distribution analysis device based on graph theory, including: an acquisition module, a drawing module, a fitting module and a visualization module;

所述获取模块,用于获取待分析的业务数据,并将所述业务数据转换成网络图的表现形式,从而得到业务数据网络图;The acquisition module is used to acquire business data to be analyzed, and convert the business data into the representation of a network diagram, thereby obtaining a business data network diagram;

所述绘制模块,用于根据所述业务数网络图,计算出每个节点的度,并根据每个节点的度,绘制业务数据度分布图;The drawing module is used to calculate the degree of each node according to the service number network diagram, and draw the service data degree distribution diagram according to the degree of each node;

所述拟合模块,用于根据所述业务数据度分布图得到网络结构信息,并根据所述网络结构信息,对所述业务数据度分布图进行参数拟合;The fitting module is used to obtain network structure information according to the service data degree distribution diagram, and perform parameter fitting on the service data degree distribution diagram according to the network structure information;

所述可视化模块,用于对拟合后的度分布图进行遗漏边纠正,从而得到业务数据度分布的分析结果,进而将所述业务数据度分布的分析结果进行可视化呈现,从而完成对业务数据度分布的分析。The visualization module is used to correct missing edges on the fitted degree distribution diagram, thereby obtaining the analysis results of the degree distribution of the business data, and then visually presenting the analysis results of the degree distribution of the business data, thereby completing the analysis of the business data. Analysis of degree distribution.

作为优选方案,所述业务数据包括社交网络数据、互联网路由关系数据、网络行为数据、交易行为数据和轨迹同步数据。As a preferred solution, the business data includes social network data, Internet routing relationship data, network behavior data, transaction behavior data and trajectory synchronization data.

相应地,本发明还提供一种终端设备,包括处理器、存储器以及存储在所述存储器中且被配置为由所述处理器执行的计算机程序,所述处理器执行所述计算机程序时实现如上任意一项所述的基于图论的业务数据度分布分析方法。Correspondingly, the present invention also provides a terminal device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor. When the processor executes the computer program, the above is implemented. Any one of the business data degree distribution analysis methods based on graph theory.

相应地,本发明还提供一种计算机可读存储介质,所述计算机可读存储介质包括存储的计算机程序,其中,在所述计算机程序运行时控制所述计算机可读存储介质所在设备执行如上任意一项所述的基于图论的业务数据度分布分析方法。Correspondingly, the present invention also provides a computer-readable storage medium. The computer-readable storage medium includes a stored computer program, wherein when the computer program is running, the device where the computer-readable storage medium is located is controlled to execute any of the above. A business data degree distribution analysis method based on graph theory.

相比于现有技术,本发明实施例具有如下有益效果:Compared with the prior art, embodiments of the present invention have the following beneficial effects:

本发明的技术方案通过获取并将待分析的业务数据转换成网络图的表现形式,从而得到业务数据网络图,并计算出每个节点的度后,绘制出业务数据度分布图,进而通过业务数据度分布图所得到的网络结构信息,对业务数据度分布图进行参数拟合,以获取最佳的分布拟合结果,提高业务数据网络的度分布分析准确性,并对拟合后的度分布图进行遗漏边纠正,从而实现分析其度分布的特性,还可以对其进行验漏纠正,最后将业务数据度分布的分析结果进行可视化呈现,从而可以从不同类型中的可视化呈现结果中进行其特性的观察,进而得出度分布的特性,且对度分布的特性进行解释分析,以便于度分布的直观了解。The technical solution of the present invention obtains and converts the business data to be analyzed into the representation of a network graph, thereby obtaining the business data network graph, and after calculating the degree of each node, the business data degree distribution graph is drawn, and then through the business Based on the network structure information obtained from the data degree distribution diagram, parameter fitting is performed on the business data degree distribution diagram to obtain the best distribution fitting results, improve the accuracy of the degree distribution analysis of the business data network, and perform parameter fitting on the degree distribution diagram after fitting. The distribution graph is corrected for missing edges, so as to analyze the characteristics of its degree distribution. It can also be corrected for missing edges. Finally, the analysis results of the degree distribution of the business data are visually presented, so that it can be analyzed from the visual presentation results of different types. Observe its characteristics, and then derive the characteristics of the degree distribution, and explain and analyze the characteristics of the degree distribution to facilitate an intuitive understanding of the degree distribution.

附图说明Description of the drawings

图1为本发明实施例所提供的一种基于图论的业务数据度分布分析方法的步骤流程图;Figure 1 is a step flow chart of a business data degree distribution analysis method based on graph theory provided by an embodiment of the present invention;

图2为本发明实施例所提供的一种基于图论的业务数据度分布分析装置的结构图。Figure 2 is a structural diagram of a business data degree distribution analysis device based on graph theory provided by an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without making creative efforts fall within the scope of protection of the present invention.

实施例一Embodiment 1

请参照图1,为本发明实施例提供的一种基于图论的业务数据度分布分析方法,包括以下步骤S101-S104:Please refer to Figure 1, which is a business data degree distribution analysis method based on graph theory provided by an embodiment of the present invention, including the following steps S101-S104:

步骤S101:获取待分析的业务数据,并将所述业务数据转换成网络图的表现形式,从而得到业务数据网络图。Step S101: Obtain the business data to be analyzed, and convert the business data into the representation form of a network diagram, thereby obtaining the business data network diagram.

作为本实施例的优选方案,所述业务数据包括社交网络数据、互联网路由关系数据、网络行为数据、交易行为数据和轨迹同步数据。As a preferred solution of this embodiment, the business data includes social network data, Internet routing relationship data, network behavior data, transaction behavior data and trajectory synchronization data.

需要说明的是,所要分析的业务数据包括社交网络数据、互联网路由关系、网络行为数据、交易行为数据和轨迹同行数据,这些数据涵盖了社交网络关系、网络拓扑结构、用户行为和交易活动等多个方面,数据转换的过程主要涉及将原始的交易行为数据进行清洗、筛选和转换,以便进行后续的分析和挖掘,可示例性地,在处理一笔交易行为数据时,可以从原始数据中提取关键字段,如交易时间、交易金额、交易类型等,并进行数据清洗,去除异常值或错误数据,然后可以对交易金额进行标准化处理,将其转换为相对比例或指标,以便后续的统计和分析,度在网络分析中通常表示节点的连接程度或相关性程度。It should be noted that the business data to be analyzed includes social network data, Internet routing relationships, network behavior data, transaction behavior data and trajectory peer data. These data cover social network relationships, network topology, user behavior and transaction activities, etc. In this aspect, the process of data conversion mainly involves cleaning, filtering and converting the original transaction behavior data for subsequent analysis and mining. For example, when processing a transaction behavior data, it can be extracted from the original data Key fields, such as transaction time, transaction amount, transaction type, etc., and conduct data cleaning to remove outliers or erroneous data. Then the transaction amount can be standardized and converted into relative proportions or indicators for subsequent statistics and Analysis, degree in network analysis usually represents the degree of connectivity or correlation of nodes.

在本实施例中,度可以用来表示问题与问题之间的联系程度,即不同问题之间的相似性或相关性。通过分析问题间的度,可以帮助发现问题之间的联系和相关性,从而对问题进行分类、聚类或推荐相关问题,问题与所获取的业务数据之间的关系是通过数据分析和挖掘来揭示的。通过对业务数据的处理和分析,可以发现问题与数据之间的关联和相关性。例如,在网络行为数据中,可以通过分析用户的点击行为和转化行为,来推断用户对不同问题的兴趣程度或倾向,并基于此对用户提供相关的问题或建议,而本发明实施例通过对度分布进行分析,从而能够清晰且准确地得到度分布的分析结果,以便于管理员能够得到对应的业务数据信息,从而来做出相应的问题解决方案或建议,优化了管理员的体验,减少管理员的相关工作时间与成本,提高了工作便捷性。In this embodiment, degree can be used to represent the degree of connection between questions, that is, the similarity or correlation between different questions. By analyzing the degrees between questions, it can help to discover the connections and correlations between questions, thereby classifying, clustering or recommending related questions. The relationship between questions and the acquired business data is determined through data analysis and mining. Revealed. Through the processing and analysis of business data, the association and correlation between problems and data can be discovered. For example, in network behavior data, the user's degree of interest or tendency in different issues can be inferred by analyzing the user's click behavior and conversion behavior, and relevant questions or suggestions can be provided to the user based on this. In the embodiment of the present invention, the user's click behavior and conversion behavior can be analyzed. Analyze degree distribution, so that the analysis results of degree distribution can be obtained clearly and accurately, so that administrators can obtain corresponding business data information to make corresponding problem solutions or suggestions, optimizing the administrator's experience and reducing The relevant working time and cost of the administrator are reduced, and the work convenience is improved.

需要说明的是,轨迹同行数据是大数据挖掘中重要的用户数据信息,通过对一个或多个移动用户的运动过程的采样所获得的数据信息,包括采样点位置、采样时间、速度等。It should be noted that trajectory peer data is important user data information in big data mining. Data information obtained by sampling the movement process of one or more mobile users, including sampling point location, sampling time, speed, etc.

步骤S102:根据所述业务数网络图,计算出每个节点的度,并根据每个节点的度,绘制业务数据度分布图。Step S102: Calculate the degree of each node based on the service number network diagram, and draw a service data degree distribution diagram based on the degree of each node.

作为本实施例的优选方案,所述根据所述业务数网络图,计算出每个节点的度,并根据每个节点的度,绘制业务数据度分布图,具体为:As a preferred solution of this embodiment, the degree of each node is calculated based on the service number network diagram, and the service data degree distribution diagram is drawn based on the degree of each node, specifically as follows:

根据所述业务数据网络图中的每个节点及每个节点所连接的边的总和,计算出每个节点的度;根据每个节点的度,将度的取值作为横轴,将对应的节点数量或比例作为纵轴,从而绘制出业务数据度分布图。According to each node in the business data network diagram and the sum of the edges connected to each node, the degree of each node is calculated; according to the degree of each node, the value of the degree is used as the horizontal axis, and the corresponding The number or proportion of nodes is used as the vertical axis to draw a business data degree distribution diagram.

在本实施例中,一个节点的度通常定义为该节点连接的所有连接的总和,利用节点度的计算公式便于对节点度进行计算,网络的度分布即为网络中节点的度的概率分布或频率分布一个节点的度,通常定义为该节点连接的所有连接的总和。In this embodiment, the degree of a node is usually defined as the sum of all the connections connected to the node. The calculation formula of the node degree is used to facilitate the calculation of the node degree. The degree distribution of the network is the probability distribution of the degree of the node in the network or Frequency distribution The degree of a node, usually defined as the sum of all the connections to which the node is connected.

步骤S103:根据所述业务数据度分布图得到网络结构信息,并根据所述网络结构信息,对所述业务数据度分布图进行参数拟合。Step S103: Obtain network structure information according to the service data degree distribution diagram, and perform parameter fitting on the service data degree distribution diagram according to the network structure information.

作为本实施例的优选方案,所述根据所述业务数据度分布图得到网络结构信息,并根据所述网络结构信息,对所述业务数据度分布图进行参数拟合,具体为:As a preferred solution of this embodiment, the network structure information is obtained according to the service data degree distribution diagram, and the service data degree distribution diagram is parameter-fitted according to the network structure information, specifically as follows:

对所述业务数据度分布图进行分析,得到业务数据度分布图的网络结构信息;所述网络结构信息为幂律分布、泊松分布或高斯分布中的一种;根据网络结构信息的类型,通过最小二乘拟合法、线性混合模型法、岭回归法、奇异值分解或最大似然估计法,对所述业务数据度分布图进行度分布参数的拟合;通过假设检验法,对拟合后的业务数据度分布图进行验证,从而完成对业务数据度分布图的参数拟合。Analyze the business data degree distribution map to obtain the network structure information of the business data degree distribution map; the network structure information is one of power law distribution, Poisson distribution or Gaussian distribution; according to the type of network structure information, The degree distribution parameters of the business data degree distribution diagram are fitted using the least squares fitting method, linear mixed model method, ridge regression method, singular value decomposition or maximum likelihood estimation method; and the fitting is performed using the hypothesis testing method. The final business data degree distribution chart is verified, thereby completing the parameter fitting of the business data degree distribution chart.

在本实施例中,网络结构信息包括但不限于幂律分布、泊松分布和高斯分布中的一种或以上。幂律分布是指在数理统计学中,随机变量的概率分布具有幂律形式的概率分布,幂律分布的特点是概率密度函数随变量的幂次递减,在幂律分布中,部分事件具有低的频率,而另一部分事件具有高的频率,这种分布呈现出“长尾”的特点。泊松分布是一种概率分布,用于描述在一定时间和空间范围内发生的事件的数量的概率分布情况,泊松分布通常用l ambda(λ)表示,l ambda表示单位时间和单位空间内的平均事件发生率,在泊松分布中,事件的发生是独立且稀疏的,且事件发生的平均速率是恒定的。高斯分布也被称为正态分布,是一种连续概率分布,高斯分布在自然界和社会科学中广泛应用,它的特点是具有对称的钟形曲线,在高斯分布中,均值和标准差是描述分布形状的两个重要参数,可以通过观察度分布的形态来判断网络的特性,网络特性包括但不限于是否存在核心节点、是否存在核心节点的规律、网络的分布特征的一种或以上。In this embodiment, the network structure information includes but is not limited to one or more of power law distribution, Poisson distribution and Gaussian distribution. Power law distribution refers to in mathematical statistics, the probability distribution of random variables has a power law form. The characteristic of power law distribution is that the probability density function decreases with the power of the variable. In power law distribution, some events have low frequency, while another part of the events has a high frequency. This distribution exhibits the characteristics of a "long tail". The Poisson distribution is a probability distribution that is used to describe the probability distribution of the number of events that occur within a certain range of time and space. The Poisson distribution is usually represented by l ambda (λ), where l ambda represents unit time and unit space. In the Poisson distribution, the occurrence of events is independent and sparse, and the average rate of event occurrence is constant. The Gaussian distribution, also known as the normal distribution, is a continuous probability distribution. The Gaussian distribution is widely used in nature and social sciences. It is characterized by a symmetrical bell-shaped curve. In the Gaussian distribution, the mean and standard deviation are described The two important parameters of the distribution shape can be used to judge the characteristics of the network by observing the shape of the degree distribution. The network characteristics include but are not limited to whether there are core nodes, whether there are patterns of core nodes, and one or more distribution characteristics of the network.

在本实施例中,度分布参数拟方法包括但不限于最小二乘拟合法、线性混合模型法、岭回归法、奇异值分解和最大似然估计法中的一种或以上,最小二乘拟合法是一种通过最小化残差平方和来估计模型参数的方法,它假设观测数据服从线性模型,并尽可能地使观测值与模型预测值之间的差异最小化,最小二乘拟合法常用于拟合线性回归模型,线性混合模型是一种用于处理具有随机效应的数据的参数估计方法,它将固定效应和随机效应结合起来,通过最小化观测数据与模型的预测值之间的差异来进行参数估计,岭回归是一种正则化的线性回归方法,通过在最小二乘拟合法的目标函数中引入L2正则化项,控制参数估计的幅度,它可以应对多重共线性问题,并在一些情况下提供更稳定的参数估计,奇异值分解是一种基于矩阵分解的方法,用于解决最小二乘拟合问题,它通过将矩阵分解为三个矩阵的乘积,找到最佳的参数估计解,最大似然估计是一种通过最大化观测数据的似然函数来估计模型参数的方法,它假设观测数据符合某种概率分布,并寻找能够最大化观测数据出现概率的模型参数,最大似然估计在统计推断中被广泛应用,可以用于估计参数、检验假设以及构建置信区间,以获取最佳的分布拟合结果,可以使用假设检验方法对拟合结果的显著性进行验证,从而提高度分布分析准确性。In this embodiment, the degree distribution parameter fitting method includes but is not limited to one or more of the least squares fitting method, the linear mixed model method, the ridge regression method, the singular value decomposition and the maximum likelihood estimation method. The least squares fitting method Legitimate is a method of estimating model parameters by minimizing the sum of squared residuals. It assumes that the observed data obeys a linear model and minimizes the difference between the observed values and the model predicted values as much as possible. The least squares fitting method is commonly used For fitting linear regression models, a linear mixed model is a parameter estimation method used to deal with data with random effects. It combines fixed effects and random effects by minimizing the difference between the observed data and the predicted value of the model. To perform parameter estimation, ridge regression is a regularized linear regression method. By introducing the L2 regularization term in the objective function of the least squares fitting method, the amplitude of parameter estimation is controlled. It can deal with multicollinearity problems and Providing more stable parameter estimates in some cases, singular value decomposition is a method based on matrix decomposition for solving least squares fitting problems. It finds the best parameter estimate by decomposing the matrix into the product of three matrices. Solution, maximum likelihood estimation is a method of estimating model parameters by maximizing the likelihood function of the observation data. It assumes that the observation data conforms to a certain probability distribution, and looks for model parameters that can maximize the probability of occurrence of the observation data. Maximum likelihood estimation Random estimation is widely used in statistical inference. It can be used to estimate parameters, test hypotheses and construct confidence intervals to obtain the best distribution fitting results. Hypothesis testing methods can be used to verify the significance of the fitting results, thereby improving Degree distribution analysis accuracy.

步骤S104:对拟合后的度分布图进行遗漏边纠正,从而得到业务数据度分布的分析结果,进而将所述业务数据度分布的分析结果进行可视化呈现,从而完成对业务数据度分布的分析。Step S104: Correct the missing edges on the fitted degree distribution graph to obtain the analysis results of the business data degree distribution, and then visually present the analysis results of the business data degree distribution, thereby completing the analysis of the business data degree distribution. .

作为本实施例的优选方案,所述对拟合后的度分布图进行遗漏边纠正,从而得到业务数据度分布的分析结果,具体为:As a preferred solution of this embodiment, the fitted degree distribution diagram is corrected for missing edges, thereby obtaining the analysis results of the business data degree distribution, specifically as follows:

根据拟合后的度分布图进行业务数网络图中各节点的边的校验,并通过校验所得到的遗漏边,对拟合后的度分布图进行遗漏边纠正,从而得到遗漏边纠正后的度分布图,作为业务数据度分布的分析结果。According to the fitted degree distribution diagram, the edges of each node in the business network diagram are verified, and by checking the obtained missing edges, the fitted degree distribution diagram is corrected for the missing edges, thereby obtaining the missing edge correction. The resulting degree distribution diagram is used as the analysis result of the degree distribution of business data.

在本实施例中,在本实施例中,遗漏边纠正是根据已有的度分布信息推断未观测到的边,使度分布更加准确,在一些业务数据网络图中存在着由于采样误差或其他原因导致某些边未被观测到的问题,通过遗漏边纠正提高度分布的准确性。In this embodiment, in this embodiment, missing edge correction is to infer unobserved edges based on existing degree distribution information to make the degree distribution more accurate. In some business data network graphs, there are errors due to sampling errors or other The reason leads to the problem that some edges are not observed, and the accuracy of the degree distribution is improved by correcting the missing edges.

作为本实施例的优选方案,所述将所述业务数据度分布的分析结果进行可视化呈现,具体为:As a preferred solution of this embodiment, the analysis results of the business data degree distribution are visually presented, specifically as follows:

通过直方图、网络图、拟合曲线图、散点图、箱线图或柱状图中的一种或多种,对所述业务数据度分布的分析结果进行可视化呈现,以使得所述业务数据各节点的信息能够进行显示。The analysis results of the degree distribution of the business data are visually presented through one or more of a histogram, a network graph, a fitting curve, a scatter plot, a box plot or a histogram, so that the business data The information of each node can be displayed.

在本实施例中,可视化是将度分布及相关分析结果进行可视化呈现,可视化呈现方式包括但不限于绘制直方图、网络图、拟合曲线图、散点图、箱线图和柱状图中的一种或以上,直方图将度的取值范围划分为不同的区间,横轴表示度的取值范围,纵轴表示该范围内节点的数量或频率,通过直方图可以直观地展示度的分布情况,并对数据进行初步的观察和分析,散点图横轴表示节点的度,纵轴表示节点的其他属性,使用散点图可以显示度和其他属性之间的关系,以及不同节点在这个关系上的分布情况,网络图将节点用点表示,边用线表示,可以将网络中不同节点的度以不同的形状、大小或颜色来表示,使得节点的度分布更加直观,可以使用力导向算法将网络图布局优化,使得节点之间的连接更加清晰,箱线图可以显示度的分布情况的五个关键统计量:最小值、最大值、上四分位数、中位数和下四分位数,通过箱线图可以直观地了解度的离散程度和异常值的存在情况,柱状图可以将不同度的节点数量进行比较,通过设置不同的颜色或阴影表示不同的度值,可以快速观察到度值在不同范围内的节点分布情况。In this embodiment, visualization is to visually present the degree distribution and related analysis results. The visual presentation methods include but are not limited to drawing histograms, network diagrams, fitting curves, scatter plots, box plots and histograms. One or more, the histogram divides the value range of degree into different intervals. The horizontal axis represents the value range of degree, and the vertical axis represents the number or frequency of nodes within the range. The histogram can visually display the distribution of degrees. situation, and conduct preliminary observation and analysis of the data. The horizontal axis of the scatter plot represents the degree of the node, and the vertical axis represents other attributes of the node. The scatter plot can be used to display the relationship between the degree and other attributes, as well as the relationship between different nodes in this For the distribution of relationships, the network diagram represents nodes as points and edges as lines. The degrees of different nodes in the network can be represented in different shapes, sizes or colors, making the degree distribution of nodes more intuitive, and force guidance can be used The algorithm optimizes the layout of the network graph to make the connections between nodes clearer. The box plot can display five key statistics of the degree distribution: minimum value, maximum value, upper quartile, median and lower quartile. Quantile, through the box plot, you can intuitively understand the degree of dispersion and the existence of outliers. The histogram can compare the number of nodes with different degrees. By setting different colors or shadows to represent different degree values, you can quickly Observe the distribution of nodes with degree values in different ranges.

需要说明的是,解释结果是对于分析所得到的结论,进行合理的解释和推断,选择可视化图呈现可以更好地理解和解释数据,因此在进行可视化呈现时,可以使用各种数据可视化工具和编程语言来实现,根据不同的需求和数据特点,选择适合的可视化方式来展示度分布及相关分析结果,可以使得管理员能够更好地理解和解释数据。It should be noted that interpreting results is a reasonable explanation and inference of the conclusions obtained from the analysis. Choosing visual graph presentation can better understand and explain the data. Therefore, when performing visual presentation, you can use various data visualization tools and Programming language is implemented, and according to different needs and data characteristics, choosing a suitable visualization method to display degree distribution and related analysis results can enable administrators to better understand and interpret the data.

需要说明的是,度分布分析的结果可以在社交网络、互联网路由关系、网络行为分析、交易行为分析中进行业务数据分析的应用;通过对社交网络中节点的度分布进行分析,可以揭示节点的重要性和影响力,在社交网络营销中,可以利用节点的度中心性来确定重要的社交媒体用户或关键节点,以进行针对性的推广和营销策略;通过对互联网路由关系中节点的度分布进行分析,可以评估网络节点的重要性和稳定性,在网络优化和故障诊断中,可以利用节点的度信息来确定网络中的关键节点,以进行路由优化或故障排查;通过对网络行为数据中节点的度分布进行分析,可以推断用户的活跃程度和连接状态,在用户行为分析和个性化推荐中,可以利用节点的度信息来识别活跃用户和相关用户,以进行精准的个性化推荐或广告投放;通过对交易行为数据中节点的度分布进行分析,可以评估交易的风险和信任程度,在金融领域或电子商务中,可以利用节点的度信息来识别高风险交易或异常交易模式,以进行风控措施或欺诈检测;进一步地,度分布分析可以应用于社交网络、互联网路由、网络行为和交易行为等多个业务数据场景之中,帮助揭示业务数据所构成网络图的各节点的重要性、影响力和连通性,以及评估风险、优化路由和提供个性化服务等方面。It should be noted that the results of degree distribution analysis can be used for business data analysis in social networks, Internet routing relationships, network behavior analysis, and transaction behavior analysis; by analyzing the degree distribution of nodes in social networks, the node’s degree distribution can be revealed. Importance and influence. In social network marketing, the degree centrality of nodes can be used to determine important social media users or key nodes for targeted promotion and marketing strategies; by analyzing the degree distribution of nodes in Internet routing relationships Through analysis, the importance and stability of network nodes can be evaluated. In network optimization and fault diagnosis, the degree information of nodes can be used to determine key nodes in the network for routing optimization or troubleshooting; by analyzing network behavior data By analyzing the node's degree distribution, the user's activity level and connection status can be inferred. In user behavior analysis and personalized recommendations, the node's degree information can be used to identify active users and related users to make accurate personalized recommendations or advertisements. Put; By analyzing the degree distribution of nodes in transaction behavior data, the risk and trust level of the transaction can be assessed. In the financial field or e-commerce, the degree information of nodes can be used to identify high-risk transactions or abnormal transaction patterns to carry out Risk control measures or fraud detection; further, degree distribution analysis can be applied to multiple business data scenarios such as social networks, Internet routing, network behavior, and transaction behavior to help reveal the importance of each node in the network graph composed of business data. , influence and connectivity, as well as aspects such as assessing risks, optimizing routing and providing personalized services.

实施以上实施例,具有如下效果:Implementing the above embodiments will have the following effects:

本发明的技术方案通过获取并将待分析的业务数据转换成网络图的表现形式,从而得到业务数据网络图,并计算出每个节点的度后,绘制出业务数据度分布图,进而通过业务数据度分布图所得到的网络结构信息,对业务数据度分布图进行参数拟合,以获取最佳的分布拟合结果,提高业务数据网络的度分布分析准确性,并对拟合后的度分布图进行遗漏边纠正,从而实现分析其度分布的特性,还可以对其进行验漏纠正,最后将业务数据度分布的分析结果进行可视化呈现,从而可以从不同类型中的可视化呈现结果中进行其特性的观察,进而得出度分布的特性,且对度分布的特性进行解释分析,以便于度分布的直观了解。The technical solution of the present invention obtains and converts the business data to be analyzed into the representation of a network graph, thereby obtaining the business data network graph, and after calculating the degree of each node, the business data degree distribution graph is drawn, and then through the business Based on the network structure information obtained from the data degree distribution diagram, parameter fitting is performed on the business data degree distribution diagram to obtain the best distribution fitting results, improve the accuracy of the degree distribution analysis of the business data network, and perform parameter fitting on the degree distribution diagram after fitting. The distribution graph is corrected for missing edges, so as to analyze the characteristics of its degree distribution. It can also be corrected for missing edges. Finally, the analysis results of the degree distribution of the business data are visually presented, so that it can be analyzed from the visual presentation results of different types. Observe its characteristics, and then derive the characteristics of the degree distribution, and explain and analyze the characteristics of the degree distribution to facilitate an intuitive understanding of the degree distribution.

实施例二Embodiment 2

请参阅图2,其为本发明所提供一种基于图论的业务数据度分布分析装置,包括:获取模块201、绘制模块202、拟合模块203和可视化模块204。Please refer to Figure 2, which shows a business data degree distribution analysis device based on graph theory provided by the present invention, including: an acquisition module 201, a drawing module 202, a fitting module 203 and a visualization module 204.

所述获取模块201,用于获取待分析的业务数据,并将所述业务数据转换成网络图的表现形式,从而得到业务数据网络图。The acquisition module 201 is used to acquire business data to be analyzed, and convert the business data into the representation form of a network diagram, thereby obtaining a business data network diagram.

所述绘制模块202,用于根据所述业务数网络图,计算出每个节点的度,并根据每个节点的度,绘制业务数据度分布图。The drawing module 202 is used to calculate the degree of each node according to the service number network diagram, and draw the service data degree distribution diagram according to the degree of each node.

所述拟合模块203,用于根据所述业务数据度分布图得到网络结构信息,并根据所述网络结构信息,对所述业务数据度分布图进行参数拟合。The fitting module 203 is configured to obtain network structure information based on the service data degree distribution diagram, and perform parameter fitting on the service data degree distribution diagram based on the network structure information.

所述可视化模块204,用于对拟合后的度分布图进行遗漏边纠正,从而得到业务数据度分布的分析结果,进而将所述业务数据度分布的分析结果进行可视化呈现,从而完成对业务数据度分布的分析。The visualization module 204 is used to correct the missing edges of the fitted degree distribution graph, thereby obtaining the analysis results of the business data degree distribution, and then visually presenting the analysis results of the business data degree distribution, thereby completing the business data analysis. Analysis of data degree distribution.

作为优选方案,所述业务数据包括社交网络数据、互联网路由关系数据、网络行为数据、交易行为数据和轨迹同步数据。As a preferred solution, the business data includes social network data, Internet routing relationship data, network behavior data, transaction behavior data and trajectory synchronization data.

作为优选方案,所述根据所述业务数网络图,计算出每个节点的度,并根据每个节点的度,绘制业务数据度分布图,具体为:As a preferred solution, calculate the degree of each node based on the service number network diagram, and draw a service data degree distribution diagram based on the degree of each node, specifically as follows:

根据所述业务数据网络图中的每个节点及每个节点所连接的边的总和,计算出每个节点的度;Calculate the degree of each node based on each node in the business data network graph and the sum of the edges connected to each node;

根据每个节点的度,将度的取值作为横轴,将对应的节点数量或比例作为纵轴,从而绘制出业务数据度分布图。According to the degree of each node, use the value of the degree as the horizontal axis and the corresponding number or proportion of nodes as the vertical axis to draw a business data degree distribution diagram.

作为优选方案,所述根据所述业务数据度分布图得到网络结构信息,并根据所述网络结构信息,对所述业务数据度分布图进行参数拟合,具体为:As a preferred solution, the network structure information is obtained according to the service data degree distribution diagram, and the service data degree distribution diagram is parameter-fitted according to the network structure information, specifically as follows:

对所述业务数据度分布图进行分析,得到业务数据度分布图的网络结构信息;所述网络结构信息为幂律分布、泊松分布或高斯分布中的一种;Analyze the business data degree distribution map to obtain the network structure information of the business data degree distribution map; the network structure information is one of power law distribution, Poisson distribution or Gaussian distribution;

根据网络结构信息的类型,通过最小二乘拟合法、线性混合模型法、岭回归法、奇异值分解或最大似然估计法,对所述业务数据度分布图进行度分布参数的拟合;According to the type of network structure information, fit the degree distribution parameters of the business data degree distribution diagram through the least squares fitting method, linear mixed model method, ridge regression method, singular value decomposition or maximum likelihood estimation method;

通过假设检验法,对拟合后的业务数据度分布图进行验证,从而完成对业务数据度分布图的参数拟合。Through the hypothesis testing method, the fitted business data degree distribution chart is verified, thereby completing the parameter fitting of the business data degree distribution chart.

作为优选方案,所述对拟合后的度分布图进行遗漏边纠正,从而得到业务数据度分布的分析结果,具体为:As a preferred solution, the fitted degree distribution diagram is corrected for missing edges, thereby obtaining the analysis results of the business data degree distribution, specifically as follows:

根据拟合后的度分布图进行业务数网络图中各节点的边的校验,并通过校验所得到的遗漏边,对拟合后的度分布图进行遗漏边纠正,从而得到遗漏边纠正后的度分布图,作为业务数据度分布的分析结果。According to the fitted degree distribution diagram, the edges of each node in the business network diagram are verified, and by checking the obtained missing edges, the fitted degree distribution diagram is corrected for the missing edges, thereby obtaining the missing edge correction. The resulting degree distribution diagram is used as the analysis result of the degree distribution of business data.

作为优选方案,所述将所述业务数据度分布的分析结果进行可视化呈现,具体为:As a preferred solution, the analysis results of the business data degree distribution are visually presented, specifically as follows:

通过直方图、网络图、拟合曲线图、散点图、箱线图或柱状图中的一种或多种,对所述业务数据度分布的分析结果进行可视化呈现,以使得所述业务数据各节点的信息能够进行显示。The analysis results of the degree distribution of the business data are visually presented through one or more of a histogram, a network graph, a fitting curve, a scatter plot, a box plot or a histogram, so that the business data The information of each node can be displayed.

相应地,本发明还提供一种基于图论的业务数据度分布分析装置,包括:获取模块、绘制模块、拟合模块和可视化模块;Correspondingly, the present invention also provides a business data degree distribution analysis device based on graph theory, including: an acquisition module, a drawing module, a fitting module and a visualization module;

所述获取模块,用于获取待分析的业务数据,并将所述业务数据转换成网络图的表现形式,从而得到业务数据网络图;The acquisition module is used to acquire business data to be analyzed, and convert the business data into the representation of a network diagram, thereby obtaining a business data network diagram;

所述绘制模块,用于根据所述业务数网络图,计算出每个节点的度,并根据每个节点的度,绘制业务数据度分布图;The drawing module is used to calculate the degree of each node according to the service number network diagram, and draw the service data degree distribution diagram according to the degree of each node;

所述拟合模块,用于根据所述业务数据度分布图得到网络结构信息,并根据所述网络结构信息,对所述业务数据度分布图进行参数拟合;The fitting module is used to obtain network structure information according to the service data degree distribution diagram, and perform parameter fitting on the service data degree distribution diagram according to the network structure information;

所述可视化模块,用于对拟合后的度分布图进行遗漏边纠正,从而得到业务数据度分布的分析结果,进而将所述业务数据度分布的分析结果进行可视化呈现,从而完成对业务数据度分布的分析。The visualization module is used to correct missing edges on the fitted degree distribution diagram, thereby obtaining the analysis results of the degree distribution of the business data, and then visually presenting the analysis results of the degree distribution of the business data, thereby completing the analysis of the business data. Analysis of degree distribution.

作为优选方案,所述业务数据包括社交网络数据、互联网路由关系数据、网络行为数据、交易行为数据和轨迹同步数据。As a preferred solution, the business data includes social network data, Internet routing relationship data, network behavior data, transaction behavior data and trajectory synchronization data.

所属领域的技术人员可以清楚的了解到,为描述的方便和简洁,上述描述的装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working process of the device described above can be referred to the corresponding process in the foregoing method embodiment, and will not be described again here.

实施以上实施例,具有如下效果:Implementing the above embodiments will have the following effects:

本发明的技术方案通过获取并将待分析的业务数据转换成网络图的表现形式,从而得到业务数据网络图,并计算出每个节点的度后,绘制出业务数据度分布图,进而通过业务数据度分布图所得到的网络结构信息,对业务数据度分布图进行参数拟合,以获取最佳的分布拟合结果,提高业务数据网络的度分布分析准确性,并对拟合后的度分布图进行遗漏边纠正,从而实现分析其度分布的特性,还可以对其进行验漏纠正,最后将业务数据度分布的分析结果进行可视化呈现,从而可以从不同类型中的可视化呈现结果中进行其特性的观察,进而得出度分布的特性,且对度分布的特性进行解释分析,以便于度分布的直观了解。The technical solution of the present invention obtains and converts the business data to be analyzed into the representation of a network graph, thereby obtaining the business data network graph, and after calculating the degree of each node, the business data degree distribution graph is drawn, and then through the business Based on the network structure information obtained from the data degree distribution diagram, parameter fitting is performed on the business data degree distribution diagram to obtain the best distribution fitting results, improve the accuracy of the degree distribution analysis of the business data network, and perform parameter fitting on the degree distribution diagram after fitting. The distribution graph is corrected for missing edges, so as to analyze the characteristics of its degree distribution. It can also be corrected for missing edges. Finally, the analysis results of the degree distribution of the business data are visually presented, so that it can be analyzed from the visual presentation results of different types. Observe its characteristics, and then derive the characteristics of the degree distribution, and explain and analyze the characteristics of the degree distribution to facilitate an intuitive understanding of the degree distribution.

实施例三Embodiment 3

相应地,本发明还提供一种终端设备,包括:处理器、存储器以及存储在所述存储器中且被配置为由所述处理器执行的计算机程序,所述处理器执行所述计算机程序时实现如上任意一项实施例所述的基于图论的业务数据度分布分析方法。Correspondingly, the present invention also provides a terminal device, including: a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor. When the processor executes the computer program, the The business data degree distribution analysis method based on graph theory as described in any of the above embodiments.

该实施例的终端设备包括:处理器、存储器以及存储在所述存储器中并可在所述处理器上运行的计算机程序、计算机指令。所述处理器执行所述计算机程序时实现上述实施例一中的各个步骤,例如图1所示的步骤S101至S104。或者,所述处理器执行所述计算机程序时实现上述装置实施例中各模块/单元的功能,例如拟合模块203。The terminal device of this embodiment includes: a processor, a memory, and a computer program and computer instructions stored in the memory and executable on the processor. When the processor executes the computer program, it implements each step in the above-mentioned Embodiment 1, such as steps S101 to S104 shown in FIG. 1 . Alternatively, when the processor executes the computer program, it implements the functions of each module/unit in the above device embodiment, such as the fitting module 203.

示例性的,所述计算机程序可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器中,并由所述处理器执行,以完成本发明。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序在所述终端设备中的执行过程。例如,所述拟合模块203,用于根据所述业务数据度分布图得到网络结构信息,并根据所述网络结构信息,对所述业务数据度分布图进行参数拟合。Exemplarily, the computer program may be divided into one or more modules/units, and the one or more modules/units are stored in the memory and executed by the processor to complete the present invention. The one or more modules/units may be a series of computer program instruction segments capable of completing specific functions. The instruction segments are used to describe the execution process of the computer program in the terminal device. For example, the fitting module 203 is configured to obtain network structure information based on the service data degree distribution diagram, and perform parameter fitting on the service data degree distribution diagram based on the network structure information.

所述终端设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述终端设备可包括,但不仅限于,处理器、存储器。本领域技术人员可以理解,示意图仅仅是终端设备的示例,并不构成对终端设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述终端设备还可以包括输入输出设备、网络接入设备、总线等。The terminal device may be a computing device such as a desktop computer, a notebook, a PDA, a cloud server, etc. The terminal device may include, but is not limited to, a processor and a memory. Those skilled in the art can understand that the schematic diagram is only an example of the terminal equipment and does not constitute a limitation of the terminal equipment. It may include more or fewer components than shown in the figure, or combine certain components, or different components, such as the so-called terminal equipment. The terminal device may also include input and output devices, network access devices, buses, etc.

所称处理器可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,所述处理器是所述终端设备的控制中心,利用各种接口和线路连接整个终端设备的各个部分。The so-called processor can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf processor Programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general processor may be a microprocessor or the processor may be any conventional processor, etc. The processor is the control center of the terminal device and uses various interfaces and lines to connect various parts of the entire terminal device.

所述存储器可用于存储所述计算机程序和/或模块,所述处理器通过运行或执行存储在所述存储器内的计算机程序和/或模块,以及调用存储在存储器内的数据,实现终端设备的各种功能。所述存储器可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据移动终端的使用所创建的数据等。此外,存储器可以包括高速随机存取存储器,还可以包括非易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(SecureDigital,SD)卡,闪存卡(Flash Card)、至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory may be used to store the computer program and/or module. The processor implements the terminal device by running or executing the computer program and/or module stored in the memory and calling data stored in the memory. Various functions. The memory may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function, etc.; the stored data area may store data created according to the use of the mobile terminal, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as hard disk, memory, plug-in hard disk, smart memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, Flash Card, at least one disk storage device, flash memory device, or other volatile solid-state storage device.

其中,所述终端设备集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。Wherein, if the modules/units integrated with the terminal device are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the present invention can implement all or part of the processes in the methods of the above embodiments, and can also be completed by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium, and the computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, the steps of each of the above method embodiments can be implemented. Wherein, the computer program includes computer program code, which may be in the form of source code, object code, executable file or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording media, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media, etc. It should be noted that the content contained in the computer-readable medium can be appropriately added or deleted according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, the computer-readable medium Excludes electrical carrier signals and telecommunications signals.

实施例四Embodiment 4

相应地,本发明还提供一种计算机可读存储介质,所述计算机可读存储介质包括存储的计算机程序,其中,在所述计算机程序运行时控制所述计算机可读存储介质所在设备执行如上任意一项实施例所述的基于图论的业务数据度分布分析方法。Correspondingly, the present invention also provides a computer-readable storage medium. The computer-readable storage medium includes a stored computer program, wherein when the computer program is running, the device where the computer-readable storage medium is located is controlled to execute any of the above. A business data degree distribution analysis method based on graph theory described in one embodiment.

以上所述的具体实施例,对本发明的目的、技术方案和有益效果进行了进一步的详细说明,应当理解,以上所述仅为本发明的具体实施例而已,并不用于限定本发明的保护范围。特别指出,对于本领域技术人员来说,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above-mentioned specific embodiments further describe the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above-mentioned are only specific embodiments of the present invention and are not intended to limit the scope of the present invention. . It is particularly pointed out that for those skilled in the art, any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection scope of the present invention.

Claims (10)

1. The business data degree distribution analysis method based on graph theory is characterized by comprising the following steps of:
acquiring service data to be analyzed, and converting the service data into a representation form of a network diagram so as to obtain a service data network diagram;
calculating the degree of each node according to the service number network diagram, and drawing a service data degree distribution diagram according to the degree of each node;
obtaining network structure information according to the service data degree distribution map, and performing parameter fitting on the service data degree distribution map according to the network structure information;
and correcting missing edges of the fitted degree distribution diagram to obtain an analysis result of the business data degree distribution, and further visually presenting the analysis result of the business data degree distribution to complete analysis of the business data degree distribution.
2. The graph theory-based business data degree distribution analysis method according to claim 1, wherein the business data comprises social network data, internet routing relationship data, network behavior data, transaction behavior data and track synchronization data.
3. The graph theory-based service data degree distribution analysis method according to claim 1, wherein the calculating the degree of each node according to the service number network graph and drawing the service data degree distribution graph according to the degree of each node specifically comprises:
calculating the degree of each node according to each node in the service data network diagram and the sum of edges connected with each node;
and according to the degree of each node, taking the value of the degree as a horizontal axis and the corresponding number or proportion of the nodes as a vertical axis, so as to draw a business data degree distribution diagram.
4. The method for analyzing service data degree distribution based on graph theory according to claim 3, wherein the network structure information is obtained according to the service data degree distribution map, and parameter fitting is performed on the service data degree distribution map according to the network structure information, specifically:
analyzing the service data degree distribution map to obtain network structure information of the service data degree distribution map; the network structure information is one of power law distribution, poisson distribution or Gaussian distribution;
fitting the degree distribution parameters of the business data degree distribution map by a least square fitting method, a linear mixed model method, a ridge regression method, a singular value decomposition or a maximum likelihood estimation method according to the type of the network structure information;
and verifying the fitted business data degree distribution map by a hypothesis test method, thereby completing the parameter fitting of the business data degree distribution map.
5. The graph theory-based business data degree distribution analysis method of claim 4, wherein the missing edge correction is performed on the fitted degree distribution graph to obtain the analysis result of the business data degree distribution, specifically:
and checking the edges of all nodes in the business number network graph according to the fitted degree distribution graph, correcting the missing edges of the fitted degree distribution graph by checking the obtained missing edges, and thus obtaining the degree distribution graph with corrected missing edges, and taking the corrected missing edges as an analysis result of business data degree distribution.
6. The graph theory-based business data degree distribution analysis method according to claim 5, wherein the visual presentation of the analysis result of the business data degree distribution is specifically as follows:
and visually presenting the analysis result of the business data degree distribution through one or more of a histogram, a network diagram, a fitting curve diagram, a scatter diagram, a box diagram or a histogram, so that the information of each node of the business data can be displayed.
7. A business data degree distribution analysis device based on graph theory, comprising: the device comprises an acquisition module, a drawing module, a fitting module and a visualization module;
the acquisition module is used for acquiring service data to be analyzed and converting the service data into a representation form of a network diagram so as to obtain a service data network diagram;
the drawing module is used for calculating the degree of each node according to the business number network diagram and drawing a business data degree distribution diagram according to the degree of each node;
the fitting module is used for obtaining network structure information according to the service data degree distribution map and performing parameter fitting on the service data degree distribution map according to the network structure information;
and the visualization module is used for correcting missing edges of the fitted degree distribution diagram so as to obtain an analysis result of the business data degree distribution, and further carrying out visual presentation on the analysis result of the business data degree distribution so as to complete analysis of the business data degree distribution.
8. The graph-theory-based business data degree distribution analysis device according to claim 7, wherein the business data includes social network data, internet routing relationship data, network behavior data, transaction behavior data, and trajectory synchronization data.
9. A terminal device comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the graph theory-based traffic data degree distribution analysis method according to any one of claims 1 to 6 when the computer program is executed by the processor.
10. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored computer program, wherein the computer program when run controls a device in which the computer readable storage medium is located to perform the graph theory based service data degree distribution analysis method according to any one of claims 1 to 6.
CN202311156530.0A 2023-09-07 2023-09-07 A business data degree distribution analysis method and device based on graph theory Pending CN117194728A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311156530.0A CN117194728A (en) 2023-09-07 2023-09-07 A business data degree distribution analysis method and device based on graph theory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311156530.0A CN117194728A (en) 2023-09-07 2023-09-07 A business data degree distribution analysis method and device based on graph theory

Publications (1)

Publication Number Publication Date
CN117194728A true CN117194728A (en) 2023-12-08

Family

ID=88993640

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311156530.0A Pending CN117194728A (en) 2023-09-07 2023-09-07 A business data degree distribution analysis method and device based on graph theory

Country Status (1)

Country Link
CN (1) CN117194728A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090303237A1 (en) * 2008-06-06 2009-12-10 International Business Machines Corporation Algorithms for identity anonymization on graphs
US20170177681A1 (en) * 2015-12-17 2017-06-22 Business Objects Software Limited Graph Database Querying and Visualization
CN110472068A (en) * 2019-08-20 2019-11-19 星环信息科技(上海)有限公司 Big data processing method, equipment and medium based on heterogeneous distributed knowledge mapping
CN112580931A (en) * 2020-11-23 2021-03-30 河北电力交易中心有限公司 Analysis method of bilateral transaction network and terminal equipment
CN112632196A (en) * 2021-01-06 2021-04-09 中国工商银行股份有限公司 Data visualization method and device and storage medium
CN115935941A (en) * 2022-12-14 2023-04-07 国网福建省电力有限公司 Electric power service system data alignment method based on graph convolution neural network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090303237A1 (en) * 2008-06-06 2009-12-10 International Business Machines Corporation Algorithms for identity anonymization on graphs
US20170177681A1 (en) * 2015-12-17 2017-06-22 Business Objects Software Limited Graph Database Querying and Visualization
CN110472068A (en) * 2019-08-20 2019-11-19 星环信息科技(上海)有限公司 Big data processing method, equipment and medium based on heterogeneous distributed knowledge mapping
CN112580931A (en) * 2020-11-23 2021-03-30 河北电力交易中心有限公司 Analysis method of bilateral transaction network and terminal equipment
CN112632196A (en) * 2021-01-06 2021-04-09 中国工商银行股份有限公司 Data visualization method and device and storage medium
CN115935941A (en) * 2022-12-14 2023-04-07 国网福建省电力有限公司 Electric power service system data alignment method based on graph convolution neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李季明;张宁;: "中国汽车零部件企业竞争网拓扑结构分析", 复杂系统与复杂性科学, no. 02, 15 June 2008 (2008-06-15) *

Similar Documents

Publication Publication Date Title
AU2015201161B2 (en) Event correlation
CN108322473B (en) User behavior analysis method and device
US7926026B2 (en) Graphical analysis to detect process object anomalies
CN114430365B (en) Fault root cause analysis method, device, electronic equipment and storage medium
CN114493255A (en) Enterprise anomaly monitoring method and related equipment based on knowledge graph
CN114661571B (en) Model evaluation method, device, electronic equipment and storage medium
CN117407513B (en) Question processing method, device, equipment and storage medium based on large language model
CN113609008B (en) Test result analysis method and device and electronic equipment
CN114598539A (en) Root cause positioning method and device, storage medium and electronic equipment
CN115857461A (en) Piglet premixed feed production online monitoring method and system
CN118245261A (en) Alarm analysis method, device, apparatus, medium and program product
CN116563035A (en) Analysis method and device of medical insurance data, electronic equipment and storage medium
CN108696397A (en) A kind of electric network information safety evaluation method and device based on AHP and big data
CN117035563B (en) Product quality safety risk monitoring method, device, monitoring system and medium
CN117194728A (en) A business data degree distribution analysis method and device based on graph theory
CN118673437A (en) Knowledge graph-based abnormal examination method and device for examination rules and electronic equipment
CN117993616A (en) Carbon emission data analysis management method, device, equipment and storage medium
CN118331831A (en) Application system performance evaluation method, device, electronic device and storage medium
CN113706098B (en) Business-based deviation reason identification method and device and electronic equipment
CN116304211A (en) Information interception method, device and equipment based on graph calculation and readable storage medium
CN114443630A (en) Data security auditing method, device, electronic device and storage medium
CN113190844A (en) Detection method, related method and related device
CN107566187B (en) A SLA violation monitoring method, device and system
CN112750047A (en) Behavior relation information extraction method and device, storage medium and electronic equipment
US20240112053A1 (en) Determination of an outlier score using extreme value theory (evt)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination