CN110837841B - KPI degradation root cause identification method and device based on random forest - Google Patents
KPI degradation root cause identification method and device based on random forest Download PDFInfo
- Publication number
- CN110837841B CN110837841B CN201810938061.0A CN201810938061A CN110837841B CN 110837841 B CN110837841 B CN 110837841B CN 201810938061 A CN201810938061 A CN 201810938061A CN 110837841 B CN110837841 B CN 110837841B
- Authority
- CN
- China
- Prior art keywords
- kpi
- root cause
- influence factor
- influence
- influence factors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Quality & Reliability (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Operations Research (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a KPI degradation root cause identification method based on a random forest, which comprises the following steps: acquiring relevant basic data for establishing a KPI degradation root cause analysis model, wherein the basic data comprises historical data of influence factors and KPI actual data to be analyzed; selecting basic data with set proportion as a training set, training a certain number of decision trees according to the training set, and constructing a KPI degradation root cause analysis model by the decision trees; and taking the rest of the basic data as a test set, and obtaining influence factors influencing all the influence factors of all the KPIs by using the KPI degradation root cause analysis model. The invention also discloses a KPI degradation root cause identification device based on the random forest. The invention can improve the accuracy and efficiency of identifying the influence factors.
Description
Technical Field
The invention relates to the technical field of data analysis and machine learning in the communication industry, in particular to a technology for identifying a degradation root cause.
Background
In the operation management of the mobile communication network, some key KPIs, such as call drop rate and call loss, are required to be paid attention to, besides daily maintenance, operators hope to know factors influencing the KPIs, obtain the association between the KPIs and the network, and facilitate the distribution and guarantee of network optimization tasks in the later period.
The traditional root cause identification method has lower operation efficiency and calculation accuracy, and is more difficult to obtain accurate root cause especially when thousands of input variables and training data are in missing information.
Therefore, how to quickly and accurately identify the degradation root causes is a problem to be solved.
Disclosure of Invention
The invention provides a KPI degradation root cause identification method based on a random forest, which comprises the following steps:
acquiring relevant basic data for establishing a KPI degradation root cause analysis model, wherein the basic data comprises historical data of influence factors and KPI actual data to be analyzed;
Selecting basic data with set proportion as a training set, training a certain number of decision trees according to the training set, and constructing a KPI degradation root cause analysis model by the decision trees;
And taking the rest of the basic data as a test set, and obtaining influence factors influencing all the influence factors of all the KPIs by using the KPI degradation root cause analysis model.
The influencing factors comprise a retentivity index, an access class index, a mobility index, a resource class index and a system capacity class index, and the influencing factors are sample characteristics.
Further, the method for training a certain number of decision trees according to the training set specifically comprises the following steps:
Randomly and repeatedly extracting N training samples from a training set with the size of N to be used as the training set of the decision tree;
And the feature dimension of each training sample is M, M (M is less than or equal to M) features are randomly selected as feature subsets, and when the tree is split each time, the optimal feature is selected from the M features to split, so that a decision tree is obtained.
Further, the method for constructing the KPI inferior root cause analysis model by the decision tree specifically comprises the following steps:
obtaining k decision trees to generate a random forest model, namely a KPI inferior root cause analysis model;
and averaging the obtained results of all the decision trees to obtain the result of a random forest model, namely the analysis result of the KPI inferior root cause analysis model.
Specifically, each node of the decision tree is an influence factor, the value of the unrepeated degree of which the decision tree is reduced by each influence factor is calculated, and the value of the unrepeated degree is taken as the influence factor of the influence factor.
Preferably, the KPI bad root cause analysis model is fitted by using a variance or least square method.
Preferably, the influence factors of the obtained KPI influence factors are ranked, and the influence factors are output according to the ranking result.
The invention also discloses a KPI degradation root cause identification device based on the random forest, which comprises:
The data acquisition module is used for acquiring relevant basic data for establishing a KPI degradation root cause analysis model, wherein the basic data comprises historical data of influence factors and KPI actual data to be analyzed;
The model building module is used for selecting relevant basic data acquired by the data acquisition module with set proportion as a training set, training a certain number of decision trees according to the training set, and building the KPI degradation root cause analysis model by the decision trees;
And the influence factor determining module is used for taking the basic data after the model building module selects the training set as a test set, and combining the model to build the KPI listed root cause analysis model built by the default module to obtain influence factors influencing all the influence factors of all the KPIs.
The model building module further comprises:
The training set selecting unit is used for selecting the basic data with a set proportion as a training set, and randomly extracting N training samples from the training set with a return if the size of the training set is N, and taking the N training samples as the training set of the decision tree;
The decision tree acquisition unit is used for designating a constant M < M if the feature dimension of each sample is M, randomly selecting M feature subsets from M features, and selecting optimal features from the M features for splitting to acquire a decision tree;
The model building unit is used for repeatedly building k decision trees k times according to the method for obtaining the decision tree by the decision tree obtaining unit to obtain a KPI degradation root cause analysis model;
And the calculating unit is used for calculating the average value of the k decision tree results by adopting an average method according to the k decision tree results established by the decision tree establishing unit and taking the average value as the result of the KPI degradation root cause analysis model.
Specifically, each node of the decision tree is an influence factor, the value of the unrepeated degree of which the decision tree is reduced by each influence factor is calculated, and the value of the unrepeated degree is taken as the influence factor of the influence factor.
The influence factor determination module further includes:
the test set acquisition unit is used for taking the basic data after the training set selection unit selects the training set as a test set;
the test data acquisition unit is used for acquiring historical data of influence factors in a certain time and actual data of the KPI indexes in a certain time in the test set acquisition unit;
The influence factor calculation unit is used for inputting the historical data and the actual data acquired by the test data acquisition unit into the KPI degradation root cause analysis model determined in the model building module, taking each node of the decision tree in the KPI degradation root cause analysis model as an influence factor, and calculating an unrepeace value of the influence factor, which enables the decision tree to be reduced averagely;
and the influence factor determining unit is used for determining the impure value obtained by calculation of the influence factor calculating unit as an influence factor of the influence factor on the KPI.
Preferably, the device further comprises a sequencing module for sequencing the determined influence factors according to a set rule;
and the main influence factor determining module is used for determining influence factors of influence factors influencing the KPI according to the influence factor sorting result of the sorting module.
According to the technical scheme, the KPI degradation root cause identification method based on the random forest disclosed by the embodiment of the invention establishes a KPI degradation root cause analysis model based on the random forest according to the collected historical data of a plurality of influence factors and KPI actual data to be analyzed, inputs the historical data of the plurality of influence factors and KPI data to be analyzed into a preset KPI degradation root cause analysis model to obtain influence factors affecting the KPI, and sequences and outputs main influence factors, thereby improving the accuracy and efficiency of the result.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a KPI degradation root cause identification method based on a random forest, which is provided by the embodiment of the application;
FIG. 2 is a flow chart of a method according to a second embodiment of the present application;
FIG. 3 is a flow chart of a method according to a third embodiment of the present application;
FIG. 4 is a flow chart of a method according to a fourth embodiment of the present application;
fig. 5 is a schematic structural diagram of a KPI degradation root cause identification device based on a random forest according to a fifth embodiment of the application;
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a method for identifying KPI degradation root causes based on random forest according to a first embodiment of the present invention is provided.
Step S01: and acquiring relevant basic data for establishing a KPI degradation root cause analysis model, wherein the basic data comprises historical data of influence factors and KPI actual data to be analyzed.
The influencing factors comprise a retentivity index, an access class index, a mobility index, a resource class index and a system capacity class index, and the influencing factors are sample characteristics.
Specifically, the influencing factors may be tens of total downtilt angle, azimuth angle, wireless utilization rate, average e-rab number, mechanical downtilt angle, latitude, longitude and the like, and the variety of the influencing factors may be increased according to the newly added data.
Step S02: and selecting basic data with set proportion as a training set, training a certain number of decision trees according to the training set, and constructing the KPI degradation root cause analysis model by the decision trees.
The selection of the proportion of the training set can be flexibly set according to actual conditions.
For example, 40% of the underlying data may be selected as the training set.
Step S03: and taking the rest of the basic data as a test set, and obtaining influence factors influencing all the influence factors of all the KPIs by using the KPI degradation root cause analysis model.
And selecting the data remained after the training set as a test set, and obtaining the influence factors of all influence factors by using the KPI degradation root cause analysis model.
If 40% of the underlying data is selected as the training set, the remaining 60% of the data can be used as the test set.
In order to obtain the influence factors of the target influence factors more conveniently according to the requirements, preferably, the method further comprises the following steps:
step S04: and sequencing the influence factors of the obtained KPI influence factors, and outputting the influence factors and the influence factors according to the sequencing result.
And sequencing the influence factors of the influence factors from high to low. The ranking is ranking of all influence factors affecting the KPI, and the main influence factors are obtained.
Therefore, the embodiment of the invention discloses a method for identifying the degradation root cause of a KPI based on a random forest, which is characterized in that the degradation root cause of the KPI is accurately known by collecting historical data of influence factor indexes, the influence factors of the influence factors are used as the non-purity of decision tree nodes in the random forest, and the influence factors are sequenced, so that the degradation root cause of the KPI can be rapidly and accurately positioned, the time for manual judgment is greatly saved, and the accuracy is improved.
In order to better illustrate the present invention, a second embodiment is provided, and as shown in fig. 2, the model building process of the present invention is described in detail.
Step S201: n training samples are randomly and repeatedly extracted from a training set with the size of N to be used as the training set of the decision tree.
The subsampling is one of the modes of operation of simple random sampling. Training samples in the population are numbered from 1 to N, and each number is extracted and then put back into the population. For any one extraction, the N numbers are equally drawn because the overall capacity is unchanged.
Step S202: and the feature dimension of each training sample is M, M (M is less than or equal to M) features are randomly selected as feature subsets, and when the tree is split each time, the optimal feature is selected from the M features to split, so that a decision tree is obtained.
The method for determining the optimal division characteristics is as follows: so that the purity of the data of each node after splitting is the highest. That is, the samples included in the branch nodes classified by the feature are classified into the same class as much as possible.
Feature dimensions may be understood as the kind of features of the training sample, one feature dimension for each feature.
Specifically, the column properties of the training data may be sampled; for M columns of attributes, extraction M less than or equal to M without replacement is adopted.
And selecting the optimal feature to split the decision tree, namely taking the optimal feature as a father node, performing complete splitting according to a certain rule feature, and continuing splitting by taking the split leaf node as the father node until the splitting cannot be performed.
Therefore, each node of the decision tree is an influencing factor, the root causes of different KPIs can be distinguished by the influencing factor on the node, in order to obtain the optimal splitting result, the root causes can be completely distinguished by finding a KPI, and the purity of the node is higher.
Calculating an impure value for each influencing factor such that the decision tree is reduced on average, taking the impure value as the influencing factor for the influencing factor. Step S203: and repeating the step S202 to obtain a random forest model generated by k decision trees, namely the KPI inferior root cause analysis model.
Step S204: and averaging the obtained results of all the decision trees to obtain the result of a random forest model, namely the analysis result of the KPI inferior root cause analysis model.
Step S205: and fitting the KPI inferior root cause analysis model by using a variance or least square method.
To explain in detail how the influence factors of the influence factors are determined from the model, a third embodiment of the present invention is given as shown in fig. 3.
Step S301: and taking the basic data after the training set is selected as a test set.
Step S302: and acquiring historical data of the influence factor indexes in a certain time and actual data of the KPI indexes in a certain time in the test set.
Step S303: and the historical data and the actual data are input into a KPI degradation root cause analysis model determined in the model building module.
Step S304: and taking each node of the decision tree in the KPI degradation root cause analysis model as an influence factor, and calculating the influence factor so as to reduce the unreliability of the decision tree on average.
Step S305: and the influence factor calculating unit is used for determining the calculated non-purity value obtained by the influence factor calculating unit as an influence factor of the influence factor on the KPI.
In order to describe the implementation of the present invention in more systematic detail, a fourth embodiment is given below in conjunction with an example, as shown in fig. 4.
Step S401: and acquiring basic data related to establishing a KPI degradation root cause model, wherein the basic data comprises historical data of influence factors and KPI actual data to be analyzed.
In the KPI degradation root cause identification method provided in embodiment 4, the experimental data is from partial data (total 18510 lines) of a certain region, taking e-rab establishment success rate as an example, and the system finds that the historical data of the region are researched: these data exhibit a continuance, periodicity, correlation characteristic. The system determines the sample properties as total downtilt, azimuth, wireless utilization, average e-rab number, mechanical downtilt, latitude, longitude, etc. (58 items total). Sample data are shown in table 1:
TABLE 1 sample data
Although the data volume in the experiment does not reach the large data scale, the experimental data can be used for carrying out an algorithm correctness experiment, and then the experimental data is expanded to reach the large data scale for carrying out an algorithm prediction rate experiment.
Step S402: for random sampling of rows of training data. For the total sample size S, a put-back strategy is adopted to extract k training samples.
Step S403: column attribute sampling for training data. And for M columns of attributes, extracting M attributes without replacement, and determining the number M of the attributes randomly selected by each node according to the number M of the attributes in the sample data. Typically, M is 1/3 of M in the regression model. And calculating the information quantity of each attribute in the m attributes, and selecting the attribute with the largest information quantity for branching. Here, a random forest regression method, soOf course, the ratio of M to M may also be determined according to the actual situation, which will not be described herein.
Step S404: and establishing a decision tree. A number of decision trees are built using a fully split approach for the sampled data.
Each decision tree classifier is combined to form a random forest.
Each decision tree produces a result, and when a random forest is used to regress predictions, k trees give k predictions y 1,y2…yk.
Step S405: determining the result, calculating the average value according to the predicted values of the decision trees, and recording the final random forest output result as the average value of the k predicted results of the decision treesCan be expressed as: /(I)
Step S406: substituting data in the test set into a random forest, determining nodes by each node in the decision tree by using the non-purity, and adopting variance or least square fitting.
Step S407: when training the decision tree, it is calculated how much of the tree's unreliability each influencing factor reduces.
Step S408: how much the tree is reduced in the degree of non-purity of each influence factor is confirmed as the influence factor of the influence factor on the KPI.
Step S409: and sequencing the influence factors of the influence factors from high to low.
Step S410: and obtaining the ranking of the influence factors of the KPI according to the high-to-low ranking, and obtaining the influence factors of the main influence factors.
The invention also discloses a KPI degradation root cause identification device based on random forests, and a fifth embodiment of the invention is provided firstly, as shown in fig. 5, for explaining the structural characteristics of the device.
The device comprises:
The data acquisition module 1 is used for acquiring relevant basic data for establishing a KPI degradation root cause analysis model, wherein the basic data comprises historical data of influence factors and KPI actual data to be analyzed;
The model building module 2 is used for selecting relevant basic data acquired by the data acquisition module with set proportion as a training set, training a certain number of decision trees according to the training set, and building the KPI degradation root cause analysis model by the decision trees.
Each node of the decision tree is an influence factor, the value of the unrepeated degree which is reduced by each influence factor in average is calculated, and the value of the unrepeated degree is taken as the influence factor of the influence factor.
Specifically, the model building module further includes:
The training set selecting unit 21 selects the basic data with a set proportion as a training set, and if the training set is N in size, randomly and with a put back, extracts N training samples from the training set as a training set of the decision tree.
The decision tree obtaining unit 22 specifies a constant M < M if the feature dimension of each sample is M, randomly selects M feature subsets from M features, and selects an optimal feature from the M features for splitting to obtain a decision tree.
The model building unit 23 is used for repeatedly building k decision trees k times according to the method of acquiring the decision trees by the decision tree acquisition unit to acquire a KPI degradation root cause analysis model.
And the calculating unit 24 is used for calculating the average value of the k decision tree results by adopting an average method according to the k decision tree results established by the decision tree establishing unit, and taking the average value as the result of the KPI degradation root cause analysis model.
And the influence factor determining module 3 is used for taking the basic data after the model building module selects the training set as a test set, and combining the model to build the KPI listed root cause analysis model built by the default module to obtain influence factors influencing all the influence factors of all the KPIs.
The influence factor determination module 3 further comprises:
The test set obtaining unit 31 is configured to use the basic data after the training set selection unit selects the training set as a test set.
A test data acquisition unit 32, configured to acquire, in the test set acquisition unit, historical data of influencing factors in a certain time and actual data of KPI indicators in a certain time.
And an influence factor calculating unit 33, configured to input the historical data and the actual data acquired by the test data acquiring unit into the KPI degradation root cause analysis model determined in the model building module, take each node of the decision tree in the KPI degradation root cause analysis model as an influence factor, and calculate the influence factor so that the decision tree has an average reduced impure value.
An influence factor determining unit 34, configured to determine the impure value obtained by calculation by the influence factor calculating unit as an influence factor of the influence factor on KPI.
Preferably, for easier selection of the main influencing factor from among the influencing factors, the device may further comprise:
And the ordering module 4 orders the determined influence factors according to a set rule.
And the main influence factor determining module 5 is used for determining influence factors of influence factors influencing the KPI according to the influence factor sorting result of the sorting module.
It will be clear to those skilled in the art that, for convenience and brevity of description, the corresponding process in the above-described apparatus embodiment may refer to the specific working process of the foregoing method, which is not described herein again.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be capable of operation in sequences other than those illustrated herein.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. The KPI degradation root cause identification method based on random forest is characterized by comprising the following steps:
acquiring relevant basic data for establishing a KPI degradation root cause analysis model, wherein the basic data comprises historical data of influence factors and KPI actual data to be analyzed;
Selecting basic data with set proportion as a training set, training a certain number of decision trees according to the training set, and constructing the KPI degradation root cause analysis model by the decision trees, wherein the method specifically comprises the following steps:
Obtaining k decision trees to generate a random forest model, namely a KPI degradation root cause analysis model;
Averaging the obtained results of all the decision trees to obtain the result of a random forest model, namely the analysis result of the KPI degradation root cause analysis model;
Taking the rest of the basic data as a test set, and obtaining influence factors influencing each influence factor of each KPI by using the KPI degradation root cause analysis model;
The influence factors comprise a retentivity index, an access class index, a mobility index, a resource class index and a system capacity class index, and the influence factors are sample characteristics;
The influencing factors further comprise: total downtilt, azimuth, wireless utilization, average e-rab number, mechanical downtilt, latitude and longitude.
2. The method according to claim 1, wherein the method for training a number of decision trees according to a training set is specifically:
Randomly and repeatedly extracting N training samples from a training set with the size of N to be used as the training set of the decision tree;
And the feature dimension of each training sample is M, M (M is less than or equal to M) features are randomly selected as feature subsets, and when the tree is split each time, the optimal feature is selected from the M features to split, so that a decision tree is obtained.
3. The method according to any one of claims 1-2, characterized in that:
Each node of the decision tree is an influence factor, the value of the unrepeated degree which is reduced by each influence factor in average is calculated, and the value of the unrepeated degree is taken as the influence factor of the influence factor.
4. A method according to claim 3, characterized in that:
and the KPI degradation root cause analysis model is fitted by adopting a variance or least square method.
5. A method according to claim 3, characterized in that:
And sequencing the influence factors of the obtained KPI influence factors, and outputting the influence factors and the influence factors according to the sequencing result.
6. A KPI degradation root cause identification apparatus based on a random forest, the apparatus comprising:
The data acquisition module is used for acquiring relevant basic data for establishing a KPI degradation root cause analysis model, wherein the basic data comprises historical data of influence factors and KPI actual data to be analyzed; wherein,
The influence factors comprise a retentivity index, an access class index, a mobility index, a resource class index and a system capacity class index, and the influence factors are sample characteristics;
the influencing factors further comprise: total downtilt, azimuth, wireless utilization, average e-rab number, mechanical downtilt, latitude and longitude;
The model building module is used for selecting relevant basic data acquired by the data acquisition module with set proportion as a training set, training a certain number of decision trees according to the training set, and building the KPI degradation root cause analysis model by the decision trees; the model building module comprises: the model building unit is used for repeatedly building k decision trees k times according to the method for obtaining the decision tree by the decision tree obtaining unit to obtain a KPI degradation root cause analysis model; the calculating unit is used for calculating the average value of k decision tree results by adopting an average method according to the k decision tree results established by the decision tree establishing unit and taking the average value as the result of the KPI degradation root cause analysis model;
And the influence factor determining module is used for taking the basic data after the model building module selects the training set as a test set, and combining the model to build the KPI degradation root cause analysis model built by the default module to obtain influence factors influencing all the KPIs.
7. The apparatus of claim 6, wherein the model building module further comprises:
The training set selecting unit is used for selecting the basic data with a set proportion as a training set, and randomly extracting N training samples from the training set with a return if the size of the training set is N, and taking the N training samples as the training set of the decision tree;
And the decision tree acquisition unit is used for assigning a constant M < M if the feature dimension of each sample is M, randomly selecting M feature subsets from the M features, and selecting optimal features from the M features to split so as to acquire a decision tree.
8. The apparatus according to claim 7, wherein:
Each node of the decision tree is an influence factor, the value of the unrepeated degree which is reduced by each influence factor in average is calculated, and the value of the unrepeated degree is taken as the influence factor of the influence factor.
9. The apparatus of claim 8, wherein the influence factor determination module further comprises:
the test set acquisition unit is used for taking the basic data after the training set selection unit selects the training set as a test set;
the test data acquisition unit is used for acquiring historical data of influence factors in a certain time and actual data of the KPI indexes in a certain time in the test set acquisition unit;
The influence factor calculation unit is used for inputting the historical data and the actual data acquired by the test data acquisition unit into the KPI degradation root cause analysis model determined in the model building module, taking each node of the decision tree in the KPI degradation root cause analysis model as an influence factor, and calculating an unrepeace value of the influence factor, which enables the decision tree to be reduced averagely;
and the influence factor determining unit is used for determining the impure value obtained by calculation of the influence factor calculating unit as an influence factor of the influence factor on the KPI.
10. The apparatus according to any one of claims 6-9, wherein the apparatus further comprises:
The ordering module orders the determined influence factors according to a set rule;
and the main influence factor determining module is used for determining influence factors of influence factors influencing the KPI according to the influence factor sorting result of the sorting module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810938061.0A CN110837841B (en) | 2018-08-17 | 2018-08-17 | KPI degradation root cause identification method and device based on random forest |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810938061.0A CN110837841B (en) | 2018-08-17 | 2018-08-17 | KPI degradation root cause identification method and device based on random forest |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110837841A CN110837841A (en) | 2020-02-25 |
CN110837841B true CN110837841B (en) | 2024-05-21 |
Family
ID=69573552
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810938061.0A Active CN110837841B (en) | 2018-08-17 | 2018-08-17 | KPI degradation root cause identification method and device based on random forest |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110837841B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111553816B (en) * | 2020-04-20 | 2023-11-03 | 北京北大软件工程股份有限公司 | Administrative multiple-proposal influence factor analysis method and device |
CN113988488B (en) * | 2021-12-27 | 2022-06-21 | 上海一嗨成山汽车租赁南京有限公司 | Method for predicting ETC passing probability of vehicle by multiple factors |
CN116016303A (en) * | 2022-12-05 | 2023-04-25 | 浪潮通信信息系统有限公司 | Method for identifying service quality problem of core network based on artificial intelligence |
CN115809761B (en) * | 2023-01-19 | 2023-05-12 | 佰聆数据股份有限公司 | Voltage quality analysis method and system based on low-voltage transformer area |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104684004A (en) * | 2015-02-28 | 2015-06-03 | 浙江省通信产业服务有限公司 | Complex wireless communication network operation quality evaluation method based on fuzzy analysis |
WO2018014674A1 (en) * | 2016-07-20 | 2018-01-25 | 中兴通讯股份有限公司 | Method, apparatus, and system for determining degree of association of input and output of black box system |
CN108076197A (en) * | 2016-11-14 | 2018-05-25 | 中移(苏州)软件技术有限公司 | A kind of detection method and device of terminal network performance degradation |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7813298B2 (en) * | 2008-01-31 | 2010-10-12 | Telefonaktiebolaget Lm Ericsson | Root cause problem detection in network traffic information |
CN103138963B (en) * | 2011-11-25 | 2016-08-03 | 华为技术有限公司 | Network problem positioning method and device based on user perception |
US10230747B2 (en) * | 2014-07-15 | 2019-03-12 | Cisco Technology, Inc. | Explaining network anomalies using decision trees |
US20170019312A1 (en) * | 2015-07-17 | 2017-01-19 | Brocade Communications Systems, Inc. | Network analysis and management system |
US9961571B2 (en) * | 2015-09-24 | 2018-05-01 | Futurewei Technologies, Inc. | System and method for a multi view learning approach to anomaly detection and root cause analysis |
US10397810B2 (en) * | 2016-01-08 | 2019-08-27 | Futurewei Technologies, Inc. | Fingerprinting root cause analysis in cellular systems |
US20170364819A1 (en) * | 2016-06-17 | 2017-12-21 | Futurewei Technologies, Inc. | Root cause analysis in a communication network via probabilistic network structure |
US10361935B2 (en) * | 2017-01-31 | 2019-07-23 | Cisco Technology, Inc. | Probabilistic and proactive alerting in streaming data environments |
-
2018
- 2018-08-17 CN CN201810938061.0A patent/CN110837841B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104684004A (en) * | 2015-02-28 | 2015-06-03 | 浙江省通信产业服务有限公司 | Complex wireless communication network operation quality evaluation method based on fuzzy analysis |
WO2018014674A1 (en) * | 2016-07-20 | 2018-01-25 | 中兴通讯股份有限公司 | Method, apparatus, and system for determining degree of association of input and output of black box system |
CN108076197A (en) * | 2016-11-14 | 2018-05-25 | 中移(苏州)软件技术有限公司 | A kind of detection method and device of terminal network performance degradation |
Also Published As
Publication number | Publication date |
---|---|
CN110837841A (en) | 2020-02-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110837841B (en) | KPI degradation root cause identification method and device based on random forest | |
CN109872036B (en) | Task allocation method and device based on classification algorithm and computer equipment | |
CN112365171B (en) | Knowledge graph-based risk prediction method, device, equipment and storage medium | |
CN105786860B (en) | A data processing method and device in data modeling | |
CN105718493B (en) | Search result ordering method and its device based on decision tree | |
CN111652468B (en) | Business process generation method, device, storage medium and computer equipment | |
CN105550583A (en) | Random forest classification method based detection method for malicious application in Android platform | |
CN113240400B (en) | A candidate determination method and device based on knowledge graph | |
CN112232171B (en) | Remote sensing image information extraction method and device based on random forest and storage medium | |
CN111178633A (en) | Method and device for predicting scenic spot passenger flow based on random forest algorithm | |
CN111027771A (en) | Scenic spot passenger flow volume estimation method, system and device and storable medium | |
CN114245392B (en) | 5G network optimization method and system | |
Carrillo et al. | How predicting the academic success of students of the ESPAM MFL?: a preliminary decision trees based study | |
CN111027599A (en) | Clustering visualization method and device based on random sampling | |
Arınık et al. | Multiplicity and diversity: analysing the optimal solution space of the correlation clustering problem on complete signed graphs | |
CN109802847A (en) | A kind of analysis method of network transmission service quality, device | |
CN108055638A (en) | Obtain method, apparatus, computer-readable medium and the equipment of target location | |
CN116304155A (en) | Three-dimensional member retrieval method, device, equipment and medium based on two-dimensional picture | |
CN105930453A (en) | Repeatability analyzing method and device | |
CN114757712A (en) | Recommended method, apparatus, electronic device and readable storage medium for site selection | |
CN104636474A (en) | Method and equipment for establishment of audio fingerprint database and method and equipment for retrieval of audio fingerprints | |
CN115660730A (en) | Loss user analysis method and system based on classification algorithm | |
CN114239849A (en) | Weather disaster prediction and model training method, device, equipment and storage medium | |
CN112990246B (en) | Method and device for establishing isolated tree model | |
CN114662574A (en) | Method and device for constructing decision tree, storage medium and electronic device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |