Disclosure of Invention
The invention aims to provide a large-model intelligent early warning and intervention method and system based on risk prevention and control, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme that the large model intelligent early warning and intervention method based on risk prevention and control comprises the following steps:
Collecting data, namely collecting internal data of the enterprise and external data of related enterprises, wherein the internal data and the external data comprise historical user data and real-time data;
Preprocessing internal data and external data, setting a user with enterprise service in a follow-up continuous set time of a user in historical user data as a non-risk user, and setting a user with enterprise service less than the set time as a risk user;
Establishing a preliminary prediction model, namely establishing the preliminary prediction model according to the historical user data, and explaining the loss risk of the client in the historical user data through the preliminary prediction model;
Calculating an external addition coefficient according to the external data, and calculating the external addition coefficient showing the change trend of the external data by an external addition coefficient calculation method;
The risk prediction model is constructed by combining an external addition coefficient with the preliminary prediction model, the risk prediction model is constructed, the prediction hysteresis of the preliminary prediction model on real-time data is made up by the external addition coefficient, and the prediction accuracy is improved;
Calculating an intervention grade, namely calculating the current value and the predicted value of a user in real-time data, combining with calculating the comprehensive value, predicting the real-time data in the internal data through a risk prediction model, obtaining a loss risk prediction value, calculating the intervention grade according to the comprehensive value and the loss risk prediction value, and carrying out preset intervention behaviors according to the intervention grade.
Preferably, the external addition coefficient calculation method includes:
both the historical user data and the real-time data include multiple data types;
Splitting the external data into different enterprise categories according to the source of the external data;
calculating the increasing proportion of each data type in the historical user data relative to each data type in the real-time data by taking the enterprise class as a unit;
Respectively calculating influence factors of different enterprise categories on the enterprise by an influence factor calculation method;
taking the respective influence factors of each enterprise class as weight coefficients of target enterprises, and carrying out weight summation on the increasing proportion of each data type to obtain the comprehensive increasing proportion of each data type;
and calculating the weighting coefficients of different data types by a data type weighting coefficient calculation method, taking the weighting coefficients as the weighting coefficients of the comprehensive growth proportion of the target data types, and carrying out weight summation on the comprehensive growth proportion of each data type to obtain external addition coefficients.
Preferably, the influence factor calculating method includes:
Counting the number and acceptance of users in each enterprise class at different time points;
Calculating the user quantity increasing rate and the acceptance increasing rate between every two time points by taking the enterprise type as a unit, and calculating the average increasing rate of the user quantity and the average increasing rate of the acceptance according to the user quantity increasing rates and the acceptance increasing rates;
Taking the average growth rate of the number of users as a weight coefficient of the number of users, taking the average growth rate of the acceptance as a weight coefficient of the acceptance, and carrying out weight summation on the number of users and the acceptance to obtain an influence false value;
And carrying out probability normalization processing on the influence false values corresponding to the enterprise types to obtain a plurality of influence factors with the sum of 1.
Preferably, the data type emphasis coefficient calculating method includes:
Sorting the enterprise categories according to the sizes of the influence factors, and selecting 50% of the enterprise categories from large to small according to the influence factors;
Taking the enterprise class as a unit, calculating the ratio among the increasing proportions of multiple data types in the target enterprise class as a unit ratio;
and calculating the average ratio of unit ratios corresponding to the selected enterprise categories, and carrying out probability normalization processing on each item in the average ratio to obtain a plurality of data type emphasis coefficients with the sum of 1.
Preferably, the data type emphasis coefficient calculating method includes:
counting the number of users in each enterprise category at different time points:
Calculating the user quantity increasing rate between every two time points by taking the enterprise type as a unit, and calculating the average increasing rate of the user quantity according to the plurality of user quantity increasing rates;
sorting the enterprise categories according to the average growth rate of the number of users, and selecting 50% of the enterprise categories from large to small according to the average growth rate of the number of users;
Taking the enterprise class as a unit, calculating the ratio among the increasing proportions of multiple data types in the target enterprise class as a unit ratio;
and calculating the average ratio of unit ratios corresponding to the selected enterprise categories, and carrying out probability normalization processing on each item in the average ratio to obtain a plurality of data type emphasis coefficients with the sum of 1.
Preferably, the data type emphasis coefficient calculating method includes:
selecting 50% of enterprise categories from large to small according to influence factors;
counting the number of users in each enterprise category at different time points:
Calculating the user quantity increasing rate between every two time points by taking the enterprise type as a unit, and calculating the average increasing rate of the user quantity according to the plurality of user quantity increasing rates;
Sorting the enterprise categories according to the average growth rate of the number of users, and selecting 50% of the two enterprise categories from large to small according to the average growth rate of the number of users;
respectively calculating average ratio between the one type of enterprise class and the increasing proportion of multiple data types in the two type of enterprise class;
And calculating a weighted average ratio between the average ratio corresponding to the class-one enterprise class and the average ratio corresponding to the class-two enterprise class according to the ratio of 6:4, and carrying out probability normalization processing on the weighted average ratio to obtain a plurality of data type emphasis coefficients with the sum of 1.
Preferably, the establishing the preliminary prediction model includes:
the internal data comprises a plurality of parameter types, the parameter types are numbered sequentially, the risk value of the risk user is numbered 1, and the risk value of the non-risk user is numbered 0;
Establishing a preliminary prediction model through a linear regression model, wherein the preliminary prediction model comprises the following concrete steps:
Wherein the method comprises the steps of Representing a prediction of the risk value for the user,Is an intercept term of the linear regression model,A characteristic value representing an h-th parameter type in the internal data,Representation ofIs used to determine the regression coefficients of (a),Representing the total number of parameter types in the internal data;
Then according to the historical user data in the internal data, minimizing the loss function through an optimization algorithm, and obtaining AndThereby obtaining an accurate preliminary prediction model.
Preferably, the constructing the risk prediction model includes:
combining the preliminary prediction model with an external addition coefficient to obtain a risk prediction model, wherein the risk prediction model specifically comprises the following steps:
Wherein the method comprises the steps of Indicating a predicted value for the risk of loss, a larger value indicating a larger risk of loss,Represents the external addition coefficient of the sample,Is an intercept term of the linear regression model,A characteristic value representing an h-th parameter type in the internal data,Representation ofIs used to determine the regression coefficients of (a),Representing the total number of parameter types in the internal data.
Preferably, the calculating the intervention level includes:
real-time data of the internal data includes monthly consumption and service duration;
Respectively carrying out normalization processing on the month average consumption data and the service duration data of each user to obtain the current value and the predicted value of the user;
Calculating an average value of the current value and the predicted value as a comprehensive value;
predicting real-time data in the internal data through a risk prediction model to obtain a loss risk prediction value;
calculating the product of the comprehensive value and the loss risk predicted value as an intervention value;
and setting intervention thresholds of different grades, determining the intervention grade of the user according to the magnitude of the intervention value, and carrying out preset intervention behaviors according to the intervention grade.
Compared with the prior art, the invention has the beneficial effects that:
The external addition coefficient is calculated through external data and used for reflecting the influence of the change of the external environment on the loss of personnel of the enterprise, the preliminary prediction model established by combining the internal data of the enterprise is used for constructing a risk prediction model with higher prediction accuracy, and the data change of the enterprise and the data change of the external environment can be combined to reflect the competitiveness of the enterprise in the external environment, so that the prediction accuracy of the loss risk of the user is improved.
Meanwhile, the current value and the predicted value of the user are calculated according to the internal data of the enterprise, so that the comprehensive value of the user is judged, the intervention grade is further refined according to the predicted value of the user risk of the user, different intervention behaviors are formulated according to different intervention grades, the intervention behaviors are more accurate, and the effectiveness of the intervention behaviors is improved.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the present application, for the convenience of understanding, the steps of the method used need not be performed in the order of steps in the present embodiment, and in other embodiments, the steps may be performed synchronously or in a changed order.
Embodiment one:
When the risk model of the customer loss is constructed, the model is trained according to historical data of the enterprise, and then the influence of external data changes on the enterprise is calculated by combining the changes of the external environment, such as data of the enterprise of the same type and data of related enterprises, so that the prediction accuracy of the loss risk can be improved.
1-4 And 6, the invention provides a technical scheme of a large model intelligent early warning and intervention method based on risk prevention and control, comprising the following steps:
Collecting data, namely collecting internal data of the enterprise and external data of related enterprises, wherein the internal data and the external data comprise historical user data and real-time data;
Preprocessing internal data and external data, setting a user with enterprise service in a follow-up continuous set time of a user in historical user data as a non-risk user, and setting a user with enterprise service less than the set time as a risk user;
Establishing a preliminary prediction model, namely establishing the preliminary prediction model according to the historical user data, and explaining the loss risk of the client in the historical user data through the preliminary prediction model;
Calculating an external addition coefficient according to the external data, and calculating the external addition coefficient showing the change trend of the external data by an external addition coefficient calculation method;
The risk prediction model is constructed by combining an external addition coefficient with the preliminary prediction model, the risk prediction model is constructed, the prediction hysteresis of the preliminary prediction model on real-time data is made up by the external addition coefficient, and the prediction accuracy is improved;
Calculating an intervention grade, namely calculating the current value and the predicted value of a user in real-time data, combining with calculating the comprehensive value, predicting the real-time data in the internal data through a risk prediction model, obtaining a loss risk prediction value, calculating the intervention grade according to the comprehensive value and the loss risk prediction value, and carrying out preset intervention behaviors according to the intervention grade.
It should be noted that, the enterprise data can be queried through an enterprise internal system, and the external data of the related enterprise can be learned through the external propaganda of the related enterprise, and can also be obtained through statistics in a questionnaire mode.
The establishing of the preliminary prediction model comprises the following steps:
the internal data comprises a plurality of parameter types, the parameter types are numbered sequentially, the risk value of the risk user is numbered 1, and the risk value of the non-risk user is numbered 0;
Establishing a preliminary prediction model through a linear regression model, wherein the preliminary prediction model comprises the following concrete steps:
Wherein the method comprises the steps of Representing a prediction of the risk value for the user,Is an intercept term of the linear regression model,A characteristic value representing an h-th parameter type in the internal data,Representation ofIs used to determine the regression coefficients of (a),Representing the total number of parameter types in the internal data;
Then according to the historical user data in the internal data, minimizing the loss function through an optimization algorithm, and obtaining AndThereby obtaining an accurate preliminary prediction model.
It should be noted that, for ease of understanding, the simulation data are set as follows:
Taking a business hall as an example, historical user data in the internal data before one year comprises the gender, age, complaint times and monthly consumption and network access time of the user, forming a feature vector by the internal data, setting the feature vector as an independent variable (wherein the gender is coded by a male as 1 and the female as 0), setting the user continuously holding the business hall for one year as a non-risk user according to the judgment basis of whether the user continuously holds the business hall number for the next year, setting the user holding the business hall for one year as a risk value of 0, setting the user holding the business hall for less than one year as a risk user, setting the risk value of 1, setting the risk value as a dependent variable, and training the model through the data of a plurality of users;
Suppose that the data of the user is as follows:
the first user is a non-risk user, a gender, an age of 32, complaint times of 4 times, consumption 68 in months and a network access time period of 3 years;
The second user is a risk user, a sex woman, an age of 25, complaint times of 8 times, average consumption of 108 in a month and a network access time period of 2 years;
The third user is a risk user, sex men, age 38, complaint times of 2 times, consumption of 48 in each month and network access time of 5 years;
In training, mean square error is adopted as a loss function, and a least square method is adopted as an optimization algorithm to determine AndThe specific numerical values of (a) are calculated by the calculation method in the prior art, which is not described herein in detail, and the accurate preliminary prediction model calculated by the above-mentioned hypothesis data is:
Wherein the method comprises the steps of Representing a prediction of the risk value for the user,Indicating the sex of the user and,Indicating the age of the user and,Indicating the number of complaints of the user,Indicating that the user has consumed the same month,Indicating the network access time of the user.
When the external addition coefficients are calculated again, the influence factors of each enterprise class and the data type emphasis coefficients of each data type are calculated first.
As shown in fig. 3, the influence factor calculation method includes:
Counting the number and acceptance of users in each enterprise class at different time points;
Calculating the user quantity increasing rate and the acceptance increasing rate between every two time points by taking the enterprise type as a unit, and calculating the average increasing rate of the user quantity and the average increasing rate of the acceptance according to the user quantity increasing rates and the acceptance increasing rates;
Taking the average growth rate of the number of users as a weight coefficient of the number of users, taking the average growth rate of the acceptance as a weight coefficient of the acceptance, and carrying out weight summation on the number of users and the acceptance to obtain an influence false value;
And carrying out probability normalization processing on the influence false values corresponding to the enterprise types to obtain a plurality of influence factors with the sum of 1.
It should be noted that, for ease of understanding, the following analog data are employed:
the external enterprises are three, namely a first enterprise, a second enterprise and a third enterprise;
Taking a first enterprise as an example, the number of users (which can be searched through a network or queried through a official network of the first enterprise) and the acceptance (which can be obtained through a questionnaire or a random interview) of the first enterprise are 1000 and 60 percent respectively;
Taking the number of users as an example, the number of users of the first enterprise is 950 and 980 respectively in the data of the first two months;
Firstly, calculating the increase proportion of the number of users in each month of a first enterprise (data of three months exist, so that the calculation result is two), (980-950)/(950) is approximately equal to 0.03, (1000-980)/(980) is approximately equal to 0.02, then calculating the average value of the two increase proportions to obtain the average increase rate of the number of users to be 0.025, and calculating the average increase rate of the acceptance of the first enterprise by the same calculation method, wherein the assumption is 0.05;
The influence false value is obtained by weighting and summing the number of users and the acceptance through two average growth rates, 0.025 multiplied by 1000+0.05 multiplied by 60% = 25.03, the influence false values of a second enterprise and a third enterprise are calculated by the same calculation method, the assumption is respectively 16.37 and 19.88, then probability normalization processing is carried out on the three data, for the first enterprise, 25.03/25.03+16.37+19.88)/(0.40, for the second enterprise, 16.37/25.03+16.37+19.88)/(0.27), and for the third enterprise, 19.88/25.03+16.37+19.88)/(0.33), so that influence factors of each external enterprise on staff loss of the first enterprise can be obtained.
As shown in fig. 4, the data type emphasis coefficient calculation method includes:
both the historical user data and the real-time data include multiple data types;
sorting the enterprise categories according to the sizes of the influence factors, and selecting a certain number of enterprise categories from large to small according to the influence factors;
Taking the enterprise class as a unit, calculating the ratio among the increasing proportions of multiple data types in the target enterprise class as a unit ratio;
and calculating the average ratio of unit ratios corresponding to the selected enterprise categories, and carrying out probability normalization processing on each item in the average ratio to obtain a plurality of data type emphasis coefficients with the sum of 1.
It should be noted that, for ease of understanding, the following analog data are employed:
The data types are divided into advertisement delivery quantity, complaint quantity and package change quantity (three types of data can be counted through the inquiry of the official network of each enterprise and the random access and questionnaire mode);
Along with the simulation data, according to the ordering of the influence factors, a first enterprise > a third enterprise > a second enterprise, two enterprises (50% of the enterprises are selected, and the actual selected enterprise number is obtained after rounding, and the actual selected enterprise number is a simplified example) are respectively a first enterprise and a third enterprise;
assume that the ratio among the advertisement delivery quantity, complaint quantity and package change quantity of the first enterprise is 10:3:2, and the ratio among the advertisement delivery quantity, complaint quantity and package change quantity of the third enterprise is 8:2:5;
Calculating the average ratio of the two ratios to be 9:2.5:3.5;
9. The probability normalization processing is carried out on 2.5 and 3.5 to obtain 9/9+2.5+3.5) =0.6, 2.5/9+2.5+3.5) ≡0.17 and 3.5/9+2.5+3.5) ≡0.23 respectively, so that the data type weighting coefficients of three data types of advertisement putting quantity, complaint quantity and package changing quantity are 0.6, 0.17 and 0.23 respectively.
After the influence factors and the data type emphasis coefficients are calculated, the external addition coefficients are calculated.
As shown in fig. 2, the external addition coefficient calculation method includes:
both the historical user data and the real-time data include multiple data types;
Splitting the external data into different enterprise categories according to the source of the external data;
calculating the increasing proportion of each data type in the historical user data relative to each data type in the real-time data by taking the enterprise class as a unit;
taking the respective influence factors of each enterprise class as weight coefficients of target enterprises, and carrying out weight summation on the increasing proportion of each data type to obtain the comprehensive increasing proportion of each data type;
And taking the weighting coefficient as the weighting coefficient of the comprehensive increasing proportion of the target data type, and carrying out weight summation on the comprehensive increasing proportion of each data type to obtain an external addition coefficient.
It should be noted that, for the sake of easy understanding, the above-described analog data is extended, and the remaining analog data is set as follows:
Taking a first enterprise as an example, the advertisement putting quantity, the complaint quantity and the package changing quantity in historical user data (last year) are 8000, 1600 and 2000 respectively, the advertisement putting quantity, the complaint quantity and the package changing quantity in real-time data (latest acquisition) are 10000, 2000 and 2500 respectively, the increasing proportion of the calculated advertisement putting quantity is (10000-8000)/(8000=0.25), the increasing proportion of the complaint quantity is (2000-1600)/(1600=0.25), and the increasing proportion of the package changing quantity is (2500-2000)/(2000=0.25);
The method can calculate the growth proportion of various data types of the second enterprise and the third enterprise by the same calculation method, and the growth proportion of the advertisement delivery quantity of the second enterprise is 0.20, the growth proportion of the complaint quantity is 0.15, the growth proportion of the package change quantity is 0.18, the growth proportion of the advertisement delivery quantity of the third enterprise is 0.15, the growth proportion of the complaint quantity is 0.20, and the growth proportion of the package change quantity is 0.16;
Taking the advertisement putting amount as an example, the increase proportion of the first enterprise is 0.25, the increase proportion of the second enterprise is 0.20, the increase proportion of the third enterprise is 0.15, different weights of 0.40, 0.27 and 0.33 (the influence factors of the enterprises obtained through calculation) are distributed, and the comprehensive increase proportion of the advertisement putting amount is 0.2035 after summation;
the comprehensive increase proportion of complaint quantity and package change quantity can be calculated by the same calculation method, and the two are assumed to be 0.1500 and 0.3000 respectively;
And respectively distributing different weights of 0.6, 0.17 and 0.23 (the data type weighting coefficients of the data types obtained through calculation), and obtaining an external addition coefficient of 0.2166 after summing, so that the influence degree of external environment change on the loss of personnel of the enterprise is reflected.
Combining the external addition coefficient and the preliminary prediction model, constructing a risk prediction model comprises:
combining the preliminary prediction model with an external addition coefficient to obtain a risk prediction model, wherein the risk prediction model specifically comprises the following steps:
Wherein the method comprises the steps of Indicating a predicted value for the risk of loss, a larger value indicating a larger risk of loss,Represents the external addition coefficient of the sample,Is an intercept term of the linear regression model,A characteristic value representing an h-th parameter type in the internal data,Representation ofIs used to determine the regression coefficients of (a),Representing the total number of parameter types in the internal data.
It should be noted that, for convenience of understanding, the accurate risk prediction model can be obtained by continuing to use the previous simulation data, which is as follows:
The method can be obtained after simplification:
Wherein the method comprises the steps of Indicating a predicted value for the risk of loss, a larger value indicating a larger risk of loss,Indicating the sex of the user and,Indicating the age of the user and,Indicating the number of complaints of the user,Indicating that the user has consumed the same month,Indicating the network access time of the user.
The real-time data of the internal data is assumed to be user first data, wherein the user first data is sex women, age 48, complaint times 1 time, average consumption of month 98 and network access time 6 years, and the user first data is substituted into the risk prediction model to obtain a loss risk prediction value of 0.05.
And predicting the loss risk of each user in the real-time data according to the method, and obtaining a loss risk prediction value.
The intervention level of each user is then calculated and corresponding intervention is performed based on the intervention level.
As shown in fig. 6, calculating the intervention level includes:
Respectively carrying out normalization processing on the month average consumption data and the service duration data of each user to obtain the current value and the predicted value of the user;
Calculating an average value of the current value and the predicted value as a comprehensive value;
predicting real-time data in the internal data through a risk prediction model to obtain a loss risk prediction value;
calculating the product of the comprehensive value and the loss risk predicted value as an intervention value;
and setting intervention thresholds of different grades, determining the intervention grade of the user according to the magnitude of the intervention value, and carrying out preset intervention behaviors according to the intervention grade.
It should be noted that, for ease of understanding, the above data of user a is followed:
user A, consumption of 98 per month and network access time of 2 years, wherein the predicted value of loss risk is 0.05;
further assume that:
Monthly, consuming 108 by a user, and enabling the network access time to be 4 years, wherein the loss risk predicted value is 0.28;
the average consumption of the users is 128, the network access time is 6 years, and the predicted value of the loss risk is 0.16;
User D, consumption 68 in months, network access time of 1 year, and loss risk prediction value of 0.62;
After the average consumption and the network access duration are respectively normalized, two data of the user A can be respectively 0.5 and 0.2 (for the convenience of calculation, the data are fewer, the normalization is the prior art, and the specific process is not repeated), and the comprehensive value is (0.5+0.2)/(2=0.35;
the two data of the user B are 0.7 and 0.6, and the comprehensive value of the two data is (0.6+0.7)/(2=0.65);
two data of the user C are 1 and 1, and the comprehensive value of the data is (1+1)/(2=1);
the two data of the user D are 0 and 0, and the comprehensive value of the user D is (0+0)/(2=0);
Therefore, the intervention value of the user a is calculated to be 0.05×0.35=0.0175, the intervention value of the user b is calculated to be 0.182, the intervention value of the user c is calculated to be 0.16, and the intervention value of the user d is calculated to be 0.
Assuming that the set intervention thresholds are 0.08 and 0.17, wherein less than 0.08 is regarded as non-intervention users, more than 017 is regarded as high-intervention users, and the rest are regarded as low-intervention users, and respectively correspond to different intervention behaviors, the non-intervention users do not perform intervention behaviors, the low-intervention users perform preferential package recommendation, and the high-intervention users perform package time limit for free;
According to the intervention behaviors, the intervention behaviors of the user A and the user D can be omitted, the package time of the user B is free, and the preferential package recommendation is conducted on the user C.
In addition, when the intervention threshold is set, the staff can set according to the intervention values of all users, so that the proportion of low intervention users, high intervention users and non-intervention users is ensured, and meanwhile, more intervention levels can be set, more intervention means can be set, the operation principle is the same, and the details are not repeated here.
Different intervention behaviors are carried out on users with different intervention grades, so that the effectiveness of the intervention behaviors can be improved, and the effect of stabilizing the users is improved.
Embodiment two:
In the first embodiment, when calculating the data type emphasis coefficient, the enterprise class of the influence factor is selected as the calculation object, and the enterprises are put into calculating the influence of various data types on the user loss on different data types, but when some enterprises have a large number of users due to history reasons and the number of users increases slowly, the larger influence factor is calculated, but the number of the enterprises increases slowly, which means that the degree of correspondence between the change of the enterprise on different data types and the emphasis proportion of different data types is poor, so that the calculated data type emphasis coefficient is not accurate enough.
The data type emphasis coefficient calculation method comprises the following steps:
counting the number of users in each enterprise category at different time points:
Calculating the user quantity increasing rate between every two time points by taking the enterprise type as a unit, and calculating the average increasing rate of the user quantity according to the plurality of user quantity increasing rates;
sorting the enterprise categories according to the average growth rate of the number of users, and selecting 50% of the enterprise categories from large to small according to the average growth rate of the number of users;
Taking the enterprise class as a unit, calculating the ratio among the increasing proportions of multiple data types in the target enterprise class as a unit ratio;
and calculating the average ratio of unit ratios corresponding to the selected enterprise categories, and carrying out probability normalization processing on each item in the average ratio to obtain a plurality of data type emphasis coefficients with the sum of 1.
It should be noted that, for ease of understanding, the following analog data are employed:
The average increase rate of the number of users of the enterprise I is 0.025 respectively, the average increase rate of the number of users of the enterprise II is 0.018 respectively, and the average increase rate of the number of users of the enterprise III is 0.016 respectively;
Sorting according to the average growth rate of the number of users, wherein the first enterprise is greater than the second enterprise is greater than the third enterprise, and selecting two enterprises (selecting 50%, rounding to obtain the number of actually selected enterprises, and simplifying the example here) respectively, namely the first enterprise and the second enterprise;
Assume that the ratio among the advertisement delivery amount, complaint amount and package change amount of the first enterprise is 10:3:2, and the second enterprise is 6:2:6;
Calculating the average ratio of the two ratios to be 8:2.5:4;
8. 205, 4 can obtain 8 ≡ (8+2.5+4) ≡0.55, 2.5 ≡ (8+2.5+4) ≡0.17, 4 ≡ (8+2.5+4) ≡0.28 respectively, so the data type weighting coefficients of the three data types of advertisement putting quantity, complaint quantity and package changing quantity are 0.55, 0.17 and 0.28 respectively.
According to the calculation mode, only enterprises with high average growth rate of the number of users are considered, the average growth rate of the number of users is high, the proportional relation among the data types is more attractive to the users, and the loss of the users of the enterprises can be greatly influenced, so that the data type emphasis coefficient calculated by the mode is more accurate in the current environment.
Embodiment III:
The calculation methods of the data type emphasis coefficients in the first embodiment and the second embodiment respectively consider the actual user quantity of enterprises and the average growth rate of user data, but the data type emphasis coefficients represent the influence of the ratio of the increase proportion of various data types on personnel loss, enterprises with large user quantity and large average growth rate of user quantity can have larger influence on the user loss of the enterprises, and the embodiment provides another calculation method of the data type emphasis coefficients on the basis of the first embodiment and the second embodiment.
The data type emphasis coefficient calculation method comprises the following steps:
selecting 50% of enterprise categories from large to small according to influence factors;
counting the number of users in each enterprise category at different time points:
Calculating the user quantity increasing rate between every two time points by taking the enterprise type as a unit, and calculating the average increasing rate of the user quantity according to the plurality of user quantity increasing rates;
Sorting the enterprise categories according to the average growth rate of the number of users, and selecting 50% of the two enterprise categories from large to small according to the average growth rate of the number of users;
respectively calculating average ratio between the one type of enterprise class and the increasing proportion of multiple data types in the two type of enterprise class;
And calculating a weighted average ratio between the average ratio corresponding to the class-one enterprise class and the average ratio corresponding to the class-two enterprise class according to the ratio of 6:4, and carrying out probability normalization processing on the weighted average ratio to obtain a plurality of data type emphasis coefficients with the sum of 1.
It should be noted that, for ease of understanding, the following uses the above data:
the first enterprise class (the calculation result of the first embodiment) comprises a first enterprise and a third enterprise, and the second enterprise class (the calculation result of the second embodiment) comprises a first enterprise and a second enterprise;
the average ratio of the enterprise categories in the first class is 9:2.5:3.5 (the calculation result of the first embodiment), and the average ratio of the enterprise categories in the second class is 8:2.5:4 (the calculation result of the second embodiment);
Calculating the weighted average ratio between the average ratio corresponding to the class of enterprises and the average ratio corresponding to the class of enterprises according to the ratio of 6:4 to obtain 8.6:2.5:3.7;
The probability normalization processing is carried out on 8.6, 2.5 and 3.7 to obtain 8.6 (8.6+2.5+3.7) =0.58, 2.5 (8.6+2.5+3.7) ≡0.17 and 3.7 (8.6+2.5+3.7) ≡0.25 respectively, so that the data type emphasis coefficients of three data types of advertisement putting quantity, complaint quantity and package changing quantity are 0.58, 0.17 and 0.28 respectively.
The influence of enterprises with more users and faster user quantity growth on the user loss of the enterprise is fully considered, the accuracy of the data type emphasis coefficient is improved, the difference of influence of the growth proportion of various different data types on the user loss of the enterprise can be reflected, the accuracy of predicting the user loss risk of the enterprise is further improved, the enterprise can conveniently execute more accurate and effective intervention behaviors, and the viscosity of the user is improved.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made hereto without departing from the spirit and scope of the invention as defined by the appended embodiments and equivalents thereof.