CN114169993A - Financial institution risk early warning method and device, electronic equipment and medium - Google Patents
Financial institution risk early warning method and device, electronic equipment and medium Download PDFInfo
- Publication number
- CN114169993A CN114169993A CN202111643919.9A CN202111643919A CN114169993A CN 114169993 A CN114169993 A CN 114169993A CN 202111643919 A CN202111643919 A CN 202111643919A CN 114169993 A CN114169993 A CN 114169993A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- target
- abnormal operation
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Pure & Applied Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Development Economics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Algebra (AREA)
- Game Theory and Decision Science (AREA)
- Artificial Intelligence (AREA)
- Educational Administration (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Technology Law (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The application discloses a financial institution risk early warning method, a financial institution risk early warning device, electronic equipment, a medium and a program product, and relates to the technical field of big data. The method comprises the following steps: fitting a correlation model by using the abnormal operation historical data of each division of the financial institution; calculating a Kendel rank correlation coefficient, a Spireman correlation coefficient and a Pearson correlation coefficient according to the correlation model; weighting and summing the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient to obtain a comprehensive correlation matrix; and in response to the existence of the first target subsection with the abnormal operation quantity higher than the amount threshold value, determining the subsection to be early-warned with the relevance of the first target subsection according to the target row representing the first target subsection in the comprehensive relevance matrix and the values of the row elements in the target row. According to the technical scheme, early warning is carried out on financial institution subsections which possibly have abnormal operation risks by analyzing the association degree of abnormal operation among the subsections.
Description
Technical Field
The present application relates to the field of big data technologies, and in particular, to a method, an apparatus, an electronic device, a medium, and a program product for risk early warning of a financial institution.
Background
A large financial institution often manages multiple local divisions of the institution, for example, a bank manages many provincial branches, a security company has a provincial division, and the like. For financial institutions, all local branches often have a cooperative relationship in work, so that all branches often have linkage in work errors and abnormal operation, that is, one branch has a problem and often drives branches on other work chains to make errors together.
Therefore, the research on the linkage becomes a valuable research direction in the aspects of preventing risks caused by operation abnormity, researching sources of abnormity, clearing wind control mechanisms and the like.
Disclosure of Invention
The application provides a financial institution risk early warning method, a financial institution risk early warning device, electronic equipment, a financial institution risk early warning medium and a program product, which are used for carrying out risk early warning of abnormal operation on each division of a financial institution according to the correlation degree of the abnormal operation among the divisions.
In a first aspect, the present application provides a financial institution risk early warning method, including:
fitting a correlation model by using the abnormal operation historical data of each division of the financial institution, wherein the correlation model is a dependent structure model for reflecting the strength of the joint probability density distribution relation among the components;
calculating a Kendel rank correlation coefficient, a Spireman correlation coefficient and a Pearson correlation coefficient according to the correlation model;
weighting and summing the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient to obtain a comprehensive correlation matrix, wherein the comprehensive correlation matrix is used for measuring the abnormal operation correlation among all the branches;
in response to the existence of a first target subsection with the business abnormity and the quantity higher than an amount threshold value, determining a to-be-early-warning subsection with the relevance degree of the first target subsection according to a target row representing the first target subsection in the comprehensive relevance matrix and the values of each row element in the target row.
In a second aspect, the present application further provides a financial institution risk early warning device, which includes:
the model fitting module is used for fitting a correlation model by using the abnormal operation historical data of each division of the financial institution, wherein the correlation model is a dependent structure model for reflecting the strength of the joint probability density distribution relation among the components;
the correlation coefficient calculation module is used for calculating a Kendel rank correlation coefficient, a Spireman correlation coefficient and a Pearson correlation coefficient according to the correlation degree model;
the comprehensive relevance matrix obtaining module is used for weighting and summing the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient to obtain a comprehensive relevance matrix, and the comprehensive relevance matrix is used for measuring the abnormal operation relevance among all the subsections;
and the early warning module is used for responding to the existence of first target branches with abnormal operation and the quantity higher than the amount threshold value, and determining the branches to be early warned, of which the association degree with the first target branches accords with the association degree threshold value, according to the target rows representing the first target branches in the comprehensive association degree matrix and the values of all row elements in the target rows.
In a third aspect, the present application further provides an electronic device, including:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a financial institution risk pre-warning method as described above.
In a fourth aspect, the present application also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the financial institution risk pre-warning method as described above.
In a fifth aspect, the present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the financial institution risk pre-warning method as described above.
According to the technical scheme, the association degree model is fitted by using the operation abnormity historical data of each division of the financial institution, Kendel rank correlation coefficient, Spireman correlation coefficient and Pearson correlation coefficient are calculated according to the model, a comprehensive association degree matrix is determined in a weighted summation mode, and the association degree of the operation abnormity among the divisions is measured by using the comprehensive association degree matrix. Specifically, it is possible to identify each of the other divisions having a high degree of association with the division having a large number of operational abnormalities based on the integrated association matrix, and to perform risk warning of the operational abnormalities for the divisions, thereby quickly preventing the spread of the abnormalities and the expansion of the loss.
Drawings
Fig. 1 is a flowchart of a financial institution risk early warning method in an embodiment of the present application;
FIG. 2 is a flow chart of a financial institution risk early warning method in an embodiment of the present application;
FIG. 3 is a flow chart of a financial institution risk early warning method in an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a risk early warning device of a financial institution in an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be further noted that, for the convenience of description, only some of the structures related to the present application are shown in the drawings, not all of the structures.
Fig. 1 is a flowchart of a risk early warning method for a financial institution in an embodiment of the present application, which is applicable to a situation where association degrees of abnormal operations among departments of the financial institution are determined, and when a certain part has a large number of abnormal operations, early warning is performed on the part having a large association degree with the certain part, and relates to the technical field of big data. The method may be performed by a financial institution risk early warning device, which may be implemented in software and/or hardware, and is preferably configured in an electronic device, such as a computer device or a server. As shown in fig. 1, the method specifically includes:
s101, fitting a correlation model by using the operation abnormity historical data of each branch of the financial institution, wherein the correlation model is a dependent structure model for reflecting the strength of the joint probability density distribution relation among the components.
According to the method and the system, the operation abnormity data of all the subsections are collected and analyzed, the mathematical relation existing among the operation abnormity of different subsections is excavated, the mathematical linkage relation of the abnormity on mathematics is accurately mastered, the distribution situation of the whole operation abnormity of the financial institution can be controlled, the abnormity source and the risk conduction chain are more easily positioned, early warning is timely carried out on the relevant department before the problem spreads, and the loss is prevented from further expanding.
Specifically, the association model is fitted by using historical data of the abnormal operation of each division of the financial institution, wherein the historical data of the abnormal operation may include data of any kind of operation accident which does not belong to the normal operation event category of the financial institution, and the embodiment of the present application does not limit the specific event of the abnormal operation. The relevancy model is a dependent structure model used for reflecting the strength of the joint probability density distribution relation among the components.
In one embodiment, a copula model may be employed as the association model. It should be noted that copula is various and includes thousands of seed types, and since a highly precise mathematical model is required in the embodiment of the present application, a more complex and versatile R-vine copula model or a simplified truncated R-vine (also called truncated R-vine) copula model thereof is more suitable. The truncated R-vine copula is a general name of a certain specific architecture copula, the structure is highly flexible, the adaptability to almost all types of data is very good, the unique defects are complexity and difficulty in calculation, when the component dimension is extremely large, the parameter complexity is increased in a square series manner, and the calculation amount is difficult to bear by a general computer and a general server. Therefore, the method can be simplified, and the shortcomings of complexity and difficulty in calculation are overcome on the premise of ensuring accuracy by adopting a truncated R-vine copula model.
S102, calculating a Kendel rank correlation coefficient, a Spireman correlation coefficient and a Pearson correlation coefficient according to the correlation model.
S103, weighting and summing the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient to obtain a comprehensive correlation matrix, wherein the comprehensive correlation matrix is used for measuring the business anomaly correlation degree among all the branches.
The Copula model is essentially a variant of a joint probability distribution function, and the Copula is adopted for reflecting the strength relation of mutual linkage among component data by means of relevance in the sense of joint probability distribution, wherein the strength relation is mainly shown by Kendall rank correlation coefficients (Kendall's tau coefficients) calculated by sample data simulated by a Copula structure. The Kendall's tau correlation coefficient is used for judging whether the increase and decrease amplitudes of two sequence data are synchronous or not by comparing the magnitude ordering difference between the two sequence data, thereby judging the correlation magnitude between the two sequences. Kendall's tau correlation coefficients, like Spireman correlation coefficients and Pearson correlation coefficients, can be calculated by generating ensemble sample simulation data by combining probability distribution functions. In the application, in order to comprehensively measure the abnormal operation correlation degree among all the branches, the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient are subjected to weighted summation to obtain a comprehensive correlation degree matrix. That is, the kender rank correlation coefficient, the spearman correlation coefficient and the pearson correlation coefficient between every two components are weighted and summed to obtain a comprehensive correlation coefficient, and a comprehensive correlation matrix can be formed according to the comprehensive correlation coefficient, wherein rows and columns of the comprehensive correlation matrix correspond to the components, namely, all parts of the financial institution, and elements in the matrix can represent the correlation of the abnormal operation between the parts represented by the row where the element is located and the column where the element is located.
The Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient have good complementarity: for example, pearson's correlation coefficient can only measure linear correlation, and if the correlation between sequences is not linear, pearson will fail completely; the spearman coefficient is relatively complex to calculate, and errors are easy to generate when the sample size is small. And the comprehensive incidence matrix formed by the coefficients obtained by weighting the three coefficients can better adapt to various practical working scenes. Furthermore, since Kendall's tau is more functional, the Kendall's tau should be weighted higher, second order from the spearman coefficient, with the smallest pearson coefficient, for example, in the ratio: kendall's tau 50%, Spireman 30%, Pearson 20%. If the pearson coefficient is too high in one embodiment, it may be reduced to 10% again and the specific gravity of the spearman increased appropriately. The weights of the correlation coefficients are not limited in any way, and can be configured as required.
And S104, in response to the first target branches with the number of the business exceptions higher than the amount threshold, determining the to-be-early-warning branches with the relevance degree meeting the relevance degree threshold with the first target branches according to the target rows representing the first target branches in the comprehensive relevance degree matrix and the values of the row elements in the target rows.
The rows and columns in the overall relevance matrix correspond to the subdivisions, and the value of any element in the matrix represents the degree of relevance between the subdivision corresponding to the row of the element and the subdivision corresponding to the column of the element. Therefore, when the number of the business anomalies of any first target subsection is found to be higher than the amount threshold, namely when the number of the business anomalies of the first target subsection is found to be large, a target row representing the first target subsection can be positioned in the comprehensive relevance matrix, at least one element with a value exceeding a preset relevance threshold is determined in the target row, and the subsection corresponding to the column of the element is the subsection with high relevance with the first target subsection and serves as the subsection to be warned.
According to the technical scheme of the embodiment of the application, the association degree model is fitted by using the operation abnormity historical data of each division of the financial institution, the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient are calculated according to the model, the comprehensive association degree matrix is determined in a weighted summation mode, and the association degree of the operation abnormity among the divisions is measured by using the comprehensive association degree matrix. Specifically, it is possible to identify each of the other divisions having a high degree of association with the division having a large number of operational abnormalities based on the integrated association matrix, and to perform risk warning of the operational abnormalities for the divisions, thereby quickly preventing the spread of the abnormalities and the expansion of the loss.
Fig. 2 is a flowchart of a risk early warning method of a financial institution in the embodiment of the present application, and the embodiment is further optimized based on the above embodiment. As shown in fig. 2, the method includes:
s201, fitting a correlation model by using the operation abnormity historical data of each branch of the financial institution, wherein the correlation model is a dependent structure model for reflecting the strength of the joint probability density distribution relation among the components.
S202, calculating a Kendel rank correlation coefficient, a Spireman correlation coefficient and a Pearson correlation coefficient according to the correlation model.
S203, weighting and summing the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient to obtain a comprehensive correlation matrix, wherein the comprehensive correlation matrix is used for measuring the business anomaly correlation degree among all the branches.
S204, in response to the first target subsection with the abnormal operation quantity higher than the amount threshold, determining the subsection to be warned with the relevance degree meeting the relevance degree threshold with the first target subsection according to the target row representing the first target subsection in the comprehensive relevance degree matrix and the values of each row element in the target row.
In the scheme of passive early warning, that is, when an abnormal operation of a first target division is high, at least one division with a high association degree with the first target division can be extracted according to the abnormal operation association degree between the divisions represented by the element values in the comprehensive association degree matrix and is used as a division to be early warned. In addition, the method and the device can also actively predict the possibility of future abnormal operation and major outbreak of any target branch so as to realize an active early warning scheme. This will be described in detail next.
S205, aiming at any second target subsection, extracting at least one relevant subsection of which the relevance degree with the second target subsection meets a relevance degree threshold according to the size of each element value in the comprehensive relevance degree matrix; and forming the second target subsection and the at least one correlation subsection into a risk conduction chain of the business anomaly.
Because the magnitude of each element value in the comprehensive relevance matrix can be used for measuring the relevance of the abnormal operation among all the subsections, for any second target subsection, at least one relevance subsection with the relevance of the second target subsection according to the magnitude of each element value in the comprehensive relevance matrix can be extracted, the relevance of the relevance subsections and the second target subsection is high, a risk conduction chain of the abnormal operation can be formed, the wind control department is greatly facilitated to comb risk sources, conduction processes and conduction mechanisms, and further hidden dangers are eliminated in an auxiliary and thorough mode.
S206, inputting the abnormal operation data of each related branch in the current time period into a pre-trained LightGBM model, and predicting the abnormal operation quantity of the second target branch in the future time period after the current time period by using the LightGBM model.
After the risk conductive chain is determined, the embodiment of the application inputs the abnormal operation data of each association part in the current time period into a pre-trained artificial intelligence algorithm model LightGBM (light Gradient Boosting machine) based on the conductive chain relation, and predicts the number of the abnormal operation of the second target part in the future time period after the current time period by using the LightGBM model, so as to further accurately predict the possibility of the abnormal operation of the concerned second target part in the future. For example, the abnormal operation data of each related division in the week is input into the LightGBM model, and the output result of the LightGBM model is the abnormal operation data which may occur in the next week with the related divisions and the second target division belonging to one risk transmission chain.
Wherein, the training process of the LightGBM model may include:
acquiring sample data of a plurality of second target subsections and risk transmission chains thereof, wherein the sample data comprises abnormal operation data of each subsection; inputting the abnormal operation data of at least one relevant branch in each risk conduction chain into a LightGBM model which is built in advance as characteristic data, and taking the abnormal operation data of a second target branch in each risk conduction chain as a prediction object to train the LightGBM model; and the time period corresponding to the abnormal operation data of the second target branch used for model training is later than the time period corresponding to the abnormal operation data of at least one related branch corresponding to the second target branch.
Specifically, according to the method of the embodiment of the present application, risk conduction chains of a plurality of second target branches are obtained, abnormal operation data of each related branch is obtained from each risk conduction chain, the LightGBM model is input as feature data, and the LightGBM model is trained by taking the abnormal operation data of the second target branches in the same risk conduction chain as a prediction object. Through training of a large amount of sample data, the model structure and parameters can be finally determined, and the trained LightGBM model can predict the future business anomaly quantity of any second target subsection which needs attention at present.
In the sample data used for the LightGBM model fitting training, the time period corresponding to the abnormal operation data of the second target branch in the same risk conduction chain always precedes the time period corresponding to the abnormal operation data of each relevant branch strongly related to the second target branch, that is, the sample data is obtained by: and the time period corresponding to the abnormal operation data of the second target branch used for model training is later than the time period corresponding to the abnormal operation data of the at least one related branch corresponding to the second target branch. For example, the abnormal operation data of the related branch for model training corresponds to the abnormal operation observation data from week 1 to week 100, and the time span of the abnormal operation data of the corresponding second target branch should be from week 2 to week 101. That is, each related part data corresponds to the data of the second target part of the next week and is trained as the input data and the prediction data of the model respectively, so that the effect of predicting the future data can be realized in the model.
In the embodiment of the present application, "first" and "second" in "first target division" and "second target division" are described only for convenience of description, and have no specific meaning, and may be any divisions under a financial institution.
According to the technical scheme of the embodiment of the application, the association degree model is fitted by using the operation abnormity historical data of each division of the financial institution, the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient are calculated according to the model, the comprehensive association degree matrix is determined in a weighted summation mode, and the association degree of the operation abnormity among the divisions is measured by using the comprehensive association degree matrix. Specifically, on one hand, other branches with high relevance with the branch with much abnormal operation can be identified according to the comprehensive relevance matrix so as to carry out passive risk early warning of the abnormal operation on the branches, and the abnormal propagation and loss expansion can be quickly prevented; on the other hand, an artificial intelligence algorithm can be used, and the trained LightGBM model can be used for actively predicting future operation abnormal data of a second target branch in the chain according to the operation abnormal data of all relevant branches of the same risk conduction chain, so that active risk early warning is realized.
Fig. 3 is a flowchart of a risk early warning method of a financial institution in the embodiment of the present application, and the embodiment is further optimized based on the above embodiment. As shown in fig. 3, the method includes:
s301, acquiring abnormal operation historical data of each branch in past set time to obtain an abnormal operation historical data matrix, wherein each column in the abnormal operation historical data matrix corresponds to each branch, and each row corresponds to the abnormal operation historical data in each time period.
Specifically, the operation abnormality historical data of each part every week in past set time may be acquired, then the operation abnormality historical data may be subjected to data preprocessing, for example, at least one of data cleaning, data complementing, data removing, data normalizing or data transforming, and finally, an operation abnormality historical data matrix may be formed according to the operation abnormality historical data after data preprocessing.
S302, converting each row of data in the abnormal operation historical data matrix according to the sample quantile point value to obtain a converted abnormal operation historical data matrix.
Specifically, the conversion of the history data may be performed according to the following steps:
sequencing all rows of data in the abnormal operation historical data matrix according to the size sequence, and obtaining a sample total of all rows after sequencing; converting each data in each ordered row of data into a corresponding branch point value in each row of sample population, wherein the value range of the branch point value is 0-1; and obtaining a converted operation abnormity historical data matrix according to the quantile point value corresponding to each data.
It should be noted that the copula model is a numerical value of 0 to 1 in terms of data type, because it is a structure dedicated to processing the relationship between the probability distribution functions, and therefore, the abnormal-operation data of each division cannot be directly substituted into the model, and the data needs to be converted. Namely: the input data must be as good as the probability distribution function but only in the range of 0-1. In the application, an experience distribution method can be adopted, all the subsection data are arranged in a word to be used as a sample population, and each data is converted into a corresponding subsection point value in the sample population. In addition, since 0% and 100% quantiles are not common in real cases, in order to make the data more reasonable, under the condition of not changing the data sequence, the quantiles corresponding to the minimum and maximum values (0% and 100% quantile data) can be properly scaled in the application, for example, changed to 0.01% and 99.99%, respectively, so that the transformed data does not have the values of 0 and 1, and the result calculated by the model is more accurate.
It should be noted that, in the embodiment of the present application, the sample data used for training the LightGBM model and the data input into the model during prediction are all the weekly abnormal operation time series data, rather than being converted into quantile point data on 0-1, and the operation of the data conversion is specifically set for the structural requirement of the relevance model (e.g., copula model).
And S303, substituting the converted operation abnormity history data matrix into a pre-constructed relevance model for fitting.
The method and the device can substitute the pre-constructed relevance model into the operation abnormity historical data matrix, and estimate the model parameters through fitting, so that a final model structure is obtained.
copula is a classic dependent structure model, the variety of the copula is very complicated, the structure is simple, the structure is complex, the research object of the embodiment of the application is each division organization under the financial institution, so the number of the division organization is generally not more than 100, and for the situation that the data dimension is not high, the embodiment of the application can adopt a complex copula structure, such as an R-vine copula model. The copula structure is highly flexible, has very good adaptability to almost all types of data, and has the advantage of greatly improving the description precision of model dependence. The only disadvantage is the complex structure, numerous parameters and thus computationally difficult. The method can adapt to the data dimension of only dozens, and once the data dimension is broken by hundreds, the R-vine copula is difficult to calculate. If the dimension exceeds 50, the calculation is laborious, and in another embodiment, truncated R-vine can be used for simplification, and high accuracy can still be maintained.
For the good or bad of the fitting structure, the AIC criterion can be adopted for judging. That is, in the process of fitting the copula model, the fitted structure is judged by using the AIC criterion, the purpose of judgment is to determine a global optimal solution from a plurality of local optimal solutions of the model parameters, and finally, the final copula model structure is determined based on the judgment result. The AIC criterion can be expressed by the following formula:
AIC=2p-2ln(L)
where p is the number of parameters of the copula model, and L is the likelihood function of the sample, which can be regarded as copula density. It is clear that smaller values of AIC prove more excellent models.
It should be noted that, in the fitting process of the copula model, due to the limitation of the computer iterative operation technical means, the global optimal point of the parameter cannot be found, only the local optimal point can be found, and more than one local optimal point can be found (the iterative technique of the computer usually needs to give an initial value to each parameter, then the initial value of the iterative parameter changes continuously, and finally converges to the local optimal solution, so the initial values are different, and the local optimal solutions are different). The final optimal solution is determined by comparing the AIC values of the local optimal points. In addition, if there are other dependent structure models as the competitive objects, it is also possible to judge which model is more excellent based on the AIC criterion in order to compare the current copula model with the models.
S304, calculating a Kendel rank correlation coefficient, a Spireman correlation coefficient and a Pearson correlation coefficient according to the correlation model.
S305, weighting and summing the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient to obtain a comprehensive correlation matrix, wherein the comprehensive correlation matrix is used for measuring the business anomaly correlation degree among all the branches.
S306, in response to the first target subsection with the abnormal operation quantity higher than the amount threshold, determining the subsection to be warned with the relevance degree meeting the relevance degree threshold value with the first target subsection according to the target row representing the first target subsection in the comprehensive relevance degree matrix and the value of each row element in the target row.
According to the technical scheme of the embodiment of the application, the comprehensive relevance matrix is utilized to measure the relevance of the abnormal operation among all the branches, and then other branches with high relevance with the branches with much abnormal operation are identified according to the comprehensive relevance matrix so as to carry out risk early warning of the abnormal operation on the branches, so that abnormal spread and loss expansion are quickly prevented.
Fig. 4 is a schematic structural diagram of the financial institution risk early warning device in the embodiment. The embodiment can be suitable for determining the association degree of the abnormal operation among all the parts in the financial institution, and when a certain part has a large number of abnormal operation, the part with the large association degree is warned, and relates to the technical field of big data. The device can realize the financial institution risk early warning method in any embodiment of the application. As shown in fig. 4, the apparatus specifically includes:
a model fitting module 401, configured to fit a correlation model using the abnormal operation historical data of each division of the financial institution, where the correlation model is a dependent structure model used for reflecting the strength of the joint probability density distribution relationship between each component;
a correlation coefficient calculating module 402, configured to calculate a kender rank correlation coefficient, a spearman correlation coefficient, and a pearson correlation coefficient according to the correlation model;
a comprehensive relevance matrix obtaining module 403, configured to perform weighted summation on the kender rank correlation coefficient, the spearman correlation coefficient, and the pearson correlation coefficient to obtain a comprehensive relevance matrix, where the comprehensive relevance matrix is used to measure the abnormal business relevance between the sub-portions;
the early warning module 404 is configured to determine, in response to the existence of the first target division with the number of the business exceptions being higher than the amount threshold, a to-be-early-warning division whose association with the first target division meets the association threshold according to the target row representing the first target division in the comprehensive association matrix and the values of the row elements in the target row.
Optionally, the relevance model is an R-vine copula model or a truncated R-vine copula model.
Optionally, the apparatus further comprises:
and the part set extraction module is used for extracting a part set corresponding to an element with a numerical value higher than a preset threshold value from the comprehensive relevance matrix, wherein each part in the part set is used as a risk conduction chain of the abnormal operation.
Optionally, the model fitting module 401 includes:
a historical data matrix obtaining unit, configured to obtain abnormal operation historical data of each branch within a past set time to obtain an abnormal operation historical data matrix, where each column in the abnormal operation historical data matrix corresponds to each branch, and each row corresponds to the abnormal operation historical data within each time period;
the data conversion unit is used for converting each row of data in the operation abnormity historical data matrix according to the sample quantile point value to obtain a converted operation abnormity historical data matrix;
and the fitting unit is used for substituting the converted operation abnormity historical data matrix into a pre-constructed relevance model for fitting.
Optionally, the data conversion unit is specifically configured to:
sequencing all rows of data in the abnormal operation historical data matrix according to the size sequence, and obtaining a sample total of all rows after sequencing;
converting each data in each ordered row of data into a corresponding branch point value in each row of sample population, wherein the value range of the branch point value is 0-1;
and obtaining the converted operation abnormity historical data matrix according to the quantile point value corresponding to each data.
Optionally, the data conversion unit is further configured to:
and under the condition of not changing the sequence of the sorting, scaling the extreme value of each datum appearing in the quantile point value corresponding to each column sample population.
Optionally, the apparatus further comprises:
the judging module is used for judging the fitted structure by using an AIC (automatic information center) criterion in the process of fitting the correlation model;
the purpose of the evaluation is to determine a global optimal solution from a plurality of local optimal solutions of model parameters, and the relevance model is obtained according to the evaluation result.
Optionally, the historical data matrix obtaining unit is specifically configured to:
acquiring the weekly abnormal operation historical data of each branch in the past set time;
performing data preprocessing on the operation exception historical data, wherein the data preprocessing comprises at least one of the following steps: data cleaning, data complementing, data removing, data standardization or data transformation;
and forming the operation abnormity historical data matrix according to the operation abnormity historical data after the data preprocessing.
The financial institution risk early warning device provided by the embodiment of the application can execute the financial institution risk early warning method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. FIG. 5 illustrates a block diagram of an exemplary electronic device 12 suitable for use in implementing embodiments of the present application. The electronic device 12 shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in FIG. 5, electronic device 12 is embodied in the form of a general purpose computing device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, and commonly referred to as a "hard drive"). Although not shown in FIG. 5, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the application.
A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally perform the functions and/or methodologies of the embodiments described herein.
The processing unit 16 executes programs stored in the system memory 28 to execute various functional applications and data processing, for example, to implement the financial institution risk early warning method provided in the embodiment of the present application.
The embodiment of the application also provides a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the computer readable storage medium realizes the financial institution risk early warning method provided by the embodiment of the application.
The computer storage media of the embodiments of the present application may take any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
Furthermore, the present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the financial institution risk pre-warning method as described above.
In the technical scheme of the embodiment of the application, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user all conform to the regulations of related laws and regulations and do not violate the good customs of the public order.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present application and the technical principles employed. It will be understood by those skilled in the art that the present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the application. Therefore, although the present application has been described in more detail with reference to the above embodiments, the present application is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present application, and the scope of the present application is determined by the scope of the appended claims.
Claims (14)
1. A financial institution risk early warning method is characterized by comprising the following steps:
fitting a correlation model by using the abnormal operation historical data of each division of the financial institution, wherein the correlation model is a dependent structure model for reflecting the strength of the joint probability density distribution relation among the components;
calculating a Kendel rank correlation coefficient, a Spireman correlation coefficient and a Pearson correlation coefficient according to the correlation model;
weighting and summing the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient to obtain a comprehensive correlation matrix, wherein the comprehensive correlation matrix is used for measuring the abnormal operation correlation among all the branches;
in response to the existence of a first target subsection with the business abnormity and the quantity higher than an amount threshold value, determining a to-be-early-warning subsection with the relevance degree of the first target subsection according to a target row representing the first target subsection in the comprehensive relevance matrix and the values of each row element in the target row.
2. The method according to claim 1, wherein the relevance model is an R-vine copula model or a truncated R-vine copula model.
3. The method of claim 1 or 2, further comprising:
for any second target subsection, extracting at least one relevant subsection of which the relevance degree with the second target subsection meets the relevance degree threshold according to the size of each element value in the comprehensive relevance degree matrix;
and forming the second target subsection and at least one correlation subsection into a risk conduction chain of the business abnormity.
4. The method of claim 3, further comprising:
and inputting the abnormal operation data of each associated branch in the current time period into a pre-trained LightGBM model, and predicting the abnormal operation quantity of the second target branch in a future time period after the current time period by using the LightGBM model.
5. The method of claim 4, wherein the training process of the LightGBM model comprises:
acquiring sample data of the second target subsections and risk transmission chains thereof, wherein the sample data comprises abnormal operation data of each subsection;
inputting the abnormal operation data of at least one relevant branch in each risk conduction chain into a LightGBM model which is built in advance as characteristic data, and taking the abnormal operation data of a second target branch in each risk conduction chain as a prediction object to train the LightGBM model;
and the time period corresponding to the abnormal operation data of the second target branch used for model training is later than the time period corresponding to the abnormal operation data of at least one related branch corresponding to the second target branch.
6. The method of claim 1, wherein fitting the correlation model with historical data of business anomalies for each division of the financial institution comprises:
acquiring abnormal operation historical data of each branch within the past set time to obtain an abnormal operation historical data matrix, wherein each column in the abnormal operation historical data matrix corresponds to each branch, and each row corresponds to the abnormal operation historical data within each time period;
converting each row of data in the operation abnormity historical data matrix according to a sample quantile point value to obtain a converted operation abnormity historical data matrix;
and substituting the converted operation abnormity historical data matrix into a pre-constructed relevance model for fitting.
7. The method according to claim 6, wherein the converting each column of data in the operation abnormality historical data matrix according to the sample quantile point value to obtain a converted operation abnormality historical data matrix comprises:
sequencing all rows of data in the abnormal operation historical data matrix according to the size sequence, and obtaining a sample total of all rows after sequencing;
converting each data in each ordered row of data into a corresponding branch point value in each row of sample population, wherein the value range of the branch point value is 0-1;
and obtaining the converted operation abnormity historical data matrix according to the quantile point value corresponding to each data.
8. The method according to claim 7, wherein the converting each column of data in the operation abnormality historical data matrix according to the sample quantile point value to obtain a converted operation abnormality historical data matrix further comprises:
and under the condition of not changing the sequence of the sorting, scaling the extreme value of each datum appearing in the quantile point value corresponding to each column sample population.
9. The method of claim 1, further comprising:
judging the fitted structure by using an AIC (automatic information center) criterion in the process of fitting the correlation model;
the purpose of the evaluation is to determine a global optimal solution from a plurality of local optimal solutions of model parameters, and the relevance model is obtained according to the evaluation result.
10. The method of claim 6, wherein the obtaining historical mismanagement data of the divisions within a set time period in the past to obtain a matrix of historical mismanagement data comprises:
acquiring the weekly abnormal operation historical data of each branch in the past set time;
performing data preprocessing on the operation exception historical data, wherein the data preprocessing comprises at least one of the following steps: data cleaning, data complementing, data removing, data standardization or data transformation;
and forming the operation abnormity historical data matrix according to the operation abnormity historical data after the data preprocessing.
11. A financial institution risk early warning device, comprising:
the model fitting module is used for fitting a correlation model by using the abnormal operation historical data of each division of the financial institution, wherein the correlation model is a dependent structure model for reflecting the strength of the joint probability density distribution relation among the components;
the correlation coefficient calculation module is used for calculating a Kendel rank correlation coefficient, a Spireman correlation coefficient and a Pearson correlation coefficient according to the correlation degree model;
the comprehensive relevance matrix obtaining module is used for weighting and summing the Kendel rank correlation coefficient, the Spireman correlation coefficient and the Pearson correlation coefficient to obtain a comprehensive relevance matrix, and the comprehensive relevance matrix is used for measuring the abnormal operation relevance among all the subsections;
and the early warning module is used for responding to the existence of first target branches with abnormal operation and the quantity higher than the amount threshold value, and determining the branches to be early warned, of which the association degree with the first target branches accords with the association degree threshold value, according to the target rows representing the first target branches in the comprehensive association degree matrix and the values of all row elements in the target rows.
12. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the financial institution risk pre-warning method of any of claims 1-10.
13. A computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing a financial institution risk pre-warning method as claimed in any one of claims 1 to 10.
14. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements a financial institution risk pre-warning method as claimed in any of claims 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111643919.9A CN114169993A (en) | 2021-12-29 | 2021-12-29 | Financial institution risk early warning method and device, electronic equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111643919.9A CN114169993A (en) | 2021-12-29 | 2021-12-29 | Financial institution risk early warning method and device, electronic equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114169993A true CN114169993A (en) | 2022-03-11 |
Family
ID=80488575
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111643919.9A Pending CN114169993A (en) | 2021-12-29 | 2021-12-29 | Financial institution risk early warning method and device, electronic equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114169993A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119130660A (en) * | 2024-09-06 | 2024-12-13 | 中科先智(北京)国际科技研究院 | A financial risk early warning system and method based on big data |
CN119129941A (en) * | 2024-11-12 | 2024-12-13 | 中国电建集团华东勘测设计研究院有限公司 | Monitoring equipment layout optimization method and device based on Kendall rank correlation analysis |
CN119359307A (en) * | 2024-09-18 | 2025-01-24 | 湖北联合资信评估有限公司 | A supply chain financial transaction security early warning method and system |
CN120065938A (en) * | 2025-02-20 | 2025-05-30 | 江苏中邦建通科技有限公司 | Multi-system-integrated factory monitoring and safety system and method |
-
2021
- 2021-12-29 CN CN202111643919.9A patent/CN114169993A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119130660A (en) * | 2024-09-06 | 2024-12-13 | 中科先智(北京)国际科技研究院 | A financial risk early warning system and method based on big data |
CN119359307A (en) * | 2024-09-18 | 2025-01-24 | 湖北联合资信评估有限公司 | A supply chain financial transaction security early warning method and system |
CN119129941A (en) * | 2024-11-12 | 2024-12-13 | 中国电建集团华东勘测设计研究院有限公司 | Monitoring equipment layout optimization method and device based on Kendall rank correlation analysis |
CN120065938A (en) * | 2025-02-20 | 2025-05-30 | 江苏中邦建通科技有限公司 | Multi-system-integrated factory monitoring and safety system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114169993A (en) | Financial institution risk early warning method and device, electronic equipment and medium | |
CN111563606A (en) | Equipment predictive maintenance method and device | |
CN106030589A (en) | Disease prediction system using open source data | |
US20210342735A1 (en) | Data model processing in machine learning using a reduced set of features | |
CN110688536A (en) | Label prediction method, device, equipment and storage medium | |
CN110543996A (en) | job salary assessment method, apparatus, server and storage medium | |
CN115936895A (en) | Risk assessment method, device and equipment based on artificial intelligence and storage medium | |
US12236368B1 (en) | Probabilistic accumulation approach to assess primary uncertainty in catastrophe models | |
CN117421311A (en) | Data verification method, device, equipment and storage medium based on artificial intelligence | |
CN112668238A (en) | Rainfall processing method, device, equipment and storage medium | |
CN114331682A (en) | Method and device for acquiring relevance of deposit amount between objects | |
CN119917837A (en) | Method and device for sensing risk status of power grid safety production operations and equipment | |
Colace et al. | Unsupervised learning techniques for vibration-based structural health monitoring systems driven by data: a general overview | |
CN113313304A (en) | Power grid accident abnormity analysis method and system based on big data decision tree | |
CN119129768A (en) | A learning method for incomplete multivariate time series forecasting | |
CN118469381A (en) | Information processing method, apparatus, device, medium, and program product | |
CN114266414A (en) | Loan amount prediction method, loan amount prediction device, loan amount prediction electronic device, and loan amount prediction medium | |
CN117875684A (en) | Enterprise quality risk assessment method, device, equipment and storage medium | |
Tumbelaka et al. | Analysis of the Effect of Accounting Information Systems on the Accuracy and Reliability of Financial Statements at Shipping Industry Companies in Surabaya | |
CN114154757A (en) | Enterprise risk prediction method, device, electronic equipment and medium | |
Rasinojehdehi et al. | Predicting the claim amount from car insurance using multiple linear regression: A case study of Iran insurance | |
CN112330139A (en) | Clustering-based employee behavior classification model data set generation method | |
CN119151302B (en) | Method and system for managing and controlling settlement inspection risk of transformer substation building engineering | |
CN112465379A (en) | Enterprise scoring method, device, equipment and medium based on hazardous waste tracing data | |
CN119761715B (en) | Cost control method, device, equipment, storage medium and product for engineering cost |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |