[go: up one dir, main page]

CN111179068A - A method, device and storage medium for determining the credit level of an enterprise's performance - Google Patents

A method, device and storage medium for determining the credit level of an enterprise's performance Download PDF

Info

Publication number
CN111179068A
CN111179068A CN202010000250.0A CN202010000250A CN111179068A CN 111179068 A CN111179068 A CN 111179068A CN 202010000250 A CN202010000250 A CN 202010000250A CN 111179068 A CN111179068 A CN 111179068A
Authority
CN
China
Prior art keywords
performance
enterprise
data
determining
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010000250.0A
Other languages
Chinese (zh)
Inventor
陈杰
杨正道
马野
金鑫
赵培培
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baiwang Co ltd
Original Assignee
Baiwang Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baiwang Co ltd filed Critical Baiwang Co ltd
Priority to CN202010000250.0A priority Critical patent/CN111179068A/en
Publication of CN111179068A publication Critical patent/CN111179068A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

本申请公开了一种企业履约信用级别的确定方法、装置及存储介质,企业履约信用级别的确定方法包括:获取企业的发票数据;根据所述发票数据,建立履约评级模型;利用所述履约评级模型,计算企业的履约得分;根据所述履约得分确定履约信用级别。本申请提供的企业履约信用级别的确定方法,设计科学合理,能够对金融风险进行有效监控,快速确定企业信用级别,效率高,能大幅提升工作效率,使用简单方便,能很好地满足实际应用的需要。

Figure 202010000250

The present application discloses a method, device and storage medium for determining an enterprise's performance credit level. The method for determining an enterprise's performance credit level includes: acquiring invoice data of an enterprise; establishing a performance rating model according to the invoice data; using the performance rating Model, calculate the performance score of the enterprise; determine the performance credit level according to the performance score. The method for determining the credit level of an enterprise's performance of the contract provided by this application has a scientific and reasonable design, can effectively monitor financial risks, quickly determine the credit level of an enterprise, has high efficiency, can greatly improve work efficiency, is simple and convenient to use, and can well meet practical applications. needs.

Figure 202010000250

Description

Method, device and storage medium for determining enterprise performance credit level
Technical Field
The application relates to the technical field of data processing, in particular to a method and a device for determining an enterprise performance credit level and a storage medium.
Background
In financial activities, financial institutions continuously monitor financial risks, and the traditional method adopts a method of expert operation and on-site investigation, so that the working efficiency is low. When facing an increase in information and enterprise quantities, it is difficult to effectively monitor many risks. How to effectively monitor the increasingly-increased financial risks based on a big data technology is a problem to be solved urgently at present.
Disclosure of Invention
The application aims to provide a method, a device and a storage medium for determining an enterprise performance credit level. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
According to an aspect of an embodiment of the present application, there is provided a method for determining an enterprise performance credit level, including: acquiring invoice data of an enterprise;
establishing a performance rating model according to the invoice data;
calculating a performance score of the enterprise by using the performance rating model;
determining a level of performance credit based on the performance score.
Further, the acquiring invoice data of the enterprise comprises: and extracting the billing behavior data of the billing enterprises from the business system, carrying out ETL processing and storing the data in a data warehouse.
Further, the establishing a performance rating model according to the invoice data includes:
and combining the invoice data to derive the characteristics influencing the enterprise performance risk, extracting the derived characteristics, modeling the sample data, finding out the characteristic variables highly related to the default risk and establishing a corresponding logistic regression model.
Further, modeling the sample data after extracting the derivative features includes:
finding out the optimal box number and box boundary of each characteristic;
performing box separation on each feature according to the optimal box separation boundary to obtain each box boundary of the feature and a WOE value;
processing the characteristic matrix of the training set and the test set, and replacing all values in the characteristic matrix with WOE values of the corresponding boxes;
modeling is performed using a training set.
Further, the binning each feature according to an optimal binning boundary includes: firstly, determining a larger box dividing number, carrying out equal frequency box dividing, calculating the WOE value of each box and the IV value of the characteristic, then combining similar boxes according to the chi-square test value, calculating the WOE value of each box and the IV value of the characteristic again until the number of the boxes becomes a smaller value, drawing a box dividing number-IV value curve, and finding out the optimal box dividing number and the boundary of each box.
According to an aspect of an embodiment of the present application, there is provided an apparatus for determining an enterprise performance credit level, including:
the acquisition module is used for acquiring invoice data of enterprises;
the modeling module is used for establishing a performance rating model according to the invoice data;
the calculation module is used for calculating the performance score of the enterprise by utilizing the performance rating model;
and the rating module is used for determining the performance credit level according to the performance score.
Further, the acquisition module is specifically used for extracting the billing behavior data of the billing enterprise from the business system, and storing the billing behavior data in the data warehouse after ETL processing.
Further, the modeling module is specifically configured to combine the invoice data derived features that affect the enterprise performance risk, extract derived features, then model sample data, find out feature variables highly related to the default risk, and establish a corresponding logistic regression model.
According to another aspect of the embodiments of the present application, there is provided an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to implement the method for determining the enterprise performance credit level.
According to another aspect of the embodiments of the present application, there is provided a computer-readable storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the method for determining the level of business performance credit.
The technical scheme provided by one aspect of the embodiment of the application can have the following beneficial effects:
the method for determining the enterprise performance credit level provided by the embodiment of the application has the advantages of scientific and reasonable design, capability of effectively monitoring financial risks and quickly determining the enterprise credit level, high efficiency, capability of greatly improving the working efficiency, simplicity and convenience in use and capability of well meeting the requirements of practical application.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the application, or may be learned by the practice of the embodiments. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 illustrates a flowchart of a method for determining an enterprise performance credit level, according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is further described with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As shown in fig. 1, a first embodiment of the present application provides a method for determining a level of performance credit of an enterprise, including:
s1, acquiring invoice data of the enterprise;
s2, establishing a performance rating model according to the invoice data;
s3, calculating the performance score of the enterprise by using the performance rating model;
and S4, determining the performance credit level according to the performance score.
In some embodiments, the obtaining invoice data for a business comprises: and extracting the billing behavior data of the billing enterprises from the business system, carrying out ETL processing and storing the data in a data warehouse.
In some embodiments, said building a performance rating model from said invoice data comprises:
and combining the invoice data to derive the characteristics influencing the enterprise performance risk, extracting the derived characteristics, modeling the sample data, finding out the characteristic variables highly related to the default risk and establishing a corresponding logistic regression model.
In some embodiments, the modeling the sample data after extracting the derived features includes:
finding out the optimal box number and box boundary of each characteristic;
performing box separation on each feature according to the optimal box separation boundary to obtain each box boundary of the feature and a WOE value;
processing the characteristic matrix of the training set and the test set, and replacing all values in the characteristic matrix with WOE values of the corresponding boxes;
modeling is performed using a training set.
In some embodiments, the modeling is performed using a training set, and a logistic regression model is established with the logistic regression model as the performance rating model.
In some embodiments, the binning the features according to the optimal binning boundary comprises: firstly, determining a larger box dividing number, carrying out equal frequency box dividing, calculating the WOE value of each box and the IV value of the characteristic, then combining similar boxes according to the chi-square test value, calculating the WOE value of each box and the IV value of the characteristic again until the number of the boxes becomes a smaller value, drawing a box dividing number-IV value curve, and finding out the optimal box dividing number and the boundary of each box.
The embodiment also provides an apparatus for determining the performance credit level of an enterprise, including:
the acquisition module is used for acquiring invoice data of enterprises;
the modeling module is used for establishing a performance rating model according to the invoice data;
the calculation module is used for calculating the performance score of the enterprise by utilizing the performance rating model;
and the rating module is used for determining the performance credit level according to the performance score.
In some embodiments, the obtaining module is specifically configured to extract the billing behavior data of the billing enterprise from the business system, perform ETL processing, and store the result in the data warehouse.
In some embodiments, the modeling module is specifically configured to combine the invoice data to derive features that affect the business performance risk, extract the derived features, model the sample data, find out feature variables highly related to the default risk, and establish a corresponding logistic regression model.
Compared with the prior art, the method of the embodiment can effectively save the cost of human resources, greatly improve the working efficiency, and is simple and convenient to use and low in maintenance cost.
The embodiment also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the method for determining the enterprise performance credit level.
The present embodiment also provides a computer-readable storage medium, on which a computer program is stored, the program being executed by a processor to implement the method for determining the level of business performance credit.
A second embodiment of the present application provides a method for determining an enterprise performance credit level, including:
1.1, extracting user invoice issuing data from a data system, and establishing a new data table for storage;
1.2, analyzing factors influencing the enterprise performance risk through invoice data, designing a logistic regression model:
Figure BSA0000199103770000052
firstly, by combining invoice data derivation with characteristics affecting enterprise performance risk, modeling sample data after extracting derived characteristics, finding out characteristic variables highly related to default risk and establishing a corresponding logistic regression model.
Figure BSA0000199103770000053
Logistic regression model:
the probability of occurrence (Odds) of an event refers to the ratio of the probability of occurrence of the event to the probability of non-occurrence of the event. If a customer default probability is p, then its normal probability is 1-p, from which it can be derived:
Odds=p/(1-p);
at this time, the probability of a customer breach, p, can be expressed as:
Figure BSA0000199103770000051
the scoring card expression is:
Score=A-Blog(Odds)
where A, B is a constant. Since the log function monotonically increases at (0 → + ∞), the Score is lower as the user's rate of default Odds is greater.
By giving
(1) Score value S0 at a particular Odds;
(2) score increases value PDO when the particular Odds value is doubled;
a, B can be found by substituting the given value S0 and PDO into the scoring card expression.
Through the above analysis, the problem of scoring Score of the user is converted into the problem of finding log of default log probability (Odds) of the user.
Constructing a prediction function according to binary logistic regression
Figure BSA0000199103770000061
Where h θ (x) represents the probability that the result takes 1.
The log probability of the event (Odds) is derived as follows:
Figure BSA0000199103770000062
it can be found that: in the logistic regression model, the log probability of the output Y ═ 1 is a linear function of the input condition x.
Figure BSA0000199103770000063
Figure BSA0000199103770000064
From this, log (Odds) ═ θ01x1+...+θnxn
1.3 analyzing the sample data and establishing a performance rating model
1) And performing data preprocessing on the feature matrix X, wherein the data preprocessing comprises repeated value removal, missing value filling (only few samples lack the feature, the feature can be directly deleted, the number of family people can be filled by using a mean value, income filling of random forests can be realized, and the like), abnormal value processing and sample unbalance processing (using over-sampling and under-sampling methods).
2) The optimal bin number and bin boundaries for each feature are found. The optimal binning number is the number of bins in which the IV value of the feature is within the optimal IV value range as much as possible (the importance of each feature is increased as much as possible), and the features are similar in bins and have large inter-bin differences. The steps are thus: firstly, determining a larger number of bins, carrying out equal frequency bin splitting, calculating the WOE value of each bin and the IV value of the characteristic, then combining similar bins according to the chi-square test value, and calculating the WOE value of each bin and the IV value of the characteristic again until the number of the bins becomes a smaller value. And drawing a box number-IV value curve to find out the optimal box number and the boundaries of each box.
3) And (4) performing box separation on each feature according to the optimal box separation boundary, and obtaining the box boundaries and the WOE value of the feature after the box separation.
4) And processing the feature matrix X of the training set and the test set, and replacing all values in the feature matrix with WOE values of the corresponding boxes.
5) Modeling is performed by using a training set, a model score is calculated by using a test set, and the model score is improved by adjusting a regularization coefficient C and the maximum iteration number max _ iter by using a learning curve.
6) And (5) making a scoring card.
Score=A-B*log(odds)
Calculating the values of coefficients A and B from the values of the formula, and obtaining the intercept from logistic regression
Inter, and each characteristic coefficient lr _ coef _. The reference value of the score card is calculated by using a base _ score ═ a-B × lr. intercept _ formula, and a score list (one score for each box) for each feature is calculated by using col _ score ═ whereall [ "i _ colName" ] — (B × lr. coef _ [0] [ i ]), where whereintheaall [ "i _ colName" ] is a list of box boundaries for the i feature and a corresponding list of WOE values.
1.4 determining an enterprise performance credit level according to a performance rating model.
A third embodiment of the present application provides a method for determining an enterprise performance credit level, including:
s10, extracting user invoice data from the data system, and establishing a new data table for storage;
s20, preprocessing invoice behavior data, and storing the preprocessed incremental data; the invoice behavior data comprises invoicing amount, change of the invoicing amount, a waste invoice, invoicing time and behavior;
s30, designing a logistic regression model by analyzing factors influencing the enterprise performance risk through invoice data:
s40, analyzing the sample data and establishing a performance rating model;
and S50, determining the business performance credit level by using the performance rating model.
Step S40 includes: establishing a logistic regression model influencing the enterprise performance credit through invoice characteristic data of the selected samples;
step S50 includes: and calculating the performance risk score and the level according to the logistic regression model.
Compared with the prior art, the method of the embodiment can effectively save the cost of human resources, greatly improve the working efficiency, and is simple and convenient to use and low in maintenance cost.
A third embodiment of the present application provides a method for determining an enterprise performance credit level, including: step 01, extracting data
Extracting billing behavior data of the billing enterprise from the business system, carrying out ETL processing and storing the data in a data warehouse; the data items mainly comprise invoicing amount, invoice red flushing, invoice invalidation, invoicing time and other data in the invoice data.
Aiming at a large amount of enterprise invoice data in practical application, the embodiment adopts a data warehouse mode, and utilizes an ETL process to extract, convert and load the invoice data in a business system, and a wide table which takes a client as a center and contains multiple attributes is established and stored in the data warehouse for later modeling analysis. The data extraction refers to extracting data from a source system, converting the data into a corresponding data structure according to analysis requirements, summarizing the data, and loading the data to a target data warehouse after conversion and summarization are completed.
Step 02, data processing
Four types of data are mainly processed: respectively, missing values, abnormal values, deduplication processing, and processing of noise data. For the whole data, a preliminary knowledge of the data and an exploration analysis process of prior knowledge are obtained, the preliminary exploration of the data is firstly carried out, the basic attribute and the distribution condition of the data are obtained, and in addition, the relation among all the characteristics of the data index can be preliminarily explored through univariate and multivariate analysis so as to verify the hypothesis provided in the business analysis stage.
Billing amount: the invoicing amount reflects the operating income condition of the enterprise, the target customer groups of different credit products are different, and the invoicing amount can reflect the applicability of the credit products to the enterprises with different operating scales;
billing frequency: in the aspect of invoicing frequency, the number of continuously invoiced months in one year is a typical signal, an enterprise operating normally can have more invoiced months, even invoices are made for 12 months, and on the contrary, if the number of invoiced months is less than 12 months or less, problems of the enterprise can be shown, and the signals are risk signals. In the aspect of invoice types, the common invoice does not have the function of deducting the income tax amount, is obviously distinguished from the value-added tax special invoice, and the proportion of the special invoice is also a dimension of credit expression.
The ratio of the invoices which are wasted in the red flushing is high, and the invoice with a certain amount is treated by the red flushing in the current month or the next month, so that the conditions of returning goods and issuing wrong invoices of normal enterprises are considered not too much, if the ratio of the invoices which are wasted in the red flushing is too high, certain problems in enterprise operation or management can be shown, and the invoice belongs to potential risk points.
The more downstream customers of the enterprise, the wider the sales channel of the enterprise, and the less influence of the single downstream customer, so the concentration indexes such as the number and the occupation ratio of the downstream customers are added into the characteristics.
The more stable the downstream customers of the enterprise, the more stable the operation condition of the enterprise, and the more stable the cash flow of the enterprise, so that the lost customer proportion and the new customer proportion of the downstream customers of the enterprise are added into the index.
Step 03, single factor analysis
In order to ensure the modeling effect, trend analysis and correlation, feature importance, collinearity among features and significance calculation are carried out on each feature, and feature variables with strong prediction capability, weak correlation and strong significance on a prediction result are reserved.
The WOE value is a form of encoding of the original argument, and to perform WOE encoding on a variable, it is necessary to first perform binning on the variable. It represents the ratio of the corresponding customer to the non-responding customer in the current group and the difference in this ratio across all samples. This difference is represented by the ratio of these two ratios, then log-removed. The larger the WOE, the greater the difference, the greater the likelihood of a sample response in the packet, and the smaller the WOE value, the smaller the difference, the lesser the likelihood of a sample response in the packet.
The complete name of the characteristic IV Value is Information Value, and Chinese means Information Value or Information quantity.
Step 04, logistic regression modeling
And (4) performing logistic regression modeling, predicting default probability of the enterprise, and determining the performance risk rating of the enterprise according to the default probability interval.
It should be noted that:
the term "module" is not intended to be limited to a particular physical form. Depending on the particular application, a module may be implemented as hardware, firmware, software, and/or combinations thereof. Furthermore, different modules may share common components or even be implemented by the same component. There may or may not be clear boundaries between the various modules.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose devices may be used with the teachings herein. The required structure for constructing such a device will be apparent from the description above. In addition, this application is not directed to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present application as described herein, and any descriptions of specific languages are provided above to disclose the best modes of the present application.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: this application is intended to cover such departures from the present disclosure as come within known or customary practice in the art to which this invention pertains. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the application and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the present application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in the creation apparatus of a virtual machine according to embodiments of the present application. The present application may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present application may be stored on a computer readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the application, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The above-mentioned embodiments only express the embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (10)

1.一种企业履约信用级别的确定方法,其特征在于,包括:1. A method for determining the performance credit level of an enterprise, comprising: 获取企业的发票数据;Obtain the invoice data of the enterprise; 根据所述发票数据,建立履约评级模型;According to the invoice data, establish a performance rating model; 利用所述履约评级模型,计算企业的履约得分;Using the performance rating model, calculate the performance score of the enterprise; 根据所述履约得分确定履约信用级别。A performance credit rating is determined based on the performance score. 2.根据权利要求1所述的方法,其特征在于,所述获取企业的发票数据,包括:从业务系统抽取开票企业的开票行为数据,进行ETL处理后储存到数据仓库。2 . The method according to claim 1 , wherein the acquiring the invoice data of the enterprise comprises: extracting the invoicing behavior data of the invoicing enterprise from the business system, and storing the data in a data warehouse after ETL processing. 3 . 3.根据权利要求1所述的方法,其特征在于,所述根据所述发票数据,建立履约评级模型,包括:3. The method according to claim 1, wherein, establishing a performance rating model according to the invoice data, comprising: 结合所述发票数据衍生影响企业履约风险的特征,提取衍生特征后对样本数据进行建模,找出与违约风险高度相关的特征变量并建立相应的逻辑回归模型。Combined with the characteristics of the invoice data that affect the performance risk of the enterprise, after extracting the derivative characteristics, the sample data is modeled, the characteristic variables that are highly correlated with the default risk are found, and the corresponding logistic regression model is established. 4.根据权利要求3所述的方法,其特征在于,所述提取衍生特征后对样本数据进行建模,包括:4. The method according to claim 3, characterized in that, modeling the sample data after the extraction of derived features, comprising: 找出每个特征最佳分箱数和箱子边界;Find the optimal number of bins and bin boundaries for each feature; 对各个特征依据最佳分箱边界进行分箱,得到特征的各箱边界以及WOE值;Binning each feature according to the best binning boundary to obtain the box boundary and WOE value of the feature; 处理训练集和测试集的特征矩阵,将特征矩阵中的值全部替换为对应箱子的WOE值;Process the feature matrices of the training set and the test set, and replace all the values in the feature matrix with the WOE values of the corresponding boxes; 使用训练集进行建模。Use the training set for modeling. 5.根据权利要求4所述的方法,其特征在于,所述对各个特征依据最佳分箱边界进行分箱,包括:首先确定一个较大的分箱数,进行等频分箱,计算各箱WOE值和特征的IV值,然后依据卡方检验值合并相似箱子,再次计算各箱WOE值以及该特征IV值,直到箱子数量变为一个较小值,画出分箱数-IV值曲线,找出最佳分箱数和各箱边界。5. The method according to claim 4, characterized in that, performing binning for each feature according to the best binning boundary, comprising: first determining a larger number of bins, performing equal-frequency binning, calculating each feature. Box WOE value and feature IV value, and then combine similar boxes according to the chi-square test value, calculate the WOE value of each box and the feature IV value again, until the number of boxes becomes a smaller value, draw the number of bins-IV value curve , find the optimal number of bins and the boundaries of each bin. 6.一种企业履约信用级别的确定装置,其特征在于,包括:6. A device for determining the credit level of an enterprise's contract performance, characterized in that it comprises: 获取模块,用于获取企业的发票数据;The acquisition module is used to acquire the invoice data of the enterprise; 建模模块,用于根据所述发票数据,建立履约评级模型;a modeling module for establishing a performance rating model according to the invoice data; 计算模块,用于利用所述履约评级模型,计算企业的履约得分;a calculation module, used for calculating the performance score of the enterprise by using the performance rating model; 评级模块,用于根据所述履约得分确定履约信用级别。A rating module, configured to determine the performance credit level according to the performance score. 7.根据权利要求6所述的装置,其特征在于,所述获取模块具体用于从业务系统抽取开票企业的开票行为数据,进行ETL处理后储存到数据仓库。7 . The device according to claim 6 , wherein the acquiring module is specifically configured to extract the invoicing behavior data of the invoicing enterprise from the business system, perform ETL processing, and store it in a data warehouse. 8 . 8.根据权利要求6所述的装置,其特征在于,所述建模模块具体用于结合所述发票数据衍生影响企业履约风险的特征,提取衍生特征后对样本数据进行建模,找出与违约风险高度相关的特征变量并建立相应的逻辑回归模型。8 . The device according to claim 6 , wherein the modeling module is specifically configured to derive the features that affect the performance risk of the enterprise in combination with the invoice data, extract the derived features and model the sample data, and find out the corresponding data. 9 . The default risk is highly correlated with characteristic variables and the corresponding logistic regression model is established. 9.一种电子设备,其特征在于,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序,以实现如权利要求1-5中任一所述的企业履约信用级别的确定方法。9. An electronic device, characterized in that it comprises a memory, a processor and a computer program stored on the memory and running on the processor, the processor executing the program to achieve the method as claimed in the claims The method for determining the credit level of a company's performance as described in any one of 1-5. 10.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行,以实现如权利要求1-5中任一所述的企业履约信用级别的确定方法。10. A computer-readable storage medium on which a computer program is stored, characterized in that the program is executed by a processor to implement the method for determining an enterprise performance credit level according to any one of claims 1-5.
CN202010000250.0A 2020-01-02 2020-01-02 A method, device and storage medium for determining the credit level of an enterprise's performance Pending CN111179068A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010000250.0A CN111179068A (en) 2020-01-02 2020-01-02 A method, device and storage medium for determining the credit level of an enterprise's performance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010000250.0A CN111179068A (en) 2020-01-02 2020-01-02 A method, device and storage medium for determining the credit level of an enterprise's performance

Publications (1)

Publication Number Publication Date
CN111179068A true CN111179068A (en) 2020-05-19

Family

ID=70655998

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010000250.0A Pending CN111179068A (en) 2020-01-02 2020-01-02 A method, device and storage medium for determining the credit level of an enterprise's performance

Country Status (1)

Country Link
CN (1) CN111179068A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561680A (en) * 2020-12-08 2021-03-26 爱信诺征信有限公司 Enterprise credit limit calculation system, method, electronic equipment and storage medium
CN113222255A (en) * 2021-05-17 2021-08-06 上海生腾数据科技有限公司 Method and device for contract performance quantification and short-term default prediction
CN113240224A (en) * 2021-02-05 2021-08-10 山东齐信数据服务有限公司 Method and system for evaluating performance capability of government procurement
CN115660774A (en) * 2022-10-14 2023-01-31 国网山东省电力公司物资公司 Material supply chain system credit evaluation method based on block chain

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130132269A1 (en) * 2010-08-06 2013-05-23 The Dun And Bradstreet Corporation Method and system for quantifying and rating default risk of business enterprises
CN106779457A (en) * 2016-12-29 2017-05-31 深圳微众税银信息服务有限公司 A kind of rating business credit method and system
CN109583773A (en) * 2018-12-04 2019-04-05 税友软件集团股份有限公司 A kind of method, system and relevant apparatus that taxpaying credit integral is determining
CN110110981A (en) * 2019-04-26 2019-08-09 重庆第二师范学院 A kind of credit rating Default Probability estimates and method for prewarning risk

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130132269A1 (en) * 2010-08-06 2013-05-23 The Dun And Bradstreet Corporation Method and system for quantifying and rating default risk of business enterprises
CN106779457A (en) * 2016-12-29 2017-05-31 深圳微众税银信息服务有限公司 A kind of rating business credit method and system
CN109583773A (en) * 2018-12-04 2019-04-05 税友软件集团股份有限公司 A kind of method, system and relevant apparatus that taxpaying credit integral is determining
CN110110981A (en) * 2019-04-26 2019-08-09 重庆第二师范学院 A kind of credit rating Default Probability estimates and method for prewarning risk

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561680A (en) * 2020-12-08 2021-03-26 爱信诺征信有限公司 Enterprise credit limit calculation system, method, electronic equipment and storage medium
CN113240224A (en) * 2021-02-05 2021-08-10 山东齐信数据服务有限公司 Method and system for evaluating performance capability of government procurement
CN113222255A (en) * 2021-05-17 2021-08-06 上海生腾数据科技有限公司 Method and device for contract performance quantification and short-term default prediction
CN113222255B (en) * 2021-05-17 2024-03-05 上海生腾数据科技有限公司 Method and device for quantifying contract performance and predicting short-term violations
CN115660774A (en) * 2022-10-14 2023-01-31 国网山东省电力公司物资公司 Material supply chain system credit evaluation method based on block chain
CN115660774B (en) * 2022-10-14 2023-09-19 国网山东省电力公司物资公司 A credit evaluation method for material supply chain system based on blockchain

Similar Documents

Publication Publication Date Title
Al-Sai et al. Big data impacts and challenges: a review
US10452992B2 (en) Interactive interfaces for machine learning model evaluations
US10540375B2 (en) Systems and methods for self-pairing databases
US8533235B2 (en) Infrastructure and architecture for development and execution of predictive models
CN111179068A (en) A method, device and storage medium for determining the credit level of an enterprise's performance
CN113064866A (en) A power business data integration system
CN112579586A (en) Data processing method, device, equipment and storage medium
CN110544035A (en) internal control detection method, system and computer readable storage medium
CN113450004A (en) Power credit report generation method and device, electronic equipment and readable storage medium
WO2024125673A2 (en) Data processing and visualization method, apparatus, medium and device
CN117725437A (en) An accurate data matching analysis method based on machine learning
CN113641654B (en) Marketing treatment rule engine method based on real-time event
CN113312410B (en) Data map construction method, data query method and terminal equipment
CN111062603B (en) Enterprise life cycle analysis method, device and storage medium
US20230052352A1 (en) Systems and methods for assigning attribution weights to nodes
Bog et al. A mixed transaction processing and operational reporting benchmark
CN118982256B (en) A BI decision management system and method based on big data
CN114254918A (en) Calculation method, device, readable medium and electronic device for index data
CN120336346A (en) A data analysis method, device, equipment and computer storage medium
CN115422198B (en) Big data zipper table processing method, device, equipment and storage medium
CN111930815A (en) Method and system for constructing enterprise portrait based on industry attribute and business attribute
CN114092265B (en) Method, device and storage medium for improving insurance policy new service value determination efficiency
CN113546426B (en) Security policy generation method for data access event in game service
CN108897763A (en) A kind of block chain data conversion system
CN113641705A (en) Marketing disposal rule engine method based on calculation engine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200519

RJ01 Rejection of invention patent application after publication