CN110543904A - Enterprise risk classification model construction method based on Bayes - Google Patents
Enterprise risk classification model construction method based on Bayes Download PDFInfo
- Publication number
- CN110543904A CN110543904A CN201910790138.9A CN201910790138A CN110543904A CN 110543904 A CN110543904 A CN 110543904A CN 201910790138 A CN201910790138 A CN 201910790138A CN 110543904 A CN110543904 A CN 110543904A
- Authority
- CN
- China
- Prior art keywords
- enterprise
- information gain
- data
- classification model
- bayes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Human Resources & Organizations (AREA)
- Entrepreneurship & Innovation (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides an enterprise risk classification model construction method based on Bayes, and belongs to the technical field of computers. Step one, selecting static characteristics, tax payment behavior characteristics and high tax risk characteristics of an enterprise as data sources, and cleaning and de-noising data. Step two, performing feature selection on the data source obtained in the step one by using an information gain method, and storing the obtained feature information gain values into a database table from large to small, wherein the larger the feature information gain value is, the stronger the classification capability of the data source is; thirdly, selecting the features 15-25 before ranking according to the sequence of each feature information gain value as the input of a Bayesian algorithm model; and step four, establishing an enterprise risk classification model based on Bayes, classifying the enterprises through the enterprise risk classification model based on Bayes, and finding out abnormal enterprises, thereby playing a role in early warning tax risks.
Description
Technical Field
The invention relates to an enterprise risk classification model construction method based on Bayes, and belongs to the technical field of computers.
background
With the development of informatization, a large amount of financial and tax information is accumulated in an enterprise system, and how to extract effective information from the financial and tax information is to carry out early warning on tax payment risks for deep research. Traditional data and risk analysis rely on survey mathematical statistics to a great extent, and data have hysteresis quality to lack the comprehensive consideration of a plurality of dimensions, can not timely comprehensive feedback enterprise's state. Therefore, in order to control the abnormal situations and the risk processing process and prevent the tax payment illegal events, a closed risk monitoring and management loop relying on a big data technology should be actively explored.
disclosure of Invention
The invention aims to provide an enterprise risk classification model based on a Bayesian algorithm by utilizing machine learning and data mining algorithms, and the problem can be effectively prevented. The model establishes an analysis model by using financial, tax and business related data of the enterprise, identifies possible risk doubt points in advance, and guides the reduction of enterprise tax evasion and enterprise illegal behaviors. The technical scheme is as follows:
A Bayesian-based enterprise risk classification model construction method comprises the following steps:
selecting static characteristics, tax payment behavior characteristics and high tax risk characteristics of an enterprise as data sources, and cleaning and de-noising data;
secondly, performing feature selection on the data source obtained in the first step by using an information gain method, and storing information gain values into a database table from large to small;
thirdly, selecting the top 15-25 ranked features as the input of a Bayes algorithm model according to the ranking of each feature information gain value;
and step four, establishing an enterprise risk classification model based on Bayes, classifying the enterprises through the enterprise risk classification model based on Bayes, and finding out abnormal enterprises, thereby playing a role in early warning tax risks.
further, the specific process of obtaining the characteristic information gain value in the second step includes:
Selecting an enterprise data set D, and assuming that the enterprise data set is D, | D | is the number of samples, K classes Ck are set, and | Ck | is the number of samples belonging to the Ck-th class; setting n different values { X1, X2, … and xn } of the characteristic X, dividing D into n subsets D1, D2, …, Dn and | Di | as the number of samples of the ith subset according to the value of the characteristic X; recording the number of samples in the subset Di, wherein the set of samples belonging to the Ck-th class is Dik, and | Dik | is Dik;
Secondly, inputting an enterprise data set D; calculating an empirical entropy H (D) of the enterprise dataset D, the empirical entropy H (D) being as follows:
Thirdly, calculating the empirical condition entropy H (D | X) of the feature pair data set D, wherein the empirical condition entropy H (D | X) is as follows:
fourthly, calculating a characteristic information gain value according to the empirical entropy H (D) and the empirical conditional entropy H (D | X), wherein the characteristic information gain value is as follows:
g(D,X)=H(D)-H(D|X)
namely, the acquisition of the characteristic information gain value is completed.
Further, where K ═ 2, indicates that the enterprise is classified into a normal enterprise and an abnormal enterprise.
further, the step four of classifying the enterprises through the bayesian algorithm model comprises:
Step 1, after feature selection is carried out according to the feature information gain value, normalized and discretized data preprocessing is carried out on the feature data of the enterprise;
Step 2, dividing the processed enterprise data into a training set and a testing set, wherein the ratio of the training set to the testing set is 7: 3;
and 3, calculating the number of the enterprises classified as normal in the training set and the number of the enterprises with abnormal category attributes by using a Bayesian algorithm model, and calculating a prior probability P (Y ═ ck) as follows:
P(Y=c),k=1,2,…,K
and 4, calculating a conditional probability P (Y ═ ck | X ═ X) of each classification for each feature, wherein the conditional probability P (Y ═ ck | X ═ X) is as follows:
wherein x represents a data sample; y represents the category, namely normal business or abnormal business; n represents the total number of samples; j represents the jth sample;
and 6, acquiring the belonged classification of the enterprise according to the enterprise risk classification model by using the test sample, wherein the classification model comprises the following steps:
wherein y represents the test sample.
The invention has the beneficial effects that:
the Bayesian enterprise risk model is evaluated by selecting Recall ratio (Recall), Precision ratio (Precision) and F1 value (F1-score). The recall ratio is also called recall ratio, and refers to how many of all positive samples are judged as positive samples by the model, and the precision ratio refers to how many of all samples judged as positive by the model are real positive samples. F is 2PR/(P + R), wherein P is precision ratio and R is recall ratio.
Selecting about 1 million enterprises in a city of Hebei province as training samples, extracting 2000 enterprises as test samples, and judging the enterprise risk types. The recall ratio of the trained classification model is 75%, the precision ratio is 88%, and the F1 value is 81%. Experimental result shows that this patent classification model has good classification effect, can find out unusual enterprise to play the effect of carrying out early warning to the tax risk in advance.
Drawings
FIG. 1 is a flow chart of a Bayesian-based enterprise risk classification model construction method of the present invention;
FIG. 2 is a schematic diagram of IG values of some features in the database and their ordering.
Detailed Description
The present invention will be further described with reference to the following specific examples, but the present invention is not limited to these examples.
example 1:
A Bayesian-based enterprise risk classification model construction method comprises the following steps:
selecting static characteristics, tax payment behavior characteristics and high tax risk characteristics of an enterprise as data sources, and cleaning and de-noising data;
secondly, performing feature selection on the data source obtained in the first step by using an information gain method, and storing information gain values into a database table from large to small;
Thirdly, selecting the top 15-25 ranked features as the input of a Bayes algorithm model according to the ranking of each feature information gain value;
And step four, establishing an enterprise risk classification model based on Bayes, classifying the enterprises through the enterprise risk classification model based on Bayes, and finding out abnormal enterprises, thereby playing a role in early warning tax risks.
the static characteristics of the enterprise mean that the tax registration information of the enterprise is enterprise basic information, and most data of the enterprise is input when the enterprise registers the tax for the first time. Once the data is recorded, changes are often made, such as the business address of the enterprise, the registered funds and the like. The features refined from this portion of the data are classified as static features by the present embodiment.
the tax payment behavior characteristic means that the tax payment behavior of the enterprise often has periodic changes due to the declaration period, the tax payment clearing period and other reasons, and the embodiment classifies the changes as the tax payment behavior characteristic. Such as upstream and downstream of a business, the feature may produce different results due to different choices of statistical periods, and different time granularities may be set for such features. Although the setting of different granularities can sacrifice partial feature independence, the actual situation of an enterprise can be more comprehensively reflected, and tax payment features can be refined according to different cycle granularities.
The high tax risk characteristic refers to the risk characteristic of the tax risk high issuing industry, and according to the industry subdivision result, because the characteristic only aims at the specific industry, the characteristic does not exist or is not obvious in other industries and has no judgment significance, the characteristic only calculates the enterprise belonging to the specific industry in the data processing process.
Summarizing the characteristics of the three angles, fifty characteristics are subjected to data sorting in the embodiment and are used as the basis of subsequent data analysis.
Step two, the specific process of obtaining the characteristic information gain value includes:
selecting an enterprise data set D, and assuming that the enterprise data set is D, | D | is the number of samples, K classes Ck are set, and | Ck | is the number of samples belonging to the Ck-th class; setting n different values { X1, X2, … and xn } of the characteristic X, dividing D into n subsets D1, D2, …, Dn and | Di | as the number of samples of the ith subset according to the value of the characteristic X; recording the number of samples in the subset Di, wherein the set of samples belonging to the Ck-th class is Dik, and | Dik | is Dik;
secondly, inputting an enterprise data set D; calculating an empirical entropy H (D) of the enterprise dataset D, the empirical entropy H (D) being as follows:
thirdly, calculating the empirical condition entropy H (D | X) of the feature pair data set D, wherein the empirical condition entropy H (D | X) is as follows:
Fourthly, calculating a characteristic information gain value according to the empirical entropy H (D) and the empirical conditional entropy H (D | X), wherein the characteristic information gain value is as follows:
g(D,X)=H(D)-H(D|X)
namely, the acquisition of the characteristic information gain value is completed.
the characteristic selection is characterized in that the characteristics with stronger classification capability to the classifier are selected, so that the classification efficiency is improved, and a better classification result is obtained. This patent carries out the feature selection to fifty features that sort well to carry out the sequencing of feature classification ability, select the feature that has better classification ability. The present embodiment uses an information gain method to perform feature selection.
in information theory and probability statistics, Entropy (Entropy) represents a measure of uncertainty of random variables. The entropy of a random variable X is defined as:
where pi is the probability distribution of X. The larger the entropy of the random variable X, the greater its uncertainty.
conditional Entropy (Conditional Entropy) refers to the Entropy of the Conditional probability of a random variable X under a given condition Y, denoted as H (Y | X).
p=P(X=x),i=1,2,…,n (3)
when the Entropy and the Conditional Entropy are obtained by data estimation, the corresponding Entropy and the Conditional Entropy are called Empirical Entropy (Empirical Entropy) and Empirical Conditional Entropy (Empirical Conditional Entropy), respectively.
The Information Gain (IG) indicates the degree of uncertainty in the Information of the class Y that is reduced by knowing the Information of the feature X. The information gain g (D, X) of the feature X on the data set D is defined as the difference between the empirical entropy H (D) of the data set D and the empirical conditional entropy H (D | X) of the feature X under a given condition, i.e. the difference between the empirical entropy H (D) of the feature X and the empirical conditional entropy H (D | X) of the feature X under the given condition
g(D,X)=H(D)-H(D|X)
(4)
Where empirical entropy h (D) represents the uncertainty of the classification of data set D. The empirical conditional entropy H (D | X) represents the uncertainty of the classification of the data set D under the condition of the feature X. Their difference, called the information gain, represents the degree to which the classification uncertainty of the data set D is reduced by the features X. According to the formula (4), the information gain of the data set D depends on the features X, different features can obtain different information gain values, and features with larger information gains have stronger classification capability. Wherein, K is 2, which means that the enterprises are divided into normal enterprises and abnormal enterprises.
step four, the concrete process of establishing the enterprise risk classification model based on the Bayesian method comprises the following steps:
in some enterprises, abnormal enterprises are identified, and the enterprise tagging behavior is the classification process. The embodiment classifies the enterprises by adopting a naive Bayesian algorithm to find out abnormal enterprises, thereby playing a role in early warning tax risks. A bayesian algorithm is used to derive a training target P (Y ═ ck | X ═ X) from the training data D, i.e. for a given sample X, the probability that the sample belongs to the ck class is determined.
assuming that the training dataset is D { (X1, Y1), (X2, Y2), …, (xN, yN) }, and the output class label set Y ═ { c1, c2, …, cK } (in this patent, K ═ 2, which is a two-class classification, i.e., there is a normal business class and an abnormal business class), the naive bayes algorithm learns the joint probability distribution P (X, Y) through the training dataset. Specifically, the following prior probability distribution and conditional probability distribution are learned. Prior probability distribution, which expresses the proportion of each type of sample in the sample space:
P(Y=c),k=1,2,…,K (8)
Conditional probability distribution:
In the naive bayes classification, a posterior probability distribution P (Y ═ ck | X ═ X) of an input sample X is calculated by a learned model, the class with the highest posterior probability is input to the sample X, and the posterior probability is calculated by bayes theorem according to probability theory-related knowledge:
substituting equation (9) into equation (10) yields
The bayesian classifier uses the argmax function for classification, and thus the naive bayesian classifier can be expressed as:
all the class denominators in equation (12) are the same, so equation (12) can be changed to
the formula (13) is an enterprise risk classification model based on Bayesian.
Step four, the process of classifying the enterprises through the Bayesian algorithm model comprises the following steps:
step 1, carrying out normalization and discretization processing on the characteristic information gain value, wherein the characteristic information gain value represents enterprise data;
step 2, dividing the processed enterprise data into a training set and a testing set, wherein the ratio of the training set to the testing set is 7: 3;
and 3, calculating the number of the enterprises classified as normal in the training set and the number of the enterprises with abnormal category attributes by using a Bayesian algorithm model, and calculating a prior probability P (Y ═ ck) as follows:
P(Y=c),k=1,2,…,K
and 4, calculating a conditional probability P (Y ═ ck | X ═ X) of each classification for each feature, wherein the conditional probability P (Y ═ ck | X ═ X) is as follows:
wherein x represents a data sample; y represents the category, namely normal business or abnormal business; n represents the total number of samples; j represents the jth sample;
and 5, acquiring the belonged classification of the enterprise according to the enterprise risk classification model by using the test sample, wherein the classification model comprises the following steps:
wherein y represents the test sample.
Although the present invention has been described with reference to the preferred embodiments, it should be understood that various changes and modifications can be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (4)
1. a Bayesian-based enterprise risk classification model construction method is characterized by comprising the following steps:
Selecting static characteristics, tax payment behavior characteristics and high tax risk characteristics of an enterprise as data sources, and cleaning and de-noising data;
Secondly, performing feature selection on the data source obtained in the first step by using an information gain method, and storing information gain values into a database table from large to small;
thirdly, selecting the top 15-25 ranked features as the input of a Bayes algorithm model according to the ranking of each feature information gain value;
and step four, establishing an enterprise risk classification model based on Bayes, classifying the enterprises through the enterprise risk classification model based on Bayes, and finding out abnormal enterprises, thereby playing a role in early warning tax risks.
2. the Bayesian-based enterprise risk classification model construction method according to claim 1, wherein the specific process of obtaining the feature information gain value in the second step comprises:
Selecting an enterprise data set D, and assuming that the enterprise data set is D, | D | is the number of samples, K classes Ck are set, and | Ck | is the number of samples belonging to the Ck-th class; setting n different values { X1, X2, … and xn } of the characteristic X, dividing D into n subsets D1, D2, …, Dn and | Di | as the number of samples of the ith subset according to the value of the characteristic X; recording the number of samples in the subset Di, wherein the set of samples belonging to the Ck-th class is Dik, and | Dik | is Dik;
Secondly, inputting an enterprise data set D; calculating an empirical entropy H (D) of the enterprise dataset D, the empirical entropy H (D) being as follows:
Thirdly, calculating the empirical condition entropy H (D | X) of the feature pair data set D, wherein the empirical condition entropy H (D | X) is as follows:
fourthly, calculating a characteristic information gain value according to the empirical entropy H (D) and the empirical conditional entropy H (D | X), wherein the characteristic information gain value is as follows:
g(D,X)=H(D)-H(D|X)
namely, the acquisition of the characteristic information gain value is completed.
3. the bayesian-based enterprise risk classification model building method according to claim 2, wherein K is 2, which indicates that the enterprise is classified into a normal enterprise and an abnormal enterprise.
4. the Bayesian-based enterprise risk classification model construction method according to claim 1, wherein the step four of classifying the enterprise through the Bayesian algorithm model comprises the following steps:
Step 1, after feature selection is carried out according to the feature information gain value, normalized and discretized data preprocessing is carried out on the feature data of the enterprise;
step 2, dividing the processed enterprise data into a training set and a testing set, wherein the ratio of the training set to the testing set is 7: 3;
And 3, calculating the number of the enterprises classified as normal in the training set and the number of the enterprises with abnormal category attributes by using a Bayesian algorithm model, and calculating a prior probability P (Y ═ ck) as follows:
P(Y=c),k=1,2,…,K
And 4, calculating a conditional probability P (Y ═ ck | X ═ X) of each classification for each feature, wherein the conditional probability P (Y ═ ck | X ═ X) is as follows:
wherein x represents a data sample; y represents the category, namely normal business or abnormal business; n represents the total number of samples; j represents the jth sample;
And 5, acquiring the belonged classification of the enterprise according to a classification model by using the test set, wherein the classification model comprises the following steps:
wherein y represents the test sample.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910790138.9A CN110543904A (en) | 2019-08-26 | 2019-08-26 | Enterprise risk classification model construction method based on Bayes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910790138.9A CN110543904A (en) | 2019-08-26 | 2019-08-26 | Enterprise risk classification model construction method based on Bayes |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110543904A true CN110543904A (en) | 2019-12-06 |
Family
ID=68711999
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910790138.9A Pending CN110543904A (en) | 2019-08-26 | 2019-08-26 | Enterprise risk classification model construction method based on Bayes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110543904A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111324509A (en) * | 2020-02-18 | 2020-06-23 | 广东小天才科技有限公司 | Method and device for identifying application addiction |
CN113205271A (en) * | 2021-05-12 | 2021-08-03 | 国家税务总局山东省税务局 | Method for evaluating enterprise income tax risk based on machine learning |
CN113342903A (en) * | 2020-02-18 | 2021-09-03 | 北京沃东天骏信息技术有限公司 | Method and device for managing models in data warehouse |
CN115641227A (en) * | 2022-09-05 | 2023-01-24 | 山西大学 | A Scaling Attack Detection Method Considering Diversity of Power Consumption Modes in Smart Grid Advanced Measurement System |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108491719A (en) * | 2018-03-15 | 2018-09-04 | 重庆邮电大学 | A kind of Android malware detection methods improving NB Algorithm |
CN108596276A (en) * | 2018-05-10 | 2018-09-28 | 重庆邮电大学 | The naive Bayesian microblog users sorting technique of feature based weighting |
CN109409969A (en) * | 2018-10-24 | 2019-03-01 | 仲恺农业工程学院 | Enterprise tax fraud detection method, electronic equipment and storage medium |
CN109583682A (en) * | 2018-10-11 | 2019-04-05 | 平安科技(深圳)有限公司 | Recognition methods, device and the computer equipment of business finance fraud risk |
-
2019
- 2019-08-26 CN CN201910790138.9A patent/CN110543904A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108491719A (en) * | 2018-03-15 | 2018-09-04 | 重庆邮电大学 | A kind of Android malware detection methods improving NB Algorithm |
CN108596276A (en) * | 2018-05-10 | 2018-09-28 | 重庆邮电大学 | The naive Bayesian microblog users sorting technique of feature based weighting |
CN109583682A (en) * | 2018-10-11 | 2019-04-05 | 平安科技(深圳)有限公司 | Recognition methods, device and the computer equipment of business finance fraud risk |
CN109409969A (en) * | 2018-10-24 | 2019-03-01 | 仲恺农业工程学院 | Enterprise tax fraud detection method, electronic equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
韩忠明等: "《数据分析与R》", 30 August 2014 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111324509A (en) * | 2020-02-18 | 2020-06-23 | 广东小天才科技有限公司 | Method and device for identifying application addiction |
CN113342903A (en) * | 2020-02-18 | 2021-09-03 | 北京沃东天骏信息技术有限公司 | Method and device for managing models in data warehouse |
CN111324509B (en) * | 2020-02-18 | 2023-07-11 | 广东小天才科技有限公司 | A method and device for identifying addiction to applications |
CN113205271A (en) * | 2021-05-12 | 2021-08-03 | 国家税务总局山东省税务局 | Method for evaluating enterprise income tax risk based on machine learning |
CN115641227A (en) * | 2022-09-05 | 2023-01-24 | 山西大学 | A Scaling Attack Detection Method Considering Diversity of Power Consumption Modes in Smart Grid Advanced Measurement System |
CN115641227B (en) * | 2022-09-05 | 2025-06-27 | 山西大学 | A scaling attack detection method considering the diversity of power consumption patterns in smart grid advanced measurement system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | Default prediction in P2P lending from high-dimensional data based on machine learning | |
Tian et al. | Steel plates fault diagnosis on the basis of support vector machines | |
CN110543904A (en) | Enterprise risk classification model construction method based on Bayes | |
US8165979B2 (en) | System and method for resource adaptive classification of data streams | |
Kim et al. | Ordinal classification of imbalanced data with application in emergency and disaster information services | |
Arnaiz-González et al. | Instance selection for regression: Adapting DROP | |
Yuan et al. | fBGD: Learning embeddings from positive unlabeled data with BGD | |
Wang et al. | Predicting class-imbalanced business risk using resampling, regularization, and model ensembling algorithms | |
Ebenezer et al. | A comprehensive analysis of handling imbalanced dataset | |
CN118364317A (en) | Sample expansion method, sample expansion device, computer equipment and readable storage medium | |
Burgess et al. | One-shot learning in discriminative neural networks | |
Wang et al. | Adaptive supervised learning on data streams in reproducing kernel Hilbert spaces with data sparsity constraint | |
Devi et al. | An efficient document clustering using hybridised harmony search K-means algorithm with multi-view point | |
Pambudi et al. | Forecasting Brown sugar production using k-NN Minkowski distance and Z-score normalization | |
Bhuyan et al. | Sub-feature selection based classification | |
CN117273791A (en) | Mathematical model optimization system and method based on big data | |
Shahad et al. | Challenges in streaming data analysis for building an adaptive model for handling concept drifts | |
Yeon et al. | Model averaging via penalized regression for tracking concept drift | |
Chang et al. | Short-Term Stock Price-Trend Prediction Using Meta-Learning | |
Belgaum et al. | Comparative analysis of Start-up Success Rate Prediction Using Machine Learning Techniques | |
Septiana et al. | Classification of travel class with k-nearest neighbors algorithm using rapidminer | |
Wu et al. | A hybrid ensemble model for corporate bankruptcy prediction based on feature engineering method | |
Thalor et al. | Learning on high frequency stock market data using misclassified instances in ensemble | |
Zhe et al. | Online spike-and-slab inference with stochastic expectation propagation | |
Billio et al. | Bayesian inference on dynamic models with latent factors |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191206 |