WO2019218699A1

WO2019218699A1 - Fraud transaction determining method and apparatus, computer device, and storage medium

Info

Publication number: WO2019218699A1
Application number: PCT/CN2019/070159
Authority: WO
Inventors: 石建; 陈飞腾
Original assignee: OneConnect Smart Technology Co Ltd
Current assignee: OneConnect Smart Technology Co Ltd
Priority date: 2018-05-18
Filing date: 2019-01-03
Publication date: 2019-11-21
Anticipated expiration: 2020-11-18
Also published as: CN108717638A

Abstract

A fraud transaction determining method, comprising: obtaining current transaction; inputting the current transaction to a transaction anti-fraud analysis model to obtain the transaction fraud probability of the current transaction; identifying the current transaction by means of a pre-stored fraud transaction identification rule to obtain the identification result; and determining whether the current transaction is a fraud transaction according to the transaction fraud probability of the current transaction and the identification result.

Description

Fraud transaction judgment method, device, computer device and storage medium

相关申请的交叉引用Cross-reference to related applications

本申请要求于2018年5月18日提交中国专利局，申请号为2018104790604，申请名称为“欺诈交易判断方法、装置、计算机设备和存储介质”的中国专利申请的优先权，其全部内容通过引用结合在本申请中。This application claims to be filed on May 18, 2018, the Chinese Patent Office, the application number is 2018104790604, and the priority of the Chinese patent application entitled "Secret transaction judgment method, device, computer equipment and storage medium" is claimed. Combined in this application.

Technical field

本申请涉及一种欺诈交易判断方法、装置、计算机设备和存储介质。The present application relates to a fraudulent transaction determination method, apparatus, computer device and storage medium.

Background technique

随着现代网络技术的发展，电子交易在生活中被越来越广泛地应用，随之而来也存在电子交易为欺诈交易的风险，如信息泄露或者交易违法等；因此需要在交易中对当前交易的交易数据进行欺诈分析，当根据交易数据分析出当前交易可能为欺诈交易时，则结束当前交易。With the development of modern network technology, electronic transactions are more and more widely used in life, and there are also risks of electronic transactions as fraudulent transactions, such as information disclosure or transaction violations; therefore, it is necessary to The transaction data of the transaction is subjected to fraud analysis, and when it is analyzed based on the transaction data that the current transaction may be a fraudulent transaction, the current transaction is ended.

传统地，一般利用一些交易反欺诈实时评分模型对当前交易进行分析，如FICO(一种个人信用评级法，)、SAS(STATISTICAL ANALYSIS SYSTEM，数据分析系统，银行可通过部署SAS欺诈管理，进行反欺诈分析)、同盾的交易反欺诈模型等；然而，发明人意识到，这些模型的通用性较差，对实际交易中发生的欺诈交易判断的准确率较低。Traditionally, some transaction anti-fraud real-time scoring models are generally used to analyze current transactions, such as FICO (a personal credit rating method), SAS (STATISTICAL ANALYSIS SYSTEM, data analysis system, banks can be implemented by deploying SAS fraud management. Fraud analysis), the same anti-fraud model of trading, etc.; however, the inventors realized that these models are less versatile and have lower accuracy in judging fraudulent transactions that occur in actual transactions.

发明内容Summary of the invention

根据本申请公开的各种实施例，提供一种欺诈交易判断方法、装置、计算机设备和存储介质。According to various embodiments disclosed herein, a fraudulent transaction determination method, apparatus, computer device, and storage medium are provided.

一种欺诈交易判断方法，所述方法包括：A method for judging a fraudulent transaction, the method comprising:

获取当前交易；Get the current transaction;

将所述当前交易输入交易反欺诈分析模型得到所述当前交易的交易欺诈概率；Entering the current transaction into a transaction anti-fraud analysis model to obtain a transaction fraud probability of the current transaction;

通过预存的欺诈交易识别规则对所述当前交易进行识别得到识别结果；及Identifying the current transaction by a pre-stored fraudulent transaction identification rule to obtain a recognition result; and

根据所述当前交易的交易欺诈概率与所述识别结果判断所述当前交易是否为欺诈交易。Determining whether the current transaction is a fraudulent transaction based on the transaction fraud probability of the current transaction and the recognition result.

一种欺诈交易判断装置，所述装置包括：A fraudulent transaction judging device, the device comprising:

数据获取模块，用于获取当前交易；a data acquisition module for acquiring a current transaction;

概率分析模块，用于将所述当前交易输入交易反欺诈分析模型得到所述当前交易的交易欺诈概率；a probability analysis module, configured to input the current transaction into a transaction anti-fraud analysis model to obtain a transaction fraud probability of the current transaction;

规则识别模块，用于通过预存的欺诈交易识别规则对所述当前交易进行识别得到识别结果；及a rule identification module, configured to identify the current transaction by a pre-stored fraudulent transaction identification rule to obtain a recognition result; and

结果获取模块，用于根据所述当前交易的交易欺诈概率与所述识别结果判断所述当前交易是否为欺诈交易。And a result obtaining module, configured to determine, according to the transaction fraud probability of the current transaction, the determination whether the current transaction is a fraudulent transaction.

一种计算机设备，包括存储器和一个或多个处理器，所述存储器中储存有计算机可读指令，所述计算机可读指令被所述处理器执行时，使得所述一个或多个处理器执行以下步骤：A computer device comprising a memory and one or more processors having stored therein computer readable instructions, the computer readable instructions being executable by the processor to cause the one or more processors to execute The following steps:

获取当前交易；Get the current transaction;

一个或多个存储有计算机可读指令的非易失性计算机可读存储介质，计算机可读指令被一个或多个处理器执行时，使得一个或多个处理器执行以下步骤：One or more non-transitory computer readable storage mediums storing computer readable instructions, when executed by one or more processors, cause one or more processors to perform the steps of:

获取当前交易；Get the current transaction;

本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。Details of one or more embodiments of the present application are set forth in the accompanying drawings and description below. Other features and advantages of the present invention will be apparent from the description, drawings and claims.

DRAWINGS

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其它的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings to be used in the embodiments will be briefly described below. Obviously, the drawings in the following description are only some embodiments of the present application, Those skilled in the art can also obtain other drawings based on these drawings without any creative work.

图1为根据一个或多个实施例中欺诈交易判断方法的应用场景图。1 is an application scenario diagram of a fraudulent transaction determination method in accordance with one or more embodiments.

图2为根据一个或多个实施例中欺诈交易判断方法的流程示意图。2 is a flow diagram of a fraudulent transaction determination method in accordance with one or more embodiments.

图3为根据一个或多个实施例中模型训练步骤的流程示意图。3 is a flow diagram of a model training step in accordance with one or more embodiments.

图4为根据一个或多个实施例中图3中的步骤S304的流程示意图。FIG. 4 is a flow diagram of step S304 of FIG. 3 in accordance with one or more embodiments.

图5为根据一个或多个实施例中欺诈交易判断装置的框图。FIG. 5 is a block diagram of a fraudulent transaction determining apparatus in accordance with one or more embodiments.

图6为根据一个或多个实施例中计算机设备的框图。FIG. 6 is a block diagram of a computer device in accordance with one or more embodiments.

Detailed ways

为了使本申请的技术方案及优点更加清楚明白，以下结合附图及实施例，对本申请进行进一步详细说明。应当理解，此处描述的具体实施例仅仅用以解释本申请，并不用于限定本申请。In order to make the technical solutions and advantages of the present application more clear, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.

本申请提供的欺诈交易判断方法，可以应用于如图1所示的应用环境中。其中，终端与服务器通过网络进行通信。终端从服务器中获取当前交易，将当前交易输入交易反欺诈分析模型得到当前交易的交易欺诈概率，并通过欺诈交易识别规则对当前交易进行识别，通过当前交易的交易欺诈概率与欺诈交易识别规则的识别结果两个方案的结合，判断当前交易是否为欺诈交易。其中，终端可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备，服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The fraudulent transaction judging method provided by the present application can be applied to the application environment as shown in FIG. 1. The terminal and the server communicate through the network. The terminal obtains the current transaction from the server, inputs the current transaction into the transaction anti-fraud analysis model to obtain the transaction fraud probability of the current transaction, and identifies the current transaction through the fraud transaction identification rule, and the transaction fraud probability and the fraud transaction identification rule of the current transaction. The result of the combination of the two solutions is to determine whether the current transaction is a fraudulent transaction. The terminal can be, but is not limited to, various personal computers, notebook computers, smart phones, tablets, and portable wearable devices. The server can be implemented by a separate server or a server cluster composed of multiple servers.

在一些实施例中，如图2所示，提供了一种欺诈交易判断方法，以该方法应用于图1中的终端为例进行说明，包括以下步骤：In some embodiments, as shown in FIG. 2, a fraudulent transaction judging method is provided. The method is applied to the terminal in FIG. 1 as an example, and includes the following steps:

S202，获取当前交易。S202: Acquire a current transaction.

当前交易是指需要判断其是否为欺诈交易的交易，如需要对多笔交易进行欺诈性判断，则分别将多笔交易按顺序作为当前交易进行后续操作；当前交易可以包括一笔交易的交易金额、时间、账户数据等信息。The current transaction refers to a transaction that needs to determine whether it is a fraudulent transaction. If it is necessary to make fraudulent judgments on multiple transactions, then multiple transactions are subsequently performed as a current transaction in sequence; the current transaction may include the transaction amount of a transaction. , time, account data and other information.

具体地，终端从服务器获取需要判断是否为欺诈交易的当前交易，服务器可以是业务系统或其他交易系统，其中业务系统可以为交易网站或存储交易数据的数据库，其他交易系统可以是其他合作网站等。Specifically, the terminal obtains a current transaction from the server that needs to determine whether it is a fraudulent transaction, and the server may be a business system or other transaction system, wherein the business system may be a transaction website or a database storing transaction data, and other transaction systems may be other cooperative websites, etc. .

S204，将当前交易输入交易反欺诈分析模型得到当前交易的交易欺诈概率。S204: Enter the current transaction into the transaction anti-fraud analysis model to obtain the transaction fraud probability of the current transaction.

其中，交易反欺诈分析模型是用于分析输入的当前交易的交易欺诈概率的模型；交易反欺诈分析模型可以是技术人员以历史交易数据作为样本数据训练得到的神经网络模型或规则模型。The transaction anti-fraud analysis model is a model for analyzing the transaction fraud probability of the input current transaction; the transaction anti-fraud analysis model may be a neural network model or a rule model trained by a technician using historical transaction data as sample data.

交易欺诈概率是终端通过交易反欺诈分析模型对当前交易进行判断得到当前交易为欺诈交易的概率；交易欺诈概率可以以百分比的形式展示，也可以采用其他数学表示方法。The probability of transaction fraud is the probability that the terminal judges the current transaction through the transaction anti-fraud analysis model to obtain the current transaction as a fraudulent transaction; the probability of transaction fraud can be displayed in percentage form, and other mathematical representations can also be used.

具体地，终端将获取到的当前交易输入到根据历史交易数据以及历史交易数据中的欺诈信息作为样本数据训练得到的交易反欺诈分析模型中，根据模型训练的规则得到当前交易为欺诈交易的概率。Specifically, the terminal inputs the acquired current transaction into the transaction anti-fraud analysis model obtained by training the fraud information in the historical transaction data and the historical transaction data as the sample data, and obtains the probability that the current transaction is a fraudulent transaction according to the rule of the model training. .

S206，通过预存的欺诈交易识别规则对当前交易进行识别得到识别结果。S206. Identify the current transaction by using a pre-stored fraudulent transaction identification rule to obtain a recognition result.

其中，欺诈交易识别规则是根据不同的交易特征制定的用于评判当前交易是否为欺诈交易的规则。欺诈交易识别规则可以包括基于高频交易的限额限次规则和基于历史风险交易特征提取的欺诈交易识别规则决策组合等；基于高频交易的限额限次规则即查看执行当前交易的账户在一段时间内的交易次数和交易金额，当该账户在短时间内进行多次大金额的交易，则该账户进行的交易为欺诈交易的可能性较高；基于历史风险交易特征提取的欺诈交易识别规则可以是基于执行当前交易的账户的历史交易信息制定的欺诈交易识别规则：如该账户的设备曾经使用该设备登录多个账户、或者执行当前交易的账户在注册之后的短时间内进行了多次大额交易等，都可以作为该当前交易为欺诈交易的评判标准。Among them, the fraudulent transaction identification rule is a rule based on different transaction characteristics for judging whether the current transaction is a fraudulent transaction. The fraudulent transaction identification rule may include a quota limit rule based on high frequency transactions and a fraud transaction identification rule decision combination based on historical risk transaction feature extraction; the limit limit rule based on high frequency transaction is to view the account for executing the current transaction for a period of time. The number of transactions and the transaction amount. When the account performs multiple large transactions in a short period of time, the transaction conducted by the account is more likely to be fraudulent transactions; the fraudulent transaction identification rule based on historical risk transaction feature extraction may Is a fraudulent transaction identification rule based on historical transaction information of an account executing the current transaction: if the device of the account has used the device to log in to multiple accounts, or the account that performs the current transaction is performed multiple times in a short time after registration The amount of transactions, etc., can be used as a criterion for judging the current transaction as a fraudulent transaction.

具体地，终端将制定好欺诈交易识别规则预存在终端或服务器中，在接收到当前交易时，终端可以先调取此预存的欺诈交易识别规则对当前交易进行识别，判断当前交易是否存在欺诈风险。Specifically, the terminal will formulate a fraudulent transaction identification rule pre-existing in the terminal or the server. When receiving the current transaction, the terminal may first retrieve the pre-stored fraudulent transaction identification rule to identify the current transaction, and determine whether the current transaction has a fraud risk. .

S208，根据当前交易的交易欺诈概率与识别结果判断当前交易是否为欺诈交易。S208. Determine whether the current transaction is a fraudulent transaction according to the transaction fraud probability of the current transaction and the recognition result.

具体地，当前交易通过步骤S204中的交易反欺诈分析模型得到的交易欺诈概率与步骤S206中通过预存的欺诈交易识别规则对当前交易的识别结果结合起来，判断当前交易是否为欺诈交易。Specifically, the transaction fraud probability obtained by the current transaction through the transaction anti-fraud analysis model in step S204 is combined with the recognition result of the current transaction by the pre-stored fraudulent transaction identification rule in step S206 to determine whether the current transaction is a fraudulent transaction.

可选地，当前交易通过交易反欺诈分析模型得到的交易欺诈概率大于预设值时，则表示该当前交易很可能为欺诈交易；若通过预存的欺诈交易识别规则识别后也认为当前交易的账户信息或交易金额等数据较为可疑，则可以判断当前交易为欺诈交易。然后终端将当前交易交由验证部门进行核实，若核实结果确定当前交易为欺诈交易，可以对发起该交易的账户进行警告或者封锁等处理步骤。Optionally, when the transaction fraud probability obtained by the transaction anti-fraud analysis model is greater than a preset value, it indicates that the current transaction is likely to be a fraudulent transaction; if the pre-stored fraudulent transaction identification rule is identified, the current transaction is also considered as an account. If the data such as information or transaction amount is more suspicious, it can be judged that the current transaction is a fraudulent transaction. Then the terminal submits the current transaction to the verification department for verification. If the verification result determines that the current transaction is a fraudulent transaction, the processing step of warning or blocking the account that initiated the transaction may be performed.

若交易反欺诈分析模型和预存的欺诈交易识别规则对当前交易是否为风险交易的判断结果差异性较大时，则需要进一步对当前交易进行人工核查，并根据核查结果检查交易反欺诈分析模型和预存的欺诈交易识别规则判断结果差别较大的原因，从而纠正交易反欺诈分析模型或欺诈交易识别规则。If the transaction anti-fraud analysis model and the pre-existing fraudulent transaction identification rule have different differences in the judgment result of whether the current transaction is a risk transaction, further verification of the current transaction is required, and the transaction anti-fraud analysis model is checked according to the verification result. The pre-stored fraudulent transaction identification rule determines the reason for the large difference, thereby correcting the transaction anti-fraud analysis model or the fraudulent transaction identification rule.

上述欺诈交易判断方法还可以是，先将当前交易输入欺诈交易识别规则中，再将欺诈交易识别规则对于当前交易的识别结果作为交易反欺诈分析模型对于当前交易判断的参考因素，能够使得交易反欺诈分析模型更全面地对当前交易进行分析。The above fraudulent transaction judging method may also be: first inputting the current transaction into the fraudulent transaction identification rule, and then using the fraudulent transaction identification rule as the reference factor of the transaction anti-fraud analysis model for the current transaction judgment as the reference factor of the current transaction judgment can make the transaction reverse The fraud analysis model more fully analyzes current transactions.

上述实施例中，终端通过根据实际交易训练出的交易反欺诈分析模型与预存的欺诈交易识别规则结合起来对当前交易是否为欺诈交易进行判断，提高了对实际交易中发生的欺诈交易的判断的准确率。In the above embodiment, the terminal combines the transaction anti-fraud analysis model trained according to the actual transaction with the pre-stored fraudulent transaction identification rule to judge whether the current transaction is a fraudulent transaction, and improves the judgment of the fraudulent transaction occurring in the actual transaction. Accuracy.

在一些实施例中，请参见图3，上述欺诈交易判断方法中的步骤S202获取当前交易之前，还可以包括模型训练的步骤，该模型训练的步骤可以包括：In some embodiments, referring to FIG. 3, step S202 in the above fraudulent transaction determination method may further include a step of model training before the current transaction is acquired, and the step of training the model may include:

S302，获取样本数据。S302. Acquire sample data.

其中，样本数据为用于训练交易反欺诈分析模型的交易数据，可以是与当前交易相关的历史数据，如当需要对今年的每一单交易数据进行欺诈性判断，则可以采用上一年度的交易数据为样本数据。The sample data is transaction data used to train the transaction anti-fraud analysis model, and may be historical data related to the current transaction. For example, when it is necessary to fraudulently judge each single transaction data of this year, the previous year may be used. The transaction data is sample data.

具体地，在训练交易反欺诈分析模型时，需要先获取用于训练的样本数据。可选地，样本数据可以包括正样本数据和负样本数据；正样本数据是正常交易数据，由于大部分交易为正常交易，所以可以对正样本数据进行随机抽样，抽取一段时间内的正常交易的数据；负样本数据是通过业务核查后，提取历史交易数据中发生欺诈的欺诈交易数据；其中，样本数据就可以包括交易数据、绑卡数据、开户数据等。Specifically, when training the transaction anti-fraud analysis model, it is necessary to first acquire sample data for training. Optionally, the sample data may include positive sample data and negative sample data; the positive sample data is normal transaction data, and since most transactions are normal transactions, the positive sample data may be randomly sampled to extract normal transactions for a period of time. The data of the negative sample data is fraudulent transaction data in the historical transaction data after the business verification; wherein the sample data may include transaction data, card binding data, account opening data, and the like.

S304，对样本数据进行清洗，并从清洗后的样本数据中选择训练变量。S304: Clean the sample data, and select a training variable from the cleaned sample data.

其中，训练变量是输入随机森林算法模型的变量，如交易金额、交易设备型号等变量。Among them, the training variable is a variable that is input into the random forest algorithm model, such as the transaction amount, the transaction equipment model and other variables.

具体地，对获取的样本数据进行清洗，按照预设的清洗规则筛出不满足需求的数据；例如，当苹果账户的样本数据与欺诈数据类型相似时，则剔除使用苹果账户进行支付的样本数据，以免影响模型的准确性。步骤S202中获取的样本数据为交易数据，每一笔交易可能包含多种变量，如交易金额、客户姓名、交易设备的IP号码、流水号等，而在训练交易反欺诈分析模型时只需要提取与交易欺诈判断相关的变量即可，此处终端提取与交易欺诈判断相关的变量需要先筛选出输入随机森林算法的训练变量。Specifically, the obtained sample data is cleaned, and the data that does not satisfy the demand is filtered according to the preset cleaning rule; for example, when the sample data of the Apple account is similar to the fraud data type, the sample data that is paid by using the Apple account is excluded. So as not to affect the accuracy of the model. The sample data acquired in step S202 is transaction data, and each transaction may include multiple variables, such as transaction amount, customer name, IP number of the transaction device, serial number, etc., and only need to be extracted when training the transaction anti-fraud analysis model. The variable related to the transaction fraud judgment can be used. Here, the terminal extracts the variables related to the transaction fraud judgment and needs to first filter out the training variables of the input random forest algorithm.

可选地，可以通过R语言(一种计算机编译语言)中的cor函数(计算列与列之间的相关系数的函数)，将样本数据中的多个变量剔除高相关性变量，对样本数据进行清洗。Alternatively, multiple variables in the sample data can be excluded from the high correlation variable by the cor function (a function of calculating the correlation coefficient between the columns and columns) in the R language (a computer compiled language) for the sample data Wash it.

S306，通过随机森林算法模型从训练变量中选择入模变量以及入模变量对应的出模变量。S306, selecting a modulo variable and a modulo variable corresponding to the modulo variable from the training variable through a random forest algorithm model.

其中，随机森林算法模型是用于从训练变量中选取入模变量的算法模型；随机森林是一个包含多个决策树的分类器；即用随机的方式建立一个森林，森林里面有很多的决策树组成，随机森林的每一棵决策树之间是没有关联的。在得到森林之后，当有一个新的输入数据进入的时候，就让森林中的每一棵决策树分别进行判断，通过多个决策树对输入数据进行分类。Among them, the random forest algorithm model is an algorithm model for selecting the modulo variables from the training variables; the random forest is a classifier containing multiple decision trees; that is, a forest is established in a random manner, and there are many decision trees in the forest. Composition, there is no correlation between each decision tree in a random forest. After getting the forest, when a new input data is entered, each decision tree in the forest is judged separately, and the input data is classified through multiple decision trees.

入模变量是用于训练交易反欺诈分析模型的变量，是经过随机森林算法模型选择出的与交易欺诈判断相关的变量，如交易金额等。出模变量是与入模变量对应的变量，如入模变量为交易金额时，出模变量为交易金额大于10万元时，此交易为欺诈交易的概率为70％。The modulo variable is a variable used to train the transaction anti-fraud analysis model, and is a variable selected by the random forest algorithm model and related to the transaction fraud judgment, such as the transaction amount. The model variable is the variable corresponding to the model variable. If the model variable is the transaction amount, and the model variable is greater than 100,000 yuan, the probability of this transaction being a fraudulent transaction is 70%.

具体地，将训练变量输入随机森林算法模型，通过该算法中的多个决策树，分析每个训练变量对交易欺诈概率的判断结果的重要性，最终得到重要性大于预设值的训练变量作为入模变量以及和入模变量对应的用于计算交易欺诈概率的出模变量。Specifically, the training variable is input into the random forest algorithm model, and the importance of the judgment result of each training variable on the transaction fraud probability is analyzed by using multiple decision trees in the algorithm, and finally the training variable whose importance is greater than the preset value is obtained as The modulo variable and the modulo variable corresponding to the modulo variable used to calculate the probability of transaction fraud.

S308，获取预设的初始神经网络模型。S308. Acquire a preset initial neural network model.

具体地，步骤S206中选取入模变量以及入模变量对应的出模变量之后，则可以获取预设的初始神经网络模型对其进行训练。Specifically, after the modulo variable and the modulo variable corresponding to the modulo variable are selected in step S206, the preset initial neural network model may be acquired and trained.

S310，根据入模变量与入模变量对应的出模变量对初始神经网络模型进行训练得到交易反欺诈分析模型。S310, training the initial neural network model according to the model variables corresponding to the model variables and the model variables to obtain a transaction anti-fraud analysis model.

具体地，根据步骤S306中得到的入模变量以及入模变量对应的出模变量对初始神经网络进行训练，得到能够计算输入的当前交易的交易欺诈概率的交易反欺诈分析模型。Specifically, the initial neural network is trained according to the modulo variable obtained in step S306 and the modulo variable corresponding to the modulo variable, and a transaction anti-fraud analysis model capable of calculating the transaction fraud probability of the input current transaction is obtained.

上述实施例中，将获取的样本数据进行清洗等处理后，按照训练好的随机森林算法模型选择入模变量以及与其对应的出模变量，对初始神经网络模型进行训练，得到能够准确判断输入的当前交易的交易欺诈概率的交易反欺诈分析模型；由于样本数据为与当前交易相关的历史交易数据，对当前交易的交易欺诈概率的计算更准确。In the above embodiment, after the acquired sample data is processed by cleaning, the model is selected according to the trained random forest algorithm model, and the corresponding model variables are selected, and the initial neural network model is trained to obtain an accurate judgment input. The transaction anti-fraud analysis model of the transaction fraud probability of the current transaction; since the sample data is historical transaction data related to the current transaction, the calculation of the transaction fraud probability of the current transaction is more accurate.

在一些实施例中，上述欺诈交易判断方法中的步骤S304，即对样本数据进行清洗，并从清洗后的样本数据中选择训练变量的步骤，可以包括：In some embodiments, the step S304 in the fraudulent transaction determination method, that is, the step of cleaning the sample data and selecting the training variable from the cleaned sample data may include:

S402，获取内容为空的数据，并识别内容为空的数据对应的数据类型。S402. Acquire data whose content is empty, and identify a data type corresponding to the data whose content is empty.

其中，内容空的数据可以是数据中的内容为NULL值的数据；数据类型是用于制定和获取对应替换逻辑的内容为空的数据的一个或多个类型，如离散型数据和连续型等。The data with empty content may be data with NULL value in the data; the data type is one or more types of data for which the content of the corresponding replacement logic is empty, such as discrete data and continuous type, etc. .

具体地，获取样本数据后，为保证合理判定其中的内容为空的数据，避免出现数据遗漏的情况，需要判断样本数据中内容为空的数据是否无意义，而为对其进行判断，需要先识别内容为空的数据的数据类型，即识别出无意义的数据。Specifically, after obtaining the sample data, in order to ensure that the content of the sample is empty, and avoiding data omission, it is necessary to determine whether the data in the sample data is empty, and whether it is meaningless, Identify the data type of the data whose content is empty, that is, identify the meaningless data.

S404，获取数据类型获取对应的替换逻辑，并根据所获取的替换逻辑将样本数据中的内容为空的数据进行替换。S404. Acquire a replacement type logic corresponding to the data type, and replace the data in the sample data with the empty content according to the acquired replacement logic.

其中，替换逻辑是用于根据内容为空的数据的数据类型对其进行替换的逻辑规则，可以根据经验制定得到；如在R语言中离散变量中NULL值均由字符U替换，连续型变量(如数值等)中NULL值均由数值0做替换等；在制定替换逻辑后可将其存储在预设位置中，该存储制定的替换逻辑的预设位置可以称为逻辑库，使用时根据关键字等规则从逻辑库中调取。Wherein, the replacement logic is a logic rule for replacing the data according to the data whose content is empty, and can be determined according to experience; for example, in the R language, the NULL value in the discrete variable is replaced by the character U, and the continuous variable ( In the case of numerical values, etc., the NULL value is replaced by the value 0; after the replacement logic is formulated, it can be stored in the preset position, and the preset position of the replacement logic defined by the storage can be called a logic library, and the key is used according to the key Rules such as words are retrieved from the logic library.

具体地，根据样本数据的数据类型，从逻辑库获取对应的替换逻辑，对样本数据中的内容为空的数据按照该替换逻辑进行对应的替换操作。Specifically, according to the data type of the sample data, the corresponding replacement logic is obtained from the logic library, and the data in which the content in the sample data is empty is subjected to a corresponding replacement operation according to the replacement logic.

可选地，终端将NULL值占比高于预设比例以上的无实际含义的字段作为是内容为空的字段，终端可以直接移除；其中的预设比例可根据经验设定，如30％等。Optionally, the terminal uses a field with a NULL value that is higher than the preset ratio and has no actual meaning as a field whose content is empty, and the terminal may directly remove the preset content, and the preset ratio may be set according to experience, such as 30%. Wait.

S406，获取替换后的样本数据中的非高相关性变量作为第一变量。S406. Obtain a non-high correlation variable in the replaced sample data as the first variable.

其中，第一变量是从替换后的样本数据中选取的非高相关性变量，如姓名等；非高相关性变量是从样本数据中将高相关性变量筛除后得到的变量，可以通过按照业务操作制定相应的规则对样本数据中的变量的相关性进行分析，确定样本数据中多个变量之间的相关性关系，如姓名和姓氏等变量，在业务操作中作用相同，则可能为是高相关性变量。Wherein, the first variable is a non-high correlation variable selected from the replaced sample data, such as a name; the non-high correlation variable is a variable obtained by screening the high correlation variable from the sample data, and can be followed by The business operation formulates corresponding rules to analyze the correlation of variables in the sample data, and determines the correlation between multiple variables in the sample data. For example, variables such as name and last name have the same effect in business operations, which may be High correlation variable.

具体地，步骤S404中得到的替换后样本数据可能包括相关性较高的重复变量，，选取其中非高相关性变量，继续后面的操作。Specifically, the replaced sample data obtained in step S404 may include a repetitive variable with a higher correlation, and a non-high correlation variable is selected to continue the subsequent operations.

S408，计算第一变量的信息价值。S408. Calculate an information value of the first variable.

其中，信息价值(IV，information value)是自变量对目标变量影响程度的指标；第一变量的信息价值越大，该变量对判断交易是否为欺诈交易的结果影响越大。将分组后的第一变量进行WOE(Weight of Evidence，证据权重)的转换，其转换公式为：Among them, the information value (IV) is the index of the degree of influence of the independent variable on the target variable; the larger the information value of the first variable, the greater the influence of the variable on the judgment of whether the transaction is a fraudulent transaction. The first variable after grouping is converted into WOE (Weight of Evidence), and the conversion formula is:

其中py _i是一组交易数据中欺诈交易与所有交易的比例，pn _i是一组交易数据中正常交易与所有交易的比例，#y _i是这个组中欺诈交易的数量，#n _i是这个组中正常交易的数量，#y _T是所有交易数据中所有欺诈交易的数量，#n _T是所有交易数据中所有正常交易的数量。 Wherein py _i is a set of transactions in a fraudulent transaction ratio of all transactions, pn _i is the ratio of a group's normal transactions and all transactions, # y _i is the number of the group in fraudulent transactions, # n _i is the The number of normal transactions in the group, #y _T is the number of all fraudulent transactions in all transaction data, #n _T is the number of all normal transactions in all transaction data.

一组第一变量的信息价值可根据上述WOE计算公式得出：The information value of a set of first variables can be derived from the above WOE calculation formula:

IV _i＝(py _i-pn _i)*WOE _i (2) IV _i =(py _i -pn _i )*WOE _i (2)

则第一变量的IV值为所有组别第一变量的信息价值的累加。Then the IV value of the first variable is the sum of the information values of the first variable of all groups.

具体地，在步骤S306中获取第一变量后，需要判断第一变量中是否所有的变量都对欺诈交易的判断结果存在影响，若存在影响，则可以作为训练交易反欺诈分析模型的变量；若无影响，则可将其删除。可选地，可以通过引用R语言中的数据算法包来计算第一变量对应的信息价值。Specifically, after acquiring the first variable in step S306, it is necessary to determine whether all variables in the first variable have an influence on the judgment result of the fraudulent transaction, and if there is an influence, it can be used as a variable of the training transaction anti-fraud analysis model; If it has no effect, it can be deleted. Optionally, the information value corresponding to the first variable may be calculated by referring to the data algorithm package in the R language.

S410，选取信息价值高于预设值的第一变量作为训练变量。S410: Select a first variable whose information value is higher than a preset value as a training variable.

具体地，当第一变量的信息价值高于预设值时，则可以认为其对欺诈交易的判断结果存在影响，可以考虑将其作为训练交易反欺诈分析模型的入模变量。其中预设值可以通过大数据分析或者经验设置。Specifically, when the information value of the first variable is higher than the preset value, it may be considered to have an influence on the judgment result of the fraudulent transaction, and may be considered as a model variable of the training transaction anti-fraud analysis model. The default value can be set by big data analysis or experience.

上述实施例中，通过对样本数据中内容为空的数据进行对应的替换操作，以及删除高相关性的变量并从中选取信息价值高于预设值的步骤，更为准确地为训练交易反欺诈分析模型选取入模变量，提高了模型训练的准确性和训练效率。In the above embodiment, the anti-fraud is more accurately trained for the transaction by performing a corresponding replacement operation on the data in which the content of the sample data is empty, and deleting the variable of high correlation and selecting the value of the information from the preset value. The analysis model selects the model variables to improve the accuracy and training efficiency of the model training.

在一些实施例中，上述欺诈交易判断方法中的步骤S304对样本数据进行清洗，可以包括：获取样本数据中的第二变量，并获取第二变量对应的业务规则；根据业务规则从第二变量中选取业务作用的相似度大于预设值的第二变量，并从所业务作用相似度大于预设值的第二变量中选取其中一个作为清洗后的样本数据。In some embodiments, the cleaning of the sample data in step S304 in the fraudulent transaction determination method may include: acquiring a second variable in the sample data, and acquiring a business rule corresponding to the second variable; and the second variable according to the business rule. Selecting a second variable whose service similarity is greater than a preset value, and selecting one of the second variables whose service similarity is greater than the preset value as the cleaned sample data.

其中，第二变量是从样本数据中获取的用于进行相似度判断的变量；如交易金额等。The second variable is a variable obtained from the sample data for performing similarity judgment; such as a transaction amount.

业务规则是用于计算多个变量之间相似度的设定好的规则；如当需要计算的变量为身份证号码、交易金额和出生年月时，可设定身份证号和出生年月的相似度为90％，身份证号码和交易金额的相似度为0％；业务规则设定好后可以存储在规则库中，根据关键字进行调用。The business rule is a set rule for calculating the similarity between a plurality of variables; for example, when the variable to be calculated is the ID number, the transaction amount, and the date of birth, the ID number and the date of birth can be set. The similarity is 90%, and the similarity between the ID number and the transaction amount is 0%; after the business rules are set, they can be stored in the rule base and called according to the keywords.

具体地，从样本数据中获取用于进行相似度判断的第二变量；且从设定好规则库中调用与第二变量相关的业务规则，根据业务规则计算第二变量之间的相似度，当两个或多个第二变量之间的相似度高于预设值时，则可以认为这些第二变量在业务活动中的作用相似，只保留一个进行后续的操作即可。Specifically, a second variable for performing similarity determination is obtained from the sample data; and a business rule related to the second variable is called from the set rule base, and a similarity between the second variables is calculated according to the business rule, When the similarity between two or more second variables is higher than the preset value, it can be considered that these second variables have similar effects in the business activity, and only one subsequent operation is reserved.

上述实施例中，根据业务规则对样本数据中的变量进行相似度计算，多个相似度较高的变量仅保留一个，减少在交易反欺诈分析模型训练和实际应用时时的数据计算，提高了模型训练和应用的效率。In the above embodiment, the similarity calculation is performed on the variables in the sample data according to the business rule, and only a plurality of variables with higher similarity are retained, which reduces the data calculation during the training and practical application of the transaction anti-fraud analysis model, and improves the model. The efficiency of training and application.

在一些实施例中，上述欺诈交易判断方法中的步骤S304对样本数据进行清洗，可以包括：获取样本数据中的第三变量；检测第三变量的变量类型；当变量类型为连续型，根据预设条件推理树获取递归分割算法，并根据递归分割算法对第三变量进行分段。In some embodiments, the cleaning of the sample data in step S304 in the fraudulent transaction determination method may include: acquiring a third variable in the sample data; detecting a variable type of the third variable; and when the variable type is continuous, according to the pre- The conditional reasoning tree is used to obtain the recursive segmentation algorithm, and the third variable is segmented according to the recursive segmentation algorithm.

其中，第三变量是从样本数据中获取的，用于检测变量类型的变量；如交易金额等。The third variable is a variable obtained from the sample data and used to detect the type of the variable; such as the transaction amount.

条件推理树(conditional inference trees，Ctree)是一种基于树的分类算法，如R语言中的plot、text函数等。Conditional inference trees (Ctree) is a tree-based classification algorithm, such as plot, text function in R language.

递归(Recursion)指在函数的定义中使用函数自身的方法；而递归分割算法是基于递归算法对输入的变量进行多次分割的机器算法；即通过多次调用递归分割算法将第三变量分割为离散型变量，如交易金额为0～100万不等，通过递归分割算法将其分割为0至5万一段、5至10万一段的变量表示方法。Recursion refers to the method of using the function itself in the definition of the function; the recursive segmentation algorithm is a machine algorithm that divides the input variable multiple times based on the recursive algorithm; that is, the third variable is divided into multiple by calling the recursive segmentation algorithm multiple times. Discrete variables, such as transaction amounts ranging from 0 to 1 million, are segmented into variable representations of 0 to 50,000 segments and 5 to 100,000 segments by a recursive segmentation algorithm.

具体地，从样本数据中获取用于检测变量类型的第三变量；当检测到第三变量的变量类型为连续型时，根据预设条件推理树获取对第三变量进行分段的递归分割算法，然后根据递归分割算法对第三变量进行分段处理，得到便于训练初始神经网络模型的变量形式。Specifically, a third variable for detecting a variable type is obtained from the sample data; when the variable type of the third variable is detected to be continuous, the recursive segmentation algorithm for segmenting the third variable is obtained according to the preset conditional inference tree. Then, according to the recursive segmentation algorithm, the third variable is segmented to obtain a variable form that is easy to train the initial neural network model.

上述实施例中，通过将样本数据中的连续型变量进行分段，使得样本数据中的变量更符合模型训练的表示形式，便于训练初始神经网络模型。In the above embodiment, by segmenting the continuous variables in the sample data, the variables in the sample data are more consistent with the representation of the model training, and it is convenient to train the initial neural network model.

在一些实施例中，上述欺诈交易判断方法中的步骤S204将当前交易输入交易反欺诈分析模型得到当前交易的交易欺诈概率之后，还可以包括：根据以下公式将交易欺诈概率转换成交易欺诈评分：Score＝A-Blog(odds)；其中odds＝p/(1-p)，Score为交易欺诈评分，p为交易欺诈概率，A和B是预设常数；根据当前交易的交易欺诈概率与识别结果判断当前交易是否为欺诈交易，包括：根据当前交易的交易欺诈评分与识别结果判断当前交易是否为欺诈交易。In some embodiments, after step S204 in the fraudulent transaction determination method, the current transaction is input into the transaction anti-fraud analysis model to obtain the transaction fraud probability of the current transaction, the method may further include: converting the transaction fraud probability into a transaction fraud score according to the following formula: Score=A-Blog(odds); where odds=p/(1-p), Score is the transaction fraud score, p is the transaction fraud probability, A and B are preset constants; transaction fraud probability and recognition result according to the current transaction Determining whether the current transaction is a fraudulent transaction includes: determining whether the current transaction is a fraudulent transaction based on the transaction fraud score of the current transaction and the recognition result.

具体地，为将交易反欺诈分析模型对当前交易是否为欺诈交易的分析结果更直观地展现给技术人员，可以将其转换成交易欺诈评分，其转换公式为Score＝A-Blog(odds)，其中odds＝p/(1-p)；Score为交易欺诈评分，p为交易欺诈概率，A和B是预设常数，可以根据经验获得；评分可直观体现欺诈风险，高分值代表低风险，低分值代表高风险。Specifically, in order to more directly present the analysis result of the transaction anti-fraud analysis model to whether the current transaction is a fraudulent transaction, the technical staff can be converted into a transaction fraud score, and the conversion formula is Score=A-Blog(odds), Where odds=p/(1-p); Score is the transaction fraud score, p is the transaction fraud probability, A and B are preset constants, which can be obtained according to experience; the score can directly reflect the fraud risk, and the high score represents low risk. Low scores represent high risk.

上述实施例中，通过将当前交易的交易欺诈概率根据入模变量获取该变量对应的欺诈平分，使得交易反欺诈分析模型的结果更加便于分析和查看。In the above embodiment, the transaction fraud probability of the current transaction is obtained according to the modulo variable, and the result of the transaction anti-fraud analysis model is more convenient for analysis and review.

在一些实施例中，上述欺诈交易判断方法还可以包括：获取验证交易，将验证交易输入交易反欺诈分析模型得到验证欺诈概率；获取验证交易对应的标准欺诈概率，计算验证欺诈概率与标准欺诈概率的差值；获取验证交易对应的标准欺诈概率，计算验证欺诈概率与标准欺诈概率的差值；当差值大于预设值时，根据标准欺诈概率纠正交易反欺诈分析模型。In some embodiments, the fraud transaction judging method may further include: obtaining a verification transaction, inputting the verification transaction into the transaction anti-fraud analysis model to obtain a verification fraud probability; obtaining a standard fraud probability corresponding to the verification transaction, calculating a verification fraud probability and a standard fraud probability The difference is obtained; the standard fraud probability corresponding to the verification transaction is obtained, and the difference between the verification fraud probability and the standard fraud probability is calculated; when the difference is greater than the preset value, the transaction anti-fraud analysis model is corrected according to the standard fraud probability.

其中，验证交易是用于对交易反欺诈分析模型进行验证的交易数据；如交易反欺诈分析模型是以上一年度的交易为样本数据训练的，则可以用本年度一月份的交易或从中随机抽样的交易为验证交易。Wherein, the verification transaction is transaction data used for verifying the transaction anti-fraud analysis model; if the transaction anti-fraud analysis model is that the transaction of the previous year is trained as sample data, the transaction in January of this year may be used or randomly sampled therefrom. The transaction is a verification transaction.

验证欺诈概率是通过交易反欺诈分析模型对验证交易进行判断后，得到验证交易为欺诈交易的概率；验证欺诈概率与交易欺诈概率形式一致，可以以百分比的形式展示，也可以采用其他数学表示方法。The probability of verifying fraud is the probability that the verified transaction is a fraudulent transaction after the transaction anti-fraud analysis model judges the verification transaction; the verification fraud probability is consistent with the transaction fraud probability form, and can be displayed in percentage form, or other mathematical representations can be used. .

标准欺诈概率是验证交易实际为欺诈交易的概率；其与验证欺诈概率形式一致，可以以百分比的形式展示，也可以采用其他数学表示方法。The standard fraud probability is the probability that the transaction is actually a fraudulent transaction; it is consistent with the form of the verification fraud probability and can be displayed as a percentage, or other mathematical representations can be used.

具体地，从存储验证交易的地址读取验证交易，将获取到的验证交易输入到根据历史交易数据以及历史交易数据中的欺诈信息作为样本数据训练得到的神经网络模型或规则模型中，根据模型训练的规则，得到验证交易为验证交易的概率；然后获取到验证交易实际为欺诈交易的标准欺诈概率，将标准欺诈概率与验证欺诈概率相比较，当二者差值较小时，则证明此交易反欺诈分析模型准确性较高；当二者差值较大，则该交易反欺诈分析模型可能存在错误，根据标准欺诈概率对其纠正。Specifically, the verification transaction is read from the address of the storage verification transaction, and the obtained verification transaction is input into the neural network model or the rule model trained as the sample data according to the historical transaction data and the fraud information in the historical transaction data, according to the model The rules of training, the probability of verifying the transaction as a verification transaction; then obtaining the standard fraud probability that the verification transaction is actually a fraudulent transaction, comparing the standard fraud probability with the probability of verifying the fraud, and when the difference between the two is small, the transaction is proved The anti-fraud analysis model has higher accuracy; when the difference between the two is large, the transaction anti-fraud analysis model may have errors and correct it according to the standard fraud probability.

可选地，可以通过分层抽样方法、随机抽样方法分别将历史交易数据分段，如2-8分段，80％数据作为样本数据对交易反欺诈分析模型进行模型训练，20％的数据作为验证交易对交易反欺诈分析模型进行验证。或者，将所有的历史交易数据作为样本数据，如对上年度抽样后的交易数据，取样本数据日期之后的交易数据，如本年度一月份的交易数据，作为验证交易对交易反欺诈分析模型进行验证Optionally, the historical transaction data may be segmented by a stratified sampling method or a random sampling method, for example, 2-8 segments, 80% of the data is used as sample data to model the transaction anti-fraud analysis model, and 20% of the data is used as The verification transaction verifies the transaction anti-fraud analysis model. Or, all historical transaction data is used as sample data, such as the transaction data after sampling in the previous year, and the transaction data after the date of the data is sampled, such as the transaction data of January of this year, and the transaction anti-fraud analysis model is verified as the verification transaction.

上述实施例中，通过交易反欺诈分析模型的验证步骤能够保证模型对当前交易的欺诈概率判断的准确性。In the above embodiment, the verification step of the transaction anti-fraud analysis model can ensure the accuracy of the model's fraud probability judgment on the current transaction.

应该理解的是，虽然图2至图4的流程图中的各个步骤按照箭头的指示依次显示，但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明，这些步骤的执行并没有严格的顺序限制，这些步骤可以以其它的顺序执行。而且，图2至图4中的至少一部分步骤可以包括多个子步骤或者多个阶段，这些子步骤或者阶段并不必然是在同一时刻执行完成，而是可以在不同的时刻执行，这些子步骤或者阶段的执行顺序也不必然是依次进行，而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowcharts of FIGS. 2 through 4 are sequentially displayed as indicated by the arrows, these steps are not necessarily performed in the order indicated by the arrows. Except as explicitly stated herein, the execution of these steps is not strictly limited, and the steps may be performed in other orders. Moreover, at least some of the steps in FIGS. 2 to 4 may include a plurality of sub-steps or stages, which are not necessarily performed at the same time, but may be performed at different times, or The order of execution of the stages is also not necessarily sequential, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.

在一些实施例中，如图5所示，提供了一种欺诈交易判断装置，包括：数据获取模块100、概率分析模块200、规则识别模块300和结果获取模块400，其中：In some embodiments, as shown in FIG. 5, a fraudulent transaction determining apparatus is provided, including: a data acquisition module 100, a probability analysis module 200, a rule identification module 300, and a result acquisition module 400, wherein:

数据获取模块100，用于获取当前交易。The data acquisition module 100 is configured to acquire a current transaction.

概率分析模块200，用于将当前交易输入交易反欺诈分析模型得到当前交易的交易欺诈概率。The probability analysis module 200 is configured to input the current transaction into the transaction anti-fraud analysis model to obtain the transaction fraud probability of the current transaction.

规则识别模块300，用于通过预存的欺诈交易识别规则对当前交易进行识别得到识别结果。The rule identification module 300 is configured to identify the current transaction by using a pre-stored fraudulent transaction identification rule to obtain a recognition result.

结果获取模块400，用于根据当前交易的交易欺诈概率与识别结果判断当前交易是否为欺诈交易。The result obtaining module 400 is configured to determine whether the current transaction is a fraudulent transaction according to the transaction fraud probability of the current transaction and the recognition result.

在一些实施例中，上述欺诈交易判断装置还可以包括：In some embodiments, the fraud transaction determining apparatus may further include:

样本获取模块，用于获取样本数据。A sample acquisition module for obtaining sample data.

清洗模块，用于对样本数据进行清洗，并从清洗后的样本数据中选择训练变量。A cleaning module for cleaning sample data and selecting training variables from the cleaned sample data.

变量选择模块，用于通过随机森林算法模型从训练变量中选择入模变量以及入模变量对应的出模变量。The variable selection module is configured to select the modulo variable from the training variable and the modulo variable corresponding to the modulo variable through the random forest algorithm model.

初始模型获取模块，用于获取预设的初始神经网络模型。An initial model acquisition module is configured to acquire a preset initial neural network model.

模型训练模块，用于根据入模变量与入模变量对应的出模变量对初始神经网络模型进行训练得到交易反欺诈分析模型。The model training module is configured to train the initial neural network model according to the model variables corresponding to the model variables and the model variables to obtain a transaction anti-fraud analysis model.

在一些实施例中，上述欺诈交易判断装置中的清洗模块可以包括：In some embodiments, the cleaning module in the fraud transaction determining apparatus may include:

空数据获取单元，用于获取内容为空的数据，并识别内容为空的数据对应的数据类型。The empty data obtaining unit is configured to acquire data whose content is empty, and identify a data type corresponding to the data whose content is empty.

替换单元，用于获取数据类型获取对应的替换逻辑，并根据所获取的替换逻辑将样本数据中的内容为空的数据进行替换。And a replacement unit, configured to acquire a replacement logic corresponding to the data type, and replace the data in the sample data with the empty content according to the obtained replacement logic.

相关性分析单元，用于获取替换后的样本数据中的非高相关性变量作为第一变量。A correlation analysis unit is configured to obtain a non-high correlation variable in the replaced sample data as the first variable.

价值计算单元，用于计算第一变量的信息价值。A value calculation unit for calculating the information value of the first variable.

训练变量选取单元，用于选取信息价值高于预设值的第一变量作为训练变量。The training variable selecting unit is configured to select the first variable whose information value is higher than the preset value as the training variable.

业务规则获取单元，用于获取样本数据中的第二变量，并获取第二变量对应的业务规则。The business rule obtaining unit is configured to acquire a second variable in the sample data, and obtain a business rule corresponding to the second variable.

变量相似度分析单元，用于根据业务规则从第二变量中选取业务作用的相似度大于预设值的第二变量，并从所业务作用相似度大于预设值的第二变量中选取其中一个作为清洗后的样本数据。The variable similarity analysis unit is configured to select, from the second variable, a second variable whose service similarity is greater than a preset value according to the business rule, and select one of the second variables whose service similarity is greater than the preset value. As sample data after cleaning.

样本变量获取单元，用于获取样本数据中的第三变量。A sample variable acquisition unit is configured to acquire a third variable in the sample data.

变量类型识别单元，用于检测第三变量的变量类型。A variable type identifying unit for detecting a variable type of the third variable.

连续型变量分割单元，用于当变量类型为连续型，根据预设条件推理树获取递归分割算法，并根据递归分割算法对第三变量进行分段。The continuous variable segmentation unit is configured to acquire a recursive segmentation algorithm according to a predetermined conditional reasoning tree when the variable type is continuous, and segment the third variable according to the recursive segmentation algorithm.

评分转化模块，用于根据以下公式将交易欺诈概率转换成交易欺诈评分：A score conversion module for converting a transaction fraud probability into a transaction fraud score according to the following formula:

Score＝A-Blog(odds)Score=A-Blog(odds)

其中odds＝p/(1-p)，Score为交易欺诈评分，p为交易欺诈概率，A和B 是预设常数。Where odds=p/(1-p), Score is the transaction fraud score, p is the transaction fraud probability, and A and B are preset constants.

则结果获取模块400，可以用于：根据当前交易的交易欺诈评分与识别结果判断当前交易是否为欺诈交易。The result obtaining module 400 may be configured to: determine whether the current transaction is a fraudulent transaction according to the transaction fraud score of the current transaction and the recognition result.

验证交易获取模块，用于获取验证交易。The verification transaction acquisition module is used to obtain the verification transaction.

验证欺诈概率获取模块，用于将验证交易输入交易反欺诈分析模型得到验证欺诈概率。The verification fraud probability acquisition module is configured to input the verification transaction into the transaction anti-fraud analysis model to obtain the verification fraud probability.

差值计算模块，用于获取验证交易对应的标准欺诈概率，计算验证欺诈概率与标准欺诈概率的差值。The difference calculation module is configured to obtain a standard fraud probability corresponding to the verification transaction, and calculate a difference between the verification fraud probability and the standard fraud probability.

纠正模块，用于当差值大于预设值时，根据标准欺诈概率纠正交易反欺诈分析模型。The correction module is configured to correct the transaction anti-fraud analysis model according to the standard fraud probability when the difference is greater than the preset value.

关于欺诈交易判断装置的具体限定可以参见上文中对于欺诈交易判断方法的限定，在此不再赘述。上述欺诈交易判断装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。For the specific definition of the fraudulent transaction judging device, reference may be made to the above definition of the fraudulent transaction judging method, and details are not described herein again. Each of the above-described fraudulent transaction determination devices may be implemented in whole or in part by software, hardware, and combinations thereof. Each of the above modules may be embedded in or independent of the processor in the computer device, or may be stored in a memory in the computer device in a software form, so that the processor invokes the operations corresponding to the above modules.

在一些实施例中，提供了一种计算机设备，该计算机设备可以是终端，其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中，该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机可读指令。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种欺诈交易判断方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏，该计算机设备的输入装置可以是显示屏上覆盖的触摸层，也可以是计算机设备外壳上设置的按键、轨迹球或触控板，还可以是外接的键盘、触控板或鼠标等。In some embodiments, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in FIG. 6. The computer device includes a processor, memory, network interface, display screen, and input device connected by a system bus. The processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores operating systems and computer readable instructions. The internal memory provides an environment for operation of an operating system and computer readable instructions in a non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal via a network connection. The computer readable instructions are executed by the processor to implement a fraudulent transaction determination method. The display screen of the computer device may be a liquid crystal display or an electronic ink display screen, and the input device of the computer device may be a touch layer covered on the display screen, or may be a button, a trackball or a touchpad provided on the computer device casing. It can also be an external keyboard, trackpad or mouse.

本领域技术人员可以理解，图6中示出的结构，仅仅是与本申请方案相关的部分结构的框图，并不构成对本申请方案所应用于其上的计算机设备的限定，具体的计算机设备可以包括比图中所示更多或更少的部件，或者组合某些部件，或者具有不同的部件布置。It will be understood by those skilled in the art that the structure shown in FIG. 6 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied. The specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.

一种计算机设备，包括存储器和一个或多个处理器，存储器中储存有计算机可读指令，计算机可读指令被处理器执行时，使得一个或多个处理器执行以下步骤：A computer device comprising a memory and one or more processors having stored therein computer readable instructions, the computer readable instructions being executed by the processor such that the one or more processors perform the following steps:

获取当前交易；Get the current transaction;

将当前交易输入交易反欺诈分析模型得到当前交易的交易欺诈概率；Entering the current transaction into the transaction anti-fraud analysis model to obtain the transaction fraud probability of the current transaction;

通过预存的欺诈交易识别规则对当前交易进行识别得到识别结果；及Identifying the current transaction by pre-existing fraudulent transaction identification rules to obtain a recognition result; and

根据当前交易的交易欺诈概率与识别结果判断当前交易是否为欺诈交易。Judging whether the current transaction is a fraudulent transaction based on the transaction fraud probability of the current transaction and the recognition result.

在一些实施例中，处理器执行计算机可读指令时实现的获取当前交易之前，还可以包括：In some embodiments, before the processor obtains the current transaction by executing the computer readable instructions, the method may further include:

获取样本数据；Obtain sample data;

对样本数据进行清洗，并从清洗后的样本数据中选择训练变量；Cleaning the sample data and selecting training variables from the cleaned sample data;

通过随机森林算法模型从训练变量中选择入模变量以及入模变量对应的出模变量；The random forest algorithm model is used to select the modulo variables from the training variables and the modulo variables corresponding to the modulo variables;

获取预设的初始神经网络模型；及Obtaining a preset initial neural network model; and

根据入模变量与入模变量对应的出模变量对初始神经网络模型进行训练得到交易反欺诈分析模型。The initial neural network model is trained according to the model variables corresponding to the model variables and the model variables to obtain a transaction anti-fraud analysis model.

在一些实施例中，处理器执行计算机可读指令时实现的对样本数据进行清洗，并从清洗后的样本数据中选择训练变量，可以包括：In some embodiments, the cleaning of the sample data by the processor when executing the computer readable instructions and selecting the training variables from the cleaned sample data may include:

获取内容为空的数据，并识别内容为空的数据对应的数据类型；Obtaining data whose content is empty, and identifying the data type corresponding to the data whose content is empty;

获取数据类型获取对应的替换逻辑，并根据所获取的替换逻辑将样本数据中的内容为空的数据进行替换；Obtaining the data type to obtain the corresponding replacement logic, and replacing the data in the sample data with the empty content according to the obtained replacement logic;

获取替换后的样本数据中的非高相关性变量作为第一变量；Obtaining a non-high correlation variable in the replaced sample data as the first variable;

计算第一变量的信息价值；及Calculating the information value of the first variable; and

选取信息价值高于预设值的第一变量作为训练变量。The first variable whose information value is higher than the preset value is selected as the training variable.

在一些实施例中，处理器执行计算机可读指令时实现的对样本数据进行清洗，可以包括：In some embodiments, the cleaning of the sample data implemented by the processor when the computer readable instructions are executed may include:

获取样本数据中的第二变量，并获取第二变量对应的业务规则；及Obtaining a second variable in the sample data and obtaining a business rule corresponding to the second variable; and

根据业务规则从第二变量中选取业务作用的相似度大于预设值的第二变量，并从所业务作用相似度大于预设值的第二变量中选取其中一个作为清洗后的样本数据。According to the business rule, the second variable whose service similarity is greater than the preset value is selected from the second variable, and one of the second variables whose service similarity is greater than the preset value is selected as the cleaned sample data.

获取样本数据中的第三变量；Obtaining a third variable in the sample data;

检测第三变量的变量类型；及Detecting the type of the variable of the third variable; and

当变量类型为连续型，根据预设条件推理树获取递归分割算法，并根据递归分割算法对第三变量进行分段。When the variable type is continuous, the recursive segmentation algorithm is obtained according to the preset conditional inference tree, and the third variable is segmented according to the recursive segmentation algorithm.

在一些实施例中，处理器执行计算机可读指令时实现的将当前交易输入交易反欺诈分析模型得到当前交易的交易欺诈概率之后，还可以包括：In some embodiments, after the processor executes the computer-readable instruction to input the current transaction into the transaction anti-fraud analysis model to obtain the transaction fraud probability of the current transaction, the method may further include:

根据以下公式将交易欺诈概率转换成交易欺诈评分：Score＝A-Blog(odds)；Convert the transaction fraud probability into a transaction fraud score according to the following formula: Score=A-Blog(odds);

其中odds＝p/(1-p)，Score为交易欺诈评分，p为交易欺诈概率，A和B是预设常数；及Where odds=p/(1-p), Score is the transaction fraud score, p is the transaction fraud probability, and A and B are preset constants;

处理器执行计算机可读指令时实现的根据当前交易的交易欺诈概率与识别结果判断当前交易是否为欺诈交易，可以包括：The processor determines whether the current transaction is a fraudulent transaction based on the transaction fraud probability of the current transaction and the recognition result when the processor executes the computer readable instruction, and may include:

根据当前交易的交易欺诈评分与识别结果判断当前交易是否为欺诈交易。Judging whether the current transaction is a fraudulent transaction based on the transaction fraud score of the current transaction and the recognition result.

在一些实施例中，该处理器执行计算机可读指令时还可以实现以下步骤：In some embodiments, the processor may also implement the following steps when executing the computer readable instructions:

获取验证交易；Obtain a verification transaction;

将验证交易输入交易反欺诈分析模型得到验证欺诈概率；Entering the verification transaction into the transaction anti-fraud analysis model to obtain the probability of verifying the fraud;

获取验证交易对应的标准欺诈概率，计算验证欺诈概率与标准欺诈概率的差值；及Obtaining a standard fraud probability corresponding to the verification transaction, and calculating a difference between the verification fraud probability and the standard fraud probability;

当差值大于预设值时，根据标准欺诈概率纠正交易反欺诈分析模型。When the difference is greater than the preset value, the transaction anti-fraud analysis model is corrected according to the standard fraud probability.

获取当前交易；Get the current transaction;

在一些实施例中，计算机可读指令被处理器执行时实现的获取当前交易之前，还可以包括：In some embodiments, before the computer transaction instructions are executed by the processor to obtain the current transaction, the method may further include:

获取样本数据；Obtain sample data;

在一些实施例中，计算机可读指令被处理器执行时实现的对样本数据进行清洗，并从清洗后的样本数据中选择训练变量，可以包括：In some embodiments, the computer readable instructions are cleaned of the sample data as implemented by the processor and the training variables are selected from the cleaned sample data, which may include:

获取数据类型获取对应的替换逻辑，并根据所获取的替换逻辑将样本数据中的内容为空的数据进行替换；Obtaining the data type to obtain the corresponding replacement logic, and replacing the data in the sample data with empty content according to the obtained replacement logic;

在一些实施例中，计算机可读指令被处理器执行时实现的对样本数据进行清洗，可以包括：In some embodiments, the cleaning of the sample data by the computer readable instructions when executed by the processor may include:

在一些实施例中，计算机可读指令被处理器执行时实现的将当前交易输入交易反欺诈分析模型得到当前交易的交易欺诈概率之后，还可以包括：In some embodiments, after the computer readable instructions are executed by the processor to input the current transaction into the transaction anti-fraud analysis model to obtain the transaction fraud probability of the current transaction, the method may further include:

计算机可读指令被处理器执行时实现的根据当前交易的交易欺诈概率与识别结果判断当前交易是否为欺诈交易，可以包括：The computer readable instruction is executed by the processor to determine whether the current transaction is a fraudulent transaction according to the transaction fraud probability of the current transaction and the recognition result, and may include:

在一些实施例中，计算机可读指令被处理器执行时还可以实现以下步骤：In some embodiments, the computer readable instructions, when executed by the processor, can also implement the following steps:

获取验证交易；Obtain a verification transaction;

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机可读指令来指令相关的硬件来完成，所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中，该计算机可读指令在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用，均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限，RAM以多种形式可得，诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。One of ordinary skill in the art can understand that all or part of the process of implementing the above embodiments can be completed by computer readable instructions, which can be stored in a non-volatile computer. The readable storage medium, which when executed, may include the flow of an embodiment of the methods as described above. Any reference to a memory, storage, database or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of formats, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronization chain. Synchlink DRAM (SLDRAM), Memory Bus (Rambus) Direct RAM (RDRAM), Direct Memory Bus Dynamic RAM (DRDRAM), and Memory Bus Dynamic RAM (RDRAM).

以上实施例的各技术特征可以进行任意的组合，为使描述简洁，未对上述实施例中的各个技术特征所有可能的组合都进行描述，然而，只要这些技术特征的组合不存在矛盾，都应当认为是本说明书记载的范围。The technical features of the above embodiments may be arbitrarily combined. For the sake of brevity of description, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, It is considered to be the range described in this specification.

以上实施例仅表达了本申请的几种实施方式，其描述较为具体和详细，但并不能因此而理解为对发明专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本申请构思的前提下，还可以做出若干变形和改进，这些都属于本申请的保护范围。因此，本申请专利的保护范围应以所附权利要求为准。The above embodiments are merely illustrative of several embodiments of the present application, and the description thereof is more specific and detailed, but is not to be construed as limiting the scope of the invention. It should be noted that a number of variations and modifications may be made by those skilled in the art without departing from the spirit and scope of the present application. Therefore, the scope of the invention should be determined by the appended claims.

Claims

A method for judging a fraudulent transaction, the method comprising:

Get the current transaction;

Entering the current transaction into a transaction anti-fraud analysis model to obtain a transaction fraud probability of the current transaction;

Identifying the current transaction by a pre-stored fraudulent transaction identification rule to obtain a recognition result; and

Determining whether the current transaction is a fraudulent transaction based on the transaction fraud probability of the current transaction and the recognition result.

The method according to claim 1, wherein before the obtaining the current transaction, the method further comprises:

Obtain sample data;

Cleaning the sample data and selecting a training variable from the cleaned sample data;

Selecting a modulo variable and a modulo variable corresponding to the modulo variable from the training variable by a random forest algorithm model;

Obtaining a preset initial neural network model; and

The transactional anti-fraud analysis model is obtained by training the initial neural network model according to the model variables corresponding to the model variables and the model variables.

The method according to claim 2, wherein said cleaning said sample data and selecting training variables from said cleaned sample data comprises:

Obtaining data whose content is empty, and identifying a data type corresponding to the data whose content is empty;

Obtaining the data type to obtain corresponding replacement logic, and replacing the data in the sample data that is empty according to the acquired replacement logic;

Obtaining a non-high correlation variable in the sample data after replacement as the first variable;

Calculating the information value of the first variable; and

The first variable whose information value is higher than a preset value is selected as a training variable.

The method according to claim 2, wherein the cleaning the sample data comprises:

Obtaining a second variable in the sample data, and obtaining a business rule corresponding to the second variable; and

Selecting, according to the business rule, a second variable whose service similarity is greater than a preset value, and selecting one of the second variables whose service similarity is greater than a preset value as the cleaned sample.

Obtaining a third variable in the sample data;

Detecting a variable type of the third variable; and

When the variable type is continuous, the recursive segmentation algorithm is acquired according to the preset condition inference tree, and the third variable is segmented according to the recursive segmentation algorithm.

The method according to claim 1, wherein after the entering the current transaction into the transaction anti-fraud analysis model to obtain the transaction fraud probability of the current transaction, the method further comprises:

The transaction fraud probability is converted to a transaction fraud score according to the following formula:

Score=A-Blog(odds)

Where odds=p/(1-p), Score is the transaction fraud score, p is the transaction fraud probability, and A and B are preset constants;

Determining whether the current transaction is a fraudulent transaction according to the transaction fraud probability of the current transaction and the recognition result, including:

Determining whether the current transaction is a fraudulent transaction based on the transaction fraud score of the current transaction and the recognition result.

The method of claim 1 further comprising:

Obtain a verification transaction;

Entering the verification transaction into the transaction anti-fraud analysis model to obtain a verification fraud probability;

Obtaining a standard fraud probability corresponding to the verification transaction, and calculating a difference between the verification fraud probability and the standard fraud probability; and

When the difference is greater than a preset value, the transaction anti-fraud analysis model is corrected according to the standard fraud probability.

A fraudulent transaction judging device, the device comprising:

a data acquisition module for acquiring a current transaction;

a probability analysis module, configured to input the current transaction into a transaction anti-fraud analysis model to obtain a transaction fraud probability of the current transaction;

a rule identification module, configured to identify the current transaction by a pre-stored fraudulent transaction identification rule to obtain a recognition result; and

And a result obtaining module, configured to determine, according to the transaction fraud probability of the current transaction, the determination whether the current transaction is a fraudulent transaction.

The device according to claim 8, wherein the device further comprises:

a sample acquisition module for acquiring sample data;

a cleaning module, configured to clean the sample data, and select a training variable from the cleaned sample data;

a variable selection module, configured to select, by using a random forest algorithm model, an modulo variable from the training variable and a modulo variable corresponding to the modulo variable;

An initial model acquisition module for acquiring a preset initial neural network model; and

The model training module is configured to train the initial neural network model according to the model variables corresponding to the model variables and the model variables to obtain a transaction anti-fraud analysis model.

A computer device comprising a memory and one or more processors having stored therein computer readable instructions, the computer readable instructions being executed by the one or more processors to cause the one or more The processors perform the following steps:

Get the current transaction;

The computer device according to claim 10, wherein before the acquiring the current transaction by the processor when the processor executes the computer readable instruction, the method further comprises:

Obtain sample data;

Obtaining a preset initial neural network model; and

The computer apparatus according to claim 11, wherein said processor performs said cleaning of said sample data when said computer readable instructions are executed, and selects training variables from said cleaned sample data, include:

Calculating the information value of the first variable; and

The computer device according to claim 11, wherein the cleaning of the sample data implemented by the processor when the computer readable instructions are executed comprises:

Obtaining a third variable in the sample data;

Detecting a variable type of the third variable; and

The computer apparatus according to claim 10, wherein said processor, when said computer readable instructions are executed, said said current transaction input transaction anti-fraud analysis model to obtain a transaction fraud probability of said current transaction After that, it also includes:

Score=A-Blog(odds)

The computer apparatus according to claim 10, wherein said processor further performs the following steps when said computer readable instructions are executed:

Obtain a verification transaction;

One or more non-transitory computer readable storage mediums storing computer readable instructions, when executed by one or more processors, cause the one or more processors to perform the following steps:

Get the current transaction;

The storage medium of claim 17, wherein the computer readable instructions are executed by the processor prior to the obtaining the current transaction, further comprising:

Obtain sample data;

Obtaining a preset initial neural network model; and

The storage medium according to claim 18, wherein said computer readable instructions are cleaned by said processor when said sample data is executed, and said training variables are selected from said cleaned sample data ,include:

Calculating the information value of the first variable; and