CN105550175A

CN105550175A - Malicious account identification method and device

Info

Publication number: CN105550175A
Application number: CN201410588349.1A
Authority: CN
Inventors: 郑丹丹
Original assignee: Alibaba Group Holding Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2014-10-28
Filing date: 2014-10-28
Publication date: 2016-05-04
Anticipated expiration: 2034-10-28
Also published as: CN105550175B; CN110033302A; CN110033302B

Abstract

The invention provides a malicious account identification method and device. The method comprises the following steps: acquiring an account to be identified; according to the fuzzy processing indication information, fuzzifying the account to be identified to obtain a fuzzy account retaining part of information in the account to be identified; wherein the fuzzy processing indication information is used for discovering accounts with the same or similar information in the accounts to be identified; and carrying out malicious account identification on the fuzzy accounts so as to determine a malicious account in the accounts to be identified. The method and the device can improve the accuracy of identifying the malicious account and reduce the misjudgment rate.

Description

Malicious account identification method and device

【技术领域】【Technical field】

本申请涉及互联网技术领域，尤其涉及一种恶意账户识别方法及装置。The present application relates to the field of Internet technologies, and in particular to a method and device for identifying a malicious account.

【背景技术】【Background technique】

随着互联网技术的发展，各种应用系统越来越多，例如电子商务系统。用户作为应用系统的使用者，一般需要注册账户，例如电子邮箱(email)，账户可以作为用户的虚拟身份标识信息，用户通过账户可以登录应用系统，以使用应用系统提供的资源或开展相关活动等。With the development of Internet technology, there are more and more various application systems, such as e-commerce systems. As a user of the application system, the user generally needs to register an account, such as an electronic mailbox (email), the account can be used as the user's virtual identity information, and the user can log in to the application system through the account to use the resources provided by the application system or carry out related activities, etc. .

在实际应用中，一些恶意用户会大批量注册账户，以便于盗取应用系统所提供的资源。以电子商务系统为例，恶意用户可以通过大批量注册的电子邮箱登录电子商务系统，从而多次领取电子商务系统提供的红包。对应用系统来说，需要识别出恶意账户。In practical applications, some malicious users will register a large number of accounts in order to steal resources provided by the application system. Taking the e-commerce system as an example, malicious users can log in to the e-commerce system through a large number of registered e-mail addresses, so as to receive red envelopes provided by the e-commerce system multiple times. For application systems, malicious accounts need to be identified.

现有技术中存在一种对同期注册的大量账户做聚类，根据聚类结果识别恶意账户的方法。对于聚类算法来说，需要设定一些参数，例如类别的数量，距离半径等，针对账户这一特殊对象来说，由于具有太多不可控性，例如无法预知同期会有多少用户注册账户，也无法预知会有多少不同类型的账户产生，因此无法很好的设定聚类算法所需的参数。因此，现有这种方法容易发生误判，识别恶意账户的精度不高。In the prior art, there is a method of clustering a large number of accounts registered in the same period, and identifying malicious accounts according to the clustering results. For the clustering algorithm, some parameters need to be set, such as the number of categories, distance radius, etc. For the special object of account, due to too much uncontrollability, for example, it is impossible to predict how many users will register accounts in the same period, It is also impossible to predict how many different types of accounts will be generated, so the parameters required for the clustering algorithm cannot be well set. Therefore, the existing method is prone to misjudgment, and the accuracy of identifying malicious accounts is not high.

【发明内容】【Content of invention】

本申请的多个方面提供一种恶意账户识别方法及装置，用以提高识别恶意账户的精度，降低误判率。Various aspects of the present application provide a method and device for identifying malicious accounts, which are used to improve the accuracy of identifying malicious accounts and reduce the misjudgment rate.

本申请的一方面，提供一种恶意账户识别方法，包括：In one aspect of the present application, a malicious account identification method is provided, including:

获取待识别账户；Obtain the account to be identified;

按照模糊处理指示信息，对所述待识别账户进行模糊化处理，以获得保留所述待识别账户中部分信息的模糊账户；其中，所述模糊处理指示信息用以发现所述待识别账户中具有相同或相似信息的账户；According to the fuzzy processing instruction information, perform fuzzy processing on the account to be identified to obtain a fuzzy account that retains part of the information in the account to be identified; wherein, the fuzzy processing instruction information is used to find that the account to be identified has Accounts with the same or similar information;

对所述模糊账户进行恶意账户识别，以确定所述待识别账户中的恶意账户。Malicious account identification is performed on the fuzzy account to determine a malicious account among the accounts to be identified.

本申请的另一方面，提供一种恶意账户识别装置，包括：Another aspect of the present application provides a malicious account identification device, including:

获取模块，用于获取待识别账户；An acquisition module, configured to acquire an account to be identified;

模糊化处理模块，用于按照模糊处理指示信息，对所述待识别账户进行模糊化处理，以获得保留所述待识别账户中部分信息的模糊账户；其中，所述模糊处理指示信息用以发现所述待识别账户中具有相同或相似信息的账户；A fuzzy processing module, configured to perform fuzzy processing on the account to be identified according to the fuzzy processing instruction information, so as to obtain a fuzzy account that retains part of the information in the account to be identified; wherein, the fuzzy processing instruction information is used to find Accounts with the same or similar information among the accounts to be identified;

识别模块，用于对所述模糊账户进行恶意账户识别，以确定所述待识别账户中的恶意账户。An identifying module, configured to identify malicious accounts on the fuzzy accounts, so as to determine the malicious accounts among the accounts to be identified.

在本申请中，获取待识别账户，按照模糊处理指示信息，对待识别账户进行模糊化处理，获得保留了待识别账户中部分信息的模糊账户，其中，模糊处理指示信息的作用是发现待识别账户中具有相同或相似信息的账户，因此通过比较模糊账户可以发现具有相同或相似信息的待识别账户，这些账户通常属于恶意账户，进一步基于模糊账户进行恶意账户识别，可以更加准确的发现待识别账户中的恶意账户，降低误判率。In this application, the account to be identified is obtained, and the account to be identified is fuzzified according to the fuzzy processing instruction information, and the fuzzy account that retains part of the information in the account to be identified is obtained, wherein the function of the fuzzy processing instruction information is to find the account to be identified Therefore, by comparing fuzzy accounts, you can find unrecognized accounts with the same or similar information. These accounts are usually malicious accounts. Further identification of malicious accounts based on fuzzy accounts can find unrecognized accounts more accurately. Malicious accounts in the network, reducing the misjudgment rate.

【附图说明】【Description of drawings】

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that need to be used in the embodiments or the description of the prior art. Obviously, the accompanying drawings in the following description are of the present application For some embodiments, those of ordinary skill in the art can also obtain other drawings based on these drawings without paying creative efforts.

图1为本申请一实施例提供的恶意账户识别方法的流程示意图；FIG. 1 is a schematic flowchart of a malicious account identification method provided by an embodiment of the present application;

图2为本申请另一实施例提供的恶意账户识别方法的流程示意图；FIG. 2 is a schematic flowchart of a malicious account identification method provided in another embodiment of the present application;

图3为本申请一实施例提供的恶意账户识别装置的结构示意图。FIG. 3 is a schematic structural diagram of a malicious account identification device provided by an embodiment of the present application.

【具体实施方式】【detailed description】

为使本申请实施例的目的、技术方案和优点更加清楚，下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.

图1为本申请一实施例提供的恶意账户识别方法的流程示意图。如图1所示，该方法包括：FIG. 1 is a schematic flowchart of a method for identifying a malicious account provided by an embodiment of the present application. As shown in Figure 1, the method includes:

101、获取待识别账户。101. Obtain an account to be identified.

102、按照模糊处理指示信息，对待识别账户进行模糊化处理，以获得保留待识别账户中部分信息的模糊账户，该模糊处理指示信息用以发现待识别账户中具有相同或相似信息的账户。102. According to the fuzzy processing instruction information, perform fuzzy processing on the account to be identified to obtain a fuzzy account that retains part of the information in the account to be identified, and the fuzzy processing instruction information is used to find accounts with the same or similar information among the accounts to be identified.

103、对上述模糊账户进行恶意账户识别，以确定待识别账户中的恶意账户。103. Perform malicious account identification on the above fuzzy accounts, so as to determine the malicious account among the accounts to be identified.

本实施例提供一种恶意账户识别方法，可由恶意账户识别装置来执行。恶意账户识别装置可以是任何需要进行恶意账户识别的设备，例如可以是应用服务端或应用客户端等。This embodiment provides a method for identifying a malicious account, which can be executed by a device for identifying a malicious account. The device for identifying malicious accounts may be any device that needs to identify malicious accounts, for example, it may be an application server or an application client.

在进行恶意账户识别时，恶意账户识别装置首先获取待识别账户。待识别账户可以包括尚未被识别为合法账户的已注册账户，还可以包括新注册账户。例如，恶意账户识别装置可以在指定时间，获取指定时间间隔内新注册的至少一个账户作为待识别账户。更为具体的，恶意账户识别装置可以周期性的获取在本周期内新注册的至少一个账户作为待识别账户。所述周期可以是一天、两天、一周或更长时间。When identifying a malicious account, the device for identifying a malicious account first obtains an account to be identified. The account to be identified may include a registered account that has not been identified as a legal account, and may also include a newly registered account. For example, the device for identifying a malicious account may acquire at least one account newly registered within a specified time interval as an account to be identified at a specified time. More specifically, the device for identifying a malicious account may periodically obtain at least one account newly registered within this period as an account to be identified. The period can be one day, two days, one week or longer.

值得说明的是，本实施例中的账户可以是用于登录的各种账户，例如可以是但不限于电子邮箱。本实施例中的账户一般具有前缀和后缀两部分。对电子邮箱来说，电子邮箱的前缀是之前的部分，其余部分作为电子邮箱的后缀。It is worth noting that the account in this embodiment may be various accounts used for login, such as but not limited to an email address. The account in this embodiment generally has two parts, a prefix and a suffix. For e-mail, the prefix of e-mail is the part before it, and the rest is used as the suffix of e-mail.

考虑到恶意账户一般都有一些明显的规律，例如账户名称有明显的规律性，例如，有固定的前缀和完全一样的后缀；以数字或字母作为序列自增；包括具有表征意义的固定字符，等等。以电子邮箱为例，恶意用户在注册时有可能注册以下一些电子邮箱，luha001163.com，luha002163.com，......，luha007163.com等。由此可见，恶意账户一般具有相同或相近的信息，是比较相近的。因此，可以利用恶意账户之间相似的特点来识别恶意账户。Considering that malicious accounts generally have some obvious rules, such as account names have obvious regularities, for example, have a fixed prefix and exactly the same suffix; use numbers or letters as a sequence of self-incrementing; include fixed characters with symbolic meaning, etc. Taking e-mail as an example, malicious users may register the following e-mails when registering, luha001163.com, luha002163.com, ..., luha007163.com, etc. It can be seen that malicious accounts generally have the same or similar information, which are relatively similar. Therefore, similar characteristics between malicious accounts can be used to identify malicious accounts.

为了发现待识别账户中具有相同或相似信息的账户，可以预先针对该目的配置模糊处理指示信息，也就是说通过模糊处理指示信息可以发现待识别账户中具有相同或相似信息的账户。模糊处理指示信息主要包括一些用于限定模糊化处理位置、模糊化处理对象以及如何模糊化操作等的信息。In order to find accounts with the same or similar information among the accounts to be identified, the fuzzy processing instruction information can be pre-configured for this purpose, that is to say, the accounts with the same or similar information among the accounts to be identified can be found through the obfuscation processing instruction information. The blurring indication information mainly includes some information for defining the blurring position, the blurring object, and how to perform the blurring operation.

恶意账户识别装置按照模糊处理指示信息，对待识别账户进行模糊化处理，以获得保留了待识别账户中部分信息的模糊账户。简单来说，模糊账户保留了待识别账户中的部分信息，待识别账户中的另一部分信息被模糊掉。所谓模糊实际上是抽象的意思，即将具体的待识别账户抽象成模糊账户。The malicious account identification device performs fuzzy processing on the account to be identified according to the instruction information of the obfuscation processing, so as to obtain an obfuscated account that retains part of the information in the account to be identified. To put it simply, the fuzzy account retains part of the information in the account to be identified, and another part of the information in the account to be identified is blurred. The so-called fuzzy actually means abstraction, that is, to abstract specific accounts to be identified into fuzzy accounts.

举例说明，以账户luha3902163.com和luha244163.com为例，模糊处理指示信息可以指示将账户中的数字模糊掉，并保留被模糊掉的数字个数，则经过模糊化处理后可以得到模糊账户luha^^^^163.com和luha^^^163.com，其中“^”代表被模糊掉的数字，“^”的个数表示被模糊掉的数字个数。这两个模糊账户的后缀是完全相同的，前缀中剩余字符也是相同的，区别在于模糊掉的数字个数不同。这意味着这两个模糊账户对应的待识别账户具有相同的开头字符“luha”和相同的后缀“163.com”，属于相似账户。For example, taking the accounts luha3902163.com and luha244163.com as examples, the obfuscation instruction information can instruct to obfuscate the numbers in the account and keep the number of obfuscated numbers, then the obfuscated account luha can be obtained after the obfuscation process ^^^^163.com and luha^^^163.com, where "^" represents the blurred number, and the number of "^" indicates the number of blurred numbers. The suffixes of the two fuzzy accounts are exactly the same, and the rest of the characters in the prefix are also the same, the difference is that the number of blurred numbers is different. This means that the accounts to be identified corresponding to these two fuzzy accounts have the same initial character "luha" and the same suffix "163.com", which belong to similar accounts.

再举例说明，以账户luha3902163.com和luha3903163.com为例，模糊处理指示信息可以指示将账户中的数字模糊掉，并保留被模糊掉的数字个数，则经过模糊化处理后可以得到模糊账户luha^^^^163.com和luha^^^^163.com，其中“^”代表被模糊掉的数字，“^”的个数表示被模糊掉的数字个数。这两个模糊账户的后缀是完全相同的，前缀中剩余字符也是相同的，被模糊掉的数字个数也相同，即这两个模糊账户是完全相同的。这意味着这两个模糊账户对应的待识别账户具有相同的开头字符“luha”、相同的后缀以及相同个数的数字，属于相近账户。As another example, taking the accounts luha3902163.com and luha3903163.com as examples, the obfuscation instruction information can instruct to obfuscate the numbers in the account and keep the number of obfuscated numbers, then the obfuscated account can be obtained after the obfuscation process luha^^^^163.com and luha^^^^163.com, where "^" represents the blurred number, and the number of "^" represents the number of blurred numbers. The suffixes of the two fuzzy accounts are exactly the same, the remaining characters in the prefix are also the same, and the number of numbers that are blurred out is also the same, that is, the two fuzzy accounts are completely the same. This means that the accounts to be identified corresponding to the two fuzzy accounts have the same initial character "luha", the same suffix, and the same number of numbers, and belong to similar accounts.

由上述可以看出，通过对待识别账户进行模糊化处理，将待识别账户中可能不同的信息给模糊掉，从而产生模糊账户。模糊账户更为简单，且保留了待识别账户中相同或相似的信息，因此通过模糊账户可以直接发现待识别账户中具有相同或相似信息的账户，不容易出现误判。因此，恶意账户识别装置可以对模糊账户进行恶意账户识别，以确定待识别账户中的恶意账户。It can be seen from the above that by performing fuzzy processing on the account to be identified, information that may be different in the account to be identified is blurred, thereby generating a fuzzy account. The fuzzy account is simpler and retains the same or similar information in the account to be identified. Therefore, the account with the same or similar information in the account to be identified can be directly found through the fuzzy account, and it is not easy to misjudgment. Therefore, the malicious account identification device can perform malicious account identification on fuzzy accounts, so as to determine the malicious account among the accounts to be identified.

在一可选实施方式中，恶意账户识别装置可以直接比较模糊账户，发现完全相同的模糊账户，将完全相同的模糊账户对应的待识别账户确定为恶意账户。或者，恶意账户识别装置也可以直接比较模糊账户，发现相似程度符合预设相似度指标的模糊账户，将这些相似程度符合相似度指标的模糊账户对应的待识别账户作为恶意账户。相似度指标可以根据不同应用场景适应性设置，例如相似度指标可以是开头字符相同、后缀相同以及被模糊掉的数字个数相同；或者相似度指标也可以是开头字符相同、后缀相同以及被模糊掉的数字个数相差一个，等等。In an optional implementation manner, the device for identifying malicious accounts may directly compare fuzzy accounts, find identical fuzzy accounts, and determine the account to be identified corresponding to the identical fuzzy accounts as malicious accounts. Alternatively, the malicious account identification device may also directly compare fuzzy accounts, find fuzzy accounts whose similarity meets the preset similarity index, and use the accounts to be identified corresponding to the fuzzy accounts whose similarity meets the similarity index as malicious accounts. The similarity index can be adaptively set according to different application scenarios. For example, the similarity index can have the same initial character, the same suffix, and the same number of blurred numbers; or the similarity index can also have the same initial character, the same suffix, and the same number of blurred numbers. The number of dropped numbers differs by one, and so on.

在一可选实施方式中，恶意账户识别装置可以对模糊账户进行分组，以将相同或相似的模糊账户分为一组；按照评测参数，对每组内的模糊账户进行评测，获得每组对应的评测结果；之后，确定评测结果满足预设恶意条件的分组所对应的待识别账户作为恶意账户。In an optional implementation, the malicious account identification device can group fuzzy accounts to group the same or similar fuzzy accounts into one group; evaluate the fuzzy accounts in each group according to the evaluation parameters, and obtain the corresponding After that, it is determined that the account to be identified corresponding to the group whose evaluation result satisfies the preset malicious condition is a malicious account.

其中，评测结果满足预设恶意条件的分组所对应的待识别账户是指评测结果满足预设恶意条件的分组内各模糊账户对应的待识别账户。Wherein, the accounts to be identified corresponding to the groups whose evaluation results meet the preset malicious conditions refer to the accounts to be identified corresponding to the fuzzy accounts in the group whose evaluation results meet the preset malicious conditions.

可选的，可以预先设定相似度指标，根据相似度指标判断两个模糊账户是否相似，进而将模糊账户分为不同组。Optionally, a similarity index may be preset, and whether two fuzzy accounts are similar is judged according to the similarity index, and then the fuzzy accounts are divided into different groups.

其中，考虑到恶意账户除了具有相同或相近的信息之外，在注册时间、注册数量、信息共享等方面都会呈现较为明显的特征。举例说明：同一批恶意账户的注册时间往往比较集中，例如在同一天内注册。同一批恶意账户的注册时间间隔具有一定规律性，例如前后两个账户的注册时间间隔不超过2小时等。恶意账户的数量一般比较多，例如可能在100个以上。另外，恶意用户在注册恶意账户时一般会使用部分相同的信息，例如恶意账户会共享相同的设备互联网协议IP、MAC、UMID、TID、身份证号码、电话号码和/或联系地址等。Among them, in addition to having the same or similar information, malicious accounts will have more obvious characteristics in terms of registration time, number of registrations, and information sharing. For example: the registration time of the same batch of malicious accounts is often relatively concentrated, such as registration within the same day. The registration time interval of the same batch of malicious accounts has a certain regularity, for example, the registration time interval between the two accounts before and after does not exceed 2 hours. The number of malicious accounts is generally relatively large, for example, it may be more than 100. In addition, malicious users generally use some of the same information when registering malicious accounts, for example, malicious accounts will share the same device Internet Protocol IP, MAC, UMID, TID, ID number, phone number and/or contact address, etc.

基于上述，评测参数可以包括但不限于以下至少一个：注册平均时间间隔、注册时间规律、分组内模糊账户的个数、分组的特征、分组的后验概率、静态共享广度指标、动态共享广度指标、静态共享密集度指标和动态共享密集度指标。Based on the above, evaluation parameters may include but not limited to at least one of the following: average registration time interval, registration time regularity, number of fuzzy accounts in the group, characteristics of the group, posterior probability of the group, static sharing breadth index, dynamic sharing breadth index , static sharing intensity index and dynamic sharing intensity index.

其中，注册平均时间间隔是指同一分组内的模糊账户的注册时间的平均间隔。模糊账户的注册时间也就是模糊账户对应的待识别账户的注册时间。对于每个分组，恶意账户识别装置可以根据该分组内的模糊账户的注册时间对模糊账户的注册时间进行排序，形成时间序列，获得前后两个模糊账户的注册时间的时间间隔，进而根据所获得的全部时间间隔和时间间隔的个数获得注册平均时间间隔。Wherein, the average registration time interval refers to the average interval of registration time of fuzzy accounts in the same group. The registration time of the fuzzy account is also the registration time of the account to be identified corresponding to the fuzzy account. For each group, the malicious account identification device can sort the registration time of the fuzzy accounts according to the registration time of the fuzzy accounts in the group to form a time series, and obtain the time interval between the registration times of the two fuzzy accounts before and after, and then according to the obtained The number of total time intervals and time intervals to obtain the registration average time interval.

若一个分组对应的注册平均时间间隔越短，说明该分组中的模糊账户被集中注册的可能性较大，也就意味着是恶意账户的风险较大。If the average registration time interval corresponding to a group is shorter, it means that the vague accounts in this group are more likely to be registered collectively, which means that the risk of being a malicious account is higher.

其中，注册时间规律是指同一分组内模糊账户的注册时间之间具有的规律性。对于每个分组，恶意账户识别装置可以根据该分组内的模糊账户的注册时间对模糊账户的注册时间进行排序，形成时间序列，进而根据时间序列的标准差，获得时间序列具有的规律。Among them, the regularity of registration time refers to the regularity between the registration times of fuzzy accounts in the same group. For each group, the malicious account identification device can sort the registration time of the fuzzy accounts in the group according to the registration time of the fuzzy accounts in the group to form a time series, and then obtain the regularity of the time series according to the standard deviation of the time series.

若一个分组对应的注册时间规律性很强，说明该分组内的模糊账户被恶意注册的可能性较大，也就意味着是恶意账户的可能性较大。If the registration time corresponding to a group has a strong regularity, it means that the vague accounts in this group are more likely to be maliciously registered, which means that the possibility of being a malicious account is higher.

其中，分组内模糊账户的个数是指同一分组内模糊账户的个数。在实际应用中，不同用户注册的账户相同或相近的可能性较小，而同时出现大量相同或相近的账户的可能性就越小，因此若分组内的模糊账户越多，说明是恶意账户的可能性较大。Wherein, the number of fuzzy accounts in a group refers to the number of fuzzy accounts in the same group. In practical applications, the possibility of the same or similar accounts registered by different users is small, and the possibility of a large number of identical or similar accounts appearing at the same time is even smaller. Therefore, if there are more fuzzy accounts in the group, it means that they are malicious accounts. more likely.

其中，分组的特征是指同一分组内模糊账户具有的共同特征，也就是该分组具有的特征，例如该分组内的模糊账户的开头字符都是相同的，例如都是luha，和/或，该分组内的模糊账户含有相同的字符数。其中，分组的特征越多，意味着分组内的模糊账户具有的相同特征也就越多，说明该分组内的模糊账户的相似度越高，进而意味着是恶意账户的可能性较大。Among them, the characteristics of the group refer to the common characteristics of the fuzzy accounts in the same group, that is, the characteristics of the group, for example, the beginning characters of the fuzzy accounts in the group are the same, for example, they are all luha, and/or, the Fuzzy accounts within a group have the same number of characters. Among them, the more features of a group, it means that the fuzzy accounts in the group have more identical features, indicating that the similarity of the fuzzy accounts in the group is higher, which in turn means that the possibility of being a malicious account is higher.

其中，分组的后验概率是指同一分组内出现与之前已经确定的恶意账户属于同期注册账户的模糊账户的概率。由于合法用户与恶意用户同期注册账户的概率较低，也就是说实际应用中出现合法用户注册合法账户的同时，出现恶意用户注册恶意账户的概率较低，因此若分组内出现与已经确定的恶意账户属于同期注册账户的模糊账户，说明该组内的这些相同或相似模糊账户是恶意账户的可能性很高。Among them, the posterior probability of a group refers to the probability of an ambiguous account belonging to the registered account of the same period as the previously determined malicious account in the same group. Since the probability of legitimate users and malicious users registering accounts at the same time is low, that is to say, in actual applications, when legitimate users register legitimate accounts, the probability of malicious users registering malicious accounts is low. The account belongs to the fuzzy account of the account registered in the same period, indicating that these same or similar fuzzy accounts in this group are highly likely to be malicious accounts.

其中，静态共享广度指标用于表征分组内出现模糊账户之间共享静态信息的情况的多少。这里的静态信息主要是指用户注册账户时使用的一些不易发生变化的信息，例如可以是注册时使用的设备信息，例如设备的IP、MAC、UMID和/或TID；还可以是注册时的用户信息，例如用户的身份证号码、电话号码、姓名和/或联系地址等；还可以是注册渠道信息，例如注册来源、注册业务来源和/或注册来源网站等信息。Among them, the static sharing breadth index is used to represent how much static information is shared between fuzzy accounts within a group. The static information here mainly refers to some information that is not easy to change when the user registers an account. For example, it can be the device information used during registration, such as the IP, MAC, UMID and/or TID of the device; it can also be the user during registration. Information, such as the user's ID number, phone number, name and/or contact address, etc.; it can also be registration channel information, such as registration source, registration business source and/or registration source website and other information.

只要两个模糊账户使用了任何一个或多个相同的静态信息，则认为这两个模糊账户之间共享静态信息。对每个分组来说，恶意账户识别装置可以获取该分组内各模糊账户使用的静态信息，通过比较各模糊账户使用的静态信息，可以发现模糊账户之间是否共享静态信息。As long as any one or more of the same static information is used by two ambiguous accounts, the static information is considered to be shared between the two ambiguous accounts. For each group, the malicious account identification device can obtain the static information used by each fuzzy account in the group, and by comparing the static information used by each fuzzy account, it can be found whether the static information is shared among fuzzy accounts.

由于不同用户注册的账户使用相同静态信息的概率较低，因此若同一分组内出现模糊之间共享静态信息的情况越多，说明该组内的模糊账户属于恶意账户的概率较高。Since the accounts registered by different users have a low probability of using the same static information, if there are more instances of shared static information between fuzzy accounts in the same group, it means that the fuzzy accounts in this group have a higher probability of being malicious accounts.

其中，动态共享广度指标用于表征分组内出现模糊账户之间共享动态信息的情况的多少。这里的动态信息是指与账户有关且会随着时间发生变化的信息，例如可以是注册账户时使用的设备信息的更新，例如更新设备的IP、MAC、UMID和/或TID；还可以是用户使用账户的行为信息，例如发生的登录事件、交易事件、被CTU稽核事件、修改密码和/或修改其他注册信息等事件；还可以是使用模糊账户进行交易产生的交易信息，例如交易主动方信息、交易被动方信息、交易商品信息(尤其是高危商品信息)、收货地址、收货主动方信息和/或收货被动方信息。Among them, the dynamic sharing breadth index is used to represent how much dynamic information is shared among fuzzy accounts within a group. The dynamic information here refers to the information related to the account and will change over time, for example, it can be the update of the device information used when registering the account, such as updating the IP, MAC, UMID and/or TID of the device; it can also be the user Behavior information about the use of accounts, such as login events, transaction events, events audited by CTU, changes to passwords, and/or changes to other registration information, etc.; it can also be transaction information generated by using fuzzy accounts for transactions, such as transaction active party information , transaction passive party information, transaction product information (especially high-risk product information), delivery address, information of the active party receiving goods and/or information of the passive party receiving goods.

只要两个模糊账户因其使用产生了任何一个或多个相同的动态信息，则认为这两个模糊账户之间共享动态信息。对每个分组来说，恶意账户识别装置可以获取该分组内各模糊账户对应的动态信息，通过比较各模糊账户对应的动态信息，可以发现模糊账户之间是否共享动态信息。As long as two fuzzy accounts generate any one or more of the same dynamic information due to their use, it is considered that the dynamic information is shared between the two fuzzy accounts. For each group, the malicious account identification device can obtain the dynamic information corresponding to each fuzzy account in the group, and by comparing the dynamic information corresponding to each fuzzy account, it can find out whether the dynamic information is shared among fuzzy accounts.

进一步，还可以限定两个模糊账户之间是否在指定时间内共享动态信息，例如在同一天内共享相同动态信息。Further, it is also possible to limit whether dynamic information is shared between two fuzzy accounts within a specified time, for example, the same dynamic information is shared within the same day.

由于不同用户使用账户产生相同行为的概率较低，因此若同一分组内出现模糊之间共享动态信息的情况越多，说明该组内的模糊账户属于恶意账户的概率较高。Since the probability that different users use accounts to produce the same behavior is low, if there are more dynamic information shared between fuzzy accounts in the same group, it means that the fuzzy accounts in this group have a higher probability of being malicious accounts.

其中，静态共享密集度指标用于表征分组内出现的共享静态信息的模糊账户之间所共享的静态信息的多少。这里的静态信息和之前的静态信息的定义相同，不再赘述。Among them, the static sharing intensity index is used to characterize how much static information is shared between fuzzy accounts that share static information appearing in the group. The definition of the static information here is the same as that of the previous static information, and will not be repeated here.

其中，共享静态信息的模糊账户所共享的静态信息越多，说明两个模糊账户越相近，属于恶意账户的可能性也就较大。例如，若第一模糊账户与第二模糊账户同时共享设备IP、用户身份证号码、联系地址等信息，但是第一模糊账户与第三模糊账户仅共享了设备IP这一静态信息，则意味着第一模糊账户与第二模糊账户更相近。对于每个分组来说，恶意账户识别装置首先可以获取共享静态信息的模糊账户，进而统计这些模糊账户所共享的静态信息的多少。Among them, the more static information shared by the fuzzy accounts that share static information, the more similar the two fuzzy accounts are, the greater the possibility of belonging to malicious accounts. For example, if the first fuzzy account and the second fuzzy account share information such as device IP, user ID number, and contact address at the same time, but the first fuzzy account and the third fuzzy account only share the static information of the device IP, it means The first fuzzy account is more similar to the second fuzzy account. For each group, the device for identifying malicious accounts may first obtain fuzzy accounts that share static information, and then count the amount of static information shared by these fuzzy accounts.

其中，动态共享密集度指标用于表征分组内出现的共享动态信息的模糊账户之间所共享的动态信息的多少。这里的动态信息和之前的动态信息的定义相同，不再赘述。Among them, the dynamic sharing intensity index is used to characterize how much dynamic information is shared among the fuzzy accounts that share dynamic information appearing in the group. The definition of the dynamic information here is the same as that of the previous dynamic information, and will not be repeated here.

其中，共享动态信息的模糊账户所共享的动态信息越多，说明两个模糊账户越相近，属于恶意账户的可能性也就较大。例如，若第一模糊账户与第二模糊账户在同一天进行了登录、修改了密码、并且均购买了同一高危商品，但是第一模糊账户与第三模糊账户仅在同一天进行了登录，则意味着第一模糊账户与第二模糊账户更相近。对于每个分组来说，恶意账户识别装置首先可以获取共享动态信息的模糊账户，进而统计这些模糊账户所共享的动态信息的多少。Among them, the more dynamic information shared by the fuzzy accounts that share dynamic information, the more similar the two fuzzy accounts are, the greater the possibility of belonging to malicious accounts. For example, if the first fuzzy account and the second fuzzy account log in, change the password, and both purchase the same high-risk commodity on the same day, but the first fuzzy account and the third fuzzy account only log in on the same day, then means that the first fuzzy account is more similar to the second fuzzy account. For each group, the device for identifying malicious accounts may first obtain fuzzy accounts that share dynamic information, and then count the amount of dynamic information shared by these fuzzy accounts.

值得说明的是，基于上述评测参数，对每组内的模糊账户进行评测，以获得每组对应的评测结果的方式可以有多种。例如，恶意账户识别装置可以独立使用上述任一评测参数，对每组内的模糊账户进行评测，以获得每组对应的评测结果。又例如，恶意账户识别装置可以同时使用多个评测参数，为每个评测参数分配不同权重，分别使用每个评测参数对分组内的模糊账户进行评测，获得每个评测参数对应的评测值，再根据每个评测参数的评测值和每个评测参数的权重进行数值处理，获得最终评测结果。It is worth noting that, based on the above evaluation parameters, there may be multiple ways to evaluate the fuzzy accounts in each group to obtain the evaluation results corresponding to each group. For example, the device for identifying a malicious account may independently use any of the above evaluation parameters to evaluate the fuzzy accounts in each group, so as to obtain the evaluation results corresponding to each group. For another example, the malicious account identification device can use multiple evaluation parameters at the same time, assign different weights to each evaluation parameter, use each evaluation parameter to evaluate the fuzzy accounts in the group, obtain the evaluation value corresponding to each evaluation parameter, and then Numerical processing is performed according to the evaluation value of each evaluation parameter and the weight of each evaluation parameter to obtain the final evaluation result.

举例说明，在经过上述评测处理之后，恶意账户识别装置可以确定注册平均时间间隔较小且注册时间规律性较强的分组对应的待识别账户为恶意账户。或者，经过上述评测处理之后，恶意账户识别装置可以确定注册平均时间间隔较小、注册时间规律性较强以及分组内模糊账户个数较多的分组对应的待识别账户为恶意账户。或者，经过上述评测处理之后，恶意账户识别装置可以确定注册平均时间间隔较小、注册时间规律性较强、分组内模糊账户个数较多、静态共享广度指标较高以及动态共享广度指标较高等的分组对应的待识别账户为恶意账户。For example, after the above evaluation process, the device for identifying malicious accounts may determine that the accounts to be identified corresponding to the groups whose average registration time interval is relatively small and whose registration time is relatively regular are malicious accounts. Alternatively, after the above-mentioned evaluation process, the malicious account identification device may determine that the accounts to be identified corresponding to groups with a shorter average registration time interval, stronger registration time regularity, and a large number of fuzzy accounts in the group are malicious accounts. Or, after the above evaluation process, the malicious account identification device can determine that the average registration time interval is small, the registration time regularity is strong, the number of fuzzy accounts in the group is large, the static sharing breadth index is high, and the dynamic sharing breadth index is high, etc. The account to be identified corresponding to the group of is a malicious account.

综上所述，在本实施例中，获取待识别账户，按照模糊处理指示信息，对待识别账户进行模糊化处理，获得保留了待识别账户中部分信息的模糊账户，其中，模糊处理指示信息的作用是发现待识别账户中具有相同或相似信息的账户，因此通过比较模糊账户可以发现具有相同或相似信息的待识别账户，这些账户通常属于恶意账户，进一步基于模糊账户进行恶意账户识别，可以更加准确的发现待识别账户中的恶意账户，降低误判率。To sum up, in this embodiment, the account to be identified is obtained, and the account to be identified is fuzzified according to the fuzzy processing instruction information, and the fuzzy account that retains part of the information in the account to be identified is obtained, wherein the fuzzy processing instruction information The function is to find accounts with the same or similar information among the accounts to be identified. Therefore, by comparing fuzzy accounts, we can find accounts to be identified with the same or similar information. These accounts are usually malicious accounts. Further identification of malicious accounts based on fuzzy accounts can be more Accurately discover malicious accounts among the accounts to be identified, reducing the false positive rate.

图2为本申请另一实施例提供的恶意账户识别方法的流程示意图。如图2所示，该方法包括：FIG. 2 is a schematic flowchart of a method for identifying a malicious account provided by another embodiment of the present application. As shown in Figure 2, the method includes:

201、获取待识别账户。201. Acquire an account to be identified.

202、按照模糊处理指示信息包括的至少一种模糊粒度的模糊化参数，对待识别账户进行模糊化处理，以获得每种模糊粒度下保留了待识别账户中部分信息的模糊账户。202. Perform fuzzy processing on the account to be identified according to the fuzzy parameters of at least one fuzzy granularity included in the fuzzy processing instruction information, so as to obtain a fuzzy account that retains part of the information in the account to be identified at each fuzzy granularity.

203、根据业务场景，从至少一种模糊粒度中确定目标粒度。203. Determine a target granularity from at least one fuzzy granularity according to a business scenario.

204、从所有模糊账户中，选出目标粒度下的模糊账户。204. From all fuzzy accounts, select fuzzy accounts under the target granularity.

205、对目标粒度下的模糊账户进行分组，以将相同或相近的模糊账户分为一组。205. Group the fuzzy accounts at the target granularity, so as to group the same or similar fuzzy accounts into one group.

206、按照评测参数，对每组内的模糊账户进行评测，以获得每组对应的评测结果。206. Evaluate the fuzzy accounts in each group according to the evaluation parameters, so as to obtain the evaluation results corresponding to each group.

207、确定评测结果满足预设恶意条件的分组所对应的待识别账户为恶意账户。207. Determine that the account to be identified corresponding to the group whose evaluation result satisfies the preset malicious condition is a malicious account.

在本实施例中，模糊处理指示信息包括至少一种模糊粒度的模糊化参数。不同的模糊粒度意味着待识别账户中被模糊掉的信息不同。举例说明，至少一种模糊粒度的模糊化参数可以包括但不限于：第一模糊粒度的模糊化参数、第二模糊粒度的模糊化参数、第三模糊粒度的模糊化参数、第四模糊粒度的模糊化参数和第五模糊粒度的模糊化参数。In this embodiment, the blurring indication information includes at least one blurring parameter of a blurring granularity. Different fuzzy granularity means that the information to be fuzzed out in the account to be identified is different. For example, the fuzzy parameters of at least one fuzzy granularity may include but not limited to: the fuzzy parameters of the first fuzzy granularity, the fuzzy parameters of the second fuzzy granularity, the fuzzy parameters of the third fuzzy granularity, and the fuzzy parameters of the fourth fuzzy granularity. A fuzzification parameter and a fuzzification parameter of a fifth fuzzing granularity.

其中，第一模糊粒度的模糊化参数用于指示：模糊掉账户前缀中的所有数字，并保留被模糊掉的数字个数。Wherein, the obfuscation parameter of the first fuzzy granularity is used to indicate: obfuscate all digits in the account prefix, and retain the number of obfuscated digits.

第二模糊粒度的模糊化参数用于指示：模糊掉账户前缀中的所有数字，忽略被模糊掉的数字个数，需标识模糊掉的部分是数字。The fuzzing parameter of the second fuzzing granularity is used to indicate: to fuzz all the numbers in the account prefix, ignore the number of fuzzed numbers, and mark the fuzzed part as a number.

第三模糊粒度的模糊化参数用于指示：模糊掉账户前缀中的所有数字，忽略被模糊掉的数字个数，并模糊掉账户前缀中非数字字符中除指定位置处的非数字字符之外的其他非数字字符，并保留被模糊掉的非数字字符的个数。The fuzzing parameter of the third fuzzing granularity is used to indicate: fuzz all numbers in the account prefix, ignore the number of numbers to be fuzzed, and fuzz the non-numeric characters in the account prefix except the non-numeric characters at the specified position other non-numeric characters, and keep the number of non-numeric characters that are blurred out.

第四模糊粒度的模糊化参数用于指示：模糊掉账户前缀中的所有数字，忽略被模糊掉的数字个数，模糊掉账户前缀中非数字字符中除指定位置处的非数字字符之外的其他非数字字符，并忽略被模糊掉的非数字字符的个数。The fuzzing parameter of the fourth fuzzing granularity is used to indicate: fuzz all numbers in the account prefix, ignore the number of numbers to be fuzzed, and fuzz all non-numeric characters in the account prefix except the non-numeric characters at the specified position other non-numeric characters, and ignore the number of non-numeric characters that are blurred out.

第五模糊粒度的模糊化参数用于指示：模糊掉账户前缀中所有字符组合，所述字符组合是指除起分割作用的分割字符之外的其他任意字符的组合，并忽略被模糊掉的字符组合中的字符个数。The fuzzy parameter of the fifth fuzzy granularity is used to indicate: to fuzz out all character combinations in the account prefix, the character combination refers to any combination of characters except the splitting characters used for segmentation, and ignore the fuzzy characters The number of characters in the combination.

获取待识别账户后，可以针对每种模糊粒度，分别对待识别账户进行模糊化处理，这样就会获得每种模糊粒度下的模糊账户。由于不同业务场景所需要的模糊粒度并不相同，所以可以根据业务场景，从所有模糊粒度中确定所需的目标粒度，进而从所有模糊账户中选择目标粒度下的模糊账户。值得说明的是，目标粒度可以是一种或多种模糊粒度。After the account to be identified is obtained, the account to be identified can be fuzzified for each fuzzy granularity, so that the fuzzy account under each fuzzy granularity can be obtained. Since different business scenarios require different fuzzy granularities, the required target granularity can be determined from all fuzzy granularities according to the business scenario, and then the fuzzy account under the target granularity can be selected from all fuzzy accounts. It should be noted that the target granularity may be one or more fuzzy granularities.

之后，针对每个目标粒度下的模糊账户进行恶意账户识别。该识别过程可参见上述实施例的描述，在此不再赘述。关于评测参数的描述也可以参见上述实施例，在此不再赘述。Afterwards, malicious account identification is carried out for fuzzy accounts under each target granularity. For the identification process, reference may be made to the description of the foregoing embodiments, and details are not repeated here. For the description of the evaluation parameters, reference may also be made to the foregoing embodiments, and details are not repeated here.

以账户luha3902163.com和luh244163.com为例，根据第一模糊粒度的模糊化参数进行模糊化处理后得到模糊账户为：luha^^^^163.com和luh^^^163.com；根据第二模糊粒度的模糊化参数进行模糊化处理后得到模糊账户为：luha^163.com和luh^163.com；经过第三模糊粒度的模糊化参数进行模糊化处理后得到模糊账户为：lucc^163.com和luc^163.com，其中指定位置处的非数字字符是指开头2个非数字字符；经过第四模糊粒度的模糊化参数进行模糊化处理后得到模糊账户为：luc^163.com和luc^163.com；经过第五模糊粒度的模糊化参数进行模糊化处理后得到模糊账户为：x163.com和x163.com。其中，上述模糊账户中的“^”、“c”和“x”属于模糊化处理后用于替代原字符的标识符。Taking the accounts luha3902163.com and luh244163.com as examples, the fuzzy accounts obtained after fuzzy processing according to the fuzzy parameter of the first fuzzy granularity are: luha^^^^163.com and luh^^^163.com; After the fuzzy parameters of the second fuzzy granularity are fuzzified, the fuzzy accounts obtained are: luha^163.com and luh^163.com; after the fuzzy parameters of the third fuzzy granularity are fuzzy processed, the fuzzy accounts obtained are: lucc^ 163.com and luc^163.com, where the non-numeric characters at the specified position refer to the first two non-numeric characters; the fuzzy account obtained after fuzzing with the fuzzy parameter of the fourth fuzzy granularity is: luc^163. com and luc^163.com; the fuzzy accounts obtained after fuzzy processing by the fuzzy parameter of the fifth fuzzy granularity are: x163.com and x163.com. Among them, "^", "c" and "x" in the above-mentioned fuzzy account are identifiers used to replace the original characters after fuzzy processing.

例如，对于注册电子邮箱的业务场景，可以选择第一模糊粒度为目标粒度，则该目标粒度下的模糊账户为luha^^^^163.com和luh^^^163.com，进一步对这两个模糊账户进行恶意账户识别。具体的，恶意账户识别装置可以对这两个模糊账户进行分组，假设这两个模糊账户分为一组，之后利用评测参数对该组内的模糊账户进行评测处理，获得评测结果。例如，以注册时间平均间隔和静态共享广度指标为评测参数，则首先可以获得这两个模糊账户的注册时间间隔，根据该注册时间间隔给这两个模糊账户所在分组打一分值；进一步判断这两个模糊账户之间是否共享静态信息，根据判断结果为这两个模糊账户所在分组再打一分值，根据这两个分值，获得该分组的最终得分，即评测结果。之后，恶意账户识别装置将该分组的最终得分与预设的恶意条件中的分值门限进行比较，若大于该门限，确定该分组对应的待识别账户，即账户luha3902163.com和luh244163.com属于恶意账户；否则，不属于恶意账户。For example, for the business scenario of registering an e-mail, the first fuzzy granularity can be selected as the target granularity, then the fuzzy accounts under the target granularity are luha^^^^163.com and luh^^^163.com, and further A fuzzy account is used to identify malicious accounts. Specifically, the malicious account identification device may group the two fuzzy accounts, assuming that the two fuzzy accounts are grouped into one group, and then use the evaluation parameters to perform evaluation processing on the fuzzy accounts in the group to obtain evaluation results. For example, taking the average interval of registration time and the static sharing breadth index as evaluation parameters, the registration time interval of the two fuzzy accounts can be obtained first, and a score is given to the group of the two fuzzy accounts according to the registration time interval; further judgment Whether static information is shared between the two fuzzy accounts, according to the judgment result, a score is assigned to the group where the two fuzzy accounts belong, and the final score of the group is obtained according to the two scores, that is, the evaluation result. Afterwards, the malicious account identification device compares the final score of the group with the score threshold in the preset malicious condition, if it is greater than the threshold, it determines that the account to be identified corresponding to the group, that is, accounts luha3902163.com and luh244163.com belong to Malicious account; otherwise, it is not a malicious account.

在本实施例中，利用不同的模糊粒度对待识别账户进行模糊处理，获得不同模糊粒度下的模糊账户；再根据业务场景挑选所需粒度的模糊账户，对这些模糊账户进行分组，之后利用评测参数，针对每个分组进行评测，并最终根据评测结果确定恶意账户，可以更加准确的发现待识别账户中的恶意账户，降低误判率。In this embodiment, different fuzzy granularities are used to fuzzy the accounts to be identified to obtain fuzzy accounts under different fuzzy granularities; then select fuzzy accounts with required granularity according to business scenarios, group these fuzzy accounts, and then use the evaluation parameters to , evaluate each group, and finally determine the malicious account according to the evaluation result, which can more accurately discover the malicious account among the accounts to be identified, and reduce the misjudgment rate.

需要说明的是，对于前述的各方法实施例，为了简单描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本申请并不受所描述的动作顺序的限制，因为依据本申请，某些步骤可以采用其他顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和模块并不一定是本申请所必须的。It should be noted that for the foregoing method embodiments, for the sake of simple description, they are expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Depending on the application, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification belong to preferred embodiments, and the actions and modules involved are not necessarily required by this application.

在上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述的部分，可以参见其他实施例的相关描述。In the foregoing embodiments, the descriptions of each embodiment have their own emphases, and for parts not described in detail in a certain embodiment, reference may be made to relevant descriptions of other embodiments.

图3为本申请一实施例提供的恶意账户识别装置的结构示意图。如图3所示，该装置包括：获取模块31、模糊化处理模块32和识别模块33。FIG. 3 is a schematic structural diagram of a malicious account identification device provided by an embodiment of the present application. As shown in FIG. 3 , the device includes: an acquisition module 31 , a fuzzy processing module 32 and an identification module 33 .

获取模块31，用于获取待识别账户。An acquisition module 31, configured to acquire an account to be identified.

模糊化处理模块32，与获取模块31连接，用于按照模糊处理指示信息，对获取模块31获取的待识别账户进行模糊化处理，以获得保留待识别账户中部分信息的模糊账户；其中，模糊处理指示信息用以发现待识别账户中具有相同或相似信息的账户。The fuzzy processing module 32 is connected with the acquisition module 31, and is used to perform fuzzy processing on the account to be identified obtained by the acquisition module 31 according to the fuzzy processing instruction information, so as to obtain a fuzzy account that retains part of the information in the account to be identified; wherein, the fuzzy The processing instruction information is used to find accounts with the same or similar information among the accounts to be identified.

识别模块33，与模糊化处理模块32连接，用于对模糊化处理模块32获得的模糊账户进行恶意账户识别，以确定待识别账户中的恶意账户。The identification module 33 is connected with the obfuscation processing module 32, and is used for identifying malicious accounts on the fuzzy accounts obtained by the obfuscation processing module 32, so as to determine the malicious accounts among the accounts to be identified.

在一可选实施方式中，识别模块33具体可用于：对模糊账户进行分组，以将相同或相似的模糊账户分为一组；按照评测参数，对每组内的模糊账户进行评测，以获得每组对应的评测结果；确定评测结果满足预设恶意条件的分组所对应的待识别账户为恶意账户。In an optional implementation, the identification module 33 can be specifically used to: group fuzzy accounts, so as to group the same or similar fuzzy accounts into one group; evaluate the fuzzy accounts in each group according to the evaluation parameters, to obtain The evaluation results corresponding to each group; it is determined that the accounts to be identified corresponding to the groups whose evaluation results meet the preset malicious conditions are malicious accounts.

在一可选实施方式中，模糊处理指示信息包括至少一种模糊粒度的模糊化参数。基于此，模糊化处理模块32具体可用于：根据至少一种模糊粒度中每种模糊粒度的模糊化参数，分别对待识别账户进行模糊化处理，以获得每种模糊粒度下保留待识别账户中部分信息的模糊账户。In an optional implementation manner, the blurring indication information includes at least one blurring parameter of a blurring granularity. Based on this, the fuzzy processing module 32 can be specifically configured to: according to the fuzzy parameters of each fuzzy granularity in at least one fuzzy granularity, respectively perform fuzzy processing on the accounts to be identified, so as to obtain the remaining part of the account to be identified under each fuzzy granularity An ambiguous account of information.

可选的，本实施例的待识别账户可以包括：账户前缀和账户后缀。Optionally, the account to be identified in this embodiment may include: an account prefix and an account suffix.

则至少一种模糊粒度的模糊化参数包括：Then the fuzzy parameters of at least one fuzzy granularity include:

第一模糊粒度的模糊化参数用于指示：模糊掉账户前缀中的所有数字，并保留被模糊掉的数字个数；The fuzzing parameter of the first fuzzing granularity is used to indicate: fuzz all the digits in the account prefix and keep the number of fuzzed digits;

第二模糊粒度的模糊化参数用于指示：模糊掉账户前缀中的所有数字，忽略被模糊掉的数字个数，需标识模糊掉的部分是数字；The fuzzing parameter of the second fuzzing granularity is used to indicate: fuzz all the numbers in the account prefix, ignore the number of fuzzed numbers, and mark the fuzzed part as a number;

第三模糊粒度的模糊化参数用于指示：模糊掉账户前缀中的所有数字，忽略被模糊掉的数字个数，并模糊掉账户前缀中非数字字符中除指定位置处的非数字字符之外的其他非数字字符，并保留被模糊掉的非数字字符的个数；The fuzzing parameter of the third fuzzing granularity is used to indicate: fuzz all numbers in the account prefix, ignore the number of numbers to be fuzzed, and fuzz the non-numeric characters in the account prefix except the non-numeric characters at the specified position other non-numeric characters, and retain the number of non-numeric characters that are blurred out;

第四模糊粒度的模糊化参数用于指示：模糊掉账户前缀中的所有数字，忽略被模糊掉的数字个数，模糊掉账户前缀中非数字字符中除指定位置处的非数字字符之外的其他非数字字符，并忽略被模糊掉的非数字字符的个数；和The fuzzing parameter of the fourth fuzzing granularity is used to indicate: fuzz all numbers in the account prefix, ignore the number of numbers to be fuzzed, and fuzz all non-numeric characters in the account prefix except the non-numeric characters at the specified position other non-numeric characters, ignoring the number of non-numeric characters blurred out; and

第五模糊粒度的模糊化参数用于指示：模糊掉账户前缀中所有字符组合，字符组合是指除起分割作用的分割字符之外的其他任意字符的组合，并忽略被模糊掉的字符组合中的字符个数。The fuzzing parameter of the fifth fuzzing granularity is used to indicate: all character combinations in the account prefix are fuzzy, and the character combination refers to any combination of characters except the splitting character used for segmentation, and the fuzzy character combination is ignored the number of characters.

基于上述至少一种模糊粒度的模糊化参数，则识别模块33用于对模糊账户进行分组，以将相同或相似的模糊账户分为一组，具体可以是：Based on the fuzzification parameters of at least one kind of fuzzy granularity above, the identification module 33 is used to group fuzzy accounts, so as to group the same or similar fuzzy accounts into one group, which may specifically be:

根据业务场景，从至少一种模糊粒度中确定目标粒度；Determine the target granularity from at least one fuzzy granularity according to the business scenario;

从所有模糊账户中，选出目标粒度下的模糊账户；From all fuzzy accounts, select fuzzy accounts under the target granularity;

对目标粒度下的模糊账户进行分组，以将相同或相似的模糊账户分为一组。Group fuzzy accounts at the target granularity to group the same or similar fuzzy accounts into one group.

可选的，上述评测参数可以包括以下至少一个：注册平均时间间隔、注册时间规律、分组内模糊账户的个数、分组的特征、分组的后验概率、静态共享广度指标、动态共享广度指标、静态共享密集度指标和动态共享密集度指标。Optionally, the above evaluation parameters may include at least one of the following: average registration time interval, registration time regularity, number of fuzzy accounts in the group, characteristics of the group, posterior probability of the group, static sharing breadth index, dynamic sharing breadth index, Static share-intensity metrics and dynamic share-intensity metrics.

其中，静态共享广度指标用于表征分组内出现模糊账户之间共享静态信息的情况的多少；Among them, the static sharing breadth indicator is used to represent how much static information is shared between fuzzy accounts within a group;

动态共享广度指标用于表征分组内出现模糊账户之间共享动态信息的情况的多少；The dynamic sharing breadth index is used to characterize how much dynamic information is shared between fuzzy accounts within a group;

静态共享密集度指标用于表征分组内出现的共享静态信息的模糊账户之间所共享的静态信息的多少；The static sharing intensity index is used to characterize the amount of static information shared between the fuzzy accounts that share static information appearing in the group;

动态共享密集度指标用于表征分组内出现的共享动态信息的模糊账户之间所共享的动态信息的多少。The index of dynamic sharing density is used to characterize the amount of dynamic information shared among the fuzzy accounts that share dynamic information appearing in the group.

本实施例提供的恶意账户识别装置，获取待识别账户，按照模糊处理指示信息，对待识别账户进行模糊化处理，获得保留了待识别账户中部分信息的模糊账户，其中，模糊处理指示信息的作用是发现待识别账户中具有相同或相似信息的账户，因此通过比较模糊账户可以发现具有相同或相似信息的待识别账户，这些账户通常属于恶意账户，进一步基于模糊账户进行恶意账户识别，可以更加准确的发现待识别账户中的恶意账户，降低误判率。The malicious account identification device provided in this embodiment obtains the account to be identified, performs fuzzy processing on the account to be identified according to the fuzzy processing instruction information, and obtains a fuzzy account that retains part of the information in the account to be identified, wherein the function of the fuzzy processing instruction information It is to find accounts with the same or similar information among the accounts to be identified. Therefore, by comparing fuzzy accounts, it is possible to find accounts to be identified with the same or similar information. These accounts are usually malicious accounts. Further identification of malicious accounts based on fuzzy accounts can be more accurate. Malicious accounts among the accounts to be identified can be found to reduce the misjudgment rate.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统，装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

在本申请所提供的几个实施例中，应该理解到，所揭露的系统，装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or in the form of hardware plus software functional units.

上述以软件功能单元的形式实现的集成的单元，可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(Read-OnlyMemory，ROM)、随机存取存储器(RandomAccessMemory，RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The above-mentioned integrated units implemented in the form of software functional units may be stored in a computer-readable storage medium. The above-mentioned software functional units are stored in a storage medium, and include several instructions to make a computer device (which may be a personal computer, server, or network device, etc.) or a processor (processor) execute the methods described in various embodiments of the present application. partial steps. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-OnlyMemory, ROM), random access memory (RandomAccessMemory, RAM), magnetic disk or optical disk, and various media capable of storing program codes.

最后应说明的是：以上实施例仅用以说明本申请的技术方案，而非对其限制；尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, rather than limiting them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent replacements are made to some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present application.

Claims

1. A malicious account identification method, characterized in that, comprising:

Obtain the account to be identified;

According to the fuzzy processing instruction information, perform fuzzy processing on the account to be identified to obtain a fuzzy account that retains part of the information in the account to be identified; wherein, the fuzzy processing instruction information is used to find that the account to be identified has Accounts with the same or similar information;

Malicious account identification is performed on the fuzzy account to determine a malicious account among the accounts to be identified.

2. The method according to claim 1, wherein the identifying a malicious account on the fuzzy account to determine the malicious account among the accounts to be identified comprises:

grouping said ambiguous accounts to group identical or similar said ambiguous accounts;

Evaluate the fuzzy accounts in each group according to the evaluation parameters, so as to obtain the evaluation results corresponding to each group;

It is determined that the account to be identified corresponding to the group whose evaluation result satisfies the preset malicious condition is a malicious account.

3. The method according to claim 2, wherein the blurring instruction information includes at least one blurring parameter of a blurring granularity;

According to the fuzzy processing instruction information, performing fuzzy processing on the account to be identified, so as to obtain a fuzzy account that retains part of the information in the account to be identified, includes:

According to the fuzzification parameters of each fuzzy granularity in the at least one fuzzy granularity, respectively perform fuzzy processing on the account to be identified, so as to obtain the fuzzy information that retains part of the information in the account to be identified at each fuzzy granularity. account.

4. The method according to claim 3, wherein the account to be identified comprises: an account prefix and an account suffix;

The blurring parameters of the at least one blurring granularity include:

The fuzzing parameter of the first fuzzing granularity is used to indicate: fuzz all the digits in the account prefix and keep the number of fuzzed digits;

The fuzzing parameter of the second fuzzing granularity is used to indicate: fuzz all the numbers in the account prefix, ignore the number of fuzzed numbers, and mark the fuzzed part as a number;

The fuzzing parameter of the third fuzzing granularity is used to indicate: fuzz all the numbers in the account prefix, ignore the number of numbers to be fuzzed, and fuzz the non-numeric characters in the account prefix except the non-numeric characters at the specified position other non-numeric characters, and retain the number of non-numeric characters that are blurred out;

The fuzzing parameter of the fourth fuzzing granularity is used to indicate: fuzz all numbers in the account prefix, ignore the number of numbers to be fuzzed, and fuzz all non-numeric characters in the account prefix except the non-numeric characters at the specified position other non-numeric characters, ignoring the number of non-numeric characters blurred out; and

The fuzzy parameter of the fifth fuzzy granularity is used to indicate: to fuzz out all character combinations in the account prefix, the character combination refers to any combination of characters except the splitting characters used for segmentation, and ignore the fuzzy characters The number of characters in the combination.

5. The method according to claim 3 or 4, wherein the grouping the fuzzy accounts to group the same or similar fuzzy accounts includes:

Determine a target granularity from the at least one fuzzy granularity according to a business scenario;

Selecting the fuzzy accounts under the target granularity from all the fuzzy accounts;

The fuzzy accounts under the target granularity are grouped, so as to group the same or similar fuzzy accounts.

6. The method according to claim 2, 3 or 4, wherein the evaluation parameters include at least one of the following: average registration time interval, registration time regularity, number of fuzzy accounts in the group, characteristics of the group, group Posterior probability, static sharing breadth index, dynamic sharing breadth index, static sharing intensity index and dynamic sharing intensity index;

Wherein, the static sharing breadth index is used to represent how much static information is shared between fuzzy accounts within a group;

The dynamic sharing breadth index is used to characterize how much dynamic information is shared between fuzzy accounts within a group;

The static sharing intensity index is used to characterize the amount of static information shared between fuzzy accounts that share static information appearing in the group;

The dynamic sharing intensity index is used to characterize how much dynamic information is shared between fuzzy accounts that share dynamic information appearing in the group.

7. A malicious account identification device, characterized in that it comprises:

An acquisition module, configured to acquire an account to be identified;

A fuzzy processing module, configured to perform fuzzy processing on the account to be identified according to the fuzzy processing instruction information, so as to obtain a fuzzy account that retains part of the information in the account to be identified; wherein, the fuzzy processing instruction information is used to find Accounts with the same or similar information among the accounts to be identified;

An identifying module, configured to identify malicious accounts on the fuzzy accounts, so as to determine the malicious accounts among the accounts to be identified.

8. The device according to claim 7, wherein the identification module is specifically used for:

9. The device according to claim 8, wherein the blurring instruction information includes at least one blurring parameter of a blurring granularity;

The fuzzy processing module is specifically configured to: perform fuzzy processing on the account to be identified according to the fuzzy parameters of each fuzzy granularity in the at least one fuzzy granularity, so as to obtain the The fuzzy account of part of the information in the account to be identified.

10. The device according to claim 9, wherein the account to be identified comprises: an account prefix and an account suffix;

The blurring parameters of the at least one blurring granularity include:

11. The device according to claim 9 or 10, wherein the identification module is specifically used for:

12. The device according to claim 8, 9 or 10, wherein the evaluation parameters include at least one of the following: average time interval of registration, regularity of registration time, number of fuzzy accounts in the group, characteristics of the group, group Posterior probability, static sharing breadth index, dynamic sharing breadth index, static sharing intensity index and dynamic sharing intensity index;