[go: up one dir, main page]

CN105897714B - Botnet detection method based on DNS traffic characteristics - Google Patents

Botnet detection method based on DNS traffic characteristics Download PDF

Info

Publication number
CN105897714B
CN105897714B CN201610222674.5A CN201610222674A CN105897714B CN 105897714 B CN105897714 B CN 105897714B CN 201610222674 A CN201610222674 A CN 201610222674A CN 105897714 B CN105897714 B CN 105897714B
Authority
CN
China
Prior art keywords
domain name
domain
address
detection method
confidence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610222674.5A
Other languages
Chinese (zh)
Other versions
CN105897714A (en
Inventor
喻梅
李鑫
于健
王建荣
赵越
雷霆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Zhiwei Ruichi Information Technology Co ltd
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201610222674.5A priority Critical patent/CN105897714B/en
Publication of CN105897714A publication Critical patent/CN105897714A/en
Application granted granted Critical
Publication of CN105897714B publication Critical patent/CN105897714B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2463/00Additional details relating to network architectures or network communication protocols for network security covered by H04L63/00
    • H04L2463/144Detection or countermeasures against botnets

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

一种基于DNS流量特征的僵尸网络检测方法,包括:基于DNS流量特征的Domain‑Flux僵尸网络检测方法:将合法主域名和非法主域名组合起来形成目标集合;处理提取长度大于6的域名作为研究对象;分别计算域名熵值、构词法特征、语音特征和分组特征;放入随机森林分类器得到训练模型。基于Domain‑Flux僵尸网络检测方法的Fast‑Flux僵尸网络检测方法:将DNS服务器的原始数据进行处理;使用前面得到的训练模型对预处理的域名进行评估,获取DGA情况的打分;使用白名单、黑名单和灰名单对域名和IP进行评分;计算IP地址的时间特性;计算IP地址的稳定性;放入随机森林分类器得到训练模型SFF。本发明实验的准确率较高。

A botnet detection method based on DNS traffic characteristics, including: Domain‑Flux botnet detection method based on DNS traffic characteristics: combining legal primary domain names and illegal primary domain names to form a target set; processing and extracting domain names with a length greater than 6 for research object; respectively calculate the domain name entropy value, word formation features, phonetic features and grouping features; put it into the random forest classifier to obtain the training model. The Fast-Flux botnet detection method based on the Domain-Flux botnet detection method: process the raw data of the DNS server; use the training model obtained above to evaluate the pre-processed domain name and obtain the score of the DGA situation; use the whitelist, The blacklist and graylist score the domain name and IP; calculate the time characteristics of the IP address; calculate the stability of the IP address; put it into the random forest classifier to obtain the training model SFF. The accuracy rate of the experiment of the present invention is higher.

Description

基于DNS流量特征的僵尸网络检测方法Botnet detection method based on DNS traffic characteristics

技术领域technical field

本发明涉及一种DNS域名技术和机器学习系分类算法。特别是涉及一种基于DNS流量特征的僵尸网络检测方法。The invention relates to a DNS domain name technology and a machine learning system classification algorithm. In particular, it relates to a botnet detection method based on DNS traffic characteristics.

背景技术Background technique

在目前的域名生成技术中,主要有:Among the current domain name generation technologies, there are mainly:

(1)Domain-Flux技术:Domain-Flux指不停的改变和分配多个域名到一个或多个IP的行为。(1) Domain-Flux technology: Domain-Flux refers to the behavior of continuously changing and assigning multiple domain names to one or more IPs.

(2)Fast-Flux技术:此技术有两种:Single-Flux域名技术和Double-Flux域名技术。(2) Fast-Flux technology: There are two types of this technology: Single-Flux domain name technology and Double-Flux domain name technology.

Single-Flux域名技术可以类比Tor网络来看,在基于Single-Flux域名技术的僵尸网络中的每一个僵尸主机都是一个重定向节点,这样基于不同僵尸主机的重定向实现最优寻址的过程,一方面避免单一节点对整个僵尸网络的影响,另一方面也使得研究者难以循迹。The Single-Flux domain name technology can be compared to the Tor network. In the botnet based on the Single-Flux domain name technology, each zombie host is a redirection node, so that the optimal addressing process can be realized based on the redirection of different zombie hosts. , on the one hand to avoid the impact of a single node on the entire botnet, on the other hand it also makes it difficult for researchers to track.

Double-Flux相比于Single-Flux增加了一个可控的DNS服务层,控制者可以控制域名的修改和发布权限,而不是使用公用的域名提供商的解析服务。解析服务器是Double-Flux架构的一部分,然而解析服务器的地址也是不断变换的。Compared with Single-Flux, Double-Flux adds a controllable DNS service layer. The controller can control the modification and release authority of domain names, instead of using the resolution services of public domain name providers. The parsing server is part of the Double-Flux architecture, but the address of the parsing server is constantly changing.

分类算法在很多领域都有较广泛的应用,尤其是在数据挖掘领域,在数据挖掘领域,分类算法一般是通过概率论中的各种统计模型实现。其中常用的分类器有:Classification algorithms are widely used in many fields, especially in the field of data mining. In the field of data mining, classification algorithms are generally realized through various statistical models in probability theory. Among the commonly used classifiers are:

(1)决策树:决策树算法是常用的分类和预测的方法技术。决策树算法是通过对一组无规则且无序的数据进行推理和演算,从已知实例数据中通过推演得出决策树分类的一些分类规则。(1) Decision tree: The decision tree algorithm is a commonly used classification and prediction method. Decision tree algorithm is to deduce some classification rules of decision tree classification from known instance data by reasoning and calculating a set of irregular and disordered data.

(2)随机森林:随机森林实质上是一个包括多个决策树算法的分类器。随机森林通过构建决策树,得到决策树森林。随机森林中的各个决策树之间是没有关系的。一颗完整的决策树森林构建完成后,对于一个输入数据样本,在随机森林中的多个决策树中进行分类和决策,直到到达随机森林的叶子节点,叶子节点所属类别即为该数据样本预测得到的结果。(2) Random Forest: Random Forest is essentially a classifier that includes multiple decision tree algorithms. The random forest obtains a decision tree forest by constructing a decision tree. There is no relationship between individual decision trees in a random forest. After the construction of a complete decision tree forest is completed, for an input data sample, classification and decision-making are performed in multiple decision trees in the random forest until reaching the leaf node of the random forest, and the category of the leaf node is the prediction of the data sample The results obtained.

僵尸网络融合了传统的网络蠕虫、木马后门、病毒等技术,并结合新技术,成为现今较为广泛传播和较为隐蔽的一种恶意代码形式。使用者由于某些非法初衷,将僵尸程序大范围扩散,从而形成一个僵尸网络,并进一步通过指令和信道控制实现各种攻击行为。当前的僵尸网络的平台化技术已经很成熟,这也就为了攻击人员带来了更多效益的可能。Botnets combine traditional network worms, Trojan horse backdoors, viruses and other technologies, combined with new technologies, to become a form of malicious code that is more widely spread and more concealed today. Due to some illegal original intentions, users spread bots in a large area to form a botnet, and further implement various attack behaviors through instructions and channel control. The platform technology of the current botnet is very mature, which brings the possibility of more benefits for the attackers.

近年来,国外的研究者针对僵尸网络提出了一种新的技术检测方法——DNS数据流分析检测技术,目前的这些基于DNS数据流的僵尸网络检测方法,大多通过模拟僵尸网络来对方法进行验证,没有在实际的网络流量中测试。此外,这些方法在测试中用到的数据量都比较的小,并不能代表实际网络中流量的真实特征。In recent years, foreign researchers have proposed a new technical detection method for botnets——DNS data flow analysis and detection technology. The current botnet detection methods based on DNS data flow mostly simulate botnets. Validation, not tested in real network traffic. In addition, the amount of data used by these methods in the test is relatively small, and cannot represent the real characteristics of the traffic in the actual network.

发明内容Contents of the invention

本发明所要解决的技术问题是,提供一种基于DNS流量特征的僵尸网络检测方法。能够通过对域名系统(Domain Name System,DNS)服务器的查询流量提取、分析,找出Domain-Flux和Fast-Flux两种具有明显域名解析特征的僵尸网络控制服务器的查询策略所对应的命令控制(Command&Control,C&C)服务器。The technical problem to be solved by the present invention is to provide a botnet detection method based on DNS traffic characteristics. By extracting and analyzing the query traffic of Domain Name System (DNS) servers, it is possible to find out the corresponding command control ( Command&Control, C&C) server.

本发明所采用的技术方案是:一种基于DNS流量特征的Domain-Flux僵尸网络检测方法,包括如下步骤:The technical scheme adopted in the present invention is: a kind of Domain-Flux botnet detection method based on DNS flow characteristic, comprises the following steps:

1)读取域名,包括读取合法域名,并提取合法主域名,以及读取DGA算法生成的非法域名,并抽取非法主域名,将合法主域名和非法主域名组合起来,形成目标集合;1) Read the domain name, including reading the legal domain name, and extracting the legal main domain name, and reading the illegal domain name generated by the DGA algorithm, and extracting the illegal main domain name, combining the legal main domain name and the illegal main domain name to form a target set;

2)对获得的目标集合进行处理,提取处理后的每个域名的长度,并抽取长度大于6的域名作为研究对象;2) Process the obtained target set, extract the length of each domain name after processing, and extract domain names with a length greater than 6 as research objects;

3)分别计算域名熵值、构词法特征、语音特征和分组特征,用来识别DGA算法生成的随机域名;3) Calculate the domain name entropy value, word formation feature, phonetic feature and grouping feature respectively to identify the random domain name generated by the DGA algorithm;

4)将得到的域名熵值、构词法特征、语音特征和分组特征分为训练集和测试集,然后放入随机森林分类器得到训练模型mDGA。4) Divide the obtained domain name entropy value, word formation feature, phonetic feature and grouping feature into a training set and a test set, and then put them into a random forest classifier to obtain the training model mDGA.

步骤2)所述的处理包括去除杂数据,并将数据用逗号分为序号部分和域名部分。The processing described in step 2) includes removing miscellaneous data, and dividing the data into a serial number part and a domain name part with commas.

步骤3)所述的计算域名熵值是采用香农信息熵的方法计算域名熵,如下公式所示:The domain name entropy value described in step 3) is to adopt the Shannon information entropy method to calculate the domain name entropy, as shown in the following formula:

其中,E为域名的香农信息熵,即字符串中不同字符出现的离散情况,L是字串的长度,Ci是字母i出现次数,字母i为变量代表的是字符串中出现的字母;Among them, E is the Shannon information entropy of the domain name, that is, the discrete occurrence of different characters in the string, L is the length of the string, C i is the number of occurrences of the letter i, and the letter i is a variable representing the letter that appears in the string;

所述的计算构词法特征,是采用N-gram的基本模型,用来评估一个句子出现的概率,设定一个域名表示为一个序列S=w1w2w3...wn,则域名的概率p(S),即,构词法特征表示为如下公式:The feature of calculating the word formation method is to use the basic model of N-gram, which is used to evaluate the probability of a sentence. If a domain name is set to be expressed as a sequence S=w 1 w 2 w 3 ... w n , then the domain name The probability p(S), that is, the feature of word formation is expressed as the following formula:

其中wi表示第i个字母出现的概率,n为序列S的长度,字母i为变量代表的是字符串中出现的字母;Where w i represents the probability of the i-th letter appearing, n is the length of the sequence S, and the letter i is a variable representing the letter appearing in the string;

所述的语音特征,是选用元音字母的个数和域名总长度的比例作为元音字母的统计特性,即语音特征,如下公式所示:Described phonetic feature is to select the number of vowels and the ratio of the total length of the domain name as the statistical characteristic of vowels, i.e. phonetic features, as shown in the following formula:

其中di表示元音字母出现的次数,L为字串的长度,E即为元音字母的熵;Among them, d i represents the number of occurrences of vowels, L is the length of the string, and E is the entropy of vowels;

所述的分组特征,是提取每个域名按照数字和字母分割的部分数目。The grouping feature is to extract the number of parts of each domain name divided by numbers and letters.

一种基于所述的基于DNS流量特征的Domain-Flux僵尸网络检测方法的Fast-Flux僵尸网络检测方法,包括如下步骤:A Fast-Flux botnet detection method based on the described Domain-Flux botnet detection method based on DNS traffic characteristics, comprising the steps:

1)将DNS服务器的原始数据使用Passivedns工具处理,只保留DNS服务器返回的A记录,并对原始数据进行预处理;1) Use the Passivedns tool to process the original data of the DNS server, keep only the A record returned by the DNS server, and preprocess the original data;

2)使用针对Domain-Flux僵尸网络检测方法中通过随机森林分类器得到的训练模型对预处理的域名进行评估,获取DGA情况的打分;2) Use the training model obtained through the random forest classifier in the Domain-Flux botnet detection method to evaluate the preprocessed domain name and obtain the score of the DGA situation;

3)使用白名单、黑名单和灰名单对域名和IP进行评分,然后再进行交叉评分得到域名的置信度,其中,所述的白名单表示具有安全性的服务器主站的域名和IP,所述的灰名单存放了具有一定可信度的公司提供共有云服务的域名和IP,所述的黑名单中存放了确定被僵尸网络所有者控制的恶意域名和IP;3) use whitelist, blacklist and greylist to carry out scoring to domain name and IP, then carry out cross-scoring to obtain the confidence degree of domain name, wherein, described whitelist represents the domain name and IP of the server master station with security, so The above-mentioned gray list stores the domain names and IPs of shared cloud services provided by companies with certain credibility, and the above-mentioned black list stores malicious domain names and IPs determined to be controlled by botnet owners;

4)计算IP地址的时间特性;4) Calculate the time characteristics of the IP address;

5)计算IP地址的稳定性;5) Calculate the stability of the IP address;

6)将得到的DGA情况打分、域名置信度、IP地址的时间特性与稳定性分别分为训练集和测试集,然后放入随机森林分类器得到训练模型SFF。6) Divide the obtained DGA score, domain name confidence, and time characteristics and stability of IP addresses into a training set and a test set, and then put them into a random forest classifier to obtain a training model SFF.

步骤1)所述的预处理包括处理域名和IP,并使用maxmind公司的AS信息作为AS号码查询字典,对IP进行AS信息匹配。The preprocessing described in step 1) includes processing the domain name and IP, and using the AS information of maxmind company as the AS number query dictionary to perform AS information matching on the IP.

步骤3)所述的对域名和IP进行评分是,如果域名存在于白名单则置信度加1,存在于黑名单则置信度减1,存在于灰名单则置信度加0.5,将结果记为Pwith,如果IP存在于白名单则置信度加1,存在于黑名单则置信度减1,存在于灰名单则置信度加0.5,并将结果记为Pgeo;如果域名和IP不存在于白名单、黑名单和灰名单的范围内,则置信度为零。Step 3) described domain name and IP is carried out scoring is, if domain name exists in whitelist then confidence degree adds 1, exists in blacklist then confidence degree subtracts 1, exists in gray list then confidence degree adds 0.5, and the result is denoted as P with , if the IP exists in the whitelist, the confidence will be increased by 1, if it exists in the blacklist, the confidence will be reduced by 1, if it exists in the gray list, the confidence will be increased by 0.5, and the result will be recorded as P geo ; if the domain name and IP do not exist in In the range of whitelist, blacklist and graylist, the confidence level is zero.

步骤3)所述的进行交叉评分是,将对域名和IP的评分综合起来,得到域名的置信度:Step 3) the described cross scoring is to combine the scores of domain name and IP to obtain the confidence degree of domain name:

Pdomain=λPgeo+μPwith (4)P domain = λP geo + μP with (4)

Pdomian表示为IP置信度,Pgeo和Pwith分别表示IP是否在三种名单中的结果和域名是否在三种名单中的结果,λ是Pgeo的权值,μ是Pwith的权值。P domian is expressed as IP confidence, P geo and P with respectively indicate whether the IP is in the three lists and whether the domain name is in the three lists, λ is the weight of P geo , and μ is the weight of P with .

步骤4)所述的处理IP的时间特性是指域名IP的解析次数。The time characteristic of processing IP in step 4) refers to the number of resolutions of the domain name IP.

步骤4)所述的计算IP地址的统计特性是,包括:Step 4) the statistical characteristic of calculating IP address is, comprises:

使用如下公式分别计算IP地址的数字特征和IP地址对应的自治域的分布特征:Use the following formulas to calculate the numerical characteristics of the IP address and the distribution characteristics of the autonomous domain corresponding to the IP address:

式中,当计算IP地址的数字特征时,X表示域名对应的IP地址,α表示IP地址的平均值,N表示该域名对应的IP地址数目;当计算IP地址对应的自治域的分布特征时X表示域名对应的自治域,α表示自治域的平均值,N表示自治域数目。In the formula, when calculating the digital characteristics of the IP address, X represents the IP address corresponding to the domain name, α represents the average value of the IP address, and N represents the number of IP addresses corresponding to the domain name; when calculating the distribution characteristics of the autonomous domain corresponding to the IP address X indicates the autonomous domain corresponding to the domain name, α indicates the average value of the autonomous domain, and N indicates the number of autonomous domains.

步骤5)所述的计算IP地址的稳定性是,结合查询次数和IP地址的统计特征以及DNS查询的特点,给出IP地址稳定性的计算公式:Step 5) the described stability of calculating IP address is, in conjunction with the statistical characteristic of query times and IP address and the characteristic of DNS query, the computing formula of IP address stability is given:

其中S表示IP地址稳定性,Cip和Chit分别表示获取到的IP地址数目和查询数,Cip和Chit均需要有同一个阈值cth,即IP地址数目或者查询数目上限,cip和chit分别为IP地址数目和查询数目在阈值cth规定下的取值,即上限为阈值cthAmong them, S represents the stability of the IP address, C ip and C hit respectively represent the number of obtained IP addresses and the number of queries, both C ip and C hit need to have the same threshold c th , which is the upper limit of the number of IP addresses or the number of queries, and c ip and c hit are respectively the values of the number of IP addresses and the number of queries under the threshold c th , that is, the upper limit is the threshold c th .

本发明的基于DNS流量特征的僵尸网络检测方法,为僵尸网络的检测提供了一种新思路,提出将发音和构词方法用到DGA恶意域名检测中的方法,同时提出了一种基于置信度的评判标准。丰富了僵尸网络的检测手段,提高了检测的准确率。本发明在计算误报率和漏报率的情况下,使用平均值,可以方便的将误报和漏报的情况取出,做进一步分析和评估,实验的准确率较高。The botnet detection method based on DNS traffic characteristics of the present invention provides a new idea for botnet detection, proposes a method of using pronunciation and word formation methods in DGA malicious domain name detection, and proposes a method based on confidence criteria for judging. It enriches the detection methods of botnets and improves the accuracy of detection. In the case of calculating the false alarm rate and the false negative rate, the present invention uses the average value, and can conveniently take out the false positive and false negative rates for further analysis and evaluation, and the accuracy rate of the experiment is relatively high.

附图说明Description of drawings

图1是本发明基于DNS流量特征的Domain-Flux僵尸网络检测方法的流程图;Fig. 1 is the flow chart of the Domain-Flux botnet detection method based on DNS flow characteristic of the present invention;

图2是分别使用本发明mDGA模型与现有的tDGA检测DGA域名准确率对比;Fig. 2 is respectively using the mDGA model of the present invention and the existing tDGA to detect DGA domain name accuracy rate comparison;

图3是本发明本发明基于DNS流量特征的Fast-Flux僵尸网络检测方法。FIG. 3 is a Fast-Flux botnet detection method based on DNS traffic characteristics of the present invention.

具体实施方式Detailed ways

下面结合实施例和附图对本发明的基于DNS流量特征的僵尸网络检测方法做出详细说明。The botnet detection method based on DNS traffic characteristics of the present invention will be described in detail below in conjunction with the embodiments and the accompanying drawings.

本发明的基于DNS流量特征的僵尸网络检测方法包括有基于DNS流量特征的Domain-Flux僵尸网络检测方法和Fast-Flux僵尸网络检测方法。The botnet detection method based on DNS flow characteristics of the present invention includes a Domain-Flux botnet detection method and a Fast-Flux botnet detection method based on DNS flow characteristics.

如图1所示,本发明的基于DNS流量特征的Domain-Flux僵尸网络检测方法,包括如下步骤:As shown in Figure 1, the Domain-Flux botnet detection method based on DNS traffic characteristics of the present invention comprises the following steps:

1)读取域名,包括读取合法域名,并提取合法主域名,以及读取DGA算法生成的非法域名,并抽取非法主域名,将合法主域名和非法主域名组合起来,形成目标集合;1) Read the domain name, including reading the legal domain name, and extracting the legal main domain name, and reading the illegal domain name generated by the DGA algorithm, and extracting the illegal main domain name, combining the legal main domain name and the illegal main domain name to form a target set;

2)对获得的目标集合进行处理,提取处理后的每个域名的长度,并抽取长度大于6的域名作为研究对象,所述的处理包括去除杂数据,并将数据用逗号分为序号部分和域名部分;2) Process the obtained target set, extract the length of each domain name after processing, and extract domain names with a length greater than 6 as research objects. The processing includes removing miscellaneous data, and dividing the data into serial number parts and domain name part;

3)分别计算域名熵值、构词法特征、语音特征和分组特征,用来识别DGA算法生成的随机域名;其中,3) Calculate the domain name entropy value, word formation features, phonetic features and grouping features respectively to identify the random domain names generated by the DGA algorithm; where,

所述的计算域名熵值是采用香农信息熵(Information Entropy)的方法计算域名熵,如下公式所示:The domain name entropy value is calculated by Shannon's information entropy (Information Entropy) method to calculate the domain name entropy, as shown in the following formula:

其中,E为域名的香农信息熵,即字符串中不同字符出现的离散情况,L是字串的长度,Ci是字母i出现次数,字母i为变量代表的是字符串中出现的字母。Among them, E is the Shannon information entropy of the domain name, that is, the discrete occurrence of different characters in the string, L is the length of the string, C i is the number of occurrences of the letter i, and the letter i is a variable representing the letter that appears in the string.

所述的计算构词法特征,是采用N-gram的基本模型,用来估一个句子出现的概率,设定一个域名表示为一个序列S=w1w2w3...wn,则域名的概率p(S),即,构词法特征表示为如下公式:The feature of calculating the word formation method is to use the basic model of N-gram, which is used to estimate the probability of a sentence, and a domain name is set to be expressed as a sequence S=w 1 w 2 w 3 ... w n , then the domain name The probability p(S), that is, the feature of word formation is expressed as the following formula:

其中wi表示第i个字母出现的概率,n为序列S的长度,字母i为变量代表的是字符串中出现的字母;Where w i represents the probability of the i-th letter appearing, n is the length of the sequence S, and the letter i is a variable representing the letter appearing in the string;

所述的语音特征,是选用元音字母的个数和域名总长度的比例作为元音字母的统计特性,即语音特征,如下公式所示:Described phonetic feature is to select the number of vowels and the ratio of the total length of the domain name as the statistical characteristic of vowels, i.e. phonetic features, as shown in the following formula:

其中di表示元音字母出现的次数,L为字串的长度,E即为元音字母的熵;Among them, d i represents the number of occurrences of vowels, L is the length of the string, and E is the entropy of vowels;

所述的分组特征,是提取每个域名按照数字和字母分割的部分数目。The grouping feature is to extract the number of parts of each domain name divided by numbers and letters.

4)将得到的域名熵值、构词法特征、语音特征和分组特征分为训练集和测试集,然后放入随机森林分类器得到训练模型mDGA。4) Divide the obtained domain name entropy value, word formation feature, phonetic feature and grouping feature into a training set and a test set, and then put them into a random forest classifier to obtain the training model mDGA.

如图3所示,本发明基于DNS流量特征的Fast-Flux僵尸网络检测方法,使用了前面所述的基于DNS流量特征的Domain-Flux僵尸网络检测方法。本发明以DNS流量数据为基础,在现有IP地址分析工作的基础上添加了模糊自治域的数目和熵参数,同时提出IP地址置信度评分的计算方法和域名的置信度计算方法,具体包括如下步骤:As shown in FIG. 3 , the Fast-Flux botnet detection method based on DNS traffic characteristics of the present invention uses the aforementioned Domain-Flux botnet detection method based on DNS traffic characteristics. The present invention is based on DNS traffic data, adds the number of fuzzy autonomous domains and entropy parameters on the basis of the existing IP address analysis work, and proposes a calculation method for IP address confidence scores and a domain name confidence calculation method, specifically including Follow the steps below:

1)将DNS服务器的原始数据使用Passivedns工具处理,只保留DNS服务器返回的A记录,并对原始数据进行预处理;1) Use the Passivedns tool to process the original data of the DNS server, keep only the A record returned by the DNS server, and preprocess the original data;

所述的预处理包括处理域名和IP,并使用maxmind公司的AS信息作为AS号码查询字典,对IP进行AS信息匹配。The preprocessing includes processing the domain name and IP, and using the AS information of maxmind company as the AS number query dictionary to perform AS information matching on the IP.

2)使用针对Domain-Flux僵尸网络检测方法中通过随机森林分类器得到的训练模型对预处理的域名进行评估,获取DGA情况的打分;2) Use the training model obtained through the random forest classifier in the Domain-Flux botnet detection method to evaluate the preprocessed domain name and obtain the score of the DGA situation;

3)使用白名单、黑名单和灰名单对域名和IP进行评分,然后再进行交叉评分得到域名的置信度。其中:3) Use the whitelist, blacklist and graylist to score the domain name and IP, and then perform cross-scoring to obtain the confidence of the domain name. in:

所述的白名单表示具有安全性的服务器主站的域名和IP,所述的灰名单存放了具有一定可信度的公司提供共有云服务的域名和IP,所述的黑名单中存放了确定被僵尸网络所有者控制的恶意域名和IP;The white list indicates the domain names and IPs of the main servers with security, the gray list stores the domain names and IPs of companies with certain credibility that provide public cloud services, and the black list stores certain Malicious domain names and IPs controlled by botnet owners;

所述的对域名和IP进行评分是,如果域名存在于白名单则置信度加1,存在于黑名单则置信度减1,存在于灰名单则置信度加0.5,将结果记为Pwith,如果IP存在于白名单则置信度加1,存在于黑名单则置信度减1,存在于灰名单则置信度加0.5,并将结果记为Pgeo;如果域名和IP不存在于白名单、黑名单和灰名单的范围内,则置信度为零。The scoring of the domain name and IP is as follows: if the domain name exists in the white list, then the confidence is increased by 1, if it exists in the blacklist, the confidence is reduced by 1, and if it exists in the gray list, the confidence is increased by 0.5, and the result is recorded as P with , If the IP exists in the whitelist, the confidence will be increased by 1, if it exists in the blacklist, the confidence will be reduced by 1, if it exists in the gray list, the confidence will be increased by 0.5, and the result will be recorded as P geo ; if the domain name and IP do not exist in the whitelist, In the range of blacklist and graylist, the confidence level is zero.

所述的进行交叉评分是,将对域名和IP的评分综合起来,得到域名的置信度:The described cross-scoring is to combine the scoring of the domain name and IP to obtain the confidence of the domain name:

Pdomain=λPgeo+μPwith (4)P domain = λP geo + μP with (4)

Pdomian表示为IP置信度,Pgeo和Pwith分别表示IP是否在三种名单中的结果和域名是否在三种名单中的结果,λ是Pgeo的权值,μ是Pwith的权值。P domian is expressed as IP confidence, P geo and P with respectively indicate whether the IP is in the three lists and whether the domain name is in the three lists, λ is the weight of P geo , and μ is the weight of P with .

4)计算IP地址的时间特性;其中:4) Calculate the time characteristic of IP address; Wherein:

所述的处理IP的时间特性是指域名IP的解析次数。The time characteristic of processing IP refers to the resolution times of domain name IP.

所述的计算IP地址的统计特性是,包括:The statistical characteristics of calculating the IP address include:

使用如下公式分别计算IP地址的数字特征和IP地址对应的自治域的分布特征:Use the following formulas to calculate the numerical characteristics of the IP address and the distribution characteristics of the autonomous domain corresponding to the IP address:

式中,当计算IP地址的数字特征时,X表示域名对应的IP地址,α表示IP地址的平均值,N表示该域名对应的IP地址数目;当计算IP地址对应的自治域的分布特征时X表示域名对应的自治域,α表示自治域的平均值,N表示自治域数目。In the formula, when calculating the digital characteristics of the IP address, X represents the IP address corresponding to the domain name, α represents the average value of the IP address, and N represents the number of IP addresses corresponding to the domain name; when calculating the distribution characteristics of the autonomous domain corresponding to the IP address X indicates the autonomous domain corresponding to the domain name, α indicates the average value of the autonomous domain, and N indicates the number of autonomous domains.

5)计算IP地址的稳定性;所述的计算IP地址的稳定性是,结合查询次数和IP地址的统计特征以及DNS查询的特点,给出IP地址稳定性的计算公式:5) calculate the stability of IP address; The stability of described calculation IP address is, in conjunction with the statistical characteristic of query number of times and IP address and the characteristic of DNS query, provide the computing formula of IP address stability:

其中S表示IP地址稳定性,Cip和Chit分别表示获取到的IP地址数目和查询数,Cip和Chit均需要有同一个阈值cth,即IP地址数目或者查询数目上限,cip和chit分别为IP地址数目和查询数目在阈值cth规定下的取值,即上限为阈值cthAmong them, S represents the stability of the IP address, C ip and C hit respectively represent the number of obtained IP addresses and the number of queries, both C ip and C hit need to have the same threshold c th , which is the upper limit of the number of IP addresses or the number of queries, and c ip and c hit are respectively the values of the number of IP addresses and the number of queries under the threshold c th , that is, the upper limit is the threshold c th .

6)将得到的DGA情况打分、域名置信度、IP地址的时间特性与稳定性分别分为训练集和测试集,然后放入随机森林分类器得到训练模型SFF。6) Divide the obtained DGA score, domain name confidence, and time characteristics and stability of IP addresses into a training set and a test set, and then put them into a random forest classifier to obtain a training model SFF.

本发明的基于DNS流量特征的僵尸网络检测方法,主要针对Domain-Flux和Fast-Flux两种僵尸网络的域名查询特征,分别提出了基于DNS流量的僵尸网络检测方法。为僵尸网络的检测提供了一种新思路,提出将发音和构词方法用到DGA恶意域名检测中的方法,同时提出了一种基于置信度的评判标准。丰富了僵尸网络的检测手段,提高了检测的准确率。The botnet detection method based on DNS flow characteristics of the present invention mainly aims at the domain name query characteristics of two botnets, Domain-Flux and Fast-Flux, respectively proposes a botnet detection method based on DNS flow. It provides a new way of thinking for the detection of botnets, and proposes a method of using pronunciation and word formation methods in DGA malicious domain name detection, and also proposes a judging standard based on confidence. It enriches the detection methods of botnets and improves the accuracy of detection.

下面将本发明的方法与原始算法进行评价比较:The method of the present invention is evaluated and compared with the original algorithm below:

对于Domain-Flux僵尸网络实验结果评价标准是预测的准确率,即准确预测的样本占总体样本数的比例以及其混淆矩阵的各项指标。本发明的对比实验为未考虑语音和分组特性的算法DGA detection,该算法记为tDGA,本发明的算法记mDGA。The evaluation criteria for the Domain-Flux botnet experiment results are the prediction accuracy, that is, the proportion of accurately predicted samples to the total number of samples and the indicators of the confusion matrix. The comparative experiment of the present invention is an algorithm DGA detection that does not consider voice and packet characteristics, and the algorithm is denoted as tDGA, and the algorithm of the present invention is denoted as mDGA.

首先对比本发明和原始算法的准确率。本发明使用kFold方法进行实验,其中,k取经验值10。在计算误报率和漏报率的情况下,使用平均值,这样可以方便的将误报和漏报的情况取出,做进一步分析和评估。First compare the accuracy of the present invention and the original algorithm. The present invention uses kFold method to carry out experiments, wherein, k takes empirical value 10. In the case of calculating the false positive rate and the false negative rate, the average value is used, so that the false negative and false positive can be easily taken out for further analysis and evaluation.

实验结果如图2所示,本发明的算法记mDGA相比原算法tDGA在准确率上有所提升,但是提升效果并不明显,只提高了0.2%的准确率。这是因为测试集中样本数量较大,正常域名与DGA域名数目大约为3:1的关系,DGA域名较少。对于Fast-Flux僵尸网络,实验结果的评价标准是准确率,可以使用k-fold方法进行多次实验给出平均得分,也可以使用多次随机切分训练集和测试集,给出平均得分。本发明同时使用了这两种方式进行实验效果的检测。The experimental results are shown in Figure 2. Compared with the original algorithm tDGA, the algorithm mDGA of the present invention improves the accuracy rate, but the improvement effect is not obvious, and the accuracy rate is only increased by 0.2%. This is because the number of samples in the test set is large, the relationship between the number of normal domain names and the number of DGA domain names is about 3:1, and the number of DGA domain names is relatively small. For the Fast-Flux botnet, the evaluation standard of the experimental results is the accuracy rate. The k-fold method can be used to conduct multiple experiments to give an average score, or multiple random splits of the training set and test set can be used to give the average score. The present invention uses these two methods at the same time to detect the experimental effect.

根据本发明的SFF模型使用k-fold方法进行了10次交叉检验的训练过程,每次训练给出10个评分,重复了六次。每次训练使用90%的数据作为训练集,训练随机森林,使用剩余10%的数据作为测试集,测试模型的效果如表1所示。According to the SFF model of the present invention, a training process of 10 cross-checks was carried out using the k-fold method, and 10 scores were given for each training, which was repeated six times. Each training uses 90% of the data as the training set, trains the random forest, and uses the remaining 10% of the data as the test set. The effect of the test model is shown in Table 1.

表1使用SFF模型评分结果Table 1 Scoring results using the SFF model

实验结果表明,实验的准确率较高。由于DNS本身特性,查询域名比较分散,数目较大,恶意域名相对较少,因此样本中白名单和灰名单中的数据远远多于黑名单中的数据,较高的准确率可能是由于对白名单数据判断比较准确造成的。The experimental results show that the accuracy of the experiment is high. Due to the characteristics of DNS itself, the query domain names are relatively scattered, the number is large, and the number of malicious domain names is relatively small. Therefore, the data in the white list and gray list in the sample is far more than the data in the black list. The higher accuracy may be due to the dialogue The judgment of the list data is more accurate.

使用Etienne Stalmans的fastfluxanalysis项目进行对比,该项目提供了一系列判断一个域名是否为Fast-Flux域名的算法。对比实验结果表明仅仅使用地理位置信息,并不考虑IP和域名的相关性会影响僵尸网络C&C节点的检测。说明本发明提出的在IP和域名置信度对僵尸网络控制节点的检测效果有一定的提升。For comparison, use Etienne Stalmans' fastfluxanalysis project, which provides a series of algorithms for judging whether a domain name is a Fast-Flux domain name. The results of comparative experiments show that only using geographic location information without considering the correlation between IP and domain name will affect the detection of botnet C&C nodes. It shows that the confidence degree of IP and domain name proposed by the present invention can improve the detection effect of the botnet control node to a certain extent.

Claims (9)

1. A Domain-Flux botnet detection method based on DNS traffic characteristics is characterized by comprising the following steps:
1) reading a domain name, including reading a legal domain name, extracting a legal main domain name, reading an illegal domain name generated by a domain name generation algorithm DGA algorithm, extracting the illegal main domain name, and combining the legal main domain name and the illegal main domain name to form a target set;
2) processing the obtained target set, extracting the length of each processed domain name, and extracting the domain name with the length larger than 6 as a research object;
3) respectively calculating a domain name entropy value, a morphology feature, a voice feature and a grouping feature to identify a random domain name generated by a domain name generation algorithm DGA algorithm;
the method for calculating the domain name entropy value is to calculate the domain name entropy by adopting a Shannon information entropy method, and the following formula is shown as follows:
wherein E is the Shannon information entropy of the domain name, i.e. the discrete condition of different characters in the character string, L is the length of the string, CiThe occurrence frequency of the letter i is shown, and the letter i is represented by a variable and is a letter in a character string;
the method is characterized in that a basic model of an N-gram is adopted to evaluate the probability of a sentence, and a domain name is set to be expressed as a sequence S-w1w2w3...wnThen the probability p (S) of the domain name, i.e., the lexical features, is expressed as the following formula:
wherein wiThe probability of the ith letter is shown, n is the length of the sequence S, and the letter i is the letter represented by a variable and appears in the character string;
the voice feature is the statistical characteristic of selecting the proportion of the number of vowels and the total length of the domain name as vowels, namely the voice feature, and is shown in the following formula:
wherein d isiRepresenting the occurrence frequency of the vowels, wherein L is the length of the character string, and E is the entropy of the vowels;
the grouping feature is to extract the number of parts of each domain name divided according to numbers and letters
4) And dividing the obtained domain name entropy value, the morphology feature, the voice feature and the grouping feature into a training set and a testing set, and then putting the training set and the testing set into a random forest classifier to obtain a training model mDGA.
2. The method for detecting a Domain-Flux botnet based on DNS traffic characteristics according to claim 1, wherein the processing in step 2) includes removing impurity data and comma-separating the data into a sequence number part and a Domain name part.
3. A Fast-Flux botnet detection method based on the Domain-Flux botnet detection method based on DNS traffic characteristics of claim 1, comprising the steps of:
1) processing the original data of the DNS by using a Passivedns tool, only reserving a record A returned by the DNS, and preprocessing the original data;
2) evaluating the preprocessed Domain name by using a training model obtained by a random forest classifier in a Domain-Flux botnet detection method to obtain a score of a DGA condition;
3) the method comprises the steps of scoring a domain name and an IP by using a white list, a black list and a gray list, and then carrying out cross scoring to obtain the confidence coefficient of the domain name, wherein the white list represents the domain name and the IP of a server master station with safety, the gray list stores the domain name and the IP of a company providing common cloud service with certain confidence coefficient, and the black list stores malicious domain names and IPs which are determined to be controlled by a botnet owner;
4) calculating the time characteristic of the IP address;
5) calculating the stability of the IP address;
6) and respectively dividing the obtained DGA condition score, the domain name confidence coefficient and the time characteristic and the stability of the IP address into a training set and a testing set, and then putting the training sets into a random forest classifier to obtain a training model SFF.
4. The method of claim 3, wherein the preprocessing of step 1) comprises processing Domain names and IPs, and using AS information of maxmind corporation AS AS number query dictionary for AS information matching of IPs.
5. The method of claim 3, wherein the scoring of the Domain name and the IP in step 3) comprises adding 1 to the confidence level if the Domain name exists in a white list, subtracting 1 from the confidence level if the Domain name exists in a black list, adding 0.5 to the confidence level if the Domain name exists in a gray list, and recording the result as PwithAdding 1 to the confidence if the IP exists in the white list, subtracting 1 to the confidence if the IP exists in the black list, adding 0.5 to the confidence if the IP exists in the gray list, and marking the result as Pgeo(ii) a The confidence is zero if the domain name and IP are not in the white, black, and gray lists.
6. The Domain-Flux botnet detection method of claim 3, wherein said cross scoring of step 3) is performed by combining the scores for the Domain name and the IP to obtain the confidence level of the Domain name:
Pdomain=λPgeo+μPwith(4)
Pdomianexpressed as IP confidence, PgeoAnd PwithRespectively indicating whether IP is in three lists and whether domain name is in three lists, and lambda is PgeoWith weight of (u) being PwithThe weight of (2).
7. The method of claim 3, wherein the time characteristic of processing IP in step 4) is the resolution times of Domain name IP.
8. The method of Domain-Flux botnet detection according to claim 3, wherein said step 4) of calculating statistical properties of IP addresses comprises:
respectively calculating the numerical characteristics of the IP address and the distribution characteristics of the autonomous domain corresponding to the IP address by using the following formula:
when the distribution characteristics of the autonomous domains corresponding to the IP addresses are calculated, X represents the autonomous domains corresponding to the domain names, alpha represents the average value of the autonomous domains, and N represents the number of the autonomous domains.
9. The Domain-Flux botnet detection method according to claim 3, wherein the stability of the computed IP address in step 5) is given by combining the query times, the statistical characteristics of the IP address and the characteristics of DNS query:
wherein S represents IP address stability, CipAnd ChitRespectively representing the number of acquired IP addresses and the number of queries, CipAnd ChitAll need to have the same threshold cthI.e. the number of IP addresses or the upper limit of the number of queries, cipAnd chitThe number of IP addresses and the number of queries are respectively at a threshold value cthA value under regulation, i.e. with an upper limit of the threshold cth
CN201610222674.5A 2016-04-11 2016-04-11 Botnet detection method based on DNS traffic characteristics Active CN105897714B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610222674.5A CN105897714B (en) 2016-04-11 2016-04-11 Botnet detection method based on DNS traffic characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610222674.5A CN105897714B (en) 2016-04-11 2016-04-11 Botnet detection method based on DNS traffic characteristics

Publications (2)

Publication Number Publication Date
CN105897714A CN105897714A (en) 2016-08-24
CN105897714B true CN105897714B (en) 2018-11-09

Family

ID=57012774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610222674.5A Active CN105897714B (en) 2016-04-11 2016-04-11 Botnet detection method based on DNS traffic characteristics

Country Status (1)

Country Link
CN (1) CN105897714B (en)

Families Citing this family (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI548395B (en) 2014-01-28 2016-09-11 微凸科技股份有限公司 Continuous percutaneous microneedle monitoring system
RU2649793C2 (en) 2016-08-03 2018-04-04 ООО "Группа АйБи" Method and system of detecting remote connection when working on web resource pages
RU2634209C1 (en) 2016-09-19 2017-10-24 Общество с ограниченной ответственностью "Группа АйБи ТДС" System and method of autogeneration of decision rules for intrusion detection systems with feedback
CN106533829B (en) * 2016-11-04 2019-04-30 东南大学 A method for identifying DNS traffic based on bit entropy
CN106713371B (en) * 2016-12-08 2020-04-21 中国电子科技网络信息安全有限公司 A Fast Flux Botnet Detection Method Based on DNS Anomaly Mining
US10375096B2 (en) 2016-12-08 2019-08-06 Cisco Technology, Inc. Filtering onion routing traffic from malicious domain generation algorithm (DGA)-based traffic classification
CN106803824A (en) * 2016-12-19 2017-06-06 互联网域名系统北京市工程研究中心有限公司 A kind of means of defence attacked for random domain name inquiry
RU2637477C1 (en) 2016-12-29 2017-12-04 Общество с ограниченной ответственностью "Траст" System and method for detecting phishing web pages
RU2671991C2 (en) 2016-12-29 2018-11-08 Общество с ограниченной ответственностью "Траст" System and method for collecting information for detecting phishing
CN108270761A (en) * 2017-01-03 2018-07-10 中国移动通信有限公司研究院 A kind of domain name legitimacy detection method and device
CN108282450A (en) * 2017-01-06 2018-07-13 阿里巴巴集团控股有限公司 The detection method and device of abnormal domain name
CN108632227B (en) * 2017-03-23 2020-12-18 中国移动通信集团广东有限公司 Method and device for detecting and processing malicious domain names
CN109391602B (en) * 2017-08-11 2021-04-09 北京金睛云华科技有限公司 Zombie host detection method
CN107360185B (en) * 2017-08-18 2020-09-25 中国移动通信集团海南有限公司 Network evaluation method and device based on DNS behavior characteristics
CN107682348A (en) * 2017-10-19 2018-02-09 杭州安恒信息技术有限公司 DGA domain name Quick method and devices based on machine learning
CN108076041A (en) * 2017-10-23 2018-05-25 中国银联股份有限公司 A kind of DNS flow rate testing methods and DNS flow quantity detecting systems
CN107786575B (en) * 2017-11-11 2020-07-10 北京信息科技大学 DNS flow-based self-adaptive malicious domain name detection method
CN109788079B (en) * 2017-11-15 2022-03-15 瀚思安信(北京)软件技术有限公司 DGA domain name real-time detection method and device
RU2689816C2 (en) 2017-11-21 2019-05-29 ООО "Группа АйБи" Method for classifying sequence of user actions (embodiments)
CN107908963B (en) * 2018-01-08 2020-11-06 北京工业大学 Method for automatically detecting core characteristics of malicious codes
CN108156174B (en) * 2018-01-15 2020-03-27 深圳市联软科技股份有限公司 Botnet detection method, device, equipment and medium based on C & C domain name analysis
RU2676247C1 (en) 2018-01-17 2018-12-26 Общество С Ограниченной Ответственностью "Группа Айби" Web resources clustering method and computer device
RU2677368C1 (en) 2018-01-17 2019-01-16 Общество С Ограниченной Ответственностью "Группа Айби" Method and system for automatic determination of fuzzy duplicates of video content
RU2668710C1 (en) 2018-01-17 2018-10-02 Общество с ограниченной ответственностью "Группа АйБи ТДС" Computing device and method for detecting malicious domain names in network traffic
RU2677361C1 (en) 2018-01-17 2019-01-16 Общество с ограниченной ответственностью "Траст" Method and system of decentralized identification of malware programs
RU2680736C1 (en) 2018-01-17 2019-02-26 Общество с ограниченной ответственностью "Группа АйБи ТДС" Malware files in network traffic detection server and method
RU2681699C1 (en) 2018-02-13 2019-03-12 Общество с ограниченной ответственностью "Траст" Method and server for searching related network resources
CN110324273A (en) * 2018-03-28 2019-10-11 蓝盾信息安全技术有限公司 A kind of Botnet detection method combined based on DNS request behavior with domain name constitutive characteristic
CN108809989B (en) * 2018-06-14 2021-04-23 北京中油瑞飞信息技术有限责任公司 Botnet detection method and device
CN110611640A (en) * 2018-06-15 2019-12-24 成都蓝盾网信科技有限公司 DNS protocol hidden channel detection method based on random forest
CN108900655A (en) * 2018-08-08 2018-11-27 北京谷安天下科技有限公司 A kind of domain name viability recognition methods, device and electronic equipment
US10764246B2 (en) * 2018-08-14 2020-09-01 Didi Research America, Llc System and method for detecting generated domain
CN111125563A (en) * 2018-10-31 2020-05-08 安碁资讯股份有限公司 Methods of evaluating domain names and their servers
CN109413079A (en) * 2018-11-09 2019-03-01 四川大学 Fast-Flux Botnet detection method and system under a kind of high speed network
RU2708508C1 (en) 2018-12-17 2019-12-09 Общество с ограниченной ответственностью "Траст" Method and a computing device for detecting suspicious users in messaging systems
RU2701040C1 (en) 2018-12-28 2019-09-24 Общество с ограниченной ответственностью "Траст" Method and a computer for informing on malicious web resources
CN109714356A (en) * 2019-01-08 2019-05-03 北京奇艺世纪科技有限公司 A kind of recognition methods of abnormal domain name, device and electronic equipment
EP3842968B1 (en) 2019-02-27 2024-04-24 "Group IB" Ltd. Method and system for identifying a user according to keystroke dynamics
SG11202101658UA (en) 2019-04-10 2021-03-30 Group Ib Ltd Method and system of user identification by a sequence of opened user interface windows
CN110535820A (en) * 2019-04-18 2019-12-03 国家计算机网络与信息安全管理中心 For the classification method of malice domain name, device, electronic equipment and medium
CN110147839A (en) * 2019-05-20 2019-08-20 哈尔滨英赛克信息技术有限公司 The method that algorithm based on XGBoost generates domain name detection model
CN110381089A (en) * 2019-08-23 2019-10-25 南京邮电大学 Means of defence is detected to malice domain name based on deep learning
CN110493253B (en) * 2019-09-02 2021-06-22 四川长虹电器股份有限公司 Botnet analysis method of home router based on raspberry group design
RU2728497C1 (en) 2019-12-05 2020-07-29 Общество с ограниченной ответственностью "Группа АйБи ТДС" Method and system for determining belonging of software by its machine code
RU2728498C1 (en) 2019-12-05 2020-07-29 Общество с ограниченной ответственностью "Группа АйБи ТДС" Method and system for determining software belonging by its source code
RU2743974C1 (en) 2019-12-19 2021-03-01 Общество с ограниченной ответственностью "Группа АйБи ТДС" System and method for scanning security of elements of network architecture
RU2722693C1 (en) 2020-01-27 2020-06-03 Общество с ограниченной ответственностью «Группа АйБи ТДС» Method and system for detecting the infrastructure of a malicious software or a cybercriminal
RU2722692C1 (en) 2020-02-21 2020-06-03 Общество с ограниченной ответственностью «Группа АйБи ТДС» Method and system for detecting malicious files in a non-isolated medium
SG10202001963TA (en) 2020-03-04 2021-10-28 Group Ib Global Private Ltd System and method for brand protection based on the search results
RU2738344C1 (en) 2020-03-10 2020-12-11 Общество с ограниченной ответственностью «Группа АйБи ТДС» Method and system for searching for similar malware based on results of their dynamic analysis
CN111756871B (en) * 2020-06-18 2022-04-26 北京天融信网络安全技术有限公司 Data processing method based on domain name service protocol and electronic equipment
RU2743619C1 (en) 2020-08-06 2021-02-20 Общество с ограниченной ответственностью "Группа АйБи ТДС" Method and system for generating the list of compromise indicators
CN112261028B (en) * 2020-10-16 2022-05-31 重庆理工大学 DGA botnet domain name detection method based on capsule network and k-means
CN112800424A (en) * 2021-02-02 2021-05-14 西南交通大学 Botnet malicious traffic monitoring method based on random forest
CN113114524B (en) * 2021-03-04 2022-07-05 北京六方云信息技术有限公司 Spark streaming based DNS tunnel detection method and device and electronic equipment
US11947572B2 (en) 2021-03-29 2024-04-02 Group IB TDS, Ltd Method and system for clustering executable files
CN113179260B (en) * 2021-04-21 2022-09-23 国家计算机网络与信息安全管理中心河北分中心 Botnet detection method, device, equipment and medium
CN112990792B (en) * 2021-05-11 2021-08-31 北京智源人工智能研究院 A method, device and electronic device for automatic detection of infringement risk
NL2030861B1 (en) 2021-06-01 2023-03-14 Trust Ltd System and method for external monitoring a cyberattack surface
RU2769075C1 (en) 2021-06-10 2022-03-28 Общество с ограниченной ответственностью "Группа АйБи ТДС" System and method for active detection of malicious network resources
GB2612008B (en) 2021-07-06 2023-12-27 British Telecomm Malware protection
CN113315790B (en) * 2021-07-29 2021-11-02 湖南华菱电子商务有限公司 Intrusion traffic detection method, electronic device and storage medium
CN114928472B (en) * 2022-04-20 2023-07-18 哈尔滨工业大学(威海) Bad site gray list filtering method based on full circulation main domain name
CN114841256A (en) * 2022-04-20 2022-08-02 哈尔滨工业大学(威海) DGA domain name classification method based on multi-dimensional feature fusion

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101986642A (en) * 2010-10-18 2011-03-16 中国科学院计算技术研究所 Detection system and method of Domain Flux data stream
CN102685145A (en) * 2012-05-28 2012-09-19 西安交通大学 Domain name server (DNS) data packet-based bot-net domain name discovery method
CN103944901A (en) * 2014-04-18 2014-07-23 中国科学院信息工程研究所 Detection method and device for control nodes of social bot network
CN105005594A (en) * 2015-06-29 2015-10-28 嘉兴慧康智能科技有限公司 Abnormal Weibo user identification method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101986642A (en) * 2010-10-18 2011-03-16 中国科学院计算技术研究所 Detection system and method of Domain Flux data stream
CN102685145A (en) * 2012-05-28 2012-09-19 西安交通大学 Domain name server (DNS) data packet-based bot-net domain name discovery method
CN103944901A (en) * 2014-04-18 2014-07-23 中国科学院信息工程研究所 Detection method and device for control nodes of social bot network
CN105005594A (en) * 2015-06-29 2015-10-28 嘉兴慧康智能科技有限公司 Abnormal Weibo user identification method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
域名请求行为特征与构成特征相结合的域名变换检测;张永斌等;《西安交通大学学报》;20130831;第47卷(第8期);正文第55-58页 *

Also Published As

Publication number Publication date
CN105897714A (en) 2016-08-24

Similar Documents

Publication Publication Date Title
CN105897714B (en) Botnet detection method based on DNS traffic characteristics
CN105577660B (en) DGA domain name detection method based on random forest
CN110414219B (en) Injection attack detection method based on gated cycle unit and attention mechanism
CN109117482B (en) An Adversarial Sample Generation Method for Chinese Text Sentiment Tendency Detection
US10178107B2 (en) Detection of malicious domains using recurring patterns in domain names
CN109391602B (en) Zombie host detection method
CN109450845B (en) A detection method of malicious domain name generated by algorithm based on deep neural network
CN111031026A (en) DGA malicious software infected host detection method
CN110855716B (en) Self-adaptive security threat analysis method and system for counterfeit domain names
CN109729091A (en) An LDoS attack detection method based on multi-feature fusion and CNN algorithm
CN112492059A (en) DGA domain name detection model training method, DGA domain name detection device and storage medium
Jain et al. Adversarial text generation for google's perspective api
Manasrah et al. DGA-based botnets detection using DNS traffic mining
CN112948578B (en) A kind of DGA domain name open set classification method, device, electronic equipment and medium
CN114050912B (en) Malicious domain name detection method and device based on deep reinforcement learning
CN110020430B (en) Malicious information identification method, device, equipment and storage medium
CN116684144B (en) Malicious domain name detection method and device
Ashcroft et al. A Step Towards Detecting Online Grooming--Identifying Adults Pretending to be Children
Dharani et al. Detection of phishing websites using ensemble machine learning approach
CN110147839A (en) The method that algorithm based on XGBoost generates domain name detection model
Yang et al. A novel detection method for word-based DGA
CN116304044A (en) A method for clustering APT attacks based on threat intelligence reports
CN113486243B (en) Automatic mining method and system for false flow black ash production of social network
Yazhmozhi et al. Natural language processing and Machine learning based phishing website detection system
Aravena et al. Dom2Vec-detecting DGA domains through word embeddings and AI/ML-Driven lexicographic analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20210701

Address after: Room 819, block C, Hebang building, 899 Tiantong North Road, Zhonghe street, Yinzhou District, Ningbo City, Zhejiang Province, 315100

Patentee after: NINGBO ZHIWEI RUICHI INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 300072 Tianjin City, Nankai District Wei Jin Road No. 92

Patentee before: Tianjin University

TR01 Transfer of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Botnet detection method based on DNS traffic characteristics

Effective date of registration: 20211124

Granted publication date: 20181109

Pledgee: Ningbo Tianjin Enterprise Service Co.,Ltd.

Pledgor: NINGBO ZHIWEI RUICHI INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2021330002348

PC01 Cancellation of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20221111

Granted publication date: 20181109

Pledgee: Ningbo Tianjin Enterprise Service Co.,Ltd.

Pledgor: NINGBO ZHIWEI RUICHI INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2021330002348

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Botnet Detection Method Based on DNS Traffic Characteristics

Effective date of registration: 20221114

Granted publication date: 20181109

Pledgee: Ningbo Tianjin Enterprise Service Co.,Ltd.

Pledgor: NINGBO ZHIWEI RUICHI INFORMATION TECHNOLOGY Co.,Ltd.

Registration number: Y2022980021792