WO2025190232A1

WO2025190232A1 - Traffic extraction method and apparatus for vulnerability scanner, and electronic device and storage medium

Info

Publication number: WO2025190232A1
Application number: PCT/CN2025/081660
Authority: WO
Inventors: 曾伟明
Original assignee: Shanghai Douxiang Information Technology Co Ltd
Current assignee: Shanghai Douxiang Information Technology Co Ltd
Priority date: 2024-03-15
Filing date: 2025-03-10
Publication date: 2025-09-18
Anticipated expiration: 2026-09-15
Also published as: CN118075008A

Abstract

The present application relates to the technical field of traffic extraction. Disclosed are a traffic extraction method and apparatus for a vulnerability scanner, and an electronic device and a storage medium. The method comprises: triggering a vulnerability scanner to send request traffic; determining whether the request traffic is redundant traffic, wherein the redundant traffic comprises universal traffic and/or repeated traffic, the repeated traffic being traffic identical to the previously received request traffic, and the universal traffic being request traffic for the vulnerability scanner to verify the survival of a site; and if the request traffic is non-redundant traffic, storing the request traffic. In this way, in the present application, whether request traffic is redundant traffic is determined, i.e., whether the request traffic is invalid traffic is determined, such that the request traffic that is not invalid traffic is stored, thereby reducing invalid traffic among stored request traffic.

Description

Traffic extraction method and device for vulnerability scanner, electronic device, and storage medium

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请要求享有于2024年3月15日提交的名称为“漏洞扫描器的流量提取方法及装置、电子设备、存储介质”的中国专利申请202410300947.8的优先权，该申请的全部内容通过引用并入本文中。This application claims priority to Chinese patent application No. 202410300947.8, filed on March 15, 2024, entitled “Traffic extraction method and device, electronic device, and storage medium for vulnerability scanner,” and the entire contents of that application are incorporated herein by reference.

Technical Field

本申请涉及流量提取技术领域，尤其是涉及一种漏洞扫描器的流量提取方法及装置、电子设备、存储介质。The present application relates to the field of traffic extraction technology, and in particular to a traffic extraction method and device, electronic device, and storage medium for a vulnerability scanner.

Background Art

目前，为了便于研究人员推导漏洞验证的测试方法，通常会先获取漏洞扫描器的请求流量进行存储。相关技术中，通常先部署存在漏洞缺陷的漏洞靶机程序。其中，漏洞缺陷包括XSS(跨站脚本攻击)跨站请求伪造漏洞，SQL(Structured Query Language server database，结构化查询语言数据库)注入漏洞等。然后，通过Nginx(异步框架的网页服务器)反向代理漏洞靶机程序，对外发布IP(Internet Protocol Address，互联网协议地址)地址来支持通过HTTP(Hypertext Transfer Protocol，超文本传输协议)请求的方式进行访问，同时配置支持HTTP流量日志的存储器。之后在漏洞扫描器上下发漏洞扫描任务，并设置目标扫描地址为Nginx中对外发布的地址，此时漏洞扫描器会对目标扫描地址发送漏洞扫描的请求流量。Nginx接收到请求流量后，会将HTTP流量日志存储到服务器的数据文件中，同时使用一个流量解析工具来对HTTP流量日志进行解析并进行存储，从而获得漏洞扫描器的请求流量。At present, in order to facilitate researchers to derive the test method of vulnerability verification, the request traffic of the vulnerability scanner is usually obtained and stored first. In related technologies, a vulnerable target program with vulnerability defects is usually deployed first. Among them, vulnerability defects include XSS (cross-site scripting attack) cross-site request forgery vulnerability, SQL (Structured Query Language server database) injection vulnerability, etc. Then, the vulnerable target program is reverse-proxyed through Nginx (a web server with an asynchronous framework), and the IP (Internet Protocol Address) address is published to support access through HTTP (Hypertext Transfer Protocol) requests. At the same time, a storage device that supports HTTP traffic logs is configured. After that, the vulnerability scanning task is issued on the vulnerability scanner, and the target scanning address is set to the address published in Nginx. At this time, the vulnerability scanner will send vulnerability scanning request traffic to the target scanning address. After receiving the request traffic, Nginx will store the HTTP traffic log in the server's data file, and use a traffic parsing tool to parse and store the HTTP traffic log to obtain the request traffic of the vulnerability scanner.

在实现本申请实施例的过程中，发现相关技术中至少存在如下问题：During the implementation of the embodiments of the present application, it was found that at least the following problems exist in the related art:

目前，存储的漏洞扫描器的请求流量中存在较多的无效流量。Currently, there is a lot of invalid traffic in the request traffic of stored vulnerability scanners.

需要说明的是，在上述背景技术部分公开的信息仅用于加强对本申请的背景的理解，因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。It should be noted that the information disclosed in the above background technology section is only used to enhance the understanding of the background of this application, and therefore may include information that does not constitute prior art known to ordinary technicians in this field.

Summary of the Invention

为了对披露的实施例的一些方面有基本的理解，下面给出了简单的概括。所述概括不是泛泛评述，也不是要确定关键/重要组成元素或描绘这些实施例的保护范围，而是作为后面的详细说明的序言。In order to provide a basic understanding of some aspects of the disclosed embodiments, a brief summary is given below. The summary is not an extensive review, nor is it intended to identify key/critical elements or delineate the scope of protection of these embodiments, but rather serves as a prelude to the detailed description that follows.

本申请实施例提供了一种漏洞扫描器的流量提取方法及装置、电子设备、存储介质，以降低存储的请求流量中的无效流量。Embodiments of the present application provide a traffic extraction method and device for a vulnerability scanner, an electronic device, and a storage medium to reduce invalid traffic in stored request traffic.

本申请实施例提供了一种漏洞扫描器的流量提取方法，包括：触发漏洞扫描器发送请求流量；确定所述请求流量是否为冗余流量；所述冗余流量包括：通用流量和/或重复流量；所述重复流量为与此前接收的请求流量相同的流量；所述通用流量为漏洞扫描器用于验证站点存活的请求流量；若所述请求流量为非冗余流量，存储所述请求流量。An embodiment of the present application provides a traffic extraction method for a vulnerability scanner, including: triggering a vulnerability scanner to send request traffic; determining whether the request traffic is redundant traffic; the redundant traffic includes: common traffic and/or repeated traffic; the repeated traffic is the same traffic as the request traffic received previously; the common traffic is the request traffic used by the vulnerability scanner to verify the survival of the site; if the request traffic is non-redundant traffic, storing the request traffic.

在上述实施方式中，由于漏洞扫描器发送的请求流量中会存在一些验证页面是否存活之类的测试流量，以及在测试不同的网页时漏洞扫描器可能会发送重复的请求流量。并且验证页面是否存活之类的测试流量和重复的请求流量均属于无效流量。因此，在直接对漏洞扫描器发送的所有请求流量进行存储的情况下，存储的请求流量中会包含较多的无效流量。本申请通过确定请求流量是否为通用流量，进而存储不是通用流量的请求流量，能够减少存储的请求流量中的部分无效流量。同样的，本申请通过确定请求流量是否为重复流量，进而存储不是重复流量的请求流量，也能够减少存储的请求流量中的部分无效流量。在本申请通过确定请求流量是否为冗余流量，进而存储不是冗余流量的请求流量时，能够更好的降低存储的请求流量中的无效流量。In the above embodiment, the request traffic sent by the vulnerability scanner may contain some test traffic such as those for verifying whether the page is alive, and the vulnerability scanner may send repeated request traffic when testing different web pages. Moreover, the test traffic such as those for verifying whether the page is alive and the repeated request traffic are both invalid traffic. Therefore, when all the request traffic sent by the vulnerability scanner are directly stored, the stored request traffic will contain a lot of invalid traffic. The present application can reduce some of the invalid traffic in the stored request traffic by determining whether the request traffic is general traffic and then storing the request traffic that is not general traffic. Similarly, the present application can also reduce some of the invalid traffic in the stored request traffic by determining whether the request traffic is repeated traffic and then storing the request traffic that is not repeated traffic. When the present application determines whether the request traffic is redundant traffic and then stores the request traffic that is not redundant traffic, the invalid traffic in the stored request traffic can be better reduced.

进一步的，所述冗余流量包括通用流量；确定所述请求流量是否为冗余流量，包括：利用预设的哈希算法对所述请求流量进行计算，获得所述请求流量对应的第一哈希值；确定所述第一哈希值是否位于预设的通用集合中；若是，确定所述第一哈希值对应的请求流量是通用流量。Furthermore, the redundant traffic includes general traffic; determining whether the request traffic is redundant traffic includes: calculating the request traffic using a preset hash algorithm to obtain a first hash value corresponding to the request traffic; determining whether the first hash value is in a preset general set; if so, determining that the request traffic corresponding to the first hash value is general traffic.

在上述实施方式中，由于相同的请求流量的哈希值会相同。因此，可以预先计算通用流量对应的哈希值，并存储在通用集合中。再通过比较后续获取的请求流量的哈希值是否与此前通用集合中存储的哈希值相同，从而准确确定请求流量是否为通用流量，进而便于准确的存储非冗余流量。In the above embodiment, since the hash values of the same request traffic are identical, the hash values corresponding to common traffic can be pre-calculated and stored in a common set. By comparing the hash values of subsequently acquired request traffic with the hash values previously stored in the common set to see if they are identical, it is possible to accurately determine whether the request traffic is common traffic, thereby facilitating accurate storage of non-redundant traffic.

进一步的，所述通用集合通过以下方式获取：触发漏洞扫描器发送带有测试对象的样本请求流量；利用预设的哈希算法对各所述样本请求流量分别计算并形成哈希表；根据所述测试对象对各所述哈希表进行分类以形成N个哈希集合；所述N的值与所述测试对象的种类数目相同；筛选出每个哈希集合中重复的哈希表，并基于所述重复的哈希表组成通用集合。Furthermore, the universal set is obtained in the following manner: triggering a vulnerability scanner to send sample request traffic with a test object; using a preset hash algorithm to calculate each of the sample request traffic and form a hash table; classifying each of the hash tables according to the test object to form N hash sets; the value of N is the same as the number of types of the test objects; filtering out repeated hash tables in each hash set, and forming a universal set based on the repeated hash tables.

在上述实施方式中，由于在针对不同的测试对象进行测试时，漏洞扫描器发送的请求流量中会存在多个用于验证站点存活的请求流量，且用于验证站点存活的请求流量是相同的。因此，通过对收到的样本请求流量形成哈希表，按照测试对象进行分类获得哈希集合，并通过提取哈希集合中重复的哈希表，能够确定出通用流量对应的通用集合。便于后续直接根据通用集合判断接收的请求流量是否是通用流量，进而便于准确的存储非冗余流量。In the above embodiment, when testing different test objects, the vulnerability scanner may send multiple request flows for verifying site survival, and these request flows are identical. Therefore, by forming a hash table for the received sample request flows, classifying them by test object to obtain hash sets, and extracting duplicate hash tables from the hash sets, it is possible to determine the common set corresponding to the common traffic. This facilitates subsequent direct determination of whether the received request flows are common traffic based on the common set, thereby facilitating accurate storage of non-redundant traffic.

进一步的，所述测试对象包括url、host以及directory中的至少一种。Furthermore, the test object includes at least one of url, host and directory.

进一步的，所述冗余流量包括重复流量；确定所述请求流量是否为冗余流量，包括：将所述请求流量中URI字段和所述请求流量中Referer字段对应的路径均替换为预设路径；利用预设的哈希算法对替换路径后的所述请求流量进行计算，获得第二哈希值；检查预设的哈希值记录表中是否记录所述第二哈希值，若否，确定所述请求流量为非冗余流量，并将所述第二哈希值加入所述哈希值记录表中，并存储所述第二哈希值对应的请求流量；若是，确定所述请求流量为重复流量。Furthermore, the redundant traffic includes repeated traffic; determining whether the request traffic is redundant traffic includes: replacing the paths corresponding to the URI field in the request traffic and the Referer field in the request traffic with preset paths; using a preset hash algorithm to calculate the request traffic after the replaced path to obtain a second hash value; checking whether the second hash value is recorded in a preset hash value record table, if not, determining that the request traffic is non-redundant traffic, and adding the second hash value to the hash value record table, and storing the request traffic corresponding to the second hash value; if so, determining that the request traffic is repeated traffic.

在上述实施方式中，由于漏洞扫描器可能会发送相同的请求流量去测试不同的网页，因此接收到的请求流量中可能会存在部分重复流量。通过将请求流量中URI字段和请求流量中Referer字段对应的路径均替换为预设路径，此时若请求流量相同，则计算出来的请求流量的哈希值会相同。通过将替换路径之后的请求流量对应的第二哈希值与此前记录的替换路径之后的请求流量对应的第二哈希值进行比较，能够准确确定接收到的请求流量是否为之前的重复流量，进而便于准确的存储非冗余流量。In the above embodiment, since the vulnerability scanner may send the same request traffic to test different web pages, there may be some duplicate traffic in the received request traffic. By replacing the paths corresponding to the URI field in the request traffic and the Referer field in the request traffic with the preset paths, if the request traffic is the same, the calculated hash values of the request traffic will be the same. By comparing the second hash value corresponding to the request traffic after the replacement path with the second hash value corresponding to the request traffic after the replacement path previously recorded, it is possible to accurately determine whether the received request traffic is the previous duplicate traffic, thereby facilitating the accurate storage of non-redundant traffic.

进一步的，存储所述请求流量，包括：获取所述请求流量对应的漏洞名称；按照所述漏洞名称标记所述请求流量，并存储标记了漏洞名称的所述请求流量。Furthermore, storing the request traffic includes: obtaining a vulnerability name corresponding to the request traffic; marking the request traffic according to the vulnerability name, and storing the request traffic marked with the vulnerability name.

在上述实施方式中，通过对请求流量进行漏洞标记之后再进行存储，能够使得存储的请求流量有条理，便于用户后续使用。In the above implementation, by marking the request traffic for vulnerabilities and then storing it, the stored request traffic can be organized and convenient for subsequent use by the user.

进一步的，存储所述请求流量，还包括：将标记了漏洞名称的所述请求流量发送到预设的消息队列，以供消息消费者从所述消息队列中获取所述请求流量并进行处理；获取并存储所述请求流量的处理结果。Furthermore, storing the request traffic also includes: sending the request traffic marked with the vulnerability name to a preset message queue so that the message consumer can obtain the request traffic from the message queue and process it; obtaining and storing the processing results of the request traffic.

在上述实施方式中，将标记了漏洞名称的请求流量发送到预设的消息队列，可以将请求的处理方式变为异步处理方式，以提高系统的响应速度。In the above implementation, the request traffic marked with the vulnerability name is sent to a preset message queue, and the request processing method can be changed to an asynchronous processing method to improve the response speed of the system.

进一步的，在触发漏洞扫描器发送请求流量后，所述方法还包括：解析所述请求流量中的URI字段；根据所述URI字段确定响应模板；发送所述响应模板给所述漏洞扫描器。Furthermore, after triggering the vulnerability scanner to send request traffic, the method further includes: parsing a URI field in the request traffic; determining a response template based on the URI field; and sending the response template to the vulnerability scanner.

在上述实施方式中，由于漏洞扫描器在发送请求流量后，如果没有接收到对应的响应的情况下，可能会停止发送请求流量。因此，通过提前构造请求流量中的URI字段对应的响应模版，能够使得漏洞扫描器中的请求流量均能被正确响应，从而漏洞扫描器不会终止发送请求流量，进而能够获得更多的请求流量。In the above embodiment, after sending request traffic, the vulnerability scanner may stop sending request traffic if it does not receive a corresponding response. Therefore, by constructing a response template corresponding to the URI field in the request traffic in advance, all request traffic in the vulnerability scanner can be correctly responded to, so that the vulnerability scanner will not stop sending request traffic, and can obtain more request traffic.

本申请实施例提供了一种漏洞扫描器的流量提取装置，包括：触发模块，用于触发漏洞扫描器发送请求流量；确定模块，用于确定所述请求流量是否为冗余流量；所述冗余流量包括：通用流量和/或重复流量；所述重复流量为与此前接收的请求流量相同的流量；所述通用流量为漏洞扫描器用于验证站点存活的请求流量；存储模块，用于若所述请求流量为非冗余流量，存储所述请求流量。An embodiment of the present application provides a traffic extraction device for a vulnerability scanner, including: a trigger module for triggering the vulnerability scanner to send request traffic; a determination module for determining whether the request traffic is redundant traffic; the redundant traffic includes: common traffic and/or repeated traffic; the repeated traffic is the same traffic as the request traffic received previously; the common traffic is the request traffic used by the vulnerability scanner to verify the survival of the site; and a storage module for storing the request traffic if the request traffic is non-redundant traffic.

本申请实施例提供了一种电子设备，包括处理器和存储器，所述存储器存储有能够被所述处理器执行的计算机可执行指令，所述处理器执行所述计算机可执行指令以实现上述漏洞扫描器的流量提取方法。An embodiment of the present application provides an electronic device, including a processor and a memory, wherein the memory stores computer-executable instructions that can be executed by the processor, and the processor executes the computer-executable instructions to implement the traffic extraction method of the vulnerability scanner mentioned above.

以上的总体描述和下文中的描述仅是示例性和解释性的，不用于限制本申请。The above general description and the following description are exemplary and explanatory only and are not intended to limit the present application.

BRIEF DESCRIPTION OF THE DRAWINGS

一个或多个实施例通过与之对应的附图进行示例性说明，这些示例性说明和附图并不构成对实施例的限定，附图中具有相同参考数字标号的元件示为类似的元件，附图不构成比例限制，并且其中：One or more embodiments are exemplarily described by corresponding drawings. These exemplary descriptions and drawings do not limit the embodiments. Elements with the same reference numerals in the drawings are shown as similar elements. The drawings do not constitute a scale limitation. In addition,

图1是本申请实施例提供的一个漏洞扫描器的流量提取方法的示意图；FIG1 is a schematic diagram of a traffic extraction method for a vulnerability scanner provided in an embodiment of the present application;

图2是本申请实施例提供的另一个漏洞扫描器的流量提取方法的示意图；FIG2 is a schematic diagram of another vulnerability scanner traffic extraction method provided in an embodiment of the present application;

图3是本申请实施例提供的一个漏洞扫描器的流量提取装置的示意图；FIG3 is a schematic diagram of a traffic extraction device of a vulnerability scanner provided in an embodiment of the present application;

图4是本申请实施例提供的一个电子设备的示意图。FIG4 is a schematic diagram of an electronic device provided in an embodiment of the present application.

图标：触发模块1；确定模块2；存储模块3；处理器4；存储器5；通信接口6；总线7。Icon: trigger module 1; determination module 2; storage module 3; processor 4; memory 5; communication interface 6; bus 7.

DETAILED DESCRIPTION

为了能够更加详尽地了解本申请实施例的特点与技术内容，下面结合附图对本申请实施例的实现进行详细阐述，所附附图仅供参考说明之用，并非用来限定本申请实施例。在以下的技术描述中，为方便解释起见，通过多个细节以提供对所披露实施例的充分理解。然而，在没有这些细节的情况下，一个或多个实施例仍然可以实施。在其它情况下，为简化附图，熟知的结构和装置可以简化展示。In order to be able to understand the features and technical contents of the embodiments of the present application in more detail, the implementation of the embodiments of the present application is described in detail below in conjunction with the accompanying drawings. The accompanying drawings are for reference only and are not used to limit the embodiments of the present application. In the following technical description, for the sake of convenience of explanation, a full understanding of the disclosed embodiments is provided through multiple details. However, one or more embodiments can still be implemented without these details. In other cases, to simplify the drawings, well-known structures and devices can be simplified for display.

本申请实施例的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本申请实施例的实施例。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含。In the description and claims of the embodiments of the present application and the accompanying drawings, the terms "first," "second," and the like are used to distinguish similar objects and are not necessarily used to describe a particular order or precedence. It should be understood that the terms used in this manner are interchangeable where appropriate for the purposes of describing the embodiments of the present application. In addition, the terms "including," "having," and any variations thereof are intended to cover non-exclusive inclusions.

除非另有说明，术语“多个”表示两个或两个以上。Unless otherwise stated, the term "plurality" means two or more.

术语“对应”可以指的是一种关联关系或绑定关系，A与B相对应指的是A与B之间是一种关联关系或绑定关系。The term "correspondence" may refer to an association relationship or a binding relationship. The correspondence between A and B means that there is an association relationship or a binding relationship between A and B.

实施例一Example 1

本申请实施例中提供了一种漏洞扫描器的流量提取方法。可以参见图1所示，图1为本申请实施例中提供的漏洞扫描器的流量提取方法的基本流程示意图，包括：In an embodiment of the present application, a method for extracting traffic from a vulnerability scanner is provided. Referring to FIG. 1 , FIG. 1 is a schematic diagram of the basic flow of the method for extracting traffic from a vulnerability scanner provided in an embodiment of the present application, including:

步骤S101，触发漏洞扫描器发送请求流量。Step S101: trigger the vulnerability scanner to send request traffic.

在一些实施例中，针对提供了API(Application Programming Interface，应用程序编程接口)接口的漏洞扫描器，可以采用调用待触发的漏洞扫描器对应的API接口的方式，从而对漏洞扫描器进行控制。In some embodiments, for a vulnerability scanner that provides an API (Application Programming Interface), the vulnerability scanner can be controlled by calling the API interface corresponding to the vulnerability scanner to be triggered.

在另一些实施例中，针对未提供API接口的漏洞扫描器，可以通过编写程序模拟浏览器点击漏洞扫描器或模拟控制请求控制漏洞扫描器，从而对漏洞扫描器进行控制。In other embodiments, for a vulnerability scanner that does not provide an API interface, the vulnerability scanner can be controlled by writing a program to simulate a browser clicking on the vulnerability scanner or simulating a control request to control the vulnerability scanner.

示例性的，工程师可以预先按照漏洞名称、测试对象和漏洞分类等配置用于漏洞扫描器的漏洞扫描模板，每一个漏洞扫描模板对应一个预设的标记类型ID(Identity document，身份标识)。漏洞扫描器在接收到测试请求的情况下，按照测试请求对应的漏洞扫描模板发送请求流量，发送的请求流量会携带该漏洞扫描模板对应的标记类型ID。For example, engineers can pre-configure vulnerability scanning templates for the vulnerability scanner based on vulnerability names, test objects, and vulnerability categories. Each vulnerability scanning template corresponds to a preset tag type ID (identity document). When the vulnerability scanner receives a test request, it sends request traffic according to the vulnerability scanning template corresponding to the test request. The sent request traffic will carry the tag type ID corresponding to the vulnerability scanning template.

其中，标记类型ID的格式可以是scanner/{漏洞分类}/{测试对象}/{漏洞名称}。示例的，标记类型ID为scanner(扫描器)/web(一种漏洞分类)/per_url(一种测试对象)/sql_injection(一种漏洞名称，该漏洞名称为SQL注入漏洞)，该标记类型ID对应的漏洞扫描模板为对每条URL都进行SQL注入测试的SQL注入模板。The format of the tag type ID can be scanner/{vulnerability category}/{test object}/{vulnerability name}. For example, the tag type ID is scanner (scanner)/web (a vulnerability category)/per_url (a test object)/sql_injection (a vulnerability name, the vulnerability name is SQL injection vulnerability). The vulnerability scanning template corresponding to this tag type ID is an SQL injection template that performs an SQL injection test on each URL.

其中，漏洞分类，例如：Web平台的安全漏洞、应用自身的安全漏洞等。Among them, vulnerability classification, for example: security vulnerabilities of the Web platform, security vulnerabilities of the application itself, etc.

其中，测试对象包括url、host以及directory中的至少一种。可以理解，在测试对象为url的情况下，即请求流量是根据请求URI地址进行分类或处理的。在测试对象为host的情况下，即请求流量是根据请求对应的主机名或IP地址进行分类或处理的。在测试对象为directory的情况下，即请求流量是根据请求的资源所在的目录路径进行分类或处理的。The test object includes at least one of url, host, and directory. It will be understood that when the test object is url, the request traffic is classified or processed based on the request URI address. When the test object is host, the request traffic is classified or processed based on the host name or IP address corresponding to the request. When the test object is directory, the request traffic is classified or processed based on the directory path where the requested resource is located.

其中，漏洞名称，例如：SQL注入漏洞、XSS(Cross Site Scripting，跨站脚本攻击)跨站脚本攻击、代码注入漏洞、代码执行漏洞、代码执行漏洞、文件创建/文件删除漏洞/文件包含漏洞/文件上传漏洞、CRLF(Carriage-Return Line-Feed，HTTP响应拆分漏洞)注入漏洞、目录遍历漏洞、LDAP(Lightweight Directory Access Protocol，轻量级目录访问协议)注入漏洞、SSRF漏洞(服务端请求伪造漏洞)、URL(uniform resource locator，统一资源定位系统)跳转漏洞、XML(EXtensible Markup Language，可扩展标记语言)外部实体漏洞、XPATH(XML Path Language，可扩展标记语言路径语言)注入漏洞、Java反序列化漏洞、Apache Log4j(远程代码执行漏洞)、列目录漏洞和会话固定漏洞等。可以理解，漏洞名称还可以是现有的其他漏洞名称，在此不做限定。Vulnerability names include, for example, SQL injection vulnerability, XSS (Cross Site Scripting), code injection vulnerability, code execution vulnerability, code execution vulnerability, file creation/deletion vulnerability, file inclusion vulnerability, file upload vulnerability, CRLF (Carriage-Return Line-Feed, HTTP response splitting vulnerability) injection vulnerability, directory traversal vulnerability, LDAP (Lightweight Directory Access Protocol) injection vulnerability, SSRF vulnerability (Server-Side Request Forgery vulnerability), URL (Uniform Resource Locator) jump vulnerability, XML (Extensible Markup Language) external entity vulnerability, XPATH (XML Path Language) injection vulnerability, Java deserialization vulnerability, Apache Log4j (remote code execution vulnerability), directory listing vulnerability, and session fixation vulnerability. It is understood that the vulnerability names can also be other existing vulnerability names, which are not limited here.

在上述示例中，可以在请求流量的请求头中携带该漏洞扫描模板对应的标记类型ID。In the above example, the tag type ID corresponding to the vulnerability scanning template may be carried in the request header of the request traffic.

步骤S102，确定请求流量是否为冗余流量。Step S102: Determine whether the requested traffic is redundant traffic.

其中，冗余流量可以包括：通用流量和/或重复流量。The redundant traffic may include: general traffic and/or repeated traffic.

可以理解的，重复流量可以为与此前接收的请求流量相同的流量。通用流量可以为漏洞扫描器用于验证站点存活的请求流量。It is understood that the repeated traffic may be the same traffic as the previously received request traffic. The common traffic may be the request traffic used by the vulnerability scanner to verify the survival of the site.

在一些实施例中，冗余流量包括通用流量。步骤S102可以是：利用预设的哈希算法对请求流量进行计算，获得请求流量对应的第一哈希值；确定第一哈希值是否位于预设的通用集合中；若是，确定第一哈希值对应的请求流量是通用流量。相应的，若否，确定第一哈希值对应的请求流量不是通用流量。In some embodiments, the redundant traffic includes general traffic. Step S102 may include: calculating the request traffic using a preset hash algorithm to obtain a first hash value corresponding to the request traffic; determining whether the first hash value is within a preset general set; if so, determining that the request traffic corresponding to the first hash value is general traffic. Correspondingly, if not, determining that the request traffic corresponding to the first hash value is not general traffic.

示例性的，可以使用现有技术中存在的哈希算法，例如：MD5(Message Digest Algorithm5)、SHA-1(Secure Hash Algorithm1)、SHA-3(Secure Hash Algorithm3)、CRC32(Cyclic Redundancy Check)等，在此不做限定。For example, hash algorithms available in the prior art may be used, such as MD5 (Message Digest Algorithm 5), SHA-1 (Secure Hash Algorithm 1), SHA-3 (Secure Hash Algorithm 3), CRC32 (Cyclic Redundancy Check), etc., which are not limited here.

在上述实施例的一种可选方式中，可以通过以下方式获取通用集合：触发漏洞扫描器发送带有测试对象的样本请求流量，利用预设的哈希算法对各样本请求流量分别计算并形成哈希表，根据测试对象对各哈希表进行分类以形成N个哈希集合，筛选出每个哈希集合中重复的哈希表，并基于重复的哈希表组成通用集合。其中，N的值与测试对象的种类数目相同。In an alternative embodiment of the above embodiment, the universal set can be obtained by triggering a vulnerability scanner to send sample request traffic containing test objects, using a preset hash algorithm to calculate each sample request traffic and form a hash table, classifying each hash table according to the test object to form N hash sets, filtering out duplicate hash tables in each hash set, and forming a universal set based on the duplicate hash tables. The value of N is the same as the number of test object types.

在上述可选方式中，利用预设的哈希算法对各样本请求流量分别计算并形成哈希表可以是，针对每一条请求流量：获取该请求流量对应的漏洞名称和测试对象，并利用预设的哈希算法对该请求流量中的特定字段进行计算，获得第三哈希值。将相同漏洞名称且相同测试对象的请求流量对应的第三哈希值组成一个哈希表。其中，特定字段可以基于工程师的经验选取，例如：请求路径字段(Path)，请求正文字段(body)，请求头字段(Header)，请求方法字段(method)，请求协议字段(schema)等。In the above optional method, the method of using a preset hash algorithm to calculate each sample request flow separately and form a hash table can be as follows: for each request flow: obtaining the vulnerability name and test object corresponding to the request flow, and using the preset hash algorithm to calculate the specific field in the request flow to obtain a third hash value. The third hash values corresponding to the request flows with the same vulnerability name and the same test object are formed into a hash table. Among them, the specific fields can be selected based on the engineer's experience, such as: request path field (Path), request body field (body), request header field (Header), request method field (method), request protocol field (schema), etc.

示例性的，漏洞扫描器发送携带有标记类型ID的请求流量，标记类型ID中设置有漏洞名称和测试对象。获取该请求流量对应的漏洞名称可以是，在请求流量的标记类型ID中获取漏洞名称。同样的，获取该请求流量对应的测试对象也可以是，在请求流量的标记类型ID中获取测试对象。For example, a vulnerability scanner sends request traffic carrying a tag type ID, where the tag type ID includes a vulnerability name and a test object. Obtaining the vulnerability name corresponding to the request traffic may involve obtaining the vulnerability name from the tag type ID in the request traffic. Similarly, obtaining the test object corresponding to the request traffic may involve obtaining the test object from the tag type ID in the request traffic.

在上述可选方式中，根据测试对象对各哈希表进行分类以形成N个哈希集合，可以是获取哈希表对应的测试对象，将相同的测试对象的哈希表组成一个哈希集合。In the above optional manner, each hash table is classified according to the test object to form N hash sets. The test object corresponding to the hash table is obtained, and hash tables of the same test object are grouped into one hash set.

在上述可选方式中，筛选出每个哈希集合中重复的哈希表，并基于重复的哈希表组成通用集合可以是，针对每一个哈希表：将该哈希表与处于同一哈希集合的其余哈希表依次比较，在存在其余哈希表与该哈希表相同或近似的情况下，将该哈希表作为重复的哈希表。将所有重复的哈希表组成通用集合。In the above optional method, filtering out duplicate hash tables in each hash set and forming a common set based on the duplicate hash tables can be performed by, for each hash table: sequentially comparing the hash table with the remaining hash tables in the same hash set; if any other hash table is identical or similar to the hash table, the hash table is considered a duplicate hash table; and all duplicate hash tables are formed into a common set.

其中，可以通过以下方式确定两个哈希表是否近似或相同：筛选两个哈希表之间重复的第三哈希值，并累计重复的第三哈希值的重复数量，在重复数量与第三哈希值总量的比值大于预设比值的情况下，确定两个哈希表近似。在重复数量与第三哈希值总量的比值等于1的情况下，确定两个哈希表相同。可以理解，第三哈希值总量为被比较的哈希表中其中一个哈希表的第三哈希值的总数量。预设比值可以基于工程师的经验选取。Whether two hash tables are similar or identical can be determined by screening for duplicate third hash values between the two hash tables and accumulating the number of duplicate third hash values. If the ratio of the number of duplicates to the total number of third hash values is greater than a preset ratio, the two hash tables are determined to be similar. If the ratio of the number of duplicates to the total number of third hash values is equal to 1, the two hash tables are determined to be identical. It will be understood that the total number of third hash values is the total number of third hash values in one of the compared hash tables. The preset ratio can be selected based on the engineer's experience.

可以理解的，利用预设的哈希算法对该样本请求流量进行计算，获得第三哈希值后，可以按照HASH(哈希)方式存储第三哈希值到缓存组件(Redis)中，以便于后续利用第四哈希值形成通用集合。It can be understood that after calculating the sample request traffic using a preset hash algorithm to obtain the third hash value, the third hash value can be stored in the cache component (Redis) in a HASH manner to facilitate the subsequent use of the fourth hash value to form a universal set.

在上述实施例的另一种可选方式中，也可以通过以下方式获取通用集合：触发漏洞扫描器发送带有测试对象的样本请求流量；利用预设的哈希算法对各样本请求流量分别计算，获得各样本请求流量对应的第四哈希值；按照测试对象对各第四哈希值进行分类以形成N个哈希集合；筛选出每个哈希集合中重复预设次数的第四哈希值，并将重复预设次数的第四哈希值组成通用集合。其中，预设次数可以基于工程师的经验选取，例如：预设次数跟漏洞名称的种类数目相同。In another alternative embodiment of the above embodiment, a universal set can also be obtained by: triggering a vulnerability scanner to send sample request traffic containing a test object; using a preset hash algorithm to calculate each sample request traffic flow to obtain a fourth hash value corresponding to each sample request traffic flow; classifying each fourth hash value according to the test object to form N hash sets; filtering out the fourth hash value that is repeated a preset number of times in each hash set, and forming the fourth hash values that are repeated the preset number of times into a universal set. The preset number of times can be selected based on the engineer's experience, for example, the preset number of times is the same as the number of vulnerability names.

在上述实施例的另一种可选方式中，可以通过以下方式获取通用集合：触发漏洞扫描器发送带有测试对象和漏洞名称的样本请求流量；利用预设的哈希算法对各样本请求流量分别计算，获得各样本请求流量对应的第四哈希值；按照测试对象和漏洞名称对第四哈希值进行分类以形成多个哈希表；按照测试对象对各哈希表进行分类以形成N个哈希集合；针对每一个哈希集合：筛选该哈希集合中不同的哈希表之间重复的第四哈希值，将重复的第四哈希值组成通用集合。In another optional method of the above embodiment, the universal set can be obtained in the following manner: trigger the vulnerability scanner to send sample request traffic with the test object and vulnerability name; use a preset hash algorithm to calculate each sample request traffic separately to obtain the fourth hash value corresponding to each sample request traffic; classify the fourth hash values according to the test object and vulnerability name to form multiple hash tables; classify each hash table according to the test object to form N hash sets; for each hash set: filter the repeated fourth hash values between different hash tables in the hash set, and form the repeated fourth hash values into a universal set.

示例性的，在请求流量的请求头中添加用来标记任务是否为初始化任务的字段，例如：若为初始化任务，则插入字段Task-Mode：init-task。将插入有字段Task-Mode：init-task的请求流量作为样本请求流量。For example, a field is added to the request header of the request traffic to indicate whether the task is an initialization task. For example, if it is an initialization task, the field Task-Mode: init-task is inserted. The request traffic with the field Task-Mode: init-task inserted is used as the sample request traffic.

在上述可选方式中，按照测试对象和漏洞名称对第四哈希值进行分类以形成多个哈希表可以是，将相同漏洞名称且相同测试对象的第四哈希值组成一个哈希表。这样，由于针对一个漏洞名称的检测，可能会发送多条样本请求流量，该多条样本请求流量对应的漏洞名称和测试对象均相同。通过将相同漏洞名称的样本请求流量对应的第四哈希值组成一个哈希表，能够得到针对每一种漏洞名称的哈希表。由于在相同的测试对象下针对每一种漏洞名称发送的多条样本请求流量中，会存在用于验证站点存活的样本请求流量。进而，能够通过筛选哈希集合中不同的哈希表之间重复的第四哈希值，从而确定出通用流量。In the above optional manner, classifying the fourth hash value according to the test object and the vulnerability name to form multiple hash tables can be to form a hash table with the fourth hash values of the same vulnerability name and the same test object. In this way, due to the detection of a vulnerability name, multiple sample request flows may be sent, and the vulnerability name and test object corresponding to the multiple sample request flows are the same. By forming a hash table with the fourth hash values corresponding to the sample request flows of the same vulnerability name, a hash table for each vulnerability name can be obtained. Since there will be sample request flows for verifying the survival of the site among the multiple sample request flows sent for each vulnerability name under the same test object. Furthermore, the common flow can be determined by screening the fourth hash values repeated between different hash tables in the hash set.

在上述可选方式中，根据测试对象对各哈希表进行分类以形成N个哈希集合可以是，获取哈希表对应的测试对象，将相同的测试对象的哈希表组成一个哈希集合。In the above optional manner, classifying the hash tables according to the test objects to form N hash sets may be performed by obtaining the test objects corresponding to the hash tables and grouping hash tables of the same test objects into one hash set.

示例性的，将重复的第四哈希值组成通用集合，可以是针对每一个哈希集合均存在一个对应的通用集合，使用通用集合对应的哈希集合的测试对象标记通用集合。相应的，确定第一哈希值是否位于预设的通用集合中，可以是获得第一哈希值对应的请求流量的测试对象，确定该第一哈希值是否位于第一哈希值对应的测试对象的通用集合中。Exemplarily, forming a universal set with repeated fourth hash values may involve having a corresponding universal set for each hash set, and marking the universal set with a test object of the hash set corresponding to the universal set. Accordingly, determining whether the first hash value is within a preset universal set may involve obtaining a test object of request traffic corresponding to the first hash value and determining whether the first hash value is within the universal set of test objects corresponding to the first hash value.

示例性的，将重复的第四哈希值组成通用集合，也可以是分别筛选每一个哈希集合中不同的哈希表之间重复的第三哈希值，将所有筛选出来的重复的第三哈希值组成一个通用集合。Exemplarily, repeated fourth Hash values are grouped into a common set, or repeated third Hash values between different hash tables in each hash set are screened out respectively, and all screened out repeated third Hash values are grouped into a common set.

可以理解的，利用预设的哈希算法对该样本请求流量进行计算，获得第四哈希值后，可以按照HASH方式存储第四哈希值到缓存组件中，以便于后续利用第四哈希值形成通用集合。在获取通用集合后，可以按照MAP(映射)存储结构的方式，以测试对象为key(键)，通用集合为value(值)，将通用集合存储到缓存组件中。It is understood that after calculating the sample request traffic using a preset hash algorithm and obtaining a fourth hash value, the fourth hash value can be stored in the cache component in a HASH manner to facilitate subsequent use of the fourth hash value to form a universal set. After obtaining the universal set, the universal set can be stored in the cache component in a MAP storage structure manner, with the test object as the key and the universal set as the value.

示例性的，获取若干由漏洞扫描器发送的请求流量，分别对各请求流量进行解析，获得各请求流量分别对应的测试对象和漏洞名称。例如：请求流量A对应的测试对象为per_url，漏洞名称为SQL注入漏洞。请求流量B对应的测试对象为per_url，漏洞名称为SQL注入漏洞。请求流量C对应的测试对象为per_host，漏洞名称为代码注入漏洞。请求流量D对应的测试对象为per_host，漏洞名称为代码注入漏洞。请求流量E对应的测试对象为per_url，漏洞名称为目录遍历漏洞。请求流量F对应的测试对象为per_url，漏洞名称为目录遍历漏洞。利用预设的哈希算法对各样本请求流量进行计算，获得各样本请求流量分别对应的第四哈希值。例如：请求流量A对应第四哈希值a。请求流量B对应第四哈希值b。请求流量C对应第四哈希值c。请求流量E对应第四哈希值e。请求流量F对应第四哈希值f。将相同漏洞名称且相同测试对象的样本请求流量A和B分别对应的第四哈希值a和第四哈希值b组成一个哈希表G。将相同漏洞名称且相同测试对象的样本请求流量C和D分别对应的第四哈希值c和第四哈希值d组成一个哈希表H。将相同漏洞名称且相同测试对象的样本请求流量E和F分别对应的第四哈希值e和第四哈希值f组成一个哈希表I。然后，按照测试对象对各哈希表进行分类以形成N个哈希集合，例如：将相同测试对象的哈希表G和哈希表I形成一个哈希集合J。最后，筛选哈希集合J中不同的哈希表G和哈希表I两个哈希表中重复的第四哈希值，将重复的第四哈希值作为通用集合中的一部分。Exemplarily, several request flows sent by a vulnerability scanner are obtained, and each request flow is parsed separately to obtain the test object and vulnerability name corresponding to each request flow. For example: the test object corresponding to request flow A is per_url, and the vulnerability name is SQL injection vulnerability. The test object corresponding to request flow B is per_url, and the vulnerability name is SQL injection vulnerability. The test object corresponding to request flow C is per_host, and the vulnerability name is code injection vulnerability. The test object corresponding to request flow D is per_host, and the vulnerability name is code injection vulnerability. The test object corresponding to request flow E is per_url, and the vulnerability name is directory traversal vulnerability. The test object corresponding to request flow F is per_url, and the vulnerability name is directory traversal vulnerability. Each sample request flow is calculated using a preset hash algorithm to obtain a fourth hash value corresponding to each sample request flow. For example: request flow A corresponds to the fourth hash value a. Request flow B corresponds to the fourth hash value b. Request flow C corresponds to the fourth hash value c. Request flow E corresponds to the fourth hash value e. Request flow F corresponds to the fourth hash value f. The fourth hash values a and b corresponding to the sample request flows A and B with the same vulnerability name and the same test object are respectively formed into a hash table G. The fourth hash values c and d corresponding to the sample request flows C and D with the same vulnerability name and the same test object are respectively formed into a hash table H. The fourth hash values e and f corresponding to the sample request flows E and F with the same vulnerability name and the same test object are respectively formed into a hash table I. Then, each hash table is classified according to the test object to form N hash sets. For example, hash table G and hash table I with the same test object are formed into a hash set J. Finally, the fourth hash values that are repeated in the two hash tables G and I in the hash set J are filtered out, and the repeated fourth hash values are included as part of the common set.

在一些实施例中，冗余流量包括重复流量。确定请求流量是否为冗余流量可以是：将请求流量中URI字段和请求流量中Referer字段对应的路径均替换为预设路径。利用预设的哈希算法对替换路径后的请求流量进行计算，获得第二哈希值。检查预设的哈希值记录表中是否记录第二哈希值，若否，确定请求流量为非冗余流量，并将第二哈希值加入哈希值记录表中，并存储第二哈希值对应的请求流量。若是，确定请求流量为重复流量。In some embodiments, redundant traffic includes duplicate traffic. Determining whether the request traffic is redundant traffic can be: replacing the paths corresponding to the URI field in the request traffic and the Referer field in the request traffic with preset paths. Calculating the request traffic after the path replacement using a preset hash algorithm to obtain a second hash value. Checking whether the second hash value is recorded in the preset hash value record table, if not, determining that the request traffic is non-redundant traffic, adding the second hash value to the hash value record table, and storing the request traffic corresponding to the second hash value. If so, determining that the request traffic is duplicate traffic.

示例性的，在接受到漏洞扫描器发送的请求流量后，将请求流量解析为以下字段，例如：method字段、host字段、URI字段、path字段、content字段、query字段、headers字段、scheme字段和port字段等。然后将请求流量中URI字段和请求流量中headers字段中的Referer字段对应的路径均替换为预设路径。例如：将/auth/index/替换为预设的/nTE6Enq1L9。又例如：将/form_data_post替换为预设的/nTE6Enq1L9。还例如：将Referer字段的http://xxx.com/form_data_post替换为http://xxx.com/nTE6Enq1L9。Exemplarily, after receiving the request traffic sent by the vulnerability scanner, the request traffic is parsed into the following fields, such as: method field, host field, URI field, path field, content field, query field, headers field, scheme field, and port field. Then, the paths corresponding to the URI field in the request traffic and the Referer field in the header field in the request traffic are replaced with preset paths. For example, /auth/index/ is replaced with the preset /nTE6Enq1L9. Another example is: /form_data_post is replaced with the preset /nTE6Enq1L9. Another example is: http://xxx.com/form_data_post in the Referer field is replaced with http://xxx.com/nTE6Enq1L9.

其中，method字段表征请求方案，host字段表征目标主机，URI字段表征请求URI，path字段表征请求路径，content字段表征请求正文，query字段表征请求参数，headers字段表征请求头，scheme字段表征请求协议，port字段表征目标端口，Referer字段表征HTTP来源地址。Among them, the method field represents the request scheme, the host field represents the target host, the URI field represents the request URI, the path field represents the request path, the content field represents the request body, the query field represents the request parameters, the headers field represents the request header, the scheme field represents the request protocol, the port field represents the target port, and the Referer field represents the HTTP source address.

步骤S103，若请求流量为非冗余流量，存储请求流量。Step S103: If the requested traffic is non-redundant traffic, store the requested traffic.

在一些实施例中，存储请求流量可以是：获取请求流量对应的漏洞名称；按照漏洞名称标记请求流量，并存储标记了漏洞名称的请求流量。In some embodiments, storing the request traffic may include: obtaining a vulnerability name corresponding to the request traffic; marking the request traffic according to the vulnerability name, and storing the request traffic marked with the vulnerability name.

示例性的，通过解析请求流量的请求头中的标记类型ID，获得该请求流量对应的漏洞名称，并将请求流量标记为该漏洞名称。例如：获取的请求流量的标记类型ID为scanner/web/per_url/sql_injection。其中，sql_injection表示SQL注入漏洞，则将该请求流量标记漏洞名称为SQL注入漏洞。For example, by parsing the tag type ID in the request header of the request traffic, the vulnerability name corresponding to the request traffic is obtained, and the request traffic is marked with the vulnerability name. For example, if the tag type ID of the obtained request traffic is scanner/web/per_url/sql_injection, where sql_injection indicates a SQL injection vulnerability, the vulnerability name of the request traffic is marked as SQL injection vulnerability.

在上述实施例中，存储请求流量，还可以包括：将标记了漏洞名称的请求流量发送到预设的消息队列，以供消息消费者从消息队列中获取请求流量并进行处理；获取并存储请求流量的处理结果。In the above embodiment, storing the request traffic may also include: sending the request traffic marked with the vulnerability name to a preset message queue, so that the message consumer obtains the request traffic from the message queue and processes it; obtaining and storing the processing result of the request traffic.

示例性的，通过解析请求流量的请求头中的标记类型ID，获得该请求流量对应的漏洞名称，并将请求流量标记为该漏洞名称。将标记了漏洞名称的请求流量发送到预设的消息队列中，消息消费者从消息队列中获取请求流量，对请求流量进行处理，并将处理后的结果保存到预设的存储数据库或预设的文件中。For example, by parsing the tag type ID in the request header of the request traffic, the vulnerability name corresponding to the request traffic is obtained, and the request traffic is tagged with the vulnerability name. The request traffic tagged with the vulnerability name is sent to a preset message queue. The message consumer obtains the request traffic from the message queue, processes the request traffic, and saves the processed results to a preset storage database or a preset file.

在一些实施例中，在触发漏洞扫描器发送请求流量后，漏洞扫描器的流量提取方法还可以包括：解析请求流量中的URI字段；根据URI字段确定响应模板；发送响应模板给漏洞扫描器。In some embodiments, after triggering the vulnerability scanner to send request traffic, the traffic extraction method of the vulnerability scanner may further include: parsing the URI field in the request traffic; determining a response template based on the URI field; and sending the response template to the vulnerability scanner.

在上述实施例中，根据URI字段确定响应模板可以是利用预设的响应数据库，对URI字段进行查表操作，获得URI字段对应的响应模板。响应数据库中至少存储有URI字段和响应模板之间的对应关系。In the above embodiment, determining the response template according to the URI field can be performed by using a preset response database to look up the URI field and obtain the response template corresponding to the URI field. The response database at least stores the correspondence between the URI field and the response template.

在上述实施例的一种可选方式中，响应模板，例如：根目录模板和子目录模板。在URI字段属于根目录的情况下，则返回根目录模板，根目录模板中可以包含子目录模板的跳转条件。在URI字段属于指定URI的情况下，将直接返回对应的子目录模板。其中，子目录模板可以是响应内容。子目录模板也可以是通过自定义构造响应内容的方式构造符合漏洞扫描器发送请求流量的标准的页面。In an optional manner of the above embodiment, the response template, for example, a root directory template and a sub-directory template. When the URI field belongs to the root directory, the root directory template is returned, and the root directory template can include the jump condition of the sub-directory template. When the URI field belongs to a specified URI, the corresponding sub-directory template will be directly returned. Among them, the sub-directory template can be the response content. The sub-directory template can also be a page that meets the standard of sending request traffic by the vulnerability scanner by customizing the response content.

示例性的，根目录的基础响应页面模板表示当漏洞扫描器进行根目录请求的时候返回的响应内容。即，根目录模板为返回给漏洞扫描器的根目录界面，根目录模板例如：
For example, the root directory basic response page template represents the response content returned when the vulnerability scanner makes a root directory request. That is, the root directory template is the root directory interface returned to the vulnerability scanner. For example, the root directory template is:

若子目录模板是通过自定义构造响应内容的方式构造符合漏洞扫描器发送请求流量的标准的页面。子目录模板可以是包含指定URI，响应内容，响应状态码的字典集合。If the sub-directory template is to construct a page that meets the standards of the vulnerability scanner's request traffic by customizing the response content, the sub-directory template can be a dictionary set containing the specified URI, response content, and response status code.

以/auth/index/和/auth/为例，/auth/index/就是响应模板的指定URI，当漏洞扫描器访问了这个指定URI就会返回对应的子目录模板。通常在构造子目录模板的同时，还需要在根目录模板中插入该子目录模板的触发标签，例如：在根目录模板中插入触发标签<a href＝"/auth/index">auth</a>。这样，当漏洞扫描器扫描到根目录模板中的a标签后，漏洞扫描器会对a标签对应的链接进行请求，从而请求/auth/index/目录路径，当确认页面存在后，漏洞扫描器一般会对目录进行分割后拆分访问，从而触发`/auth/`请求，/auth/页面相应将会返回WWW-Authenticate字段来表示页面需要授权，此时漏洞扫描器会采用AUTH授权方式对页面进行授权爆破测试或者对Basic Auth字段进行注入测试。Taking /auth/index/ and /auth/ as examples, /auth/index/ is the designated URI of the response template. When the vulnerability scanner accesses this designated URI, it will return the corresponding sub-directory template. Usually, when constructing the sub-directory template, it is also necessary to insert the trigger tag of the sub-directory template into the root directory template. For example, insert the trigger tag <a href="/auth/index">auth</a> into the root directory template. In this way, when the vulnerability scanner scans the a tag in the root directory template, the vulnerability scanner will request the link corresponding to the a tag, thereby requesting the /auth/index/ directory path. After confirming that the page exists, the vulnerability scanner will generally split the directory and then split the access, thereby triggering the `/auth/` request. The /auth/ page will correspondingly return the WWW-Authenticate field to indicate that the page requires authorization. At this time, the vulnerability scanner will use the AUTH authorization method to perform an authorization blasting test on the page or perform an injection test on the Basic Auth field.

如下表为在URI字段属于指定URI的情况下，部分指定URI对应的响应内容和需要在根目录模板插入内容的对应关系示例表。例如：在指定URI为“/query_get/idnex？id＝hello”的情况下，对应的响应内容“状态码：200”对应的根目录模板插入内容为“<ahref＝"/query_get/idnex.php？id＝hello">auth</a>”。

The following table shows an example of the correspondence between the response content corresponding to a specified URI and the content that needs to be inserted into the root directory template when the URI field is a specified URI. For example, when the specified URI is "/query_get/idnex?id=hello", the corresponding response content is "Status Code: 200" and the corresponding root directory template content is "<ahref="/query_get/idnex.php?id=hello">auth</a>".

表1Table 1

实施例二Example 2

本申请实施例中提供了另一种漏洞扫描器的流量提取方法。可以参见图2所示，图2为本申请实施例中提供的另一种漏洞扫描器的流量提取方法的基本流程示意图，包括：Another method for extracting traffic from a vulnerability scanner is provided in an embodiment of the present application. FIG2 is a schematic diagram of a basic flow chart of another method for extracting traffic from a vulnerability scanner provided in an embodiment of the present application, including:

步骤S201，触发漏洞扫描器发送请求流量，然后执行步骤S202。Step S201, triggering the vulnerability scanner to send request traffic, and then executing step S202.

步骤S202，在接受到请求流量后，通过流量解析功能提取请求流量的请求头中的Task-Mode字段，然后执行步骤S203。Step S202: After receiving the request traffic, extract the Task-Mode field in the request header of the request traffic through the traffic parsing function, and then execute step S203.

步骤S203，根据Task-Mode字段判断请求流量对应的任务是否为初始化任务，然后执行步骤S204。Step S203: Determine whether the task corresponding to the requested traffic is an initialization task according to the Task-Mode field, and then execute step S204.

步骤S204，在请求流量对应的任务不是初始化任务的情况下，利用预设的哈希算法对请求流量进行计算，获得请求流量对应的第一哈希值，然后执行步骤S205。Step S204: When the task corresponding to the requested traffic is not an initialization task, the requested traffic is calculated using a preset hash algorithm to obtain a first hash value corresponding to the requested traffic, and then step S205 is executed.

步骤S205，确定第一哈希值是否位于预设的通用集合中。在第一哈希值不位于预设的通用集合中的情况下，执行步骤S206。在第一哈希值位于预设的通用集合中的情况下，执行步骤S210。Step S205: Determine whether the first Hash value is in a preset universal set. If the first Hash value is not in the preset universal set, execute step S206. If the first Hash value is in the preset universal set, execute step S210.

步骤S206，将请求流量中URI字段和请求流量中Referer字段对应的路径均替换为预设路径；利用预设的哈希算法对替换路径后的请求流量进行计算，获得第二哈希值，然后执行步骤S207。In step S206, the paths corresponding to the URI field in the request traffic and the Referer field in the request traffic are replaced with preset paths; the request traffic after the replaced paths is calculated using a preset hash algorithm to obtain a second hash value, and then step S207 is executed.

步骤S207，检查预设的哈希值记录表中是否记录第二哈希值。在预设的哈希值记录表未记录第二哈希值的情况下，执行步骤S208。在预设的哈希值记录表记录第二哈希值的情况下，执行步骤S210。Step S207: Check whether the preset hash value record table records the second hash value. If the preset hash value record table does not record the second hash value, execute step S208. If the preset hash value record table records the second hash value, execute step S210.

步骤S208，确定请求流量为非冗余流量，并将第二哈希值加入哈希值记录表中，并存储第二哈希值对应的请求流量，然后执行步骤S209。Step S208 , determining that the request traffic is non-redundant traffic, adding the second hash value to the hash value record table, and storing the request traffic corresponding to the second hash value, and then executing step S209 .

步骤S209，获取请求流量对应的漏洞名称；按照漏洞名称标记请求流量，并存储标记了漏洞名称的请求流量，然后执行步骤S210。Step S209, obtaining the vulnerability name corresponding to the request traffic; marking the request traffic according to the vulnerability name, and storing the request traffic marked with the vulnerability name, and then executing step S210.

步骤S210，解析请求流量中的URI字段；根据URI字段确定响应模板；发送响应模板给漏洞扫描器。Step S210, parse the URI field in the request traffic; determine a response template according to the URI field; and send the response template to the vulnerability scanner.

实施例三Example 3

基于同一发明构思，本申请实施例提供一种漏洞扫描器的流量提取装置，如图3所示，漏洞扫描器的流量提取装置包括：触发模块1、确定模块2和存储模块3。触发模块1，用于触发漏洞扫描器发送请求流量；确定模块2，用于确定请求流量是否为冗余流量；冗余流量包括：通用流量和/或重复流量；重复流量为与此前接收的请求流量相同的流量；通用流量为漏洞扫描器用于验证站点存活的请求流量；存储模块3，用于若请求流量为非冗余流量，存储请求流量。Based on the same inventive concept, an embodiment of the present application provides a traffic extraction device for a vulnerability scanner. As shown in FIG3 , the traffic extraction device for the vulnerability scanner includes: a triggering module 1, a determining module 2, and a storage module 3. The triggering module 1 is used to trigger the vulnerability scanner to send request traffic; the determining module 2 is used to determine whether the request traffic is redundant traffic; redundant traffic includes: common traffic and/or duplicate traffic; duplicate traffic is traffic that is identical to previously received request traffic; common traffic is request traffic used by the vulnerability scanner to verify site survival; and the storage module 3 is used to store the request traffic if it is non-redundant traffic.

在一些实施例中，冗余流量包括通用流量。确定模块2用于通过以下方式确定请求流量是否为冗余流量：利用预设的哈希算法对请求流量进行计算，获得请求流量对应的第一哈希值；确定第一哈希值是否位于预设的通用集合中；若是，确定第一哈希值对应的请求流量是通用流量。In some embodiments, the redundant traffic includes general traffic. Determination module 2 is configured to determine whether the request traffic is redundant traffic by: calculating the request traffic using a preset hash algorithm to obtain a first hash value corresponding to the request traffic; determining whether the first hash value is within a preset general set; and if so, determining that the request traffic corresponding to the first hash value is general traffic.

在一些实施例中，漏洞扫描器的流量提取装置还包括通用集合获取模块。通用集合获取模块用于通过以下方式获取通用集合：触发漏洞扫描器发送带有测试对象的样本请求流量；利用预设的哈希算法对各样本请求流量分别计算并形成哈希表；根据测试对象对各哈希表进行分类以形成N个哈希集合；N的值与测试对象的种类数目相同；筛选出每个哈希集合中重复的哈希表，并基于重复的哈希表组成通用集合。In some embodiments, the traffic extraction device of the vulnerability scanner further includes a universal set acquisition module. The universal set acquisition module is configured to acquire universal sets by: triggering the vulnerability scanner to send sample request traffic containing test objects; using a preset hash algorithm to calculate each sample request traffic and form a hash table; classifying each hash table according to the test object to form N hash sets, where the value of N is the same as the number of test object types; filtering out duplicate hash tables in each hash set, and forming a universal set based on the duplicate hash tables.

在一些实施例中，冗余流量包括重复流量。确定模块2用于通过以下方式确定请求流量是否为冗余流量：将请求流量中URI字段和请求流量中Referer字段对应的路径均替换为预设路径；利用预设的哈希算法对替换路径后的请求流量进行计算，获得第二哈希值；检查预设的哈希值记录表中是否记录第二哈希值，若否，确定请求流量为非冗余流量，并将第二哈希值加入哈希值记录表中，并存储第二哈希值对应的请求流量；若是，确定请求流量为重复流量。In some embodiments, the redundant traffic includes duplicate traffic. Determination module 2 is configured to determine whether the request traffic is redundant traffic by: replacing the path corresponding to the URI field in the request traffic and the Referer field in the request traffic with a preset path; calculating the request traffic after the path replacement using a preset hash algorithm to obtain a second hash value; checking whether the second hash value is recorded in a preset hash value record table; if not, determining that the request traffic is non-redundant traffic, adding the second hash value to the hash value record table, and storing the request traffic corresponding to the second hash value; if so, determining that the request traffic is duplicate traffic.

在一些实施例中，存储模块3用于通过以下方式存储请求流量：获取请求流量对应的漏洞名称；按照漏洞名称标记请求流量，并存储标记了漏洞名称的请求流量。In some embodiments, the storage module 3 is used to store the request traffic in the following manner: obtaining the vulnerability name corresponding to the request traffic; marking the request traffic according to the vulnerability name, and storing the request traffic marked with the vulnerability name.

在上述实施例中，存储模块3还用于将标记了漏洞名称的请求流量发送到预设的消息队列，以供消息消费者从消息队列中获取请求流量并进行处理；获取并存储请求流量的处理结果。In the above embodiment, the storage module 3 is also used to send the request traffic marked with the vulnerability name to a preset message queue so that the message consumer can obtain the request traffic from the message queue and process it; and obtain and store the processing results of the request traffic.

在一些实施例中，漏洞扫描器的流量提取装置还包括：响应模块。响应模块用于在触发漏洞扫描器发送请求流量后，解析请求流量中的URI字段；根据URI字段确定响应模板；发送响应模板给漏洞扫描器。In some embodiments, the traffic extraction device of the vulnerability scanner further includes a response module configured to parse a URI field in the request traffic after triggering the vulnerability scanner to send the request traffic, determine a response template based on the URI field, and send the response template to the vulnerability scanner.

可以理解，实施例一中描述的各实施例，在不冲突的情况下，在实施例三中也同样适用，基于描述简洁考虑，在此不再赘述。It can be understood that the various embodiments described in the first embodiment are also applicable to the third embodiment if there is no conflict. For the sake of brevity, they will not be repeated here.

实施例四Example 4

结合图4所示，本申请实施例提供一种电子设备，包括处理器4和存储器5。可选地，该装置还可以包括通信接口6和总线7。其中，处理器4、通信接口6、存储器5可以通过总线7完成相互间的通信。通信接口6可以用于信息传输。处理器4可以调用存储器5中的逻辑指令，以执行上述实施例的漏洞扫描器的流量提取方法。As shown in Figure 4 , an embodiment of the present application provides an electronic device comprising a processor 4 and a memory 5. Optionally, the device may further comprise a communication interface 6 and a bus 7. The processor 4, the communication interface 6, and the memory 5 may communicate with each other via the bus 7. The communication interface 6 may be used for information transmission. The processor 4 may invoke logic instructions in the memory 5 to execute the vulnerability scanner traffic extraction method of the above embodiment.

此外，上述的存储器5中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。In addition, the logic instructions in the memory 5 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product.

存储器5作为一种计算机可读存储介质，可用于存储软件程序、计算机可执行程序，如本申请实施例中的方法对应的程序指令/模块。处理器4通过运行存储在存储器5中的程序指令/模块，从而执行功能应用以及数据处理，即实现上述实施例中漏洞扫描器的流量提取方法。Memory 5, as a computer-readable storage medium, can be used to store software programs and computer-executable programs, such as the program instructions/modules corresponding to the methods in the embodiments of the present application. Processor 4 executes the program instructions/modules stored in memory 5 to perform functional applications and data processing, thereby implementing the traffic extraction method of the vulnerability scanner in the above-mentioned embodiments.

存储器5可包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需的应用程序；存储数据区可存储根据终端设备的使用所创建的数据等。此外，存储器5可以包括高速随机存取存储器，还可以包括非易失性存储器。The memory 5 may include a program storage area and a data storage area. The program storage area may store an operating system and at least one application required for a function; the data storage area may store data generated based on the use of the terminal device. Furthermore, the memory 5 may include a high-speed random access memory and a non-volatile memory.

其中，电子设备可以为计算机或服务器等。The electronic device may be a computer or a server.

本申请实施例提供了一种存储介质，存储有计算机可执行指令，所述计算机可执行指令设置为执行上述漏洞扫描器的流量提取方法。An embodiment of the present application provides a storage medium storing computer-executable instructions, wherein the computer-executable instructions are configured to execute the traffic extraction method of the vulnerability scanner.

本申请实施例提供了一种计算机程序产品，所述计算机程序产品包括存储在存储介质上的计算机程序，所述计算机程序包括程序指令，当所述程序指令被计算机执行时，使所述计算机执行上述漏洞扫描器的流量提取方法。An embodiment of the present application provides a computer program product, which includes a computer program stored on a storage medium. The computer program includes program instructions. When the program instructions are executed by a computer, the computer executes the traffic extraction method of the vulnerability scanner.

上述的计算机可读存储介质可以是暂态计算机可读存储介质，也可以是非暂态计算机可读存储介质。The aforementioned computer-readable storage medium may be a transient computer-readable storage medium or a non-transitory computer-readable storage medium.

本申请实施例的技术方案可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括一个或多个指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本申请实施例所述方法的全部或部分步骤。而前述的存储介质可以是非暂态存储介质，包括：U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等多种可以存储程序代码的介质，也可以是暂态存储介质。The technical solutions of the embodiments of the present application can be embodied in the form of a software product, which is stored in a storage medium and includes one or more instructions for causing a computer device (which can be a personal computer, server, or network device, etc.) to execute all or part of the steps of the method described in the embodiments of the present application. The aforementioned storage medium can be a non-transitory storage medium, including: a USB flash drive, a mobile hard disk, a read-only memory, a random access memory, a magnetic disk, or an optical disk, etc., which can store program code, or a transient storage medium.

在本申请所提供的实施例中，应该理解到，所揭露装置和方法，可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，又例如，多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。In the embodiments provided herein, it should be understood that the disclosed devices and methods may be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of the units is merely a logical functional division, and actual implementation may employ other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not implemented.

以上所述仅为本申请的实施例而已，并不用于限制本申请的保护范围，对于本领域的技术人员来说，本申请可以有各种更改和变化。凡在本申请的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本申请的保护范围之内。同时，以上实施例在不冲突的情况下，可以互相结合，形成新的实施例。The above description is merely an embodiment of the present application and is not intended to limit the scope of protection of the present application. For those skilled in the art, various modifications and variations of the present application are possible. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present application shall be included in the scope of protection of the present application. Furthermore, the above embodiments may be combined with each other to form new embodiments, unless they conflict.

Claims

A method for extracting traffic from a vulnerability scanner, comprising:

Trigger the vulnerability scanner to send request traffic;

Determine whether the request traffic is redundant traffic; the redundant traffic includes: common traffic and/or repeated traffic; the repeated traffic is the same traffic as the previously received request traffic; the common traffic is the request traffic used by the vulnerability scanner to verify the survival of the site;

If the requested traffic is non-redundant traffic, the requested traffic is stored.

The method according to claim 1, wherein the redundant traffic includes general traffic; and determining whether the request traffic is redundant traffic comprises:

Calculate the request traffic using a preset hash algorithm to obtain a first hash value corresponding to the request traffic;

Determining whether the first hash value is in a preset universal set;

If so, determine that the request traffic corresponding to the first hash value is general traffic.

The method according to claim 2, wherein the general set is obtained by:

Trigger the vulnerability scanner to send sample request traffic with the test object;

Calculate the request traffic of each sample using a preset hash algorithm and form a hash table;

Classifying each of the hash tables according to the test object to form N hash sets; the value of N is the same as the number of types of the test objects;

Duplicate hash tables in each hash set are screened out, and a common set is formed based on the duplicate hash tables.

The method according to claim 3, characterized in that the test object includes at least one of url, host and directory.

The method according to claim 1, wherein the redundant traffic includes duplicate traffic; and determining whether the request traffic is redundant traffic comprises:

Replace the paths corresponding to the URI field in the request traffic and the Referer field in the request traffic with the preset paths;

Calculate the request traffic after the path is replaced using a preset hash algorithm to obtain a second hash value;

Check whether the second hash value is recorded in a preset hash value record table; if not, determine that the request traffic is non-redundant traffic, add the second hash value to the hash value record table, and store the request traffic corresponding to the second hash value;

If so, it is determined that the request traffic is duplicate traffic.

The method according to any one of claims 1 to 5, wherein storing the request traffic comprises:

Obtain the vulnerability name corresponding to the request traffic;

The request traffic is marked according to the vulnerability name, and the request traffic marked with the vulnerability name is stored.

The method according to claim 6, wherein storing the request traffic further comprises:

Sending the request traffic marked with the vulnerability name to a preset message queue, so that a message consumer can obtain the request traffic from the message queue and process it;

Obtain and store the processing result of the request traffic.

The method according to any one of claims 1 to 5, characterized in that after triggering the vulnerability scanner to send request traffic, the method further comprises:

Parsing the URI field in the request traffic;

Determine a response template according to the URI field;

Sending the response template to the vulnerability scanner.

A traffic extraction device for a vulnerability scanner, characterized by comprising:

Trigger module, used to trigger the vulnerability scanner to send request traffic;

a determination module, configured to determine whether the request traffic is redundant traffic; the redundant traffic includes: common traffic and/or repeated traffic; the repeated traffic is traffic identical to the previously received request traffic; the common traffic is request traffic used by the vulnerability scanner to verify site survival;

A storage module is used to store the requested traffic if the requested traffic is non-redundant traffic.

An electronic device, characterized in that it includes a processor and a memory, the memory storing computer-executable instructions that can be executed by the processor, and the processor executing the computer-executable instructions to implement the traffic extraction method of the vulnerability scanner according to any one of claims 1 to 8.