[go: up one dir, main page]

CN115344491A - Code detection method, device, equipment and storage medium - Google Patents

Code detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN115344491A
CN115344491A CN202210998471.0A CN202210998471A CN115344491A CN 115344491 A CN115344491 A CN 115344491A CN 202210998471 A CN202210998471 A CN 202210998471A CN 115344491 A CN115344491 A CN 115344491A
Authority
CN
China
Prior art keywords
code
preset
defect
target
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210998471.0A
Other languages
Chinese (zh)
Inventor
肖孟孟
张�浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Anheng Smart City Security Technology Co ltd
Original Assignee
Shanghai Anheng Smart City Security Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Anheng Smart City Security Technology Co ltd filed Critical Shanghai Anheng Smart City Security Technology Co ltd
Priority to CN202210998471.0A priority Critical patent/CN115344491A/en
Publication of CN115344491A publication Critical patent/CN115344491A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3604Analysis of software for verifying properties of programs
    • G06F11/3608Analysis of software for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses a code detection method, a device, equipment and a storage medium, which relate to the technical field of computers and comprise the following steps: determining a defect code from the code to be analyzed by a preset screening method; extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information; dividing the code slices according to a preset rule to obtain corresponding components; formalized abstraction is carried out on the components to obtain target binary information; and detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result, and judging whether the code to be analyzed is qualified or not based on the detection result. According to the method and the device, the characteristic elements are generated through the defect codes, the code structure information is extracted, the redundant information with huge data volume is eliminated, whether the codes are qualified or not is judged through the detection of the binary information, the reliability of software detection, the code detection speed and the development efficiency are improved, and the software development and maintenance cost is reduced.

Description

Code detection method, device, equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a code detection method, apparatus, device, and storage medium.
Background
With the development of information technology, software plays an increasingly important role in generating business capabilities. In order to ensure the controllable quality of software products, software development organizations develop the software products according to an engineering method. The software code defect management work is an important content in software engineering, and the main purpose of performing software code examination is to improve the software quality, find software defects as soon as possible and prevent larger loss caused by the code defects. In the prior art, a coverage rate test is performed on a tested code through a code coverage rate tool, when the tested code is abnormally executed, a code executed in the tested code within a period of time before an abnormal time point is acquired as a defective code according to code coverage information generated by the code coverage rate tool, and a tester does not need to manually check all codes which are possibly abnormal, so that the effect of improving the accuracy and the efficiency of software test is achieved. Although the prior art solves the work of manually checking problem code logs and slows down part of workload of developers, the developers are required to perform iterative analysis on ASTs (Abstract Syntax Trees) tree nodes or assembly instruction nodes of a program one by one, the problems of long time consumption and incapability of accurately positioning problem codes when compiling is wrong still exist, and the ASTs usually keep all information of source codes, wherein the redundant information contained in the ASTs has huge data volume, so that the detection workload and the detection cost are increased sharply, the assembly nodes are related to a specific system architecture, and the universality is limited to a certain extent.
Disclosure of Invention
In view of the above, the present invention provides a code detection method, apparatus, device and storage medium, which can improve reliability of software detection, code detection speed and development efficiency, and reduce software development and maintenance costs. The specific scheme is as follows:
in a first aspect, the present application discloses a code detection method, including:
determining a defect code from the code to be analyzed by a preset screening method;
extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information;
dividing the code slices according to a preset rule to obtain corresponding components;
formally abstracting the components to obtain target binary information;
and detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result, and judging whether the code to be analyzed is qualified or not based on the detection result.
Optionally, the determining a defect code from a code to be analyzed by using a preset screening method includes:
converting the code to be analyzed into a code attribute graph;
performing feature selection and quantization operation on the code attribute graph to obtain a target code attribute graph;
and screening the target code attribute graph according to a preset screening rule to obtain the defect code.
Optionally, the extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information includes:
extracting abstract syntax trees and data stream information of the defect codes by using a preset tool based on syntax analysis and lexical analysis;
constructing a code slice based on the abstract syntax tree and the data stream;
or generating the code slice based on the SARD and the code of the preset tool.
Optionally, the dividing the code slice according to a preset rule to obtain corresponding components includes:
and dividing the code slices according to the execution relation between the code slices and the program statements to obtain corresponding components.
Optionally, before detecting the target binary information according to a preset code security determination criterion, the method further includes:
acquiring a source program with a preset type code security defect, and marking a program statement contained in a defect source code in the source program;
compiling the source program into a binary program;
acquiring an instruction register execution path of the binary program through a preset application tool to obtain a corresponding characteristic element set;
and constructing a check sample set based on the characteristic element set, and generating a convolutional neural network model for judging whether the target binary information has defects or not based on the check sample set.
Optionally, the detecting the target binary information according to a preset code security determination standard to obtain a corresponding detection result includes:
establishing the preset code safety judgment standard based on the convolutional neural network model; the convolutional neural network model comprises the preset code safety judgment standard;
detecting the target binary information by using the convolutional neural network model to obtain a first detection result;
and if the first detection result is that the target binary information has defects, turning the target binary information, and inputting the turned target binary information into the convolutional neural network model to obtain a second detection result.
Optionally, the determining, based on the detection result, whether the code to be analyzed is qualified includes:
if the second detection result is that the target binary information has defects, judging that the code to be analyzed is unqualified;
and if the first detection result indicates that no defect exists in the target binary information, judging that the code to be analyzed is qualified, and deleting the code to be analyzed in the convolutional neural network model.
In a second aspect, the present application discloses a code detection apparatus, comprising:
the defect code determining module is used for determining a defect code from the codes to be analyzed through a preset screening method;
the code slice construction module is used for extracting target information of the defect code by using a preset tool and constructing a code slice based on the target information;
the code slice dividing module is used for dividing the code slices according to a preset rule to obtain corresponding components;
the information acquisition module is used for performing formal abstraction on the components to obtain target binary information;
and the code detection module is used for detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result and judging whether the code to be analyzed is qualified or not based on the detection result.
In a third aspect, the present application discloses an electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the code detection method as disclosed in the foregoing.
In a fourth aspect, the present application discloses a computer readable storage medium for storing a computer program; wherein the computer program realizes the code detection method as disclosed in the foregoing when executed by a processor.
As can be seen, the present application provides a code detection method, comprising: determining a defect code from the code to be analyzed by a preset screening method; extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information; dividing the code slices according to a preset rule to obtain corresponding components; formally abstracting the components to obtain target binary information; and detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result, and judging whether the code to be analyzed is qualified or not based on the detection result. Therefore, the target information of the defect codes is extracted, the code slices are constructed on the basis of the target information, formal abstraction is performed to obtain the target binary information, redundant information with huge data volume is eliminated, whether the codes are qualified or not is judged through detection of the binary information, the reliability of software detection, the code detection speed and the development efficiency are improved, and the software development and maintenance cost is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flow chart of a code detection method disclosed herein;
FIG. 2 is a flow chart of a specific code detection method disclosed herein;
FIG. 3 is a system block diagram of a code detection method disclosed herein;
FIG. 4 is a flowchart of a specific code detection method disclosed herein;
FIG. 5 is a flow chart of a specific code detection method disclosed herein;
FIG. 6 is a schematic structural diagram of a code detection apparatus provided in the present application;
fig. 7 is a block diagram of an electronic device provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, the prior art solves the work of manually checking problem code logs and slows down part of workload of developers, but the developers are required to perform iterative analysis on ASTs tree nodes or assembly instruction nodes of a program one by one, the problems of long time consumption and incapability of accurately positioning problem codes when compiling is wrong still exist, and the ASTs usually keep all information of source codes, wherein the data volume of redundant information contained in the ASTs is huge, so that the detection workload and the detection cost are increased sharply, the assembly nodes are related to a specific system architecture, and the universality is limited to a certain extent. Therefore, the code detection method can improve the reliability, the code detection speed and the development efficiency of software detection and reduce the software development and maintenance cost.
The embodiment of the invention discloses a code detection method, which is shown in figure 1 and comprises the following steps:
step S11: and determining a defect code from the codes to be analyzed by a preset screening method.
In this embodiment, a default code is determined from a code to be analyzed by a preset screening method. It will be appreciated that the defect codes are first located, and in one particular embodiment, the defect codes may be determined by multiple rounds of screening to progressively narrow the location; in another specific implementation, the defect codes can be located by comparing codes before and after defect repair, so as to obtain more unknown defect samples for detection, and the limitation that the standard patch file cannot be obtained can be made up by comparing codes between versions.
Step S12: and extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information.
In this embodiment, after a defect code is determined from a code to be analyzed by a preset screening method, target information of the defect code is extracted by using a preset tool, and a code slice is constructed based on the target information. It is understood that after determining a defect code from the code to be analyzed by the preset screening method, the defect code is subjected to a preprocessing operation. In a specific embodiment, an open source tool joern (an open source tool for analysis based on C/C + + project) is used to extract the abstract syntax tree and data stream information of the defect code based on the syntax analysis and lexical analysis, and the abstract syntax tree and data stream information of the defect code are used as the basis for constructing the cross-function code slice. It can be understood that Joern is a code vulnerability inspection tool used to traverse the code attribute graph, and the code attribute graph of each function can be extracted from the whole project or a single source file or even a function code segment without compiling and relying on a library. In another specific embodiment, the code preprocessing can generate a large amount of code slice data rich in data dependency relationship according to the SARD and the open source software code, so as to enhance the practicability of the data set and support multi-type source code defect detection.
Step S13: and dividing the code slice according to a preset rule to obtain corresponding components.
In this embodiment, after a code slice is constructed based on the target information, the code slice is divided according to a preset rule to obtain corresponding components. Specifically, the code slice divides the component parts of the code features by using the execution relation between program statements to obtain corresponding component parts. It is understood that the execution relationship between program statements is that only after the execution of a previous operation (program segment) is completed, the subsequent operation can be executed, such as a sequential relationship.
Step S14: and performing formal abstraction on the components to obtain target binary information.
In this embodiment, after the code slice is divided according to a preset rule and corresponding components are obtained, the components are abstracted in a formalized manner to obtain target binary information. Namely, each component is abstracted formally, and the complete binary information of the code structure is obtained.
Step S15: and detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result, and judging whether the code to be analyzed is qualified or not based on the detection result.
In this embodiment, after the components are abstracted in a formalized manner to obtain target binary information, the target binary information is detected by a preset code security judgment standard to obtain a corresponding detection result, and whether the code to be analyzed is qualified is judged based on the detection result. Specifically, a judgment standard of the security defect of the binary code is established according to the information in the convolutional neural network model, and the code is detected.
In the prior art, iterative analysis needs to be performed on nodes of an ASTs tree or nodes of assembly instructions of a program one by one, and the ASTs usually keep all information of source codes, which is time-consuming and contains huge redundant information data, so that the detection workload and the detection cost are increased sharply. Therefore, the scheme is proposed to solve the above defects, and the specific flow is as shown in fig. 2, and the defect code is located; preprocessing a code; sample data characterization; and (5) utilizing the convolutional neural network model to detect defects, finishing the detection if the detection is qualified, and performing code preprocessing again if the detection is unqualified.
Further, as shown in the system block diagram of fig. 3, the system includes a data layer, a service support layer, a service logic layer, and a display layer. It can be understood that the data layer is used for storing data required by system operation, including a defect database and a system operation database, and is constructed by using an SQL (Structured Query Language) Server database; the business support layer is used for providing support for the realization of business activities and consists of a network link communication engine, a data access engine, a database engine and a report generation engine; the service logic layer is used for realizing service functions according to service logic rules; the display layer is used for visually processing various data of the system through a human-computer interaction interface, and the data comprises data address configuration, service parameter configuration, query tracking and statement export of situation analysis results. The defect database is used for storing defect codes and tracking data thereof, and comprises defect items and attributes thereof, defect state change and data recorded in defect processing flow circulation; the system operation database is used for storing data for supporting system operation, and the data comprises data address configuration records, service parameter configuration records and system operation log data. The network link communication engine uses a TCP (Transmission Control Protocol) Protocol to realize that the client communicates with the server through the Ethernet; the data access engine is used for accessing the original record data of the test result. After the third-party scheduling engine extracts codes from the configuration management library and compiles and tests the codes, a software defect report is generated or the defect attributes of the software codes are updated based on the original test records, and corresponding responsible persons are informed to process the software codes, the updating and the automatic circulation of the defect attributes of the software codes are triggered after corresponding tasks are finished, the corresponding responsible persons are informed to process the software codes, and the tracking and the analysis of defect data are carried out in a software code defect management system.
As can be seen, the present application provides a code detection method, comprising: determining a defect code from the code to be analyzed by a preset screening method; extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information; dividing the code slices according to a preset rule to obtain corresponding components; formally abstracting the components to obtain target binary information; and detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result, and judging whether the code to be analyzed is qualified or not based on the detection result. Therefore, the target information of the defect codes is extracted, the code slice is constructed based on the target information, then the formal abstraction is carried out, the target binary information is obtained, the redundant information with huge data volume is eliminated, whether the codes are qualified or not is judged through the detection of the binary information, the reliability of software detection, the code detection speed and the development efficiency are improved, and the software development and maintenance cost is reduced.
Referring to fig. 4, the embodiment of the present invention discloses a code detection method, and compared with the previous embodiment, the present embodiment further describes and optimizes the technical solution.
Step S21: and converting the code to be analyzed into a code attribute graph.
In this embodiment, the defect code is first located. Specifically, the code to be analyzed is converted into a code attribute graph form, and the function is taken as a unit. It is understood that the code property diagram is a new graphic structure formed by combining an abstract syntax tree, a control flow diagram and a data flow diagram, and is used for comprehensively representing the running process of a program, the data transmission process and the like in one diagram.
Step S22: and carrying out feature selection and quantization operation on the code attribute graph to obtain a target code attribute graph.
In this embodiment, after the code to be analyzed is converted into the code attribute diagram, the code attribute diagram is subjected to feature selection and quantization operations to obtain a target code attribute diagram. It can be understood that after the code attribute graph obtained by the code conversion to be analyzed is subjected to certain selective quantization operation, the obtained target code attribute graph is subjected to a subsequent screening step.
Step S23: and screening the target code attribute graph according to a preset screening rule to obtain the defect code.
In this embodiment, after the code attribute map is subjected to feature selection and quantization operations to obtain a target code attribute map, the target code attribute map is screened according to a preset screening rule to obtain the defect code. It can be understood that the target code attribute map is subjected to multiple rounds of screening, and the positioning range is gradually narrowed to determine the defect code.
Step S24: and extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information.
In this embodiment, after the defect code is located, sample data characterization operation is performed, that is, target information of the defect code is extracted by using a preset tool, and a code slice is constructed based on the target information. In a specific embodiment, an abstract syntax tree of the defect code and data stream information are extracted based on syntax analysis and lexical analysis by using a preset tool, and then a code slice is constructed based on the abstract syntax tree and the data stream. In another specific embodiment, the code slice is generated based on the SARD and the code of the preset tool, and specifically, a large amount of code slice data rich in data dependency relationship can be generated according to the SARD and the code of open source software (e.g. joern), so that the practicability of the data set is enhanced, and multi-type source code defect detection is supported.
Further, the sample data characterization operation comprises establishing a source code element set; compiling a source program into a binary program, and acquiring an IR (instruction set) execution path of the binary program by using a VTS (Vendor Test Suite); establishing an IR code element set and constructing a check sample set; and after the characteristic element set is constructed, generating a convolutional neural network model of the security defect of the binary code.
Step S25: and dividing the code slice according to the execution relation between the code slice and the program statement to obtain the corresponding component.
Step S26: and performing formal abstraction on the components to obtain target binary information.
Step S27: and detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result, and judging whether the code to be analyzed is qualified or not based on the detection result.
For the details of the steps S25 to S27, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Therefore, the code to be analyzed is converted into the code attribute graph; performing feature selection and quantization operation on the code attribute graph to obtain a target code attribute graph; screening the target code attribute graph according to a preset screening rule to obtain the defect code; extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information; dividing the code slices according to the execution relation between the code slices and the program statements to obtain corresponding components; formally abstracting the components to obtain target binary information; and detecting the target binary information by a preset code safety judgment standard to obtain a corresponding detection result, and judging whether the code to be analyzed is qualified or not based on the detection result, so that the reliability of software detection, the code detection speed and the development efficiency are improved, and the software development and maintenance cost is reduced.
Referring to fig. 5, the embodiment of the present invention discloses a code detection method, and compared with the previous embodiment, the present embodiment further describes and optimizes the technical solution.
Step S31: and determining a defect code from the codes to be analyzed by a preset screening method.
Step S32: and extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information.
Step S33: and dividing the code slice according to a preset rule to obtain corresponding components.
Step S34: and performing formal abstraction on the components to obtain target binary information.
Step S35: the method comprises the steps of obtaining a source program with preset type code security defects, and marking program statements contained in defect source codes in the source program.
In this embodiment, after the components are abstracted formally to obtain the target binary information, a source program with a preset type code security defect is obtained, and a program statement included in a defect source code in the source program is marked. It can be understood that, because the IR statements and the machine instructions are in one-to-one correspondence and have the same semantics, the structural feature information obtained in the above process accurately describes the specific existence form of the security defect of the binary code, and can be used as an effective basis for detecting the security defect of the code in the binary program.
Specifically, a set of source code elements is established. Establishing a source code element set to obtain a source program with specific type code security defects, selecting a representative source program, marking program statements contained in the defect source code, and expressing the program statements in a vector form.
Step S36: compiling the source program into a binary program.
In this embodiment, a source program with a preset type code security defect is obtained, and after a program statement included in a defect source code in the source program is marked, the source program is compiled into a binary program.
Step S37: and acquiring an instruction register execution path of the binary program through a preset application tool to obtain a corresponding characteristic element set.
In this embodiment, after the source program is compiled into a binary program, an instruction register execution path of the binary program is obtained by a preset application tool, so as to obtain a corresponding feature element set. The IR execution path of the binary program is acquired, for example, by the application tool VTS.
Step S38: and constructing a check sample set based on the characteristic element set, and generating a convolutional neural network model for judging whether the target binary information has defects or not based on the check sample set.
In this embodiment, an instruction register execution path of the binary program is obtained through a preset application tool, after a corresponding feature element set is obtained, a check sample set is constructed based on the feature element set, and a convolutional neural network model for determining whether the target binary information has a defect is generated based on the check sample set. I.e. set up IR code elements and construct a set of check samples. It is understood that after the set of check samples is constructed, a convolutional neural network model of the binary code security deficiency is generated.
Step S39: and detecting the target binary information through a preset code safety judgment standard to obtain a corresponding detection result.
In this embodiment, the target binary information is detected according to a preset code security determination standard, so as to obtain a corresponding detection result. Specifically, the preset code safety judgment standard is established based on the convolutional neural network model; the convolutional neural network model comprises the preset code safety judgment standard; detecting the target binary information by using the convolutional neural network model to obtain a first detection result; and if the first detection result is that the target binary information has defects, turning the target binary information, and inputting the turned target binary information into the convolutional neural network model to obtain a second detection result. For example, a judgment standard of the security defect of the binary code is established based on the information in the convolutional neural network model, and the code is detected; and if the code has a defect part, turning the relevant data, and introducing the data into the convolutional neural network model for judgment. The decision method of the convolution neural network model is a feedforward neural network taking the sum of squared errors as an objective function, and for each input mode x, the operation formula is as follows:
Figure BDA0003806596270000111
assigning each letter in the formula, wherein x represents the sum of squares of errors of an expected output value and an actual output value of the data code, and the numerical value of an output unit is set to be a, and ma and na correspond to a target value (ma) and an actual value (na) of the a-th unit in the output mode; and if the data code is smaller than the actual value, determining that the data code is a defect code.
Step S310: and judging whether the code to be analyzed is qualified or not based on the detection result.
In this embodiment, whether the code to be analyzed is qualified is determined based on the detection result. If the second detection result is that the target binary information has defects, judging that the code to be analyzed is unqualified; and if the first detection result indicates that no defect exists in the target binary information, judging that the code to be analyzed is qualified, and deleting the code to be analyzed in the convolutional neural network model. For example, if the code has a defect part, the related data is turned over and then introduced into the convolutional neural network model for judgment; if the defective part still exists in the secondary judgment, judging that the part is unqualified; if the code has no defect part, the code is excluded from the convolution neural network model.
It can be understood that a characteristic element set is generated through a defective source code element set, code structure information is extracted, a convolutional neural network model for judgment is constructed, iterative analysis is not required to be performed one by one through assembly instruction nodes, redundant information with huge data volume is eliminated, effective detection of code safety defects in a binary program is realized through the conversion relation between IR codes and binary codes, the reliability, maintainability and intelligibility of software detection are improved, the cost of software development is reduced, the development efficiency is improved while the code detection speed is improved, and the software development and maintenance cost is reduced.
For the specific contents of the above steps S31 to S34, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and details are not repeated herein.
Therefore, the defect codes are determined from the codes to be analyzed through a preset screening method; extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information; dividing the code slices according to a preset rule to obtain corresponding components; formally abstracting the components to obtain target binary information; acquiring a source program with preset type code security defects, and marking program statements contained in defect source codes in the source program; compiling the source program into a binary program; acquiring an instruction register execution path of the binary program through a preset application tool to obtain a corresponding characteristic element set; constructing a check sample set based on the characteristic element set, and generating a convolutional neural network model for judging whether the target binary information has defects or not based on the check sample set; detecting the target binary information through a preset code safety judgment standard to obtain a corresponding detection result; and judging whether the code to be analyzed is qualified or not based on the detection result, so that the reliability, the code detection speed and the development efficiency of software detection are improved, and the software development and maintenance cost is reduced.
Referring to fig. 6, an embodiment of the present application further discloses a code detection apparatus, which includes:
a defect code determining module 11, configured to determine a defect code from codes to be analyzed by using a preset screening method;
a code slice construction module 12, configured to extract target information of the defect code by using a preset tool, and construct a code slice based on the target information;
a code slice dividing module 13, configured to divide the code slices according to a preset rule to obtain corresponding components;
the information acquisition module 14 is configured to perform formal abstraction on the components to obtain target binary information;
and the code detection module 15 is configured to detect the target binary information according to a preset code security judgment standard to obtain a corresponding detection result, and determine whether the code to be analyzed is qualified based on the detection result.
As can be seen, the present application includes: determining a defect code from the code to be analyzed by a preset screening method; extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information; dividing the code slices according to a preset rule to obtain corresponding components; formally abstracting the components to obtain target binary information; and detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result, and judging whether the code to be analyzed is qualified or not based on the detection result. Therefore, the target information of the defect codes is extracted, the code slices are constructed on the basis of the target information, formal abstraction is performed to obtain the target binary information, redundant information with huge data volume is eliminated, whether the codes are qualified or not is judged through detection of the binary information, the reliability of software detection, the code detection speed and the development efficiency are improved, and the software development and maintenance cost is reduced.
In some specific embodiments, the defect code determining module 11 specifically includes:
the code conversion unit to be analyzed is used for converting the code to be analyzed into a code attribute graph;
the target code attribute graph acquisition unit is used for carrying out feature selection and quantization operation on the code attribute graph to obtain a target code attribute graph;
and the defect code acquisition unit is used for screening the target code attribute graph according to a preset screening rule to obtain the defect code.
In some embodiments, the code slice building module 12 specifically includes:
the information extraction unit is used for extracting the abstract syntax tree and the data stream information of the defect code by using a preset tool and based on syntactic analysis and lexical analysis;
a first code slice generation unit for constructing a code slice based on the abstract syntax tree and the data stream;
a second code slice generation unit for generating the code slice based on the SARD and the code of the preset tool.
In some specific embodiments, the code slice dividing module 13 specifically includes:
and the code slice dividing unit is used for dividing the code slices according to the execution relation between the code slices and the program statements to obtain corresponding components.
In some specific embodiments, the information obtaining module 14 specifically includes:
and the target binary information acquisition unit is used for performing formal abstraction on the components to obtain the target binary information.
In some specific embodiments, the code detection module 15 specifically includes:
the source program acquiring unit is used for acquiring a source program with preset type code security defects and marking program statements contained in the defect source codes in the source program;
a source program compiling unit for compiling the source program into a binary program;
the characteristic element set acquisition unit is used for acquiring an instruction register execution path of the binary program through a preset application tool so as to obtain a corresponding characteristic element set;
the convolutional neural network model generating unit is used for constructing a check sample set based on the characteristic element set and generating a convolutional neural network model for judging whether the target binary information has defects or not based on the check sample set;
a preset code safety judgment standard establishing unit, configured to establish the preset code safety judgment standard based on the convolutional neural network model; the convolutional neural network model comprises the preset code safety judgment standard;
the first detection result acquisition unit is used for detecting the target binary information by using the convolutional neural network model to obtain a first detection result;
the information turning unit is used for turning the target binary information if the first detection result indicates that the target binary information has defects;
the second detection result acquisition unit is used for inputting the overturned target binary information into the convolutional neural network model to obtain a second detection result;
an unqualified determination unit, configured to determine that the code to be analyzed is unqualified if the second detection result indicates that a defect exists in the target binary information;
a qualification judging unit, configured to judge that the code to be analyzed is qualified if the first detection result indicates that no defect exists in the target binary information;
and the code deleting unit is used for deleting the code to be analyzed in the convolutional neural network model.
Furthermore, the embodiment of the application also provides electronic equipment. FIG. 7 is a block diagram illustrating an electronic device 20 according to an exemplary embodiment, and the contents of the diagram should not be construed as limiting the scope of use of the present application in any way.
Fig. 7 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present disclosure. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein, the memory 22 is used for storing a computer program, and the computer program is loaded and executed by the processor 21 to implement the relevant steps in the code detection method disclosed in any one of the foregoing embodiments. In addition, the electronic device 20 in this embodiment may be specifically an electronic computer.
In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and a communication protocol followed by the communication interface is any communication protocol that can be applied to the technical solution of the present application, and is not specifically limited herein; the input/output interface 25 is configured to obtain external input data or output data to the outside, and a specific interface type thereof may be selected according to specific application requirements, which is not specifically limited herein.
In addition, the storage 22 is used as a carrier for resource storage, and may be a read-only memory, a random access memory, a magnetic disk or an optical disk, etc., and the resources stored thereon may include an operating system 221, a computer program 222, etc., and the storage manner may be a transient storage or a permanent storage.
The operating system 221 is used for managing and controlling each hardware device on the electronic device 20 and the computer program 222, and may be Windows Server, netware, unix, linux, or the like. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the code detection method performed by the electronic device 20 disclosed in any of the foregoing embodiments.
Further, an embodiment of the present application further discloses a storage medium, where a computer program is stored, and when the computer program is loaded and executed by a processor, the steps of the code detection method disclosed in any one of the foregoing embodiments are implemented.
In the present specification, the embodiments are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same or similar parts between the embodiments are referred to each other. The device disclosed in the embodiment corresponds to the method disclosed in the embodiment, so that the description is simple, and the relevant points can be referred to the description of the method part.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
The above detailed description is provided for a code detection method, apparatus, device and storage medium, and the specific examples are applied herein to explain the principles and embodiments of the present invention, and the descriptions of the above embodiments are only used to help understand the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A code detection method, comprising:
determining a defect code from the code to be analyzed by a preset screening method;
extracting target information of the defect code by using a preset tool, and constructing a code slice based on the target information;
dividing the code slices according to a preset rule to obtain corresponding components;
formally abstracting the components to obtain target binary information;
and detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result, and judging whether the code to be analyzed is qualified or not based on the detection result.
2. The code detection method according to claim 1, wherein the determining a defect code from a code to be analyzed by a preset screening method comprises:
converting the code to be analyzed into a code attribute graph;
performing feature selection and quantization operation on the code attribute graph to obtain a target code attribute graph;
and screening the target code attribute graph according to a preset screening rule to obtain the defect code.
3. The code detection method according to claim 1, wherein the extracting target information of the defect code by using a preset tool and constructing a code slice based on the target information comprises:
extracting abstract syntax trees and data stream information of the defect codes by using a preset tool based on syntax analysis and lexical analysis;
constructing a code slice based on the abstract syntax tree and the data stream;
or generating the code slice based on the SARD and the code of the preset tool.
4. The code detection method according to claim 1, wherein the dividing the code slice according to the preset rule to obtain the corresponding component part comprises:
and dividing the code slice according to the execution relation between the code slice and the program statement to obtain the corresponding component.
5. The code detection method according to any one of claims 1 to 4, wherein before detecting the target binary information by using a preset code security judgment criterion, the method further comprises:
acquiring a source program with a preset type code security defect, and marking a program statement contained in a defect source code in the source program;
compiling the source program into a binary program;
acquiring an instruction register execution path of the binary program through a preset application tool to obtain a corresponding characteristic element set;
and constructing a check sample set based on the characteristic element set, and generating a convolutional neural network model for judging whether the target binary information has defects or not based on the check sample set.
6. The code detection method according to claim 5, wherein the detecting the target binary information by a preset code security judgment criterion to obtain a corresponding detection result comprises:
establishing the preset code safety judgment standard based on the convolutional neural network model; the convolutional neural network model comprises the preset code safety judgment standard;
detecting the target binary information by using the convolutional neural network model to obtain a first detection result;
and if the first detection result is that the target binary information has defects, turning the target binary information, and inputting the turned target binary information into the convolutional neural network model to obtain a second detection result.
7. The code detection method according to claim 6, wherein the determining whether the code to be analyzed is qualified based on the detection result comprises:
if the second detection result is that the target binary information has defects, judging that the code to be analyzed is unqualified;
and if the first detection result indicates that no defect exists in the target binary information, judging that the code to be analyzed is qualified, and deleting the code to be analyzed in the convolutional neural network model.
8. A code detection apparatus, comprising:
the defect code determining module is used for determining a defect code from the codes to be analyzed through a preset screening method;
the code slice construction module is used for extracting target information of the defect code by using a preset tool and constructing a code slice based on the target information;
the code slice dividing module is used for dividing the code slices according to a preset rule to obtain corresponding components;
the information acquisition module is used for formally abstracting the components to obtain target binary information;
and the code detection module is used for detecting the target binary information according to a preset code safety judgment standard to obtain a corresponding detection result and judging whether the code to be analyzed is qualified or not based on the detection result.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the steps of the code detection method of any of claims 1 to 7.
10. A computer-readable storage medium for storing a computer program; wherein the computer program, when executed by a processor, implements a code detection method as claimed in any one of claims 1 to 7.
CN202210998471.0A 2022-08-19 2022-08-19 Code detection method, device, equipment and storage medium Pending CN115344491A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210998471.0A CN115344491A (en) 2022-08-19 2022-08-19 Code detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210998471.0A CN115344491A (en) 2022-08-19 2022-08-19 Code detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115344491A true CN115344491A (en) 2022-11-15

Family

ID=83954288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210998471.0A Pending CN115344491A (en) 2022-08-19 2022-08-19 Code detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115344491A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119883869A (en) * 2025-03-26 2025-04-25 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Interpretable defect prediction method and related equipment
CN120523846A (en) * 2025-07-23 2025-08-22 宁波银行股份有限公司 Open source defect data processing method and device for service system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119883869A (en) * 2025-03-26 2025-04-25 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Interpretable defect prediction method and related equipment
CN119883869B (en) * 2025-03-26 2025-05-30 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) An explainable defect prediction method and related equipment
CN120523846A (en) * 2025-07-23 2025-08-22 宁波银行股份有限公司 Open source defect data processing method and device for service system

Similar Documents

Publication Publication Date Title
US8954930B2 (en) System and method for reducing test effort by object risk analysis
CN111240994B (en) Vulnerability processing method and device, electronic equipment and readable storage medium
US8875110B2 (en) Code inspection executing system for performing a code inspection of ABAP source codes
US8904350B2 (en) Maintenance of a subroutine repository for an application under test based on subroutine usage information
US11422917B2 (en) Deriving software application dependency trees for white-box testing
CN108345532A (en) A kind of automatic test cases generation method and device
CN115344491A (en) Code detection method, device, equipment and storage medium
CN115658452B (en) Buried point verification method, buried point verification device, readable storage medium, and electronic device
CN103186463B (en) Determine the method and system of the test specification of software
KR102693721B1 (en) Method for providing code inspection interface, and apparatus implementing the same method
CN115952081A (en) Software testing method, device, storage medium and equipment
CN119829469A (en) Firmware testing method, electronic device, storage medium and program product
CN117743154A (en) Quantitative analysis method and system for software code iteration change influence domain
CN114490413A (en) Test data preparation method and device, storage medium and electronic equipment
CN114490337A (en) Commissioning method, commissioning platform, equipment and storage medium
CN118151941A (en) Compiling optimization method, device, equipment and medium of electric power Internet of things operating system
CN117762799A (en) Test environment difference detection method and device, storage medium and electronic equipment
CN115168217A (en) Defect discovery method and device for source code file
Kravchenko et al. Complex Dynamic Method of Web Applications Verification by the Criterion of Time Minimization
CN114968687B (en) Traversal test method, apparatus, electronic device, program product, and storage medium
CN114595139B (en) Task execution method, device and electronic equipment
CN111881128B (en) Big data regression verification method and big data regression verification device
CN120315741A (en) A software updating method, device, equipment, medium, and program product
CN117950991A (en) Program testing method, device, electronic equipment and computer readable storage medium
Irlinger Session recording in configuration management environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination