Summary of the invention:
The technical matters that the present invention will solve is: can produce state space blast problem to model checking method; The method that proposes a kind of parallelization model detection comes buffer-overflow vulnerability is verified; This method can be applied in large-scale program and the embedded OS module practically at present, and has the advantage that reduces state space, low rate of false alarm.
The technical scheme that the present invention adopts is: based on the parallelization security flaw detection method of function call figure; It is characterized in that: through analyzing the function calling relationship generating function calling graph of program to be detected; Find out the leaf node function in the calling graph tree; Restart parallelisation procedure, utilize model checking tools to accomplish detection simultaneously, thereby judge and analyze security breaches and cause the path the leaf node function.
The present invention adopts the parallel optimization method based on function and multithreading.At first, utilize register transfer language (RTL) specificity analysis of GCC to go out the function calling relationship in each file in the module to be detected simultaneously to the multifile module of input, at this moment, generating function concerns calling graph.Then, the funtcional relationship calling graph is analyzed, finding in-degree is zero function node (leaf node).At last, the file that leaf node is belonged to carries out pre-service, starts multithreading and also utilizes model checking tools BLAST to carry out the accessibility checking through pretreated file.
Whole function call figure mainly divides following several sections:
(1) calling graph generation module, this module is accomplished the function calling relationship map generalization, through generating the function calls graph of a relation of file module to be detected, can begin to analyze and detect from the bottom function easily, also can dispatch whole testing process easily;
(2) analyze scheduler module; The task of this module is the whole testing process of scheduling on the basis of calling graph; Because the independence between module file, the function that calls subtree from difference generally has less correlativity or does not have correlativity, can be conveniently this type of function be carried out parallel detection;
(3) pre-processing module; This module has been set up the constraint Analysis mechanism of a cover to buffer-overflow vulnerability, for buffer zone increases attribute length information mainly in the enterprising row constraint analysis of the abstract syntax tree of GCC; Different buffer zone action statement generates corresponding to different attribute constraints; The process of completion code plug-in mounting is analyzed then and is asserted the distribution situation of assert, for the model detection module provides the basis;
(4) detection module, this module testing tool BLAST that uses a model accomplishes the file after handling through pre-processing module, and detection system starts multithreading and detects simultaneously.
Detecting based on the model of function and multi-threaded parallelization finally is a kind ofly to verify whether given system satisfies the technology of specific character.A given property description that system to be detected is relevant with system; Execution through the model detection algorithm; Algorithm can prove whether this system satisfies given character, if system does not satisfy given character, system can provide the error reporting that comprises counter-example with; Thereby detect security breaches, so the security breaches problem successfully has been converted into the Reachability question to error label ERROR.
Embodiment:
The present invention utilizes static analysis, concerns function calls relation in the method tracing program of calling graph through generating function, for analyzing scheduler module corresponding scheduling information is provided; Then; On this basis, the file of required plug-in mounting is handled through pre-processing module, last, realize that through detection module parallel model detects; To realize check and analysis, guaranteed accurate detection to the source code leak for the security breaches problem.Structural drawing such as Fig. 1.
1. calling graph generation module
At first this module generates the function calling relationship figure of software to be detected, and it can begin to analyze and detect from the bottom function easily, also can dispatch whole testing process easily.
In the process that realizes, we have utilized the characteristic of the register transfer language (RTL) of GCC.It is through in the frontal chromatography code of GCC, generating abstract syntax tree (AST), in the rear end of GCC, is compiled into the RTL language to AST as the transition that is compiled into final code.
Through analyzing the RTL file, we find function definition can in the RTL file, generate "; Function fun-name (fun-name) " format string; wherein fun-name is meant actual function name; when running into call function, and the RTL file can generate the RTL instruction of call, and regular expression provides general format and is " (call.* (fun-name) .*) "; wherein, fun-name is the function name of calling.Therefore, we obtain generating main algorithm such as Fig. 2 of calling graph part, and the main contents of algorithm are:
At first, utilize GCC to generate the RTL file, RTL analyzes to confirm the function calls relation, carries out information extraction in the RTL file each is capable then, if having satisfied "; Function " row of form; just note RTL file content represented function name and file name thereof; and to zone bit of this function setup, if the row of satisfied (call) form is arranged, just record the represented homophony of RTL fileinfo in another file with the function and the function that is called; preserve corresponding function information with this, think that analyzing scheduler module provides reference information.
2. analysis scheduler module
On the basis in a last step, accomplished the associative operation of funtcional relationship calling graph, the main task of analyzing scheduler module is to analyze calling graph, the whole testing process of scheduling on the basis of calling graph.Because the modularity characteristic of software to be detected, it is very little or do not have correlativity to call the functional dependence of subtree from difference, and therefore, through the analysis to the funtcional relationship calling graph, we can do parallel detection to some such functions.On the one hand, we have also considered the sensitivity transmission of interprocedual, and just the parameter of callee is handled the situation that can influence caller, with this, should begin from the function than bottom to detect; On the other hand, owing to be that bottom-up scheduling detects, and the function of bottom generally is distributed in the different module, and correlativity is smaller, so it is parallel relatively easy to realize.
In analyzing scheduling; The funtcional relationship calling graph that we at first analyze module information to be measured and are generated; Find out the leaf node that concerns in the calling graph, the mark function and the file thereof that need detect well dispatched testing tool at last these functions to be detected and file carried out parallel detection then.Thus, we have obtained analyzing main algorithm such as Fig. 3 of scheduler module, and the main contents of algorithm are:
At first, to the homophony function and the function that is called, a homophony function whenever calls a function in the function call figure each, and its out-degree count value adds at every turn be called once its in-degree count value of 1, one called function and adds 1; Then, search all out-degree count values in the module file and be 0 function, and be labeled as leaf node, join in the leaf node array; Secondly, the fileinfo that belongs to all leaf nodes and leaf node is archived to during specific leaf sets a file, and carries out Hole Detection conveniently to utilize model checking tools.
3. pre-processing module
The main completion code plug-in mounting of pre-processing module with assert the function of analyzing.Code instrumentation adopts syntax-directed method; In the enterprising row constraint analysis of the abstract syntax tree of GCC; Set up the constraint Analysis mechanism of a cover to buffer-overflow vulnerability, for buffer zone increases attribute length information, different buffer zone action statement generates corresponding to different attribute constraints.For describing whole constraint Analysis process; At first carry out abstract to the C language; The non-stream sensitivity of analytic process; Can the processing controls stream information, analyze the present invention and the C language syntax of being concerned about is carried out abstract for simplifying, comprise pointer variable, integer variable, function call, Memory Allocation and assignment statement.According to the abstract operation of these grammers, generates corresponding property and handle statement, and create-rule is to be provided by the xml configuration file that this document will directly instruct the foundation of buffer zone attribute model to buffer zone.The good code of Fig. 4 a plug-in mounting is the C language codes that can compile, like Fig. 4 b.Because the singularity that parallel model detects and the limitation of BLAST model checking tools, we have done corresponding change by the C language codes after to plug-in mounting, comment out promptly that to comprise the C language codes that assert asserts capable.Activating assert to be detected one by one when analyzing conveniently to assert asserts.
After pre-processing module is accomplished plug-in mounting to file, also need analyze and assert the distribution situation of assert, judge the function at assert place, the binding analysis scheduler module provides function name to be detected and filename, for the model detection module provides the basis.Assert and analyze in search assert, can analyze and write down the residing action scope of statement, thereby analyze function definition.The specific algorithm of this pre-processing module such as Fig. 5, the main contents of algorithm are:
At first, for all leaf nodes, read out the C language file at leaf node place line by line; When tracking the leaf node function, correspondingly do a sign, if dangerous point function in the leaf node; The assert that is plug-in mounting asserts; Activate this to be detected asserting so, and be saved in it in the corresponding file, carry out parallel detection to make things convenient for the model detection module.Then, in the leaf function, continue to search, assert, carry out according to top method so if find assert once more, otherwise, jump out leaf node, then reset sign, thereby continue to follow the trail of other the leaf node and the C language file at place thereof.
Assert through only comprising an activation in each file that obtains after the pre-processing module processing asserts; To make things convenient for model checking tools to detect; Simultaneously owing to correlativity between the leaf node is less relatively; So the method that we can launch multithreading detects the C language file of handling the back generation through pre-processing module.
4. detection module
The detection module testing tool BLAST that uses a model detects through pretreated C language file, and this module starts multithreading and detects simultaneously.Detection module carries out the accessibility checking to the function name of analyzing the scheduler module generation with asserting, the testing result that obtains is scheduler module processing by analysis also.
Blast is the model checking tools of analyzing between a control stream sensitivity and supporting process.People such as Thomas A.Henzinger exploitation by University of California.Blast has carried out accurate inspection to the null pointer misquotation of C language codes, and rate of false alarm is very low; Blast can carry out the approachability analysis of code in addition, and just whether determining program can begin to carry out and arrive certain appointed positions from entering the mouth out.
Use Blast that the attribute model of being set up in the last step is carried out approachability analysis.Attribute constraint for being generated in the attribute model changes the label (Fig. 4 b) that Blast need verify into through macro substitution, if this label can reach, shows that then there are security breaches in this place.Blast uses the method for counter-example guidance to carry out approachability analysis, can the execution route from program entry to this label be noted, and through can analyze the execution route that produces security breaches to path trace, makes things convenient for that the program personnel are manual to search and confirm.Fig. 6 illustrates resulting Hole Detection result, in this result, shows the position and the filename of leak path file, opens the specifying information that this document can be seen the leak path.
The main thought of detection module is exactly to carry out the accessibility checking with function entrance.Because it is that elementary cell is carried out that model detects with the function, with respect to the accessibility that begins from the main function, the number of times that calls the theorem prover inquiry can reduce relatively, and the searching route of checking also can shorten relatively.In addition, another the obvious benefit that detects based on function makes things convenient for parallel detection to handle exactly, begins from leaf node because model detects, and the correlativity between the leaf node function is less or do not have a correlativity.The parallel detection here is to adopt POSIX threads (being called for short Pthreads).Pthreads is that a cover is commonly used in the parallel storehouse of carrying out multiple programming on the multi-core platform, and it has made full use of the multinuclear resource on the machine.