[go: up one dir, main page]

CN116257447B - A method and device for inspecting website backend after bug repair - Google Patents

A method and device for inspecting website backend after bug repair Download PDF

Info

Publication number
CN116257447B
CN116257447B CN202310210845.2A CN202310210845A CN116257447B CN 116257447 B CN116257447 B CN 116257447B CN 202310210845 A CN202310210845 A CN 202310210845A CN 116257447 B CN116257447 B CN 116257447B
Authority
CN
China
Prior art keywords
patch
bug
snapshot
test
program
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310210845.2A
Other languages
Chinese (zh)
Other versions
CN116257447A (en
Inventor
王兴起
周旋
邵艳利
方景龙
魏丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202310210845.2A priority Critical patent/CN116257447B/en
Publication of CN116257447A publication Critical patent/CN116257447A/en
Application granted granted Critical
Publication of CN116257447B publication Critical patent/CN116257447B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3668Testing of software
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3668Testing of software
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3668Testing of software
    • G06F11/3672Test management
    • G06F11/3692Test management for test results analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Stored Programmes (AREA)

Abstract

本发明公开一种Bug修复后网站后端的检验方法及其装置。对Bug修复后的网站后端代码寻找最大可疑度snapshot,识别过拟合补丁并对过拟合补丁进行分类。通过的测试用例在bug程序和正确补丁程序的程序动态行为是相同的,失败的测试用例在bug程序和正确补丁程序的程序动态行为是不同的。从bug程序和测试集中构造导致程序发生错误的snapshot,从补丁程序中读取相同的snapshot,根据snapshot的值是否随着使用补丁而变化来判断补丁是否过拟合。本发明从程序不变量和程序表达式的角度重新诠释了补丁相似性,提出了一种用于计算补丁相似性的五元组表示方法,用于自动补丁生成的过拟合补丁识别与细分。

The present invention discloses a method and device for inspecting a website backend after a bug is fixed. The maximum suspicious snapshot is found for the website backend code after the bug is fixed, and overfitting patches are identified and classified. The program dynamic behaviors of the passed test cases are the same in the bug program and the correct patch program, and the program dynamic behaviors of the failed test cases are different in the bug program and the correct patch program. A snapshot that causes a program error is constructed from the bug program and the test set, and the same snapshot is read from the patch program. Whether the patch is overfitting is determined based on whether the value of the snapshot changes with the use of the patch. The present invention reinterprets patch similarity from the perspective of program invariants and program expressions, and proposes a five-tuple representation method for calculating patch similarity, which is used for identifying and segmenting overfitting patches in automatic patch generation.

Description

Method and device for checking website rear end after Bug repair
Technical Field
The invention belongs to the technical field of Bug repair, and relates to a method and a device for checking the rear end of a Bug repaired website, which are used for identifying fitting patches in patches generated by a software automatic repair tool and classifying the fitting patches.
Background
Automated Program Repair (APR) has led to extensive research over the last decade and a number of repair techniques have been proposed, most of which are based on test sets. The test set based repair tool takes a given test set as oracle and if the generated patch is able to pass the test set, the patch will be considered correct. However, in practice, the test set is weak and does not fully express the expected function of the program, so that the patches passing through all test cases are not fully correct, and there are patches that pass through all test cases but are still erroneous, i.e., over-fit patches. This results in a large number of ineffective patches produced by the APR technique. The current repair technology is far from mature, and most of the repair technologies can only simply accept patches passing through the test set. To filter out these patches, developers often need to manually verify the patches, which consumes too much resources. Due to the low performance of repair technology, developers must manually verify a large number of error patches. Thus, solving the over-fitting of patches is an urgent issue to be studied and solved.
If a patch passes the test set and is considered a correct patch, then the number of over-fit patches can be reduced by enhancing the test set. However, automatic test generation tools can only generate test inputs, and suitable test outputs still require manual determination by a person. Even this does not express the complete oracle. Especially for large projects, it is very difficult to want a complete oracle. At present, it is possible to identify whether a patch has been overfitted or not. Because the rapid identification of the over-fit patch can improve the success rate of the APR technique and the developer in repairing the error, if the technique can subdivide the over-fit patch, the speed of the developer in repairing the program error can be further improved.
Generally, overfitting patches can be categorized into (1) A-Overfitting Patch that the patch neither completely remedies nor destroys the original correct behavior, (2) B-Overfitting Patch that the patch remedies the original incorrect behavior but destroys the original correct behavior, which is known as a regression error, and (3) AB-Overfitting Patch that the patch does not remediate the incorrect behavior but destroys the original correct behavior. Different overfitting detection methods are proposed at present, wherein one strategy is a technology for mining test sets and program deep behaviors, and the success rate of overfitting patch identification can reach 56% through the principle of patch similarity.
Interpretation of the terms
Program abstract state, which is the abstract value of program behavior during the running of the website back-end code.
Disclosure of Invention
The invention aims at overcoming the defects of the prior art, and provides a method and a device for testing the rear end of a website after Bug repair. The invention provides a new technology-PatchID, the core idea of the technology is that the program dynamic behaviors of the passed test cases in the bug program and the correct patch program are the same, and the program dynamic behaviors of the failed test cases in the bug program and the correct patch program are different. Firstly constructing a dynamic behavior expression snapshot causing the program to generate errors from the bug program and the test set, then generating a new test case to enhance the original test set, finally reading the same snapshot from the patch program, and judging whether the patch is over-fitted according to whether the value of the snapshot changes along with the use of the patch.
In a first aspect, the invention provides a method for inspecting the rear end of a website after Bug repair, which comprises the following steps:
Step 1, searching a dynamic behavior expression snapshot with the maximum suspicious degree for the website back end code after Bug repair;
Step 1-1, acquiring a dynamic behavior expression snapshot of each test case;
The method comprises the steps of running a back-end website before Bug repair and a test set t o corresponding to the back-end website, firstly constructing a Boolean expression required by a snapshot to obtain a Boolean expression set B bug, collecting a program abstract state of each test case in the running period, and then calculating the value of each Boolean expression in the Boolean expression set B bug to generate a dynamic behavior expression snapshot of each test case in the test set;
the dynamic behavior expression snapshot is expressed by adopting five-tuple based on the patch similarity principle:
snapshot= < l, b,? i, v i > form (1)
Where l represents the unique location identity of each statement, b represents the boolean expression, and? i represents the unique serial number of each test case in the test set, v i represents the actual value of b of the test case t i in the bug program execution process;
Step 1-2, calculating the suspicious degree of each dynamic behavior expression snapshot, wherein the calculation formula is defined as follows:
where ed s represents a dependent variable (SYNTACTIC ANALYSIS of expression dependence) and dy s represents a dynamic analysis variable (DYNAMIC ANALYSIS);
Each snapshot has a corresponding suspicion that is determined by 1) ed s;2)dys, where ed s increases as the number of occurrences of b in the preceding and succeeding sentences increases, and the more times b takes value in failed test cases, the fewer times it takes value in passing test cases, the greater the value of dy s.
Step 1-3, screening a dynamic behavior expression snapshot with the maximum suspicion degree, and marking the dynamic behavior expression snapshot with the maximum suspicion degree as s max, wherein the dynamic behavior expression snapshot with the maximum suspicion degree is the dynamic behavior expression snapshot of Bug, so as to obtain a snapshot set s bug corresponding to a test set t o;
Step 2, data enhancement is carried out on the test set t o, and the test set t after data enhancement is obtained e
Randomly generating a plurality of new test cases through Evosuite software, replacing the test set in the step 1-1 with the new test cases, and repeating the step 1-1 to calculate to obtain dynamic behavior expressions snapshot of the test cases, which are recorded as s new, adding the test cases corresponding to s new into the test set t o if s new is the same as s max, otherwise discarding the test cases corresponding to s new, and finally obtaining the test set t e after data enhancement;
step 3, identifying the over-fitting patch and classifying the identified over-fitting patch, wherein the method comprises the following steps:
Step 3-1, acquiring a position l patch needing to be monitored from a patch adopted by Bug repair;
Because the Bug position before the repair of the website back end can not be monitored in the patch directly, a certain position l patch in the patch needs to be reselected to monitor the Boolean expression b with the same dynamic behavior expression of the Bug, no matter which repair operation, the program can have correct program behavior only after the repair operation is finished, so that the first different statement of the Bug and the patch is defined as start s, the last different statement is end s, and the monitoring position selection is carried out by adopting the following rules:
1) If the start s employs a block statement and the end s is inside the start s, then l patch is the next statement to end the block statement, which is for, while, or if;
2) If the start s does not adopt a block statement, judging whether the end s is the last statement, if not, l patch is the next statement of the end s, if so, l patch=ends;
Step 3-2, running a back-end website after Bug repair and a test set t e after data enhancement, obtaining a program abstract state of each test case in the test set t e on l patch, further obtaining a dynamic behavior expression snapshot, and finally obtaining a snapshot set s patch corresponding to the test set t e, wherein s patch is the same as Boolean expressions b and s bug;
Step 3-3, comparing the two sets s bug、spatch according to the sequence numbers of the test cases in the test set t e to obtain the same number N f of v among the test cases failing to test in the set s bug、spatch and different numbers N p of v among the test cases passing the test in the set s bug、spatch;
The type of patch is identified according to the following equation (3):
Wherein correct patch, A indicates that the A-type over-fit patch neither completely repairs the incorrect behavior nor destroys the original correct behavior, B indicates that the B-type over-fit patch, i.e. the patch repairs the original incorrect behavior but destroys the original correct behavior, called regression error, and AB indicates that the AB-type over-fit patch, i.e. the patch does not repair the incorrect behavior but destroys the original correct behavior.
In a second aspect, there is provided an inspection apparatus comprising:
The greatest-suspicious-degree snapshot searching module is used for searching the dynamic behavior expression snapshot with the greatest suspicious degree for the website back-end code after Bug repair;
The test data enhancement module is used for enhancing the data of the test set t o;
and an identification and classification module of the overfit patch.
In a third aspect, a computer readable storage medium is provided, on which a computer program is stored which, when executed in a computer, causes the computer to perform the method.
In a fourth aspect, a computing device is provided, including a memory having executable code stored therein and a processor, which when executing the executable code, implements the method.
The beneficial results of the invention are specifically:
1. The invention re-interprets patch similarity from the aspects of program invariants and program expressions, and provides a five-tuple representation method for calculating patch similarity, which is used for over-fitting patch identification and subdivision of automatic patch generation.
2. The invention identifies 63 overfitting patches and 15 correct patches in classical java dataset Defects4j, and experimental data shows that the method is superior to the existing similar method. This allows developers to more quickly modify the over-fit patch to the correct patch because the technique can subdivide the patch.
Drawings
FIG. 1 is an overall flow chart of the method of the present invention.
Detailed Description
The invention is described in detail below in connection with a software automatic repair technique according to the accompanying drawings. The whole flow of the invention is shown in figure 1 of the accompanying drawings, and the specific steps are as follows:
Step 1, searching a dynamic behavior expression snapshot with the maximum suspicious degree for the website back end code after Bug repair;
Step 1-1, acquiring a dynamic behavior expression snapshot of each test case;
The method comprises the steps of running a back-end website before Bug repair and a test set t o corresponding to the back-end website, firstly constructing a Boolean expression required by a snapshot to obtain a Boolean expression set B bug, collecting a program abstract state of each test case in the running period, and then calculating the value of each Boolean expression in the Boolean expression set B bug to generate a dynamic behavior expression snapshot of each test case in the test set;
The boolean expression is combined by each variable of the same type using logical symbols (<, +.gtoreq, >, +..
The dynamic behavior expression snapshot is expressed by adopting five-tuple based on the patch similarity principle:
snapshot= < l, b,? i, v i > form (1)
Where l represents the unique location identity of each statement, b represents the boolean expression, and? i represents the unique serial number of each test case in the test set, v i represents the actual value of b of the test case t i in the bug program execution process;
After the correct patch is used, the passed test case is the same as the previous boolean expression and its value, while the failed test should be different. If there is a bug program, all passed test cases make a boolean expression b value false, and all failed test cases make b value true. Then determining whether the patch is over-fitted is not just a single way of observing the output of the program, but rather can be done by comparing the value of b in a statement before and after the patch is used. The value of b should be consistent with the bug program when the passed test case tests the patch, and should be different from the bug program when the failed test case tests the patch.
Step 1-2, calculating the suspicious degree of each dynamic behavior expression snapshot, wherein the calculation formula is defined as follows:
where ed s represents a dependent variable (SYNTACTIC ANALYSIS of expression dependence) and dy s represents a dynamic analysis variable (DYNAMIC ANALYSIS);
Step 1-3, screening a dynamic behavior expression snapshot with the maximum suspicion degree, and marking the dynamic behavior expression snapshot with the maximum suspicion degree as s max, wherein the dynamic behavior expression snapshot with the maximum suspicion degree is the dynamic behavior expression snapshot of Bug, so as to obtain a snapshot set s bug corresponding to a test set t o;
Step 2, data enhancement is carried out on the test set t o, and the test set t after data enhancement is obtained e
Randomly generating a plurality of new test cases through Evosuite software, replacing the test set in the step 1-1 with the new test cases, and repeating the step 1-1 to calculate to obtain dynamic behavior expressions snapshot of the test cases, which are recorded as s new, adding the test cases corresponding to s new into the test set t o if s new is the same as s max, otherwise discarding the test cases corresponding to s new, and finally obtaining the test set t e after data enhancement;
step 3, identifying the over-fitting patch and classifying the identified over-fitting patch, wherein the method comprises the following steps:
Step 3-1, acquiring a position l patch needing to be monitored from a patch adopted by Bug repair;
Because the Bug position before the website back-end repair can not be monitored in the patch directly, a certain position l patch in the patch needs to be reselected to monitor the Boolean expression b with the same dynamic behavior expression of the Bug, and the patch generally comprises insert, delete, replace and update for Bug programs. Whichever repair operation, the program may have correct program behavior only after the repair operation is completed, so define bug and patch first different statement to be start s, last different statement to be end s, monitor location selection using the following rules:
1) If the start s employs a block statement and the end s is inside the start s, then l patch is the next statement to end the block statement, which is for, while, or if;
2) If the start s does not adopt a block statement, judging whether the end s is the last statement, if not, l patch is the next statement of the end s, if so, l patch=ends;
Step 3-2, running a back-end website after Bug repair and a test set t e after data enhancement, obtaining a program abstract state of each test case in the test set t e on l patch, further obtaining a dynamic behavior expression snapshot, and finally obtaining a snapshot set s patch corresponding to the test set t e, wherein s patch is the same as Boolean expressions b and s bug;
Step 3-3, comparing the two sets s bug、spatch according to the sequence numbers of the test cases in the test set t e to obtain the same number N f of v among the test cases failing to test in the set s bug、spatch and different numbers N p of v among the test cases passing the test in the set s bug、spatch;
The type of patch is identified according to the following equation (3):
Wherein correct patch, A indicates that the A-type over-fit patch neither completely repairs the incorrect behavior nor destroys the original correct behavior, B indicates that the B-type over-fit patch, i.e. the patch repairs the original incorrect behavior but destroys the original correct behavior, called regression error, and AB indicates that the AB-type over-fit patch, i.e. the patch does not repair the incorrect behavior but destroys the original correct behavior.
The present invention has been experimentally verified on two data sets, respectively, wherein the first data set is Dfects4j, which consists of patches generated by 6 APR tools on Defects 4J. The second dataset Java+JML dataset was created by Nilizadeh et al.
At present, defecets J proposed by Just is the most widely used Java program data set in the field of automatic program repair. The Defects4J has 17 projects so far, which contain 835 Defects. Each program bug in the dataset contains at least one test case that can trigger it. The method uses the 6 most commonly used items in the dataset, namely Chart, time, math, lang, closure and Mockito, wherein Chart is an item specially displaying icons, time is an item for date and Time processing, math is a scientifically calculated item, lang is a set of additional methods for operating JDK classes, closure is an optimized compiler for Javascript, and Mockito is a simulation framework for unit testing. The number of bug contained in each item is shown in table 1 below.
Meter 1:Defects4j Project
ProjectName Numberofbugs
Chart 26
Time 26
Math 106
Lang 64
Closure 174
Mockito 38
Total 434
The method uses 6 existing repair tools to repair on the Defects4J data set to obtain candidate patches. These 6 program bug automatic repair tools are jGenProg, nopol, nopol, 2017, ACS, HDRepair and jKali respectively, wherein jGenProg is a Java version of GenProg, which is a genetic algorithm-based heuristic search repair tool, nopol is a repair technique for conditional statement errors in Java programs, which gives different repair strategies for error statement types, namely, if the code position of the positioning error is a conditional statement, the repair patch which is usually generated by Nopol is used for modifying the original conditional statement, and if the code position of the positioning error is a non-conditional statement, the repair is realized by adding a new condition to skip the execution of the current statement. The data set comprises Nopol 2015 and Nopol 2017 versions, ACS is a high-precision conditional statement comprehensive tool which extracts patch templates for restoration based on statistical analysis, HDRepair is a restoration tool based on statistical analysis, jKali is the re-implementation of Kali on Java and is a restoration tool with a deletion function.
The Java+JML dataset proposed by Nilizadeh is the first validated, publicly available Java program dataset. It consists of four parts, correct procedure, mutated error procedure, test suite, APR-based patch. The procedure for this dataset had JML specifications for experimental evaluation. This dataset implements various classical algorithms and data structures such as bubble ordering, factorization, queuing, etc. They are all small programs of formal specifications written in JML and therefore can be considered as programs with oracle. Test suites are created using AFL-based obfuscation tools, and are scaled according to the number of test cases that are generated, to be Small and Medium. The error program is created by PITest, a Java program mutation tool, injecting a single error into each Java program. PITest generate errors by changing control conditions, changing assignment expressions, deleting method calls, and changing return values. The APR-based repair patch was obtained using the following repair tools ARJAE, cardumen, jGenProg, jKali, jMutRepair, kali-A, andNopol, respectively.
Experimental results:
Performance on Defects4J 220 patches are generated on the Defects4J data set through an APR tool, the 220 patches are tested to judge whether the 220 patches are over-fitting patches, a total of 166 patches are running results of the over-fitting patches, and the rest patches are terminated due to exceeding a set execution time limit and fail to give a final result. Of the 166 patches, the method gives a determination as to whether the remaining 157 patches are over-fitted patches, except for 9 patches. Specific patch determination results are shown in table 2.
Meter 2:Defects4j Dataset
Tables 3 and 4 show the results of the operation of the present method on the relevant defect repair tool and on different projects, respectively. As shown in the table, patchID successfully filtered 78 out of 157 patches, 63 out of which were overfitted and 15 out of which were correct. And for 63 overfitting patches PatchID successfully divided them into three categories, with a maximum of 50 a-Overfitting Patch, followed by 8B-Overfitting Patch and 5 AB-Overfitting Patch.
Meter 3:Result ByAPR Tools
Tool Correct Overfitting Correctdetected Overfittingdetected A B AB
Nopol2015 5 20 2(40%) 10(50%) 9 0 1
Nopol2017 3 68 2(66.66%) 36(52.94%) 25 8 3
HDRepair 4 5 3(75%) 1(20%) 1 0 0
ACS 11 6 7(63.63%) 1(16.66%) 1 0 0
jKali 1 14 0 8(57.14%) 8 0 0
jGenprog 6 14 1(16.67%) 7(50%) 6 0 1
Total 30 127 15(50%) 63(49.61%) 50 8 5
"Correct/overfitting detected" means the number of Correct classifications by the method of the invention from the "Correct/overfitting" patch.
A=A-Overfitting Patch,B=B-Overfitting Patch,AB=AB-Overfitting Patch
Meter 4:Result By Project
Project Correct Overfitting Correctdetected Overfittingdetected A B AB
Lang 6 10 2(33.33%) 3(50%) 3 0 0
Math 16 49 8(50%) 22(44.90%) 20 1 1
Chart 3 21 1(33.33%) 12(57.14%) 10 0 2
Time 2 10 2(100%) 6(60%) 5 1 0
Closure 2 37 1(50%) 20(54.05%) 12 6 2
Mockito 1 0 1(100%) 0 0 0 0
Total 30 127 15(51.85%) 63(49.61%) 50 8 5
Overfitting patch from Table 4 we can find that PatchID works better on four repair tools Nopol2015, nopol2017, jKali, jGenprog (50% of worst success rate), but poorly on ACS, HDRepair (20% of best success rate). We have also found that of the over-fit patches created by these 6 tools, the patches that did not fix the original errors of the program are the most, and the patches that corrupted the original correct behavior of the program are less. However, nopol and Nopol2017 (these two are tools that modify program conditional statements to repair bugs) together 12 patches are destructive to the correct behavior of the program, while the other tools only jGenprog produce one AB-Overfitting Patch. We hypothesize that modifying program conditional statements is relatively easy to introduce new errors.
According to Project, the success rate of over-fitting patch identification is relatively stable, and the range is 43% -60%. PatchID has the highest success rate in the Time Project, reaching 60%. The success rate in the Math project is the lowest, only 44.90%. Here, the number of patches that destroy the original correct behavior of the program is Closure at the maximum, and a total of 8 patches are used.
The Correct patch has 30 patches correctpatch out of 157 patches, and PatchID can correctly judge 15 patches, and the success rate reaches 50%. This is an exciting message. To the best of our knowledge, no tool currently has such a high success rate. In Nopol2017, HDRepair and ACS generated patches, the success rate of PatchID exceeds 60% with the highest being 75% in nature. From the Project point of view, the success rate of the remaining projects is not low except Lang and Chart. Of particular note is the success rate of up to 100% on Mockito and Time items.
With respect to the results of Xiong on this dataset, we identified one more over-fit patch than Xiong's method, but Xiong's method did not identify any correct patch and PatchID identified 15. For 220 patches, his method identified 62 in total and PatchID identified 78 patches. Xiong, however, increased the recognition success rate to 56.3% by pruning the average strategy, patchID being 49.7%. Furthermore, the Xiong method can only determine patches for four items Chart, lang, math and Time, while PatchID is relevant for patches for 6 items. PatchID is more widespread in terms of versatility.
Performance on Java +JML dataset we selected 236 over-fit patches based on the Medium test suite from the Java+JML dataset, 336 over-fit patches based on the Small test suite, all judged by JML specification, and determined that these patches are over-fit patches. There are 21 FALSENEGATIVES patches (JML specification mistakenly considers a correct repairedprogram as overfitted). The PatchID algorithm was run on a total of 593 patches, resulting in 380 patch runs, the specific results being shown in table 5.
TABLE 5 Java+JMLDataset
PatchType Collected Validated
Medium 236 144
Small 336 221
FalseNegatives 21 15
Total 593 380
From the data in table 6, it can be seen that the success rate reaches 50% when the patch based on the Medium type is PatchID, the success rate reaches 41.62% when the patch based on the Small type is PatchID, and the success rate reaches 33.33% when the patch based on the FALSENEGATIVES is FALSENEGATIVES.
From the perspective of the overfitting classification PatchID did not identify any B-Overfitting patch on this dataset. And other over-fit patches are of the type a-Overfitting, except for 4 AB-Overfitting patches.
As apparent from the success rates of Medium and Small, as the number of test cases in the test suite decreases, the success rate also decreases. This data illustrates that weak test kits can affect the success rate of PatchID.
Meter 6:Result By PatchType
PatchType Correctdetected Overfittingdetected A B AB
Medium 72(50%) 72(50%) 72 0 0
Small 129(58.37%) 92(41.62%) 88 0 4
FalseNegatives 5(33.33%) 10(66.67%) 10 0 0
Total 206 174 170 0 4

Claims (4)

1.一种Bug修复后网站后端的检验方法,其特征在于所述方法包括以下步骤:1. A method for inspecting the backend of a website after a bug is fixed, characterized in that the method comprises the following steps: 步骤1:对Bug修复后的网站后端代码寻找最大可疑度的动态行为表达式snapshot;Step 1: Find the most suspicious dynamic behavior expression snapshot in the backend code of the website after the bug is fixed; 步骤1-1:获取每个测试用例的动态行为表达式snapshot;Step 1-1: Get the dynamic behavior expression snapshot of each test case; 运行Bug修复前的后端网站,以及该后端网站对应的测试集to,首先构建snapshot所需的布尔表达式,得到布尔表达式集合Bbug,收集每一个测试用例运行期间的程序抽象状态;然后计算布尔表达式集合Bbug中每个布尔表达式的值,生成测试集中每个测试用例的动态行为表达式snapshot;Run the backend website before the bug is fixed, and the test set t o corresponding to the backend website. First, construct the Boolean expression required for the snapshot to obtain the Boolean expression set B bug , and collect the program abstract state during the running of each test case; then calculate the value of each Boolean expression in the Boolean expression set B bug , and generate the dynamic behavior expression snapshot of each test case in the test set; 所述动态行为表达式snapshot基于补丁相似性原理采用五元组进行表达:The dynamic behavior expression snapshot is expressed using a five-tuple based on the patch similarity principle: 其中表示每一条语句的唯一位置标识,b表示布尔表达式,?表示b的值,i表示测试集中每一个测试用例的唯一序号,vi表示测试用例ti在bug程序执行过程中b的实际值;in represents the unique position identifier of each statement, b represents a Boolean expression, ? represents the value of b, i represents the unique serial number of each test case in the test set, and vi represents the actual value of b during the execution of the bug program for test case ti ; 步骤1-2:计算每一个动态行为表达式snapshot的可疑度,计算公式定义如下:Step 1-2: Calculate the suspiciousness of each dynamic behavior expression snapshot. The calculation formula is defined as follows: 其中eds表示依赖性变量,dys表示动态分析变量;Where ed s represents dependent variables, dy s represents dynamic analysis variables; 步骤1-3:筛选最大可疑度的动态行为表达式snapshot,记为smax;最大可疑度的动态行为表达式snapshot即为Bug的动态行为表达式snapshot,进而得到测试集to对应的snapshot集合sbugStep 1-3: Filter the dynamic behavior expression snapshot with the maximum suspiciousness, denoted as s max ; the dynamic behavior expression snapshot with the maximum suspiciousness is the dynamic behavior expression snapshot of Bug, and then obtain the snapshot set s bug corresponding to the test set t o ; 步骤2:对测试集to进行数据增强,得到数据增强后的测试集teStep 2: Perform data enhancement on the test set t o to obtain the data enhanced test set te : 通过Evosuite软件随机生成多个新测试用例,将步骤1-1中测试集替换成上述新测试用例,然后重复步骤1-1计算得到这些测试用例的动态行为表达式snapshot,记为snew;如果snew与smax相同,则将snew对应的测试用例添加到测试集to,反之则丢弃snew对应的测试用例,最终得到数据增强后的测试集teGenerate multiple new test cases randomly through Evosuite software, replace the test set in step 1-1 with the new test cases, and then repeat step 1-1 to calculate the dynamic behavior expression snapshot of these test cases, recorded as s new ; if s new is the same as s max , add the test case corresponding to s new to the test set t o , otherwise discard the test case corresponding to s new , and finally obtain the test set t e after data enhancement; 步骤3:识别过拟合补丁并对识别出的过拟合补丁进行分类;具体如下:Step 3: Identify overfitting patches and classify the identified overfitting patches; the details are as follows: 步骤3-1:在Bug修复采用的补丁中获取需要监听的位置 Step 3-1: Get the location to be monitored in the patch used for the bug fix 由于网站后端修复前bug的位置不能直接在补丁中监听,需要重新选择补丁中的某个位置去监听Bug的动态行为表达式相同的布尔表达式b;无论哪一种修复操作,程序只有在修复操作结束后,才可能有正确的程序行为,故定义bug和补丁第一次不同的语句记为starts,最后不同的语句记为ends,采用以下的规则进行监听位置选择:Since the location of the bug before the website backend is fixed cannot be monitored directly in the patch, you need to reselect a location in the patch The Boolean expression b that monitors the dynamic behavior expression of the bug is the same as the expression b. No matter which repair operation is performed, the program can only have correct program behavior after the repair operation is completed. Therefore, the first different statement between the bug and the patch is recorded as start s , and the last different statement is recorded as end s . The following rules are used to select the monitoring position: 1)如果starts采用块语句,且ends在starts内部,那么在块语句结束的下一条语句;所述块语句为for、while或if;1) If starts uses a block statement and ends is inside starts , then The next statement after the end of a block statement; the block statement is for, while or if; 2)如果starts不采用块语句,判断ends是否为最后一条语句,若否则在ends的下一条语句,若是则 2) If start s does not use a block statement, determine whether end s is the last statement. If not, In the next statement after ends , if 步骤3-2:运行Bug修复后的后端网站和数据增强后的测试集te,获取测试集te中每个测试用例在上的程序抽象状态,进而得到动态行为表达式snapshot,最后获得测试集te对应的snapshot集合spatch;其中spatch与sbug的布尔表达式b和?相同;Step 3-2: Run the backend website after the bug is fixed and the test set t e after data enhancement, and obtain the results of each test case in the test set t e The program abstract state on t e is obtained, and then the dynamic behavior expression snapshot is obtained, and finally the snapshot set s patch corresponding to the test set t e is obtained; where s patch and s bug have the same Boolean expression b and ?; 步骤3-3:将两个集合sbug、spatch按照测试集te中测试用例的序号进行对比,获得集合sbug、spatch中测试失败的测试用例间v相同的数量Nf,以及集合sbug、spatch中测试通过的测试用例间v不同的数量NpStep 3-3: Compare the two sets s bug and s patch according to the serial numbers of the test cases in the test set t e , and obtain the number N f of the same v between the test cases that failed in the sets s bug and s patch , and the number N p of the different v between the test cases that passed in the sets s bug and s patch ; 根据以下公式(3),识别补丁的类型:According to the following formula (3), the type of patch is identified: 其中correct表示正确的补丁,A表示A类型的过拟合补丁,即补丁既没有完全修复不正确的行为也没有破坏原来正确的行为;B表示B类型的过拟合补丁,即补丁修复了原来不正确的行为但是破坏了原来正确的行为,被称为回归错误;Where correct represents the correct patch, A represents an overfitting patch of type A, i.e., the patch neither completely fixes the incorrect behavior nor destroys the original correct behavior; B represents an overfitting patch of type B, i.e., the patch fixes the original incorrect behavior but destroys the original correct behavior, which is called regression error; AB表示AB类型的过拟合补丁,即补丁不但没有修复不正确的行为还破坏了原来正确的行为。AB represents an overfitting patch of type AB, i.e., the patch not only fails to fix the incorrect behavior but also destroys the original correct behavior. 2.一种实现权利要求1所述方法的检验装置,其特征在于包括:2. A testing device for implementing the method of claim 1, characterized by comprising: 最大可疑度snapshot寻找模块,用于对Bug修复后的网站后端代码寻找最大可疑度的动态行为表达式snapshot;The maximum suspicious snapshot search module is used to find the dynamic behavior expression snapshot with the maximum suspiciousness in the website backend code after the bug is fixed; 测试数据增强模块,用于对测试集to进行数据增强;Test data enhancement module, used to enhance the data of the test set t o ; 过拟合补丁的识别及分类模块。Identification and classification module of overfitted patches. 3.一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行权利要求1所述的方法。3. A computer-readable storage medium having a computer program stored thereon, which, when executed in a computer, causes the computer to execute the method of claim 1. 4.一种计算设备,包括存储器和处理器,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1所述的方法。4. A computing device comprising a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, the method of claim 1 is implemented.
CN202310210845.2A 2023-03-07 2023-03-07 A method and device for inspecting website backend after bug repair Active CN116257447B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310210845.2A CN116257447B (en) 2023-03-07 2023-03-07 A method and device for inspecting website backend after bug repair

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310210845.2A CN116257447B (en) 2023-03-07 2023-03-07 A method and device for inspecting website backend after bug repair

Publications (2)

Publication Number Publication Date
CN116257447A CN116257447A (en) 2023-06-13
CN116257447B true CN116257447B (en) 2025-05-13

Family

ID=86686046

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310210845.2A Active CN116257447B (en) 2023-03-07 2023-03-07 A method and device for inspecting website backend after bug repair

Country Status (1)

Country Link
CN (1) CN116257447B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101415602A (en) * 2003-09-30 2009-04-22 空间数据公司 System and application of lighter-than-air (LTA) platform
CN113300873A (en) * 2021-02-05 2021-08-24 阿里巴巴集团控股有限公司 Five-tuple hash path-based fault bypassing method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1431438A (en) * 1972-09-11 1976-04-07 Nat Res Dev Pattern recognition systems and apparatus
US7505606B2 (en) * 2005-05-19 2009-03-17 Microsoft Corporation Detecting doctored images using camera response normality and consistency
CN113704359B (en) * 2021-09-03 2024-04-26 优刻得科技股份有限公司 Method, system and server for synchronizing multiple data copies of time sequence database
CN115640155A (en) * 2022-09-16 2023-01-24 南京航空航天大学 Method and system for automatic program repair based on statement dependency and patch similarity

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101415602A (en) * 2003-09-30 2009-04-22 空间数据公司 System and application of lighter-than-air (LTA) platform
CN113300873A (en) * 2021-02-05 2021-08-24 阿里巴巴集团控股有限公司 Five-tuple hash path-based fault bypassing method and device

Also Published As

Publication number Publication date
CN116257447A (en) 2023-06-13

Similar Documents

Publication Publication Date Title
Tian et al. Automatically diagnosing and repairing error handling bugs in C
Ray et al. On the" naturalness" of buggy code
Rice et al. Detecting argument selection defects
US20190138731A1 (en) Method for determining defects and vulnerabilities in software code
Jiang et al. Igor: Crash deduplication through root-cause clustering
JP7404839B2 (en) Identification of software program defect location
Perez et al. A dynamic code coverage approach to maximize fault localization efficiency
CN113779590B (en) A source code vulnerability detection method based on multi-dimensional representation
Clarisó et al. Smart bound selection for the verification of UML/OCL class diagrams
CN109408385B (en) A kind of disfigurement discovery method based on mischief rule and classifying feedback
CN113157565A (en) Feedback type JS engine fuzzy test method and device based on seed case mutation
Pashakhanloo et al. Codetrek: Flexible modeling of code using an extensible relational representation
US11520689B2 (en) System and method for automatic program repair using fast-result test cases
JP7384054B2 (en) automated software program repair
Yang et al. Silent compiler bug de-duplication via three-dimensional analysis
Dobslaw et al. Automated black-box boundary value detection
CN115098292A (en) Application program crash root cause identification method and device and electronic equipment
Rawat et al. An evolutionary computing approach for hunting buffer overflow vulnerabilities: A case of aiming in dim light
CN109165155B (en) A method for extracting software defect repair templates based on cluster analysis
CN116257447B (en) A method and device for inspecting website backend after bug repair
CN114238080A (en) A software project quality prediction method, prediction system and medium
CN119128910A (en) A software code vulnerability repair system based on feature extraction and rule analysis
CN106096635B (en) Warning classification method based on cost-sensitive neural network with threshold operation
CN118820083A (en) Mobile software aging-friendly defect detection method and system based on large language model
Al-Sabbagh et al. Selective regression testing based on big data: Comparing feature extraction techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant