CN117992742A - Method, device and storage medium for detecting smart contract vulnerabilities - Google Patents
Method, device and storage medium for detecting smart contract vulnerabilities Download PDFInfo
- Publication number
- CN117992742A CN117992742A CN202311228142.9A CN202311228142A CN117992742A CN 117992742 A CN117992742 A CN 117992742A CN 202311228142 A CN202311228142 A CN 202311228142A CN 117992742 A CN117992742 A CN 117992742A
- Authority
- CN
- China
- Prior art keywords
- hypergraph
- smart contract
- source code
- constructing
- intelligent contract
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2132—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
- G06F18/21322—Rendering the within-class scatter matrix non-singular
- G06F18/21326—Rendering the within-class scatter matrix non-singular involving optimisations, e.g. using regularisation techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
- G06F18/256—Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/57—Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
- G06F21/577—Assessing vulnerabilities and evaluating computer system security
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Stored Programmes (AREA)
Abstract
Description
技术领域Technical Field
本申请涉及智能合约技术领域,例如涉及一种用于检测智能合约漏洞的方法及装置、存储介质。The present application relates to the field of smart contract technology, and for example, to a method and device, and a storage medium for detecting smart contract vulnerabilities.
背景技术Background technique
以太坊(Ethereum)是一个开源的公共区块链平台,他拥有区块链的去中心化、共识、分布式账本等特点。它为所有开发者提供了基于Solidity语言的智能合约解决方案平台,它上面提供各种模块让用户来创建属于自己项目的智能合约。Ethereum is an open source public blockchain platform that has the characteristics of decentralization, consensus, and distributed ledger. It provides all developers with a smart contract solution platform based on the Solidity language, which provides various modules for users to create smart contracts for their own projects.
智能合约是一种旨在以信息化方式传播、验证或执行合同的计算机协议。智能合约允许在没有第三方的情况下进行可信交易,这些交易可追踪且不可逆转。智能合约的目的是提供优于传统合约的安全方法,并减少与合约相关的其他交易成本。然而,智能合约无法避免地存在安全漏洞。A smart contract is a computer protocol designed to communicate, verify or execute contracts in an information-based manner. Smart contracts allow for trusted transactions without a third party, which are traceable and irreversible. The purpose of smart contracts is to provide a security method superior to traditional contracts and reduce other transaction costs associated with contracts. However, smart contracts inevitably have security vulnerabilities.
为了实现对智能合约漏洞的检测。相关技术公开了一种基于深度学习的智能合约的漏洞检测模型智能合约漏洞检测方法,包括:根据智能合约漏洞类型,获取智能合约数据集;将所述智能合约数据集中的智能合约转化为操作码序列,将所述操作码序列作为训练样本;利用所述训练样本对智能合约漏洞检测模型进行训练,得到训练好的智能合约漏洞检测模型;其中,所述智能合约漏洞检测模型为基于平均随机梯度下降权重长短期记忆网络的智能合约漏洞检测模型;将待检测智能合约转化为操作码序列,所述操作码序列为待测操作码序列;将所述待测操作码序列输入所述训练好的智能合约漏洞检测模型,得到相应的检测结果。这样,相关技术中通过对操作码序列进行简化处理,减弱冗余操作码对检测结果的影响。根据相应漏洞类型提取出的关键操作码抽取上下文信息,因此最终整合的操作码序列更有针对性的反映漏洞信息,而且相比直接使用操作码序列,该序列长度大大减小,不仅缩短了模型训练的时间,同时解决了因序列过长导致的模型梯度消失和模型不稳定的问题。In order to detect smart contract vulnerabilities. The related art discloses a smart contract vulnerability detection method based on a vulnerability detection model of a smart contract based on deep learning, including: obtaining a smart contract data set according to the smart contract vulnerability type; converting the smart contract in the smart contract data set into an operation code sequence, and using the operation code sequence as a training sample; using the training sample to train the smart contract vulnerability detection model to obtain a trained smart contract vulnerability detection model; wherein the smart contract vulnerability detection model is a smart contract vulnerability detection model based on an average random gradient descent weighted long short-term memory network; converting the smart contract to be detected into an operation code sequence, and the operation code sequence is the operation code sequence to be tested; inputting the operation code sequence to be tested into the trained smart contract vulnerability detection model to obtain a corresponding detection result. In this way, the related art simplifies the operation code sequence to reduce the influence of redundant operation codes on the detection results. According to the key operation code extracted from the corresponding vulnerability type, the context information is extracted, so that the finally integrated operation code sequence reflects the vulnerability information more specifically, and compared with the direct use of the operation code sequence, the length of the sequence is greatly reduced, which not only shortens the model training time, but also solves the problem of model gradient disappearance and model instability caused by too long a sequence.
在实现本公开实施例的过程中,发现相关技术中至少存在如下问题:In the process of implementing the embodiments of the present disclosure, it is found that there are at least the following problems in the related art:
相关技术中利用构建的智能合约漏洞检测模型,虽然可以实现对智能合约漏洞的检测,并缩短智能合约漏洞检测模型的训练时间,以及提升智能合约漏洞检测模型的稳定性。但是相关技术中未考虑智能合约漏洞检测模型对智能合约漏洞检测过程中准确率的问题。The smart contract vulnerability detection model constructed in the related art can detect smart contract vulnerabilities, shorten the training time of the smart contract vulnerability detection model, and improve the stability of the smart contract vulnerability detection model. However, the related art does not consider the accuracy of the smart contract vulnerability detection model in the smart contract vulnerability detection process.
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本申请的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。It should be noted that the information disclosed in the above background technology section is only used to enhance the understanding of the background of the present application, and therefore may include information that does not constitute the prior art known to ordinary technicians in the field.
发明内容Summary of the invention
为了对披露的实施例的一些方面有基本的理解,下面给出了简单的概括。所述概括不是泛泛评述,也不是要确定关键/重要组成元素或描绘这些实施例的保护范围,而是作为后面的详细说明的序言。In order to provide a basic understanding of some aspects of the disclosed embodiments, a brief summary is given below. The summary is not an extensive review, nor is it intended to identify key/critical components or delineate the scope of protection of these embodiments, but rather serves as a prelude to the detailed description that follows.
本公开实施例提供了一种用于检测智能合约漏洞的方法及装置、存储介质,可以提升智能合约漏洞检测的准确率。The embodiments of the present disclosure provide a method, device, and storage medium for detecting smart contract vulnerabilities, which can improve the accuracy of smart contract vulnerability detection.
在一些实施例中,所述方法包括:In some embodiments, the method comprises:
构建智能合约源代码对应的超图;Build a hypergraph corresponding to the smart contract source code;
将超图转换为矩阵形式,以形成多模态超图;Converting the hypergraph into a matrix form to form a multimodal hypergraph;
基于多模态超图执行消息传递算法,得到样本数据;Execute the message passing algorithm based on the multimodal hypergraph to obtain sample data;
利用所述样本数据对图神经网络模型进行训练,得到智能合约漏洞检测模型;Using the sample data to train the graph neural network model to obtain a smart contract vulnerability detection model;
利用智能合约漏洞检测模型对多模态超图进行分类。Classifying multimodal hypergraphs using smart contract vulnerability detection models.
可选地,构建智能合约源代码对应的超图,包括:根据智能合约源代码,分别构建超图的节点、时序边、作用域超边和数据流超边。Optionally, constructing a hypergraph corresponding to the smart contract source code includes: constructing nodes, timing edges, scope hyperedges, and data flow hyperedges of the hypergraph respectively according to the smart contract source code.
可选地,根据智能合约源代码,构建超图的节点,包括:获取待检测漏洞的类型;根据待检测漏洞的类型,构建智能合约源代码中函数和变量对应的节点。Optionally, constructing nodes of a hypergraph according to the smart contract source code includes: obtaining the type of vulnerability to be detected; and constructing nodes corresponding to functions and variables in the smart contract source code according to the type of vulnerability to be detected.
可选地,根据智能合约源代码,构建超图的时序边,包括:根据智能合约源代码的执行顺序,构建时序边。Optionally, constructing the timing edges of the hypergraph according to the smart contract source code includes: constructing the timing edges according to the execution order of the smart contract source code.
可选地,根据智能合约源代码,构建超图的作用域超边,包括:根据智能合约源代码的作用域,构建作用域超边。Optionally, constructing a scope hyperedge of a hypergraph according to the smart contract source code includes: constructing a scope hyperedge according to the scope of the smart contract source code.
可选地,根据智能合约源代码,构建超图的数据流超边,包括:根据智能合约源代码中函数的调用关系,构建数据流超边。Optionally, constructing a data flow hyperedge of a hypergraph according to the smart contract source code includes: constructing a data flow hyperedge according to a calling relationship of functions in the smart contract source code.
可选地,基于多模态超图执行消息传递算法,包括:根据多模态超图的超边,计算节点度;根据多模态超图的节点,计算边度。Optionally, a message passing algorithm is executed based on a multimodal hypergraph, including: calculating node degrees according to hyperedges of the multimodal hypergraph; and calculating edge degrees according to nodes of the multimodal hypergraph.
可选地,利用智能合约漏洞检测模型对多模态超图进行分类,包括:对多模态超图进行数据建模;根据多模态超图中的超边进行消息传递;对消息传递后多模态超图的特征进行分类。Optionally, the multimodal hypergraph is classified using a smart contract vulnerability detection model, including: data modeling of the multimodal hypergraph; message passing according to hyperedges in the multimodal hypergraph; and classifying features of the multimodal hypergraph after message passing.
在一些实施例中,所述装置包括:处理器和存储有程序指令的存储器,其中,所述处理器被配置为在运行所述程序指令时,执行前述用于检测智能合约漏洞的方法。In some embodiments, the device includes: a processor and a memory storing program instructions, wherein the processor is configured to execute the aforementioned method for detecting smart contract vulnerabilities when running the program instructions.
在一些实施例中,所述存储介质,存储有程序指令,其中,所述程序指令在运行时,执行前述用于检测智能合约漏洞的方法。In some embodiments, the storage medium stores program instructions, wherein the program instructions, when run, execute the aforementioned method for detecting smart contract vulnerabilities.
本公开实施例提供的用于检测智能合约漏洞的方法及装置、存储介质,可以实现以下技术效果:The method, device, and storage medium for detecting smart contract vulnerabilities provided by the embodiments of the present disclosure can achieve the following technical effects:
通过构建智能合约源代码对应的超图,利用超图可对智能合约的源代码进行多维度、多尺度特征提取,减少智能合约源代码高维度特征提取困难的情况。并且,利用多模态超图执行消息传递算法后得到的样本数据,对图神经网络模型进行训练过程中,由于保留了图的拓扑结构进行消息传递,实现了对局部特征与全局特征的之间复杂关系的提取。从而,提升了对图神经网络模型进行训练后得到的智能合约漏洞检测模型的检测精度。这样,利用训练得到的智能合约漏洞检测模型对智能合约的漏洞进行检测时,可以提升智能合约漏洞检测的准确率。By constructing a hypergraph corresponding to the smart contract source code, the hypergraph can be used to extract multi-dimensional and multi-scale features from the smart contract source code, reducing the difficulty of extracting high-dimensional features from the smart contract source code. In addition, by using the sample data obtained after executing the message passing algorithm using the multimodal hypergraph, during the training of the graph neural network model, the topological structure of the graph is retained for message passing, which enables the extraction of the complex relationship between local features and global features. As a result, the detection accuracy of the smart contract vulnerability detection model obtained after training the graph neural network model is improved. In this way, when using the trained smart contract vulnerability detection model to detect vulnerabilities in smart contracts, the accuracy of smart contract vulnerability detection can be improved.
以上的总体描述和下文中的描述仅是示例性和解释性的,不用于限制本申请。The above general description and the following description are exemplary and explanatory only and are not intended to limit the present application.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
一个或多个实施例通过与之对应的附图进行示例性说明,这些示例性说明和附图并不构成对实施例的限定,附图中具有相同参考数字标号的元件示为类似的元件,附图不构成比例限制,并且其中:One or more embodiments are exemplarily described by corresponding drawings, which do not limit the embodiments. Elements with the same reference numerals in the drawings are shown as similar elements, and the drawings do not constitute a scale limitation, and wherein:
图1是本公开实施例提供的一种用于检测智能合约漏洞的方法的流程图;FIG1 is a flow chart of a method for detecting smart contract vulnerabilities provided by an embodiment of the present disclosure;
图2是本公开实施例提供的超图时序边的示意图;FIG2 is a schematic diagram of a hypergraph timing edge provided by an embodiment of the present disclosure;
图3是本公开实施例提供的一种用于检测智能合约漏洞的装置的示意图。FIG3 is a schematic diagram of a device for detecting smart contract vulnerabilities provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为了能够更加详尽地了解本公开实施例的特点与技术内容,下面结合附图对本公开实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本公开实施例。在以下的技术描述中,为方便解释起见,通过多个细节以提供对所披露实施例的充分理解。然而,在没有这些细节的情况下,一个或多个实施例仍然可以实施。在其它情况下,为简化附图,熟知的结构和装置可以简化展示。In order to be able to understand the features and technical contents of the embodiments of the present disclosure in more detail, the implementation of the embodiments of the present disclosure is described in detail below in conjunction with the accompanying drawings. The attached drawings are for reference only and are not used to limit the embodiments of the present disclosure. In the following technical description, for the convenience of explanation, a full understanding of the disclosed embodiments is provided through multiple details. However, one or more embodiments can still be implemented without these details. In other cases, to simplify the drawings, well-known structures and devices can be simplified for display.
本公开实施例的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开实施例的实施例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含。The terms "first", "second", etc. in the specification and claims of the embodiments of the present disclosure and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the terms used in this way can be interchanged where appropriate, so that the embodiments of the embodiments of the present disclosure described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions.
除非另有说明,术语“多个”表示两个或两个以上。Unless otherwise stated, the term "plurality" means two or more.
本公开实施例中,字符“/”表示前后对象是一种“或”的关系。例如,A/B表示:A或B。In the embodiment of the present disclosure, the character "/" indicates that the preceding and following objects are in an "or" relationship. For example, A/B indicates: A or B.
术语“和/或”是一种描述对象的关联关系,表示可以存在三种关系。例如,A和/或B,表示:A或B,或,A和B这三种关系。The term "and/or" is a description of the association relationship between objects, indicating that three relationships can exist. For example, A and/or B means: A or B, or, A and B.
术语“对应”可以指的是一种关联关系或绑定关系,A与B相对应指的是A与B之间是一种关联关系或绑定关系。The term "correspondence" may refer to an association relationship or a binding relationship. The correspondence between A and B means that there is an association relationship or a binding relationship between A and B.
智能合约是一种旨在以信息化方式传播、验证或执行合同的计算机协议。智能合约允许在没有第三方的情况下进行可信交易,这些交易可追踪且不可逆转。智能合约的目的是提供优于传统合约的安全方法,并减少与合约相关的其他交易成本。然而,智能合约无法避免地存在安全漏洞。A smart contract is a computer protocol designed to communicate, verify or execute contracts in an information-based manner. Smart contracts allow for trusted transactions without a third party, which are traceable and irreversible. The purpose of smart contracts is to provide a security method superior to traditional contracts and reduce other transaction costs associated with contracts. However, smart contracts inevitably have security vulnerabilities.
为了实现智能合约漏洞的检测,传统的智能合约漏洞检测方法,主要受到程序代码漏洞检测方法的启发,依靠静态分析或动态执行来进行漏洞检测,其检测过程存在准确率较低、误报率较高问题,并且所检测的漏洞种类有限。In order to realize the detection of smart contract vulnerabilities, traditional smart contract vulnerability detection methods are mainly inspired by program code vulnerability detection methods, relying on static analysis or dynamic execution to perform vulnerability detection. The detection process has problems such as low accuracy and high false alarm rate, and the types of vulnerabilities detected are limited.
另外,传统的智能合约漏洞检测方法中,所用到的检测工具依赖于人为设定的检测规则,该检测规则无法适用于多样化的智能合约。而且,基于人为设定的检测规则对智能合约进行检测时,容易出现假阳性和假阴性的情况。In addition, in traditional smart contract vulnerability detection methods, the detection tools used rely on artificially set detection rules, which cannot be applied to a variety of smart contracts. Moreover, when detecting smart contracts based on artificially set detection rules, false positives and false negatives are prone to occur.
现有技术中基于深度神经网络对智能合约漏洞检测的过程中,相较于上述传统智能合约的检测方法,智能合约漏洞检测的准确率有所提升。例如,基于图卷积网络的检测方法,在自动化程度和检测精度的方面,相较于传动智能合约的检测方法,具有一定优势。In the process of detecting smart contract vulnerabilities based on deep neural networks in the prior art, the accuracy of smart contract vulnerability detection has been improved compared to the above-mentioned traditional smart contract detection methods. For example, the detection method based on graph convolutional networks has certain advantages over the detection method of transmission smart contracts in terms of automation and detection accuracy.
基于图卷积网络(Graph Convolutional Netwok,GCN)的检测方法,可以将智能合约的源代码中的控制和数据流语义转换为图形。虽然在一定程度上可以提升智能合约漏洞检测的准确率。但是面向结构化数据处理的深度学习方法,无法契合智能合约图中非欧式结构特征,即无法提取合约局部特征与全局特征之间复杂关系。这样,会导致所提取的数据特征无法表征智能合约源代码多维度、多尺度时空关系。而且,基于图卷积网络的提取算法,无法保留原有特征图的拓扑结构,造成特征图的信息减少。进而,难以挖掘漏洞特征在时空维度上,不同尺度间的隐含联系,使得基于图卷积网络的检测方法对智能合约漏洞进行检测的过程中,智能合约漏洞检测的准确率仍然较低。The detection method based on graph convolutional network (GCN) can convert the control and data flow semantics in the source code of smart contracts into graphs. Although it can improve the accuracy of smart contract vulnerability detection to a certain extent. However, the deep learning method for structured data processing cannot fit the non-Euclidean structural features in the smart contract graph, that is, it cannot extract the complex relationship between the local features and the global features of the contract. In this way, the extracted data features cannot represent the multi-dimensional and multi-scale spatiotemporal relationships of the smart contract source code. Moreover, the extraction algorithm based on graph convolutional network cannot retain the topological structure of the original feature graph, resulting in a reduction in the information of the feature graph. Furthermore, it is difficult to explore the implicit connections between different scales of vulnerability features in the spatiotemporal dimension, which makes the accuracy of smart contract vulnerability detection still low in the process of detecting smart contract vulnerabilities using the detection method based on graph convolutional network.
本公开实施例提供一种用于检测智能合约漏洞的方法及装置、存储介质,可以将智能合约源代码中函数的调用关系、时序控制和数据依赖关系在超图中进行表征,并智能合约源代码中函数、变量、之间不仅存在简单的时序调用关系,还存在作用域关系、数据流调用关系。因此,本公开实施例实施过程中,需要将智能合约源代码中的作用域关系和数据流关系均体现在超图结构中,具体是将智能合约源代码中数据流调用关系构造为超图中的数据流超边。并且在利用图神经网络模型执行分类任务时,使用超图中的时序边来重构图的时序信息。从而,提升了对图神经网络模型进行训练后得到的智能合约漏洞检测模型的检测精度。这样,利用训练得到的智能合约漏洞检测模型对智能合约的漏洞进行检测时,可以提升智能合约漏洞检测的准确率。The disclosed embodiments provide a method, device, and storage medium for detecting smart contract vulnerabilities, which can characterize the calling relationship, timing control, and data dependency of functions in the smart contract source code in a hypergraph, and there are not only simple timing calling relationships between functions and variables in the smart contract source code, but also scope relationships and data flow calling relationships. Therefore, during the implementation of the disclosed embodiments, it is necessary to reflect the scope relationship and data flow relationship in the smart contract source code in the hypergraph structure, specifically to construct the data flow calling relationship in the smart contract source code as a data flow hyperedge in the hypergraph. And when using the graph neural network model to perform classification tasks, the timing edges in the hypergraph are used to reconstruct the timing information of the graph. Thereby, the detection accuracy of the smart contract vulnerability detection model obtained after training the graph neural network model is improved. In this way, when the trained smart contract vulnerability detection model is used to detect the vulnerabilities of the smart contract, the accuracy of the smart contract vulnerability detection can be improved.
结合图1所示,本公开实施例提供一种用于检测智能合约漏洞的方法,包括:As shown in FIG1 , the present disclosure provides a method for detecting smart contract vulnerabilities, including:
S11,构建智能合约源代码对应的超图。S11, build the hypergraph corresponding to the smart contract source code.
S12,将超图转换为矩阵形式,以形成多模态超图。S12, converting the hypergraph into a matrix form to form a multimodal hypergraph.
S13,基于多模态超图执行消息传递算法,得到样本数据。S13, executing a message passing algorithm based on the multimodal hypergraph to obtain sample data.
S14,利用所述样本数据对图神经网络模型进行训练,得到智能合约漏洞检测模型。S14, using the sample data to train the graph neural network model to obtain a smart contract vulnerability detection model.
S15,利用智能合约漏洞检测模型对多模态超图进行分类。S15, classify multimodal hypergraphs using smart contract vulnerability detection models.
采用本公开实施例提供的用于检测智能合约漏洞的方法,通过构建智能合约源代码对应的超图,利用超图可对智能合约的源代码进行多维度、多尺度特征提取,减少智能合约源代码高维度特征提取困难的情况。并且,利用多模态超图执行消息传递算法后得到的样本数据,对图神经网络模型进行训练过程中,由于保留了图的拓扑结构进行消息传递,实现了对局部特征与全局特征的之间复杂关系的提取。从而,提升了对图神经网络模型进行训练后得到的智能合约漏洞检测模型的检测精度。这样,利用训练得到的智能合约漏洞检测模型对智能合约的漏洞进行检测时,可以提升智能合约漏洞检测的准确率。The method for detecting smart contract vulnerabilities provided by the embodiment of the present disclosure is adopted. By constructing a hypergraph corresponding to the smart contract source code, the hypergraph can be used to extract multi-dimensional and multi-scale features of the smart contract source code, thereby reducing the difficulty of extracting high-dimensional features of the smart contract source code. In addition, by using the sample data obtained after executing the message passing algorithm using the multimodal hypergraph, during the training of the graph neural network model, the complex relationship between local features and global features is extracted because the topological structure of the graph is retained for message passing. Thereby, the detection accuracy of the smart contract vulnerability detection model obtained after training the graph neural network model is improved. In this way, when the trained smart contract vulnerability detection model is used to detect vulnerabilities in smart contracts, the accuracy of smart contract vulnerability detection can be improved.
可选地,构建智能合约源代码对应的超图,包括:根据智能合约源代码,分别构建超图的节点、时序边、作用域超边和数据流超边。Optionally, constructing a hypergraph corresponding to the smart contract source code includes: constructing nodes, timing edges, scope hyperedges, and data flow hyperedges of the hypergraph respectively according to the smart contract source code.
智能合约源代码对应的超图中,通过时序边表征智能合约源代码执行的时间顺序,通过作用域超边表征智能合约源代码中的作用域关系,通过数据流超边表征智能合约源代码中的数据流调用关系。这样,超图可表征智能合约源代码对应的全局特征与局部特征的复杂关系,以及表征出源代码中高维度的特征。In the hypergraph corresponding to the smart contract source code, the time sequence of the smart contract source code execution is represented by the timing edge, the scope relationship in the smart contract source code is represented by the scope hyperedge, and the data flow call relationship in the smart contract source code is represented by the data flow hyperedge. In this way, the hypergraph can represent the complex relationship between the global features and local features corresponding to the smart contract source code, and represent the high-dimensional features in the source code.
可选地,根据智能合约源代码,构建超图的节点,包括:获取待检测漏洞的类型。根据待检测漏洞的类型,构建智能合约源代码中函数和变量对应的节点。Optionally, constructing the nodes of the hypergraph according to the smart contract source code includes: obtaining the type of vulnerability to be detected. According to the type of vulnerability to be detected, constructing nodes corresponding to functions and variables in the smart contract source code.
本公开实施例中,漏洞的类型至少包括以下四种类型:In the disclosed embodiment, the types of vulnerabilities include at least the following four types:
重入程序漏洞,通常在源代码执行过程中,调用非递归函数的情况下,源代码中程序命令执行结束之前不会输入新的执行命令。然而,智能合约源代码的执行规程与上述不同,由于基于Solidity语言的智能合约源代码具有独特的回调机制,使得攻击者可能会在程序命令执行结束之前重新进入被调用的函数。例如,攻击者在智能合约源代码的Fallback函数中设计恶意攻击代码,递归调用智能合约源代码中的传输函数,以窃取以太坊。Reentry vulnerability, usually in the process of source code execution, when a non-recursive function is called, no new execution command will be entered before the execution of the program command in the source code ends. However, the execution procedures of the smart contract source code are different from the above. Since the smart contract source code based on the Solidity language has a unique callback mechanism, the attacker may re-enter the called function before the execution of the program command ends. For example, the attacker designs malicious attack code in the Fallback function of the smart contract source code and recursively calls the transfer function in the smart contract source code to steal Ethereum.
时间戳漏洞,智能合约通常使用区块链网络中的节点或用户确认的区块时间戳来实现时间约束。智能合约可以检索区块时间戳,区块中的所有交易共享相同的时间戳,保证了智能合约执行后状态的一致性。但是,创建智能合约的用户可以通过故意选择有利的时间戳进行获利。Timestamp vulnerability, smart contracts usually use the block timestamps confirmed by nodes or users in the blockchain network to implement time constraints. Smart contracts can retrieve block timestamps, and all transactions in a block share the same timestamp, which ensures the consistency of the state after the smart contract is executed. However, users who create smart contracts can profit by deliberately choosing favorable timestamps.
无限循环漏洞,无限循环是智能合约源代码中较为常见的漏洞。源代码中可能没有退出条件或无法达到退出条件的循环(例如,for循环、while循环或自调用循环),即为无限循环。Infinite loop vulnerability,Infinite loop is a common vulnerability in smart contract source code. A loop in the source code that may not have an exit condition or cannot reach an exit condition (for example, a for loop, while loop, or a self-calling loop) is an infinite loop.
整数溢出漏洞,智能合约源代码中数值可能超过整数的取值范围。Integer overflow vulnerability: the value in the smart contract source code may exceed the range of integers.
本公开实施例中,根据上述待检测漏洞的类型,构建超图的节点的方式,可使所构建的超图表征不同类型的待检测漏洞。从而,可提升后续基于超图训练得出的智能合约漏洞检测模型的泛化能力,以利用智能合约漏洞检测模型对不同类型的漏洞进行检测。In the disclosed embodiment, according to the types of the above-mentioned vulnerabilities to be detected, the nodes of the hypergraph are constructed in a manner that enables the constructed hypergraph to represent different types of vulnerabilities to be detected. Thus, the generalization ability of the subsequent smart contract vulnerability detection model obtained by hypergraph training can be improved, so that different types of vulnerabilities can be detected using the smart contract vulnerability detection model.
可选地,根据待检测漏洞的类型,构建智能合约源代码中函数和变量对应的节点,包括:根据待检测漏洞的类型,确定节点的类型。根据节点的类型,构建智能合约源代码中函数和变量对应的节点。Optionally, constructing nodes corresponding to functions and variables in the smart contract source code according to the type of vulnerability to be detected includes: determining the type of node according to the type of vulnerability to be detected. Constructing nodes corresponding to functions and variables in the smart contract source code according to the type of node.
本公开实施例中,由于智能合约源代码中不同类型漏洞对应的原因也不相同,漏洞的原因与源代码中关键函数和变量相关。例如,重入程序漏洞是由Fallback函数存在外部智能合约函数的递归调用关系,而导致对智能合约的数据存储出现攻击。整数溢出漏洞是由于数值超过了整数类型的范围,原因是由关键变量引起的攻击。这样,基于代码程序的关键函数和变量对漏洞类型影响程度,生成四种类型的节点,即内置函数节点、自定义函数节点、关键变量节点和Fallback函数节点。In the disclosed embodiment, since the causes of different types of vulnerabilities in the smart contract source code are different, the causes of the vulnerabilities are related to the key functions and variables in the source code. For example, the reentry vulnerability is caused by the recursive call relationship between the Fallback function and the external smart contract function, which leads to an attack on the data storage of the smart contract. The integer overflow vulnerability is caused by the attack caused by the key variable due to the value exceeding the range of the integer type. In this way, based on the degree of influence of the key functions and variables of the code program on the vulnerability type, four types of nodes are generated, namely built-in function nodes, custom function nodes, key variable nodes and Fallback function nodes.
其中,内置函数是Solidity语言中预先定义的函数,重入程序漏洞、时间戳漏洞和整数溢出漏洞的检测过程均与内置函数相关联,本公开实施例中用M∈{M1,M2,M3······Mn|n∈N}中的M表示内置函数节点,n表示内置函数节点数量,N表示自然数。自定义函数是用于在智能合约源代码中定义的封装函数,本公开实施例中用F∈{F1,F2,F3······Fn|n∈N}中的F表示自定义函数节点,n表示自定义函数节点数量,N表示自然数。关键变量是源代码中对智能合约结果具有一定影响的变量,本公开实施例中用S∈{S1,S2,S3······Sn|n∈N}中的S表示关键变量节点,n表示关键变量节点数量,N表示自然数。Fallback函数为源代码中可以调用外部智能合约的代码,本公开实施例中用C∈{C1,C2,C3······Cn|n∈N}中的C表示回退函数节点,n表示回退函数节点数量,N表示自然数。另外,利用基于图卷积网络检测方法检测智能合约漏洞过程中,仅是将单个sol文件生成图结构,未考虑到多个文件之间的相互依赖关系。鉴于此,本公开实施例中针对多个文件生成对应的节点,并表征多个文件之间的依赖调用关系。Among them, the built-in function is a function pre-defined in the Solidity language. The detection processes of reentrancy program vulnerabilities, timestamp vulnerabilities and integer overflow vulnerabilities are all associated with built-in functions. In the embodiment of the present disclosure, M∈{M 1 , M 2 , M 3 ······M n |n∈N} represents a built-in function node, n represents the number of built-in function nodes, and N represents a natural number. The custom function is a packaged function defined in the smart contract source code. In the embodiment of the present disclosure, F∈{F 1 , F 2 , F 3 ······F n |n∈N} represents a custom function node, n represents the number of custom function nodes, and N represents a natural number. The key variable is a variable in the source code that has a certain influence on the result of the smart contract. In the embodiment of the present disclosure, S∈{S 1 , S 2 , S 3 ······S n |n∈N} represents a key variable node, n represents the number of key variable nodes, and N represents a natural number. Fallback function is the code in the source code that can call external smart contracts. In the embodiment of the present disclosure, C∈{C 1 , C 2 , C 3 ······C n |n∈N} represents the fallback function node, n represents the number of fallback function nodes, and N represents a natural number. In addition, in the process of detecting smart contract vulnerabilities using the graph convolutional network detection method, only a single sol file is generated into a graph structure, and the mutual dependence relationship between multiple files is not considered. In view of this, in the embodiment of the present disclosure, corresponding nodes are generated for multiple files, and the dependent calling relationship between multiple files is characterized.
需要说明的是,智能合约中的(.sol)文件,是flash的Cookie存储文件,与Cookie的功能类似。在线视频的网站,影片都以flash形式播放。保存的sol文件主要用来配置在播放视频时的相关属性。不同网站,有的只有1个sol文件,有的则可以有5、6个。It should be noted that the (.sol) file in the smart contract is a flash cookie storage file, which has a similar function to cookies. On online video websites, videos are played in flash format. The saved sol file is mainly used to configure the relevant properties when playing the video. Different websites may have only one sol file, while others may have 5 or 6.
可选地,根据智能合约源代码,构建超图的时序边,包括:根据智能合约源代码的执行顺序,构建时序边。Optionally, constructing the timing edges of the hypergraph according to the smart contract source code includes: constructing the timing edges according to the execution order of the smart contract source code.
由于智能合约会涉及到转账相关的操作过程,那么智能合约源代码中内部函数和外部函数的调用均按照严格的时间顺序执行。并且,无限循环漏洞与程序的时间和顺序均相关。鉴于此,本公开实施例中通过构建时序边,用以捕捉源代码中程序的时序高阶语义关系,从而实现与无限循环和时间相关联的漏洞检测。具体地,本公开实施例中所构建的超图时序边如图2所示,Timing edge表示时序边,Hyperedge表示超图,Data Representation表示数据表示,built-in function Node表示内置节点,Custom function Node表示自定义节点,Key Variable Node表示关键节点,Fallback Node表示回退节点,Data flow edge表示数据流超边,Fallback Edge表示回退边,Call edge表示调用边。Since smart contracts involve transfer-related operations, the calls of internal and external functions in the smart contract source code are executed in strict time order. In addition, infinite loop vulnerabilities are related to both the time and order of the program. In view of this, in the embodiment of the present disclosure, timing edges are constructed to capture the high-order semantic relationship of the timing of the program in the source code, thereby realizing vulnerability detection associated with infinite loops and time. Specifically, the hypergraph timing edge constructed in the embodiment of the present disclosure is shown in Figure 2, where Timing edge represents timing edge, Hyperedge represents hypergraph, Data Representation represents data representation, built-in function Node represents built-in node, Custom function Node represents custom node, Key Variable Node represents key node, Fallback Node represents fallback node, Data flow edge represents data flow hyperedge, Fallback Edge represents fallback edge, and Call edge represents call edge.
可选地,根据智能合约源代码,构建超图的作用域超边,包括:根据智能合约源代码的作用域,构建作用域超边。Optionally, constructing a scope hyperedge of a hypergraph according to the smart contract source code includes: constructing a scope hyperedge according to the scope of the smart contract source code.
智能合约在进行货币交易时,通常会调用外部智能合约或执行向外部地址发送货币的操作,这使得源代码中存在较多的函数调用关系,源代码程序之间不仅存在时序调用关系,还存在高阶的作用域关系。在solidity语言的源代码中,可能会存在相同的变量名和函数名,其作用域属于不同的类或者不同的合约的情况。虽然函数名和变量名相同,但是不属于同一个函数或变量。然而,对于重入程序漏洞,如果智能合约调用的是攻击者控制的恶意合约,攻击者可以使用call.value转账函数反复调用被改写过的Fallback函数。鉴于此,本公开实施例中将智能合约源代码中处于同一作用域的函数名和变量名的节点使用SH,SH表示为超边。并且还可以对作用域超边的不同粗细粒度进行控制。例如,同一个函数在不同文件作用域,或者在同一个contract内的作用域,或者在同一个智能合约文件内的作用域的情况下,本公开实施例中可以选择的粒度为同一个contract内的作用域。这样,有利于提升检测智能合约漏洞的准确率。When a smart contract conducts currency transactions, it usually calls an external smart contract or performs an operation of sending currency to an external address, which results in a large number of function call relationships in the source code. There are not only sequential call relationships between source code programs, but also high-order scope relationships. In the source code of the solidity language, there may be the same variable name and function name, whose scopes belong to different classes or different contracts. Although the function name and variable name are the same, they do not belong to the same function or variable. However, for reentrant program vulnerabilities, if the smart contract calls a malicious contract controlled by an attacker, the attacker can use the call.value transfer function to repeatedly call the rewritten Fallback function. In view of this, in the embodiment of the present disclosure, the nodes of the function name and variable name in the same scope in the smart contract source code are represented by SH, and SH is represented as a hyperedge. And the different coarse and fine granularities of the scope hyperedge can also be controlled. For example, when the same function is in different file scopes, or in the scope of the same contract, or in the scope of the same smart contract file, the granularity that can be selected in the embodiment of the present disclosure is the scope within the same contract. In this way, it is conducive to improving the accuracy of detecting smart contract vulnerabilities.
可选地,根据智能合约源代码,构建超图的数据流超边,包括:根据智能合约源代码中函数的调用关系,构建数据流超边。Optionally, constructing a data flow hyperedge of a hypergraph according to the smart contract source code includes: constructing a data flow hyperedge according to a calling relationship of functions in the smart contract source code.
数据流用于表示智能合约源代码中函数调用的拓扑结构。在同一个智能合约源代码中,不同的函数对同一个变量操作不规范的情况下,容易导致智能合约重载漏洞。鉴于此,本公开实施例中使用ed将数据流的调用、写入和读取过程进行关联,以获取高维度的特征信息,有利于对图神经网络模型进行训练,得到智能合约漏洞检测模型。The data stream is used to represent the topological structure of function calls in the smart contract source code. In the same smart contract source code, if different functions operate the same variable in an irregular manner, it is easy to cause smart contract overload vulnerabilities. In view of this, ed is used in the embodiment of the present disclosure to associate the calling, writing and reading processes of the data stream to obtain high-dimensional feature information, which is conducive to training the graph neural network model and obtaining a smart contract vulnerability detection model.
可选地,将超图转换为矩阵形式的过程中,可采用G=(v,ε,w)表示超图。其中,G为超图,v为超图中的节点集合,ε为超图中的超边,w为超图中超边的权重。并且,超图中超边的构建过程中使用矩阵表示。当某个节点被某个超边所链接时,用数字1表示,否则用数字0表示。本公开实施例中将所有的超边构建在同一超边矩阵中,该矩阵表示solidity语言源代码的多模态信息,该矩阵如下表示:Optionally, in the process of converting the hypergraph into a matrix form, G = (v, ε, w) can be used to represent the hypergraph. Wherein G is a hypergraph, v is a set of nodes in the hypergraph, ε is a hyperedge in the hypergraph, and w is the weight of the hyperedge in the hypergraph. In addition, a matrix is used to represent the construction of hyperedges in the hypergraph. When a node is linked by a hyperedge, it is represented by the number 1, otherwise it is represented by the number 0. In the embodiment of the present disclosure, all hyperedges are constructed in the same hyperedge matrix, which represents the multimodal information of the solidity language source code. The matrix is represented as follows:
其中,h表示为超图集合,v为超图中的顶点,e为超图的边,如果节点在边上,则用1来表示节点与超边的关系,否则为0。Among them, h represents the hypergraph set, v is the vertex in the hypergraph, and e is the edge of the hypergraph. If the node is on the edge, 1 is used to indicate the relationship between the node and the hyperedge, otherwise it is 0.
可选地,基于多模态超图执行消息传递算法,包括:根据多模态超图的超边,计算节点度。根据多模态超图的节点,计算边度。Optionally, executing a message passing algorithm based on a multimodal hypergraph includes: calculating node degrees according to hyperedges of the multimodal hypergraph and calculating edge degrees according to nodes of the multimodal hypergraph.
节点度为节点链接的边的数量总和。按照以下公式,计算节点度:The node degree is the sum of the number of edges connected to the node. The node degree is calculated according to the following formula:
d(v)=∑e∈εw(e)h(v,e)d(v)=∑ e∈ε w(e)h(v,e)
其中,d(v)为节点度,v为超图中的顶点,h(v,e)为矩阵形式的多模态超图,w(e)表示超边的权重,ε为超边的集合。Among them, d(v) is the node degree, v is the vertex in the hypergraph, h(v, e) is the multimodal hypergraph in matrix form, w(e) represents the weight of the hyperedge, and ε is the set of hyperedges.
边度为超边链接节点的数量总和。按照以下公式,计算边度:The edge degree is the sum of the number of nodes connected by hyperedges. The edge degree is calculated according to the following formula:
δ(e)=∑v∈γh(v,e)δ(e)=∑ v∈γ h(v,e)
其中,δ(e)为边度,h(v,e)为矩阵形式的多模态超图,γ表示为节点的集合。Among them, δ(e) is the edge degree, h(v, e) is a multimodal hypergraph in matrix form, and γ represents the set of nodes.
在对边度进行计算的过程中,需按照以下公式对节点进行分类:In the process of calculating the edge degree, the nodes need to be classified according to the following formula:
arg=minf{Remp(f)+Ω(f)}arg=min f {R emp (f)+Ω(f)}
其中,Remp(f)为有监督的经验损失,Ω(f)为对超图进行正规化处理,f为超图神经网络的函数,arg表示边度计算结果。Among them, R emp (f) is the supervised empirical loss, Ω(f) is the regularization processing of the hypergraph, f is the function of the hypergraph neural network, and arg represents the edge degree calculation result.
本公开实施例中Ω(f)的展开式如下:The expansion of Ω(f) in the embodiment of the present disclosure is as follows:
其中,令Δ=I-θ,Ω(f)的展开式进行转换如下:Among them, Δ=I-θ, the expansion of Ω(f) is transformed as follows:
Ω(f)=fTΔΩ(f)=f T Δ
可选地,超图的公式更新如下:Optionally, the formula of the hypergraph is updated as follows:
其中,Dv为超图中的节点度,De为超图中的边度,H为超图的邻接矩阵,W为超图中每个超边的权重矩阵,X(l)为图卷积网络中L层的特征,为对X进行特征转换。Where D v is the node degree in the hypergraph, De is the edge degree in the hypergraph, H is the adjacency matrix of the hypergraph, W is the weight matrix of each hyperedge in the hypergraph, X (l) is the feature of the L layer in the graph convolutional network, To perform feature transformation on X.
可选地,利用智能合约漏洞检测模型对多模态超图进行分类,包括:对多模态超图进行数据建模。根据多模态超图中的超边进行消息传递。对消息传递后多模态超图的特征进行分类。Optionally, the multimodal hypergraph is classified using the smart contract vulnerability detection model, including: performing data modeling on the multimodal hypergraph, performing message delivery according to hyperedges in the multimodal hypergraph, and classifying features of the multimodal hypergraph after message delivery.
在公开实施例中,可以利用根据智能合约源代码构建的脚本,对多模态超图进行数据建模。具体地,脚本一用于提取超图中的数据流超边和作用域超边,超边的构造保留了源代码的高维度信息和源代码的特征信息。从而,可以向图卷积网络传输高阶的数据结构特征,减少信息的丢失。脚本二用于提取源代码的时序信息特征,以保证在原有的高阶信息传递的基础上具有更好的优越性。In the disclosed embodiment, a script constructed according to the smart contract source code can be used to perform data modeling on the multimodal hypergraph. Specifically, script one is used to extract data flow hyperedges and scope hyperedges in the hypergraph. The construction of the hyperedges retains the high-dimensional information and feature information of the source code. Thus, high-order data structure features can be transmitted to the graph convolutional network to reduce information loss. Script two is used to extract the timing information features of the source code to ensure better superiority on the basis of the original high-order information transmission.
将所构建的多模态超图,输入到图神经网络模型中,并输出带有原始图结构的超图特征,然后将超图特征中所有的节点,按照超图时序边表征的调用顺序,构建序列化的结点特征。The constructed multimodal hypergraph is input into the graph neural network model, and the hypergraph features with the original graph structure are output. Then, all the nodes in the hypergraph features are serialized into node features according to the calling order of the hypergraph temporal edge representation.
本公开实施例中,为了提升检测智能合约漏洞的准确率,可以根据超图构建的时序边,即根据时序信息流重建的分类网络。图卷积神经网络模型中加入时序信息执行超图分类任务,可以提升检测智能合约漏洞的准确率。In the disclosed embodiment, in order to improve the accuracy of detecting smart contract vulnerabilities, the time-series edges constructed according to the hypergraph, that is, the classification network reconstructed according to the time-series information flow, can be used. Adding time-series information to the graph convolutional neural network model to perform the hypergraph classification task can improve the accuracy of detecting smart contract vulnerabilities.
另外,为了便于利用图中各个节点的序列化信息,可使用transformer,transformer是一种用于自然语言处理和其他序列建模任务的机器学习模型架构,由于transformer中的自注意力机制有利于捕获智能合约中函数和变量的依赖关系特征、调用关系特征。相较于卷积神经网络模型,可更好地识别漏洞的类型。In addition, in order to facilitate the use of serialized information of each node in the graph, transformer can be used. Transformer is a machine learning model architecture used for natural language processing and other sequence modeling tasks. The self-attention mechanism in transformer is conducive to capturing the dependency characteristics and call relationship characteristics of functions and variables in smart contracts. Compared with the convolutional neural network model, it can better identify the type of vulnerability.
本公开实施例中,利用训练得到的智能合约漏洞检测模型处理上述序列化的结点特征过程中,对输出的结果embeding(将高维数据映射到低维空间)后,传入智能合约漏洞检测模型的全连接层,执行分类任务。其中,分类任务的标签分为两类(即0和1),0表示图没有漏洞,1表示图存在漏洞。从而实现对智能合约漏洞的检测。In the disclosed embodiment, the trained smart contract vulnerability detection model is used to process the serialized node features. After embedding (mapping high-dimensional data to low-dimensional space) the output result is passed to the fully connected layer of the smart contract vulnerability detection model to perform the classification task. The labels of the classification task are divided into two categories (i.e., 0 and 1), 0 means that the graph has no vulnerabilities, and 1 means that the graph has vulnerabilities. This enables the detection of smart contract vulnerabilities.
结合图3所示,本公开实施例提供一种用于检测智能合约漏洞的装置300,包括处理器(processor)100和存储器(memory)101。可选地,该装置还可以包括通信接口(Communication Interface)102和总线103。其中,处理器100、通信接口102、存储器101可以通过总线103完成相互间的通信。通信接口102可以用于信息传输。处理器100可以调用存储器101中的逻辑指令,以执行上述实施例的用于检测智能合约漏洞的方法。As shown in FIG3 , an embodiment of the present disclosure provides a device 300 for detecting smart contract vulnerabilities, including a processor 100 and a memory 101. Optionally, the device may also include a communication interface 102 and a bus 103. The processor 100, the communication interface 102, and the memory 101 may communicate with each other through the bus 103. The communication interface 102 may be used for information transmission. The processor 100 may call the logic instructions in the memory 101 to execute the method for detecting smart contract vulnerabilities of the above embodiment.
此外,上述的存储器101中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。In addition, the logic instructions in the memory 101 described above may be implemented in the form of software functional units and when sold or used as independent products, may be stored in a computer-readable storage medium.
存储器101作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序,如本公开实施例中的方法对应的程序指令/模块。处理器100通过运行存储在存储器101中的程序指令/模块,从而执行功能应用以及数据处理,即实现上述实施例中用于检测智能合约漏洞的方法。The memory 101 is a computer-readable storage medium that can be used to store software programs and computer executable programs, such as program instructions/modules corresponding to the method in the embodiment of the present disclosure. The processor 100 executes the functional application and data processing by running the program instructions/modules stored in the memory 101, that is, the method for detecting smart contract vulnerabilities in the above embodiment is implemented.
存储器101可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端设备的使用所创建的数据等。此外,存储器101可以包括高速随机存取存储器,还可以包括非易失性存储器。The memory 101 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application required for at least one function; the data storage area may store data created according to the use of the terminal device, etc. In addition, the memory 101 may include a high-speed random access memory and may also include a non-volatile memory.
本公开实施例提供了一种存储介质,存储有计算机可执行指令,所述计算机可执行指令设置为执行上述用于检测智能合约漏洞的方法。An embodiment of the present disclosure provides a storage medium storing computer executable instructions, wherein the computer executable instructions are configured to execute the above-mentioned method for detecting smart contract vulnerabilities.
上述的存储介质可以是暂态计算机可读存储介质,也可以是非暂态计算机可读存储介质。The above-mentioned storage medium may be a transient computer-readable storage medium or a non-transitory computer-readable storage medium.
本公开实施例的技术方案可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括一个或多个指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开实施例所述方法的全部或部分步骤。而前述的存储介质可以是非暂态存储介质,包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等多种可以存储程序代码的介质,也可以是暂态存储介质。The technical solution of the embodiment of the present disclosure can be embodied in the form of a software product, which is stored in a storage medium and includes one or more instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiment of the present disclosure. The aforementioned storage medium may be a non-transient storage medium, including: a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a disk or an optical disk, and other media that can store program codes, or a transient storage medium.
以上描述和附图充分地示出了本公开的实施例,以使本领域的技术人员能够实践它们。其他实施例可以包括结构的、逻辑的、电气的、过程的以及其他的改变。实施例仅代表可能的变化。除非明确要求,否则单独的部件和功能是可选的,并且操作的顺序可以变化。一些实施例的部分和特征可以被包括在或替换其他实施例的部分和特征。而且,本申请中使用的用词仅用于描述实施例并且不用于限制权利要求。如在实施例以及权利要求的描述中使用的,除非上下文清楚地表明,否则单数形式的“一个”(a)、“一个”(an)和“所述”(the)旨在同样包括复数形式。类似地,如在本申请中所使用的术语“和/或”是指包含一个或一个以上相关联的列出的任何以及所有可能的组合。另外,当用于本申请中时,术语“包括”(comprise)及其变型“包括”(comprises)和/或包括(comprising)等指陈述的特征、整体、步骤、操作、元素,和/或组件的存在,但不排除一个或一个以上其它特征、整体、步骤、操作、元素、组件和/或这些的分组的存在或添加。在没有更多限制的情况下,由语句“包括一个…”限定的要素,并不排除在包括所述要素的过程、方法或者设备中还存在另外的相同要素。本文中,每个实施例重点说明的可以是与其他实施例的不同之处,各个实施例之间相同相似部分可以互相参见。对于实施例公开的方法、产品等而言,如果其与实施例公开的方法部分相对应,那么相关之处可以参见方法部分的描述。The above description and the accompanying drawings fully illustrate the embodiments of the present disclosure so that those skilled in the art can practice them. Other embodiments may include structural, logical, electrical, process and other changes. The embodiments represent only possible changes. Unless explicitly required, individual components and functions are optional, and the order of operations may vary. Parts and features of some embodiments may be included in or replace parts and features of other embodiments. Moreover, the words used in this application are only used to describe the embodiments and are not used to limit the claims. As used in the description of the embodiments and the claims, unless the context clearly indicates otherwise, the singular forms "a", "an" and "the" are intended to include plural forms as well. Similarly, the term "and/or" as used in this application refers to any and all possible combinations of one or more associated listings. In addition, when used in this application, the term "comprise" and its variants "comprises" and/or including (comprising) refer to the existence of stated features, wholes, steps, operations, elements, and/or components, but do not exclude the existence or addition of one or more other features, wholes, steps, operations, elements, components and/or groups of these. In the absence of further restrictions, the elements defined by the sentence "comprising a ..." do not exclude the existence of other identical elements in the process, method or device comprising the elements. In this article, each embodiment may focus on the differences from other embodiments, and the same or similar parts between the embodiments may refer to each other. For the methods, products, etc. disclosed in the embodiments, if they correspond to the method part disclosed in the embodiments, then the relevant parts can refer to the description of the method part.
本领域技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,可以取决于技术方案的特定应用和设计约束条件。所述技术人员可以对每个特定的应用来使用不同方法以实现所描述的功能,但是这种实现不应认为超出本公开实施例的范围。所述技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art will appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software may depend on the specific application and design constraints of the technical solution. The technicians may use different methods for each specific application to implement the described functions, but such implementations should not be considered to exceed the scope of the embodiments of the present disclosure. The technicians may clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can refer to the corresponding processes in the aforementioned method embodiments, and will not be repeated here.
本文所披露的实施例中,所揭露的方法、产品(包括但不限于装置、设备等),可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,可以仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例。另外,在本公开实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In the embodiments disclosed herein, the disclosed methods and products (including but not limited to devices, equipment, etc.) can be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the units can be only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. In addition, the coupling or direct coupling or communication connection between each other shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms. The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to implement this embodiment. In addition, each functional unit in the embodiment of the present disclosure may be integrated in a processing unit, or each unit may exist physically alone, or two or more units may be integrated in one unit.
附图中的流程图和框图显示了根据本公开实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这可以依所涉及的功能而定。在附图中的流程图和框图所对应的描述中,不同的方框所对应的操作或步骤也可以以不同于描述中所披露的顺序发生,有时不同的操作或步骤之间不存在特定的顺序。例如,两个连续的操作或步骤实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这可以依所涉及的功能而定。框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the accompanying drawings show the possible architecture, functions and operations of the system, method and computer program product according to the embodiments of the present disclosure. In this regard, each box in the flowchart or block diagram can represent a module, a program segment or a part of a code, and the module, program segment or a part of the code contains one or more executable instructions for implementing the specified logical function. In some alternative implementations, the functions marked in the box can also occur in an order different from that marked in the accompanying drawings. For example, two consecutive boxes can actually be executed substantially in parallel, and they can sometimes be executed in the opposite order, which can depend on the functions involved. In the descriptions corresponding to the flowcharts and block diagrams in the accompanying drawings, the operations or steps corresponding to different boxes can also occur in an order different from that disclosed in the description, and sometimes there is no specific order between different operations or steps. For example, two consecutive operations or steps can actually be executed substantially in parallel, and they can sometimes be executed in the opposite order, which can depend on the functions involved. Each box in the block diagram and/or flowchart, and the combination of boxes in the block diagram and/or flowchart can be implemented with a dedicated hardware-based system that performs a specified function or action, or can be implemented with a combination of dedicated hardware and computer instructions.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311228142.9A CN117992742A (en) | 2023-09-22 | 2023-09-22 | Method, device and storage medium for detecting smart contract vulnerabilities |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311228142.9A CN117992742A (en) | 2023-09-22 | 2023-09-22 | Method, device and storage medium for detecting smart contract vulnerabilities |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN117992742A true CN117992742A (en) | 2024-05-07 |
Family
ID=90898278
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202311228142.9A Pending CN117992742A (en) | 2023-09-22 | 2023-09-22 | Method, device and storage medium for detecting smart contract vulnerabilities |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN117992742A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119149433A (en) * | 2024-11-11 | 2024-12-17 | 烟台大学 | Intelligent contract vulnerability detection method, system and equipment based on multi-mode fusion |
-
2023
- 2023-09-22 CN CN202311228142.9A patent/CN117992742A/en active Pending
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119149433A (en) * | 2024-11-11 | 2024-12-17 | 烟台大学 | Intelligent contract vulnerability detection method, system and equipment based on multi-mode fusion |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10755048B2 (en) | Artificial intelligence based method and apparatus for segmenting sentence | |
| Zeng et al. | Ethergis: A vulnerability detection framework for ethereum smart contracts based on graph learning features | |
| JP2020027649A (en) | Method, apparatus, device and storage medium for generating entity relationship data | |
| CN111177569A (en) | Recommendation processing method, device and equipment based on artificial intelligence | |
| WO2021017735A1 (en) | Smart contract formal verification method, electronic apparatus and storage medium | |
| TW202020691A (en) | Feature word determination method and device and server | |
| CN111737997A (en) | A text similarity determination method, device and storage medium | |
| CN108256164A (en) | Boolean logic in state machine lattice | |
| CN111092894A (en) | Webshell detection method based on incremental learning, terminal device and storage medium | |
| WO2022083093A1 (en) | Probability calculation method and apparatus in graph, computer device and storage medium | |
| CN113032001B (en) | Intelligent contract classification method and device | |
| CN117874234A (en) | Semantic-based text classification method, device, computer equipment and storage medium | |
| CN118378620A (en) | A method, device, equipment and storage medium for detecting abnormality of prompt words | |
| Ahsaan et al. | Big data analytics: challenges and technologies | |
| CN115577701A (en) | Risk behavior identification method, device, equipment and medium for big data security | |
| CN118337453A (en) | Automatic attack tracing method, terminal device and storage medium | |
| CN111783427B (en) | Methods, devices, equipment and storage media for training models and outputting information | |
| CN117992742A (en) | Method, device and storage medium for detecting smart contract vulnerabilities | |
| CN120492467B (en) | A data integrity verification method based on blockchain | |
| CN114756690A (en) | Method and device for constructing knowledge graph, electronic equipment and storage medium | |
| CN114398138A (en) | Interface generation method and device, computer equipment and storage medium | |
| CN116204877B (en) | A code vulnerability detection method based on graph contrast learning | |
| CN114722800B (en) | Method, device, equipment, medium and product for extracting entity | |
| CN111143461A (en) | Mapping relation processing system and method and electronic equipment | |
| CN116955751A (en) | Crawler identification method, crawler identification device, computer equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |