CN114780889B

CN114780889B - A cache replacement system and method based on imitation learning

Info

Publication number: CN114780889B
Application number: CN202210491621.9A
Authority: CN
Inventors: 范琪琳; 丘淑婷; 蔡茂林; 马浩; 李秀华; 熊庆宇; 文俊浩
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2022-05-07
Filing date: 2022-05-07
Publication date: 2024-06-25
Anticipated expiration: 2042-05-07
Also published as: CN114780889A

Abstract

The invention discloses a cache replacement system and a cache replacement method based on imitation learning, wherein the system comprises an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database; the method comprises the following steps: 1) Acquiring a cache access request sequence; 2) Inputting a cache access request sequence into a trained cache replacement prediction neural network, calculating to obtain cache line eviction probability, and transmitting the cache line eviction probability to a cache replacement module; 3) And the cache replacement module performs cache line eviction according to a cache line eviction policy, so that the current cache access request is stored in a cache. The invention builds the neural network model by using the sequence information feature extractor such as the time convolution neural network and the attention mechanism, fully extracts the history information of the access sequence and the context information of the cache line in the cache, and improves the performance of model prediction.

Description

A cache replacement system and method based on imitation learning

技术领域Technical Field

本发明涉及网络缓存领域，具体是一种基于模仿学习的缓存替换系统及方法。The present invention relates to the field of network cache, and in particular to a cache replacement system and method based on imitation learning.

背景技术Background Art

缓存是一个临时的高速数据交换的存储器，它确保高频访问的内容可以被快速获取。缓存替换是一种当缓存满时更新缓存存储器内容的策略，目标是替换掉缓存中低频访问的缓存内容而存储高频访问的缓存内容，它确保了有限大小的缓存资源能被尽可能高效地利用，而缓存替换模型的优劣直接决定了计算机系统或者网络服务器的性能高低。A cache is a temporary high-speed data exchange memory that ensures that frequently accessed content can be quickly obtained. Cache replacement is a strategy for updating cache memory content when the cache is full. The goal is to replace the cache content that is accessed less frequently in the cache and store the cache content that is accessed more frequently. It ensures that the limited size of cache resources can be used as efficiently as possible. The quality of the cache replacement model directly determines the performance of the computer system or network server.

基于模仿学习的缓存替换系统，围绕内容分发网络(Content Delivery Network,CDN)为内容管理和网络流量管理核心，确保网络边缘缓存服务器为用户提供更高效的服务，从而构建高质量、高效率、具有鲜明网络秩序的网络应用服务模式，是本系统的根本目标。The cache replacement system based on imitation learning revolves around the content delivery network (CDN) as the core of content management and network traffic management, ensuring that the network edge cache server provides users with more efficient services, thereby building a high-quality, high-efficiency, and distinct network order network application service model. This is the fundamental goal of this system.

在网络缓存领域中，缓存替换模型的优劣是缓存资源能否被充分高效利用的核心问题。近年来，由于互联网数据流动和网络内容访问的需求日益增大，网络边缘缓存的缓存替换算法也越来越受到更多研究人员的注意。现阶段常见的缓存替换模型一般分为以下两种：一种是基于启发式的缓存替换模型及其变体，该类模型是根据启发式规则对缓存进行替换和建模，实现较为简单。另外一种是基于机器学习的缓存替换模型，该类模型的思想是通过有监督学习不断修改模型以适应不同的缓存访问模式。In the field of network caching, the quality of cache replacement models is the core issue of whether cache resources can be fully and efficiently utilized. In recent years, due to the increasing demand for Internet data flow and network content access, cache replacement algorithms for network edge caches have attracted more and more attention from researchers. At present, common cache replacement models are generally divided into the following two types: one is a heuristic-based cache replacement model and its variants. This type of model replaces and models the cache according to heuristic rules, and the implementation is relatively simple. The other is a cache replacement model based on machine learning. The idea of this type of model is to continuously modify the model through supervised learning to adapt to different cache access patterns.

第一种模型是启发式模型，泛化能力弱，在不同的缓存访问模式中表现落差大，并且受缓存访问序列质量影响较大，在内容访问多变的网络缓存中表现欠佳。第二种基于机器学习的模型很好地提高了模型的泛化能力，但是需要大量的学习样本，并受到反馈延迟的影响，这可能导致在高度动态的环境中反应缓慢，不能有效地利用历史访问信息，从而导致缓存替换的命中率降低。The first model is a heuristic model with weak generalization ability, large performance differences in different cache access patterns, and is greatly affected by the quality of cache access sequences, and performs poorly in network caches with variable content access. The second model based on machine learning improves the generalization ability of the model well, but requires a large number of learning samples and is affected by feedback delays, which may lead to slow response in highly dynamic environments and ineffective use of historical access information, resulting in a reduced cache replacement hit rate.

发明内容Summary of the invention

本发明的目的是提供一种基于模仿学习的缓存替换系统，包括访问数据获取模块、缓存替换预测模块、缓存替换模块和数据库；The object of the present invention is to provide a cache replacement system based on imitation learning, comprising an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database;

所述访问数据获取模块获取缓存访问请求序列S＝{s_t-H+1,S_t-H+2,…,s_t-1,s_t}，并传输至缓存替换预测模块；s_t表示t时刻的缓存访问，H为历史缓存访问的数量；The access data acquisition module acquires a cache access request sequence S = { _st-H+1 , St _-H+2 , ..., st _-1 , _st }, and transmits it to the cache replacement prediction module; _st represents the cache access at time t, and H is the number of historical cache accesses;

所述缓存替换预测模块存储有神经网络；The cache replacement prediction module stores a neural network;

所述缓存替换预测模块利用缓存逐出预测神经网络预测缓存行逐出概率，并传输至缓存替换模块；The cache replacement prediction module predicts cache line eviction probability using a cache eviction prediction neural network and transmits the prediction to the cache replacement module;

所述缓存替换模块根据缓存行逐出概率将当前缓存访问请求存入缓存中；The cache replacement module stores the current cache access request in the cache according to the cache line eviction probability;

所述数据库存储访问数据获取模块、缓存替换预测模块、缓存替换模块的数据。The database stores data of the access data acquisition module, the cache replacement prediction module, and the cache replacement module.

进一步，所述缓存逐出预测神经网络包括时间卷积神经网络、注意力机制神经网络、承接层和输出层。Furthermore, the cache eviction prediction neural network includes a temporal convolutional neural network, an attention mechanism neural network, a receiving layer and an output layer.

进一步，所述时间卷积神经网络包括输入层、隐藏层和输出层；Further, the temporal convolutional neural network includes an input layer, a hidden layer and an output layer;

所述时间卷积神经网络输入层的输入为缓存访问请求序列S＝{s_t-H+1,S_t-H+2,…,s_t-1,s_t}，输出为缓存访问请求序列的嵌入e(s)＝{e(s_t-H+1),e(s_t-H+2),…,e(s_t-1),e(s_t)}；其中e(s_t)表示t时刻缓存访问的嵌入；The input of the temporal convolutional neural network input layer is the cache access request sequence S = {s _t-H+1 ,S _t-H+2 ,…,s _t-1 ,s _t }, and the output is the embedding of the cache access request sequence e(s) = {e(s _t-H+1 ),e(s _t-H+2 ),…,e(s _t-1 ),e(s _t )}; where e(s _t ) represents the embedding of the cache access at time t;

所述时间卷积神经网络第一层隐藏层的输入为时间卷积神经网络输入层的输出e(s)；The input of the first hidden layer of the temporal convolutional neural network is the output e(s) of the input layer of the temporal convolutional neural network;

所述第j+1层隐藏层的输入为第j层隐藏层的输出输出为 The input of the j+1th hidden layer is the output of the jth hidden layer The output is

其中，隐藏层输出如下所示：Among them, the hidden layer output As shown below:

式中，F(s_t)表示t时刻扩张卷积的输出；Activation为激活函数；Where F(s _t ) represents the output of the dilated convolution at time t; Activation is the activation function;

其中，时间卷积神经网络t时刻扩张卷积的输出F(s_t)如下所示：Among them, the output F(s _t ) of the dilated convolution of the temporal convolutional neural network at time t is as follows:

式中，为t-d·i时刻的访问数据在第j层隐藏层的信息；f(i)为卷积滤波器；k为卷积核大小；d为膨胀系数，对于第j层，d＝2^j；In the formula, is the information of the access data at time td·i in the jth hidden layer; f(i) is the convolution filter; k is the convolution kernel size; d is the expansion coefficient, for the jth layer, d=2 ^j ;

所述时间卷积神经网络的输出层的输出即为最后一层隐藏层的输出，该层的输出表示为缓存访问请求序列的历史信息h(s)＝{h(s_t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t)}；其中h(s_t)表示t时刻缓存访问内容的历史信息。The output of the output layer of the temporal convolutional neural network is the output of the last hidden layer, and the output of this layer is represented as the historical information of the cache access request sequence h(s)={h(s _t-H+1 ),h(s _t-H+2 ),…,h(s _t-1 ),h(s _t )}; where h(s _t ) represents the historical information of the cache access content at time t.

进一步，所述注意力机制神经网络的输入为当前缓存访问请求序列的历史信息h(s)＝{h(s_t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t)}、当前时间步的缓存行内容的嵌入e(l)＝{e(l₁),e(l₂),e(l₃),…,e(l_W)}，输出为每个缓存行的上下文向量gw＝{g₁,g₂,…,g_W}；其中l_w表示缓存中的第w个缓存行；W表示缓存中缓存行的数量；g_w表示第w个缓存行的上下文内容；w＝1，2，…，W；Further, the input of the attention mechanism neural network is the historical information of the current cache access request sequence h(s)={h(s _t-H+1 ),h(s _t-H+2 ),…,h(s _t-1 ),h(s _t )}, the embedding of the cache line content of the current time step e(l)={e(l ₁ ),e(l ₂ ),e(l ₃ ),…,e(l _W )}, and the output is the context vector gw={g ₁ ,g ₂ ,…,g _W } for each cache line; wherein l _w represents the wth cache line in the cache; W represents the number of cache lines in the cache; g _w represents the context content of the wth cache line; w=1, 2,…, W;

其中，上下文向量g_w如下所示：Among them, the context vector g _w is as follows:

其中，系数α_i如下所示：The coefficients _αi are as follows:

α_i＝softmax(e(l_w)^TW_eh(s_t-H+i)) (4)α _i =softmax(e(l _w ) ^T W _e h(s _t-H+i )) (4)

式中，W_e是可训练的参数。Where _We is a trainable parameter.

进一步，所述承接层的输入为上下文向量g_w，输出为每个缓存行的重用距离dis＝{dis₁,dis₂,…,dis_W}；其中dis_w表示第w个缓存行的重用距离；Furthermore, the input of the receiving layer is the context vector g _w , and the output is the reuse distance dis = {dis ₁ , dis ₂ , ..., dis _W } of each cache line; wherein dis _w represents the reuse distance of the wth cache line;

其中，每个缓存行的重用距离dis如下所示：Among them, the reuse distance dis of each cache line is as follows:

dis＝Linear(Relu(Linear(gw))) (5)dis=Linear(Relu(Linear(gw))) (5)

式中，Relu为激活函数；Linear为线性函数；In the formula, Relu is the activation function; Linear is the linear function;

所述输出层的输入为每个缓存行的重用距离dis，输出是每个缓存行的逐出概率prob＝{p₁,p₂,…,p_W}；其中p_w表示缓存中第w个缓存行的逐出概率；The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob = {p ₁ ,p ₂ ,…,p _W } of each cache line; wherein p _w represents the eviction probability of the wth cache line in the cache;

其中，每个缓存行的逐出概率prob如下所示：Among them, the eviction probability prob of each cache line is as follows:

prob＝Sigmoid(dis) (6)prob＝Sigmoid(dis) (6)

式中，Sigmoid为激活函数。In the formula, Sigmoid is the activation function.

进一步，还包括神经网络训练模块；Further, a neural network training module is included;

所述神经网络训练模块用于训练缓存替换预测模块的缓存逐出预测神经网络。The neural network training module is used to train the cache eviction prediction neural network of the cache replacement prediction module.

进一步，对缓存逐出预测神经网络进行训练的步骤包括：Furthermore, the steps of training the cache eviction prediction neural network include:

1)判断当前缓存是否未满或者当前缓存访问请求是否已经在缓存中，若是，则直接将当前缓存访问请求放入缓存中或从缓存中获取当前缓存访问请求，否则，进入步骤2)；1) Determine whether the current cache is not full or whether the current cache access request is already in the cache. If so, directly put the current cache access request into the cache or obtain the current cache access request from the cache. Otherwise, proceed to step 2);

2)建立缓存逐出预测神经网络；2) Building a cache eviction prediction neural network;

3)利用Belady算法计算出每个缓存访问请求的真实重用距离其中n为总的访问数；表示第n个缓存访问请求的真实重用距离；3) Use the Belady algorithm to calculate the actual reuse distance of each cache access request Where n is the total number of visits; Indicates the actual reuse distance of the nth cache access request;

4)利用缓存逐出预测神经网络生成每个缓存行的逐出概率prob＝{p₁,p₂,…,p_W}；4) Generate the eviction probability of each cache line prob = {p ₁ ,p ₂ ,…,p _W } using the cache eviction prediction neural network;

5)计算t时刻缓存逐出预测网络输出的缓存行逐出概率与Belady算法之间的误差即：5) Calculate the error between the cache line eviction probability output by the cache eviction prediction network at time t and the Belady algorithm Right now:

式中，prob_t,分别为缓存逐出预测网络和Belady算法在t时刻的缓存行逐出概率；分别为缓存逐出预测网络和Belady算法在t时刻对第i个缓存行进行驱逐的概率；W为缓存大小；Where, prob _t , are the cache line eviction probabilities of the cache eviction prediction network and Belady algorithm at time t, respectively; are the probabilities of evict the i-th cache line at time t by the cache eviction prediction network and the Belady algorithm, respectively; W is the cache size;

其中，缓存行逐出概率满足下式：Among them, the cache line eviction probability Satisfy the following formula:

其中，参数topk和序列分别如下所示：Among them, the parameters topk and sequence They are as follows:

topk＝int(W*α) (10)topk＝int(W*α) (10)

式中，α是驱逐百分比，int()是向下取整函数，decend_sort()是降序排序函数；Where α is the eviction percentage, int() is the rounding down function, and decend_sort() is the descending sorting function;

6)使用缓存逐出预测网络的梯度对神经网络参数ω进行更新，并判断损失函数值是否达到最小或训练轮数是否达到预设值，若是，则结束缓存逐出预测神经网络的训练，否则，进入步骤7)；6) Use the gradient of the cache eviction prediction network to update the neural network parameter ω, and determine whether the loss function value reaches the minimum or the number of training rounds reaches the preset value. If so, end the training of the cache eviction prediction neural network, otherwise, proceed to step 7);

参数ω更新如下：缓存逐出预测神经网络The parameter ω is updated as follows: Cache eviction prediction neural network

其中，ω为网络当前参数；ω′为更新后的参数；ε_t为当前学习率，为梯度符号；Bceloss为损失函数。Among them, ω is the current network parameter; ω′ is the updated parameter; ε _t is the current learning rate, is the gradient symbol; Bceloss is the loss function.

7)缓存替换模块选择一个缓存行进行逐出并将当前访问存入缓存中；7) The cache replacement module selects a cache line to evict and stores the current access in the cache;

8)获取新的缓存访问请求，并返回步骤1)。8) Get a new cache access request and return to step 1).

进一步，所述缓存替换模块将当前缓存访问请求存入缓存中的步骤包括：Further, the step of the cache replacement module storing the current cache access request into the cache includes:

1)判断当前缓存是否未满或者当前缓存访问请求是否已经在缓存中，若是，则直接将当前缓存访问请求放入缓存中或从缓存中获取当前缓存访问请求，否则，进入步骤2)。1) Determine whether the current cache is not full or whether the current cache access request is already in the cache. If so, directly put the current cache access request into the cache or obtain the current cache access request from the cache. Otherwise, go to step 2).

2)所述缓存替换模块从缓存替换预测模块调取缓存行逐出概率，并生成缓存行逐出策略；2) The cache replacement module retrieves the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction strategy;

3)所述缓存替换模块根据缓存行逐出策略进行缓存行的逐出，从而将当前缓存访问请求存入缓存中。3) The cache replacement module evicts the cache line according to the cache line eviction policy, thereby storing the current cache access request in the cache.

进一步，所述缓存行逐出策略如下所示：Furthermore, the cache line eviction strategy is as follows:

index＝decend_sort(prob) (13)index=decend_sort(prob) (13)

evict_index＝index(0) (14)evict_index＝index(0) (14)

式中，prob表示缓存行逐出概率向量，decend_sort()是降序排序函数，index(0)表示逐出概率最高的缓存行。evict_index为逐出策略；index为指引函数。Where prob is the cache line eviction probability vector, decend_sort() is the descending sorting function, index(0) represents the cache line with the highest eviction probability, evict_index is the eviction strategy, and index is the guidance function.

一种基于模仿学习的缓存替换系统的使用方法，包括以下步骤：A method for using a cache replacement system based on imitation learning comprises the following steps:

1)搭建基于模仿学习的缓存替换系统；1) Build a cache replacement system based on imitation learning;

2)获取缓存访问请求序列；2) Obtain cache access request sequence;

3)判断当前缓存是否未满或者当前访问是否在缓存内，若是则直接将当前访问放入缓存中或从缓存中获取当前缓存访问并进入步骤9)，否则，进入步骤4)。3) Determine whether the current cache is not full or whether the current access is in the cache. If so, directly put the current access into the cache or obtain the current cache access from the cache and go to step 9). Otherwise, go to step 4).

4)对缓存替换预测模块的缓存替换预测神经网络进行训练；4) training a cache replacement prediction neural network of a cache replacement prediction module;

5)将缓存访问请求序列输入到训练好的缓存替换预测神经网络中，计算得到缓存行逐出概率，并传输至缓存替换模块；5) Input the cache access request sequence into the trained cache replacement prediction neural network, calculate the cache line eviction probability, and transmit it to the cache replacement module;

6)缓存替换模块判断当前缓存是否未满或者当前缓存访问请求是否已经在缓存中，若是，则直接将当前缓存访问请求放入缓存中或从缓存中获取当前缓存访问请求，否则，进入步骤7)。6) The cache replacement module determines whether the current cache is not full or whether the current cache access request is already in the cache. If so, the current cache access request is directly placed in the cache or obtained from the cache. Otherwise, it proceeds to step 7).

7)所述缓存替换模块从缓存替换预测模块调取缓存行逐出概率，并生成缓存行逐出策略；7) The cache replacement module retrieves the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction strategy;

8)所述缓存替换模块根据缓存行逐出策略进行缓存行的逐出，从而将当前缓存访问请求存入缓存中。8) The cache replacement module evicts the cache line according to the cache line eviction policy, thereby storing the current cache access request in the cache.

9)获取新的缓存访问请求序列，并返回步骤3)。9) Obtain a new cache access request sequence and return to step 3).

本发明的技术效果是毋庸置疑的,本发明使用时间卷积神经网络和注意力机制等序列信息特征提取器构建神经网络模型，充分提取了访问序列的历史信息以及缓存中缓存行的上下文信息，提高了模型预测的性能。The technical effect of the present invention is unquestionable. The present invention uses sequence information feature extractors such as temporal convolutional neural networks and attention mechanisms to build a neural network model, which fully extracts the historical information of the access sequence and the contextual information of the cache lines in the cache, thereby improving the performance of model prediction.

本发明使用模仿学习的技术进一步模仿Belady专家的行为，根据缓存替换预测模块的输出和Belady专家的输出之间的差距训练神经网络模型，提高了模型的命中率和适用范围。The present invention uses the imitation learning technology to further imitate the behavior of the Belady expert, trains the neural network model according to the gap between the output of the cache replacement prediction module and the output of the Belady expert, and improves the hit rate and application scope of the model.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为一种基于模仿学习的缓存替换方法的总算法流程图；FIG1 is a general algorithm flow chart of a cache replacement method based on imitation learning;

图2为一种基于模仿学习的缓存替换方法的神经网络模型结构图；FIG2 is a diagram showing the structure of a neural network model of a cache replacement method based on imitation learning;

图3为一种基于模仿学习的缓存替换方法神经网络训练流程图。FIG3 is a flowchart of a neural network training for a cache replacement method based on imitation learning.

具体实施方式DETAILED DESCRIPTION

下面结合实施例对本发明作进一步说明，但不应该理解为本发明上述主题范围仅限于下述实施例。在不脱离本发明上述技术思想的情况下，根据本领域普通技术知识和惯用手段，做出各种替换和变更，均应包括在本发明的保护范围内。The present invention is further described below in conjunction with the embodiments, but it should not be understood that the above subject matter of the present invention is limited to the following embodiments. Without departing from the above technical ideas of the present invention, various substitutions and changes are made according to the common technical knowledge and customary means in the art, which should all be included in the protection scope of the present invention.

实施例1：Embodiment 1:

参见图1至图3，一种基于模仿学习的缓存替换系统，包括访问数据获取模块、缓存替换预测模块、缓存替换模块和数据库；Referring to FIG. 1 to FIG. 3 , a cache replacement system based on imitation learning includes an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database;

所述缓存逐出预测神经网络包括时间卷积神经网络、注意力机制神经网络、承接层和输出层。The cache eviction prediction neural network includes a temporal convolutional neural network, an attention mechanism neural network, a receiving layer and an output layer.

所述时间卷积神经网络包括输入层、隐藏层和输出层；The temporal convolutional neural network includes an input layer, a hidden layer and an output layer;

所述时间卷积神经网络的输出层的输出即为最后一层隐藏层的输出，将该层的输出表示为缓存访问请求序列的历史信息h(s)＝{h(s_t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t)}；其中h(s_t)表示t时刻缓存访问内容的历史信息。The output of the output layer of the temporal convolutional neural network is the output of the last hidden layer, and the output of this layer is represented as the historical information of the cache access request sequence h(s)={h(s _t-H+1 ),h(s _t-H+2 ),…,h(s _t-1 ),h(s _t )}; where h(s _t ) represents the historical information of the cache access content at time t.

公式2中的F(s_t)是t时刻时间卷积神经网络的中扩张卷积输出，F(s_t)和t时刻的输入经过残差连接以后输出所以F(s_t)可以看成t时刻时间卷积神经网络的中间输出，其和当前时刻的输入经过残差连接以后的输出才是当前时刻时间卷积神经网络的实际输出。此处时间卷积神经网络输出层的输出h(s)为最后一层时间卷积神经网络的实际输出。F(s _t ) in Formula 2 is the dilated convolution output of the temporal convolutional neural network at time t, and F(s _t ) is the input at time t. After residual connection, the output Therefore, F(s _t ) can be regarded as the intermediate output of the temporal convolutional neural network at time t, which is consistent with the input at the current time. The output after the residual connection is the actual output of the temporal convolutional neural network at the current moment. Here, the output h(s) of the temporal convolutional neural network output layer is the actual output of the last layer of the temporal convolutional neural network.

所述注意力机制神经网络的输入为当前缓存访问请求序列的历史信息h(s)＝{h(s_t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t)}、当前时间步的缓存行内容的嵌入e(l)＝{e(l₁),e(l₂),e(l₃),…,e(l_W)}，输出为每个缓存行的上下文向量gw＝{g₁,g₂,…,g_w}；其中l_w表示缓存中的第w个缓存行；W表示缓存中缓存行的数量；g_w表示第w个缓存行的上下文内容；w＝1，2，…，W；The input of the attention mechanism neural network is the historical information of the current cache access request sequence h(s)={h( _st-H+1 ),h( _st-H+2 ),…,h( _st-1 ),h( _st )}, the embedding of the cache line content of the current time step e(l)={e( _l1 ),e( _l2 ),e( _l3 ),…,e( _lW )}, and the output is the context vector gw={ _g1 , _g2 ,…, _gw } for each cache line; where _lw represents the wth cache line in the cache; W represents the number of cache lines in the cache; _gw represents the context content of the wth cache line; w=1, 2,…,W;

其中，系数α_i如下所示：The coefficients _αi are as follows:

式中，W_e是可训练的参数。Where _We is a trainable parameter.

所述承接层的输入为上下文向量g_w，输出为每个缓存行的重用距离dis＝{dis₁,dis₂,…,dis_W}；其中dis_w表示第w个缓存行的重用距离；The input of the receiving layer is the context vector g _w , and the output is the reuse distance dis = {dis ₁ , dis ₂ , ..., dis _W } of each cache line; wherein dis _w represents the reuse distance of the wth cache line;

dis＝Linear(Relu(Linear(gw))) (5)dis=Linear(Relu(Linear(gw))) (5)

式中，Reli为激活函数；Linear为线性函数；In the formula, Reli is the activation function; Linear is the linear function;

所述输出层的输入为每个缓存行的重用距离dis，输出是每个缓存行的逐出概率prob＝{p₁,p₂,…,p_W}其中p_w表示缓存中第w个缓存行的逐出概率；The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob = {p ₁ ,p ₂ ,…,p _W } of each cache line, where p _w represents the eviction probability of the w-th cache line in the cache;

prob＝Sigmoid(dis) (6)prob＝Sigmoid(dis) (6)

该系统还包括神经网络训练模块；The system also includes a neural network training module;

对缓存逐出预测神经网络进行训练的步骤包括：The steps to train a cache eviction prediction neural network include:

5)计算t时刻缓存逐出预测网络输出的缓存行逐出概率与Bela dy算法之间的误差即：5) Calculate the error between the cache line eviction probability output by the cache eviction prediction network at time t and the Belady algorithm Right now:

topk＝int(W*α) (8)topk＝int(W*α) (8)

式中α是驱逐百分比，int()是向下取整函数，decend_sort()是降序排序函数。Where α is the eviction percentage, int() is the floor function, and decend_sort() is the descending sort function.

其中，ω为网络当前参数；ω′为更新后的参数；ε_t为当前学习率，为梯度符号；Bceloss为二分类交叉熵损失函数。Among them, ω is the current network parameter; ω′ is the updated parameter; ε _t is the current learning rate, is the gradient symbol; Bceloss is the binary cross entropy loss function.

所述缓存替换模块将当前缓存访问请求存入缓存中的步骤包括：The step of storing the current cache access request in the cache by the cache replacement module includes:

所述缓存行逐出策略如下所示：The cache line eviction policy is as follows:

index＝decend_sort(prob) (13)index=decend_sort(prob) (13)

evict_index＝index(0) (14)evict_index＝index(0) (14)

2)获取缓存访问请求序列；2) Obtain cache access request sequence;

实施例2：Embodiment 2:

基于模仿学习的缓存替换系统，包括访问数据获取模块、缓存替换预测模块、缓存替换模块和数据库。A cache replacement system based on imitation learning includes an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database.

所述访问数据获取模块获取对缓存的访问序列。S＝{s_t-H+1,s_t-H+2,…,s_t-1,s_t}。其中s_t表示t时刻的缓存访问,H为所述缓存访问序列中历史缓存访问的数量。判断当前缓存是否未满或者当前所访问内容是否已经在缓存中，若是，则直接将当前访问放入缓存中或从缓存中获取当前缓存访问，否则，将访问序列发送至缓存替换预测模块。The access data acquisition module obtains the access sequence to the cache. S = { _st-H+1 , st _-H+2 , ..., st _-1 , _st }. Where _st represents the cache access at time t, and H is the number of historical cache accesses in the cache access sequence. Determine whether the current cache is not full or whether the currently accessed content is already in the cache. If so, directly put the current access into the cache or obtain the current cache access from the cache. Otherwise, send the access sequence to the cache replacement prediction module.

所述缓存替换预测模块包括时间卷积神经网络、注意力机制神经网络、承接层和输出层，并将输出的缓存行逐出概率发送至缓存替换模块。The cache replacement prediction module includes a temporal convolutional neural network, an attention mechanism neural network, a receiving layer and an output layer, and sends the output cache line eviction probability to the cache replacement module.

时间卷积神经网络和注意力机制神经网络具有先后关系，其中时间卷积神经网络实时提取访问序列中每个缓存访问的历史信息，注意力机制神经网络提取每个访问序列的历史信息与当前缓存中缓存行的嵌入进行计算，生成每个缓存行的上下文。它们和承接层、输出层共同组成神经网络模型，结构包括：The temporal convolutional neural network and the attention mechanism neural network have a sequential relationship. The temporal convolutional neural network extracts the historical information of each cache access in the access sequence in real time, and the attention mechanism neural network extracts the historical information of each access sequence and calculates the embedding of the cache line in the current cache to generate the context of each cache line. Together with the receiving layer and the output layer, they form a neural network model, and the structure includes:

Ⅰ)时间卷积神经网络提取访问序列历史信息；Ⅰ) Temporal convolutional neural network extracts access sequence history information;

Ⅱ)注意力机制网络计算上下文向量；II) The attention network calculates the context vector;

Ⅲ)承接层输出每个缓存行的重用距离预测；III) The receiving layer outputs the reuse distance prediction of each cache line;

Ⅳ)输出层输出每个缓存行的逐出概率预测；IV) The output layer outputs the eviction probability prediction for each cache line;

时间卷积神经网络:时间卷积神经网络包括输入层、隐藏层和输出层。Temporal Convolutional Neural Network: Temporal Convolutional Neural Network consists of input layer, hidden layer and output layer.

时间卷积神经网络输入层的输入为缓存访问请求序列S＝{s_t-H+1,s_t-H+2,…,s_t-1,s_t}，输出为缓存访问序列的嵌入e(s)＝{e(s_t-H+1),e(s_t-H+2),…,e(s_t-1),e(s_t)}，其中e(s_t)表示t时刻缓存访问的嵌入。将输入层的输出作为隐藏层第一层的输入。The input of the temporal convolutional neural network input layer is the cache access request sequence S = { _st-H+1 , st _-H+2 , ..., st _-1 , _st }, and the output is the embedding of the cache access sequence e(s) = {e( _st-H+1 ), e( _st-H+2 ), ..., e( _st-1 ), e( _st )}, where e( _st ) represents the embedding of the cache access at time t. The output of the input layer is used as the input of the first hidden layer.

时间卷积神经网络隐藏层的输入为上一层隐藏层的输出，的计算步骤如下：Temporal Convolutional Neural Network Hidden Layers The input is the previous hidden layer The output, The calculation steps are as follows:

式中F(s_t)表示扩张卷积的输出，Activation为激活函数。为了使梯度不会消失，时间卷积神经网络在每层使用残差连接来获得下一层的输出hidden^l+1。Where F(s _t ) represents the output of the dilated convolution, and Activation is the activation function. In order to prevent the gradient from disappearing, the temporal convolutional neural network uses residual connections in each layer to obtain the output hidden ^l+1 of the next layer.

式中，为t-d·i时刻的访问数据在第j层隐藏层的信息；f:{0,…,k-1}→R为卷积滤波器，k为卷积核大小，d为膨胀系数，对于第l层，d＝2^j。In the formula, is the information of the access data at time td·i in the jth hidden layer; f:{0,…,k-1}→R is the convolution filter, k is the convolution kernel size, d is the expansion coefficient, and for the lth layer, d＝2 ^j .

时间卷积神经网络输出层将最后一层隐藏层的信息作为每个缓存访问数据的历史信息h(s)＝{h(s_t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t)}输出，其中h(s_t)表示t时刻缓存访问内容的历史信息。The output layer of the temporal convolutional neural network outputs the information of the last hidden layer as the historical information of each cache access data h(s) = {h(s _t-H+1 ), h(s _t-H+2 ),…, h(s _t-1 ), h(s _t )}, where h(s _t ) represents the historical information of the cache access content at time t.

注意力网络：对于任意时间步t，所述注意力网络的输入为当前缓存访问请求序列的历史信息h(s)＝{h(s_t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t)}和当前时间步的缓存行内容的嵌入e(l)＝{e(l₁),e(l₂),e(l₃),…,e(l_W)}，其中l_w表示缓存中的第w个缓存行，W表示缓存中缓存行的数量。输出为每个缓存行的上下文向量gw＝{g₁,g₂,…,g_w}，其中g_w表示第w个缓存行的上下文内容。Attention network: For any time step t, the input of the attention network is the historical information of the current cache access request sequence h(s) = {h( _st-H+1 ), h( _st-H+2 ), ..., h( _st-1 ), h( _st )} and the embedding of the cache line content of the current time step e(l) = {e( _l1 ), e( _l2 ), e( _l3 ), ..., e( _lW )}, where _lw represents the wth cache line in the cache and W represents the number of cache lines in the cache. The output is the context vector gw = { _g1 , _g2 , ..., _gw } for each cache line, where _gw represents the context content of the wth cache line.

上下文内容g_w如下所示：The context g _w is as follows:

α_i＝softmax(e(l_w)^TW_eh(s_t-H+i)) (3)α _i =softmax(e(l _w ) ^T W _e h(s _t-H+i )) (3)

式中，W_e是可训练的参数Where _We is a trainable parameter

承接层：所述承接层的输入是上下文向量gw，输出是每个缓存行的重用距离dis＝{dis₁,dis₂,…,dis_W}，其中dis_w表示第w个缓存行的重用距离。Continuing layer: The input of the continuing layer is the context vector gw, and the output is the reuse distance dis = {dis ₁ , dis ₂ , ..., dis _W } of each cache line, where dis _w represents the reuse distance of the w-th cache line.

dis＝Linear(Relu(Linear(gw))) (5)dis=Linear(Relu(Linear(gw))) (5)

承接层的激活函数为RELU：f(x)＝max(0,x)，线性函数为Li near:y＝xA^T+b。The activation function of the receiving layer is RELU: f(x) = max(0,x), and the linear function is Linear: y = xA ^T + b.

输出层：所述输出层的输入为每个缓存行的重用距离dis，输出是每个缓存行的逐出概率prob＝{p₁,p₂,…,p_W}，其中p_w表示缓存中第w个缓存行的逐出概率。Output layer: The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob = {p ₁ , p ₂ , ..., p _W } of each cache line, where p _w represents the eviction probability of the w-th cache line in the cache.

prob＝Sigmoid(dis) (6)prob＝Sigmoid(dis) (6)

输出层的激活函数为Sigmoid： The activation function of the output layer is Sigmoid:

所述缓存替换模块根据缓存中每个缓存行的逐出概率选择一个缓存行进行逐出并存入当前访问数据。The cache replacement module selects a cache line to evict and stores the currently accessed data according to the eviction probability of each cache line in the cache.

缓存行的逐出选择步骤如下：The cache line eviction selection steps are as follows:

index＝decend_sort(prob) (7)index=decend_sort(prob) (7)

evict_index＝index(0) (8)evict_index＝index(0) (8)

式中，prob表示缓存行逐出概率向量，decend_sort()是降序排序函数，index(0)表示逐出概率最高的缓存行。Where prob represents the cache line eviction probability vector, decend_sort() is the descending sort function, and index(0) represents the cache line with the highest eviction probability.

所述神经网络训练模块对神经网络模型进行训练，得到训练好的神经网络模型。The neural network training module trains the neural network model to obtain a trained neural network model.

对神经网络模型进行训练的步骤包括：The steps to train a neural network model include:

1)判断当前缓存是否未满或者当前所访问内容是否已经在缓存中，若是，则直接将当前访问放入缓存中或从缓存中获取当前缓存访问并进入步骤7)，否则，进入步骤2)。1) Determine whether the current cache is not full or whether the currently accessed content is already in the cache. If so, directly put the current access into the cache or obtain the current cache access from the cache and go to step 7). Otherwise, go to step 2).

2)Belady专家策略根据所有访问数据计算出每个访问数据的真实重用距离其中n为总的访问数。2) Belady expert strategy calculates the true reuse distance of each access data based on all access data Where n is the total number of visits.

3)建立缓存逐出预测网络，所述缓存逐出预测网络包括缓存替换预测模块和缓存逐出模块。3) Establishing a cache eviction prediction network, wherein the cache eviction prediction network includes a cache replacement prediction module and a cache eviction module.

在神经网络训练过程中，搭建了一个缓存逐出预测网络模型，即如上面Ⅰ)Ⅱ)Ⅲ)Ⅳ)描述的神经网络模型。神经网络训练和生成实际缓存替换策略的区别在于：训练时Belady专家实时产生缓存的逐出内容和模型产生的缓存行逐出概率进行误差计算，且进行参数更新；而在实际缓存替换策略中，仅利用训练好的模型进行缓存行逐出概率计算。During the neural network training process, a cache eviction prediction network model is built, which is the neural network model described in Ⅰ), Ⅱ), Ⅲ), and Ⅳ above. The difference between neural network training and generating actual cache replacement strategies is that during training, Belady experts generate cache eviction content in real time and cache line eviction probabilities generated by the model for error calculation and parameter update; while in the actual cache replacement strategy, only the trained model is used to calculate cache line eviction probabilities.

4)缓存替换预测模块生成每个缓存行的逐出概率prob＝{p₁,p₂,…,p_W}。4) The cache replacement prediction module generates the eviction probability prob = {p ₁ , p ₂ , ..., p _W } for each cache line.

5)计算所述缓存逐出预测网络输出的缓存行逐出概率与Belady专家策略之间的误差。5) Calculate the error between the cache line eviction probability output by the cache eviction prediction network and the Belady expert strategy.

式中，prob_t,分别为模型和Belady算法在t时刻对缓存内容进行驱逐的概率向量，分别为模型和Belady算法在t时刻对第i个缓存行进行驱逐的概率，W为缓存大小。Where, prob _t , are the probability vectors of the model and Belady algorithm evicting cached content at time t, respectively. are the probabilities of evicting the i-th cache line at time t by the model and Belady algorithm, respectively, and W is the cache size.

的计算如下所示： The calculation of is as follows:

topk＝int(W*α) (10)topk＝int(W*α) (10)

缓存逐出预测网络参数ω更新如下：The cache eviction prediction network parameter ω is updated as follows:

其中，ω为网络当前参数，ω′为更新后的参数，ε_t为当前学习率，为梯度符号。Among them, ω is the current network parameter, ω′ is the updated parameter, ε _t is the current learning rate, is the gradient symbol.

6)使用缓存逐出预测网络的梯度对全局神经网络参数进行更新。6) Use the gradient of the cached eviction prediction network to update the global neural network parameters.

7)缓存替换模块选择一个缓存行进行逐出并将当前访问存入缓存中。7) The cache replacement module selects a cache line for eviction and stores the current access in the cache.

8)获取新的缓存访问序列并返回步骤1)。8) Get a new cache access sequence and return to step 1).

所述缓存替换模块，步骤包括：The cache replacement module comprises the following steps:

1)判断当前缓存是否未满或者当前所访问内容是否已经在缓存中，若是，则直接将当前访问放入缓存中或从缓存中获取当前缓存访问，否则，进入步骤2)。1) Determine whether the current cache is not full or whether the currently accessed content is already in the cache. If so, directly put the current access into the cache or obtain the current cache access from the cache. Otherwise, go to step 2).

2)利用训练好的神经网络提取当前缓存序列中每个缓存访问内容的重用距离，步骤包括：2) Using the trained neural network to extract the reuse distance of each cache access content in the current cache sequence, the steps include:

2.1)以时间卷积网络最后一层隐藏层状态和缓存行的嵌入作为输入，得到每个缓存行的上下文信息gw。2.1) Taking the last hidden layer state of the temporal convolutional network and the embedding of the cache line as input, we get the context information gw of each cache line.

2.2)每个缓存行的上下文信息gw经过承接层输出缓存行的重用距离dis。2.2) The context information gw of each cache line is output through the receiving layer to obtain the reuse distance dis of the cache line.

3)利用训练好的神经网络将所提取的缓存重用距离经过输出层输出每个缓存行的逐出概率。3) Use the trained neural network to output the eviction probability of each cache line through the output layer based on the extracted cache reuse distance.

4)利用缓存替换模块进行缓存内容的替换。4) Use the cache replacement module to replace cache content.

所述数据库存储访问数据获取模块、缓存替换预测模块、缓存替换模块、神经网络训练模块的数据。The database stores data of the access data acquisition module, the cache replacement prediction module, the cache replacement module, and the neural network training module.

实施例3：Embodiment 3:

参见图1至图3，基于模仿学习的缓存替换方法，包括以下步骤：Referring to FIG. 1 to FIG. 3 , the cache replacement method based on imitation learning includes the following steps:

1)获取对缓存的访问序列。1) Get the access sequence to the cache.

2)建立缓存替换预测模型，将输出的缓存行逐出概率发送至缓存替换模块。2) Establish a cache replacement prediction model and send the output cache line eviction probability to the cache replacement module.

所述时间卷积神经网络包括输入层、隐藏层和输出层。The temporal convolutional neural network includes an input layer, a hidden layer and an output layer.

时间卷积神经网络输入层的输入为缓存访问请求序列S＝{s_t-H+1,s_t-H+2,…,s_t-1,s_t}，输出为缓存访问序列的嵌入e(s)＝{e(s_t-H+1),e(s_t-H+2),…,e(s_t-1),e(s_t)}，其中e(s_t)表示t时刻缓存访问的嵌入。将输入层的输出作为隐藏层第一层的输入。The input of the temporal convolutional neural network input layer is the cache access request sequence S = {s _t-H+1 ,s _t-H+2 ,…,s _t-1 ,s _t }, and the output is the embedding of the cache access sequence e(s) = {e(s _t-H+1 ),e(s _t-H+2 ),…,e(s _t-1 ),e(s _t )}, where e(s _t ) represents the embedding of the cache access at time t. The output of the input layer is used as the input of the first hidden layer.

式中F(s_t)表示扩张卷积的输出，Activation为激活函数。为了使梯度不会消失，时间卷积神经网络在每层使用残差连接来获得下一层的输出hiddenj⁺¹。Where F(s _t ) represents the output of the dilated convolution, and Activation is the activation function. In order to prevent the gradient from disappearing, the temporal convolutional neural network uses residual connections in each layer to obtain the output hiddenj ⁺¹ of the next layer.

式中，为t-d·i时刻的访问数据在第j层隐藏层的信息；f:{0,…,k-1}→R为卷积滤波器，k为卷积核大小，d为膨胀系数，对于第j层，d＝2^j。In the formula, is the information of the access data at time td·i in the jth hidden layer; f:{0,…,k-1}→R is the convolution filter, k is the convolution kernel size, d is the expansion coefficient, and for the jth layer, d=2 ^j .

对于任意时间步t，所述注意力网络获取时间卷积网络输出层的输出h(s)＝{h(s_t-H+1),h(s_t-H+2),…,h(s_t-1),h(s_t)}和当前时间步的缓存行内容的嵌入e(l)＝{e(l₁),e(l₂),e(l₃),…,e(l_w)}作为注意力网络的输入，其中l_w表示缓存中的第w个缓存行，W表示缓存中缓存行的数量。输出为每个缓存行的上下文向量gw＝{g₁,g₂,…,g_w}，其中g_w表示第w个缓存行的上下文内容。For any time step t, the attention network takes the output of the temporal convolutional network output layer h(s) = {h(s _t-H+1 ), h(s _t-H+2 ), …, h(s _t-1 ), h(s _t )} and the embedding of the cache line content of the current time step e(l) = {e(l ₁ ), e(l ₂ ), e(l ₃ ), …, e(l _w )} as the input of the attention network, where l _w represents the wth cache line in the cache and W represents the number of cache lines in the cache. The output is the context vector gw = {g ₁ , g ₂ , …, g _w } for each cache line, where g _w represents the context content of the wth cache line.

上下文内容g_w如下所示：The context g _w is as follows:

式中，W_e是可训练的参数Where _We is a trainable parameter

所述承接层的输入是上下文向量gw，输出是每个缓存行的重用距离dis＝{dis₁,dis₂,…,dis_W}，其中dis_w表示第w个缓存行的重用距离。The input of the receiving layer is the context vector gw, and the output is the reuse distance dis={dis ₁ , dis ₂ , …, dis _W } of each cache line, where dis _w represents the reuse distance of the w-th cache line.

dis＝Linear(Relu(Linear(gw))) (5)dis=Linear(Relu(Linear(gw))) (5)

所述输出层的输入为每个缓存行的重用距离dis，输出是每个缓存行的逐出概率prob＝{p₁,p₂,…,p_W}，其中p_w表示缓存中第w个缓存行的逐出概率。The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob = {p ₁ , p ₂ , ..., p _W } of each cache line, where p _w represents the eviction probability of the wth cache line in the cache.

prob＝Sigmoid(dis) (6)prob＝Sigmoid(dis) (6)

3)将缓存替换预测模块的输出输入到缓存替换模块，以将当前缓存访问替换缓存中某个缓存行。3) The output of the cache replacement prediction module is input to the cache replacement module to replace a cache line in the cache with the current cache access.

index＝decend_sort(prob) (7)index=decend_sort(prob) (7)

evict_index＝index(0) (8)evict_index＝index(0) (8)

4)建立神经网络训练模块，对缓存替换预测模型中的神经网络进行训练，得到训练好的神经网络模型。4) Establishing a neural network training module, training the neural network in the cache replacement prediction model, and obtaining a trained neural network model.

4.1)判断当前缓存是否未满或者当前所访问内容是否已经在缓存中，若是，则直接将当前访问放入缓存中或从缓存中获取当前缓存访问并进入步骤7)，否则，进入步骤2)。4.1) Determine whether the current cache is not full or whether the currently accessed content is already in the cache. If so, directly put the current access into the cache or obtain the current cache access from the cache and proceed to step 7), otherwise, proceed to step 2).

4.2)Belady专家策略根据所有访问数据计算出每个访问数据的真实重用距离其中n为总的访问数。4.2) Belady expert strategy calculates the true reuse distance of each access data based on all access data Where n is the total number of visits.

4.3)建立缓存逐出预测网络，所述缓存逐出预测网络包括缓存替换预测模块和缓存逐出模块。4.3) Establishing a cache eviction prediction network, wherein the cache eviction prediction network includes a cache replacement prediction module and a cache eviction module.

4.4)缓存替换预测模块生成每个缓存行的逐出概率prob＝{p₁,p₂,…,p_W}。4.4) The cache replacement prediction module generates the eviction probability prob = {p ₁ , p ₂ , …, p _W } for each cache line.

4.5)计算所述缓存逐出预测网络输出的缓存行逐出概率与Bela dy专家策略之间的误差。4.5) Calculate the error between the cache line eviction probability output by the cache eviction prediction network and the Belady expert strategy.

的计算如下所示： The calculation of is as follows:

k＝int(W*topk) (10)k＝int(W*topk) (10)

式中α是驱逐百分比，int()是向下取整函数，decend_sort()是降序排序函数Where α is the eviction percentage, int() is the rounding down function, and decend_sort() is the descending sorting function

4.6)使用缓存逐出预测网络的梯度对全局神经网络参数进行更新。4.6) Use the gradient of the cached eviction prediction network to update the global neural network parameters.

4.7)缓存替换模块选择一个缓存行进行逐出并将当前访问存入缓存中。4.7) The cache replacement module selects a cache line for eviction and stores the current access in the cache.

4.8)获取新的缓存访问序列并返回步骤1)。4.8) Get a new cache access sequence and return to step 1).

5)将访问序列输入训练好的神经网络模型中，利用缓存替换模块完成缓存替换操作。5) Input the access sequence into the trained neural network model and use the cache replacement module to complete the cache replacement operation.

利用训练好的神经网络对访问数据进行缓存替换预测和缓存替换，步骤包括：The trained neural network is used to predict and replace cache access data. The steps include:

5.1)判断当前缓存是否未满或者当前所访问内容是否已经在缓存中，若是，则直接将当前访问放入缓存中或从缓存中获取当前缓存访问，否则，进入步骤2)。5.1) Determine whether the current cache is not full or whether the currently accessed content is already in the cache. If so, directly put the current access into the cache or obtain the current cache access from the cache. Otherwise, go to step 2).

5.2)利用训练好的神经网络提取当前缓存序列中每个缓存访问内容的重用距离，步骤包括：5.2) Using the trained neural network to extract the reuse distance of each cache access content in the current cache sequence, the steps include:

5.2.1)以时间卷积网络最后一层隐藏层状态和缓存行的嵌入作为输入，得到每个缓存行的上下文信息gw。5.2.1) Taking the last hidden layer state of the temporal convolutional network and the embedding of the cache line as input, we get the context information gw of each cache line.

5.2.2)每个缓存行的上下文信息gw经过承接层输出缓存行的重用距离dis。5.2.2) The context information gw of each cache line is output through the receiving layer to obtain the reuse distance dis of the cache line.

5.3)利用训练好的神经网络将所提取的缓存重用距离经过输出层输出每个缓存行的逐出概率。5.3) Use the trained neural network to output the eviction probability of each cache line through the output layer based on the extracted cache reuse distance.

5.4)利用训练好的缓存替换模块进行缓存内容的替换。5.4) Use the trained cache replacement module to replace cache content.

Claims

1. A cache replacement system based on imitation learning, characterized by comprising an access data acquisition module, a cache replacement prediction module, a cache replacement module and a database;

The access data acquisition module acquires a cache access request sequence S = { _st-H+1 , st _-H+2 , ..., st _-1 , _st }, and transmits it to the cache replacement prediction module; _st represents the cache access at time t, and H is the number of historical cache accesses;

The cache replacement prediction module stores a neural network;

The cache replacement prediction module predicts cache line eviction probability using a cache eviction prediction neural network and transmits the prediction to the cache replacement module;

The cache replacement module stores the current cache access request in the cache according to the cache line eviction probability;

The database stores data of the access data acquisition module, the cache replacement prediction module, and the cache replacement module;

The cache eviction prediction neural network includes a temporal convolutional neural network, an attention mechanism neural network, a receiving layer and an output layer;

The temporal convolutional neural network includes an input layer, a hidden layer and an output layer;

The input of the temporal convolutional neural network input layer is a cache access request sequence S = { _st-H+1 , st _-H+2 , ..., st _-1 , _st }, and the output is an embedding of the cache access request sequence e(s) = {e( _st-H+1 ), e( _st-H+2 ), ..., e( _st-1 ), e( _st )}; where e( _st ) represents the embedding of the cache access at time t;

The input of the first hidden layer of the temporal convolutional neural network is the output e(s) of the input layer of the temporal convolutional neural network;

The input of the j+1th hidden layer is the output of the jth hidden layer The output is

Among them, the hidden layer output As shown below:

Where F(s _t ) represents the output of the dilated convolution at time t; Activation is the activation function;

Among them, the dilated convolution F(s _t ) at time t is as follows:

In the formula, is the information of the access data at time td·i in the jth hidden layer; f(i) is the convolution filter; k is the convolution kernel size; d is the expansion coefficient, for the jth layer, d=2 ^j ;

The output of the output layer of the temporal convolutional neural network is the output of the last hidden layer, and the output of this layer is represented as the historical information of the cache access request sequence h(s)={h(s _t-H+1 ), h(s _t-H+2 ), ..., h(s _t-1 ), h(s _t )}; where h(s _t ) represents the historical information of the cache access content at time t;

The steps to train a cache eviction prediction neural network include:

1) Determine whether the current cache is not full or whether the current cache access request is already in the cache. If so, directly put the current cache access request into the cache or obtain the current cache access request from the cache. Otherwise, proceed to step 2);

2) Building a cache eviction prediction neural network;

3) Use the Belady algorithm to calculate the actual reuse distance of each cache access request Where n is the total number of visits; Indicates the actual reuse distance of the nth cache access request;

4) Generate the eviction probability prob = {p ₁ , p ₂ , ..., p _W } of each cache line using the cache eviction prediction neural network;

5) Calculate the error between the cache line eviction probability output by the cache eviction prediction network at time t and the Belady algorithm Right now:

In the formula, are the cache line eviction probabilities of the cache eviction prediction network and Belady algorithm at time t, respectively; are the probabilities of evict the i-th cache line at time t by the cache eviction prediction network and the Belady algorithm, respectively; W is the cache size;

Among them, the cache line eviction probability Satisfy the following formula:

Among them, the parameters topk and sequence They are as follows:

topk＝int(W*α) (10)

Where α is the eviction percentage, int() is the rounding down function, and decend_sort() is the descending sorting function;

6) Use the gradient of the cache eviction prediction network to update the neural network parameter ω, and determine whether the loss function value reaches the minimum or the number of training rounds reaches the preset value. If so, end the training of the cache eviction prediction neural network, otherwise, proceed to step 7);

The parameter ω is updated as follows: Cache eviction prediction neural network

Among them, ω is the current network parameter; ω′ is the updated parameter; ε _t is the current learning rate, is the gradient symbol; Bceloss is the loss function;

7) The cache replacement module selects a cache line to evict and stores the current access in the cache;

8) Get a new cache access request and return to step 1).

2. The cache replacement system based on imitation learning according to claim 1 is characterized in that the input of the attention mechanism neural network is the historical information of the current cache access request sequence h(s) = {h(s _t-H+1 ), h(s _t-H+2 ), ..., h(s _t-1 ), h(s _t )}, the embedding of the cache line content at the current time step e(l) = {e(l ₁ ), e(l ₂ ), e(l ₃ ), ..., e(l _W )}, and the output is the context vector gw = {g ₁ , g ₂ , ..., g _W } for each cache line; where l _w represents the wth cache line in the cache; W represents the number of cache lines in the cache; g _w represents the context content of the wth cache line; w = 1, 2, ..., W;

Among them, the context vector g _w is as follows:

The coefficients _αi are as follows:

α _i =softmax(e(l _w ) ^T W _e h(s _t-H+i )) (4)

Where _We is a trainable parameter.

3. The cache replacement system based on imitation learning according to claim 2, characterized in that the input of the receiving layer is the context vector g _w , and the output is the reuse distance dis of each cache line = {dis ₁ , dis ₂ , ..., dis _W }; wherein dis _W represents the reuse distance of the Wth cache line;

Among them, the reuse distance dis of each cache line is as follows:

dis=Linear(Relu(Linear(gw))) (5)

In the formula, Relu is the activation function; Linear is the linear function;

The input of the output layer is the reuse distance dis of each cache line, and the output is the eviction probability prob = {p ₁ , p ₂ , ..., p _W } of each cache line; wherein p _W represents the eviction probability of the Wth cache line in the cache;

Among them, the eviction probability prob of each cache line is as follows:

prob＝Sigmoid(dis) (6)

In the formula, Sigmoid is the activation function.

4. The cache replacement system based on imitation learning according to claim 3, characterized in that: it also includes a neural network training module;

The neural network training module is used to train the cache eviction prediction neural network of the cache replacement prediction module.

5. The cache replacement system based on imitation learning according to claim 1, wherein the step of the cache replacement module storing the current cache access request into the cache comprises:

a1) determining whether the current cache is not full or whether the current cache access request is already in the cache; if so, directly placing the current cache access request in the cache or obtaining the current cache access request from the cache; otherwise, proceeding to step a2);

a2) the cache replacement module retrieves the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction strategy;

a3) The cache replacement module evicts the cache line according to the cache line eviction policy, thereby storing the current cache access request in the cache.

6. The cache replacement system based on imitation learning according to claim 5, characterized in that the cache line eviction strategy is as follows:

index=decend_sort(prob) (13)

evict_index＝index(0) (14)

Where prob represents the cache line eviction probability vector, decend_sort() is the descending sort function, index(0) represents the cache line with the highest eviction probability, evict_index is the eviction strategy, and index is the guidance function.

7. A method for using the cache replacement system according to any one of claims 1 to 6, characterized in that it comprises the following steps:

s1) Build a cache replacement system based on imitation learning;

s2) obtaining a cache access request sequence;

s3) Determine whether the current cache is not full or whether the current access is in the cache. If so, directly put the current access into the cache or obtain the current cache access from the cache and go to step s9). Otherwise, go to step s4);

s4) training a cache replacement prediction neural network of a cache replacement prediction module;

s5) inputting the cache access request sequence into the trained cache replacement prediction neural network, calculating the cache line eviction probability, and transmitting it to the cache replacement module;

s6) The cache replacement module determines whether the current cache is not full or whether the current cache access request is already in the cache. If so, the current cache access request is directly placed in the cache or obtained from the cache. Otherwise, the process proceeds to step s7);

s7) the cache replacement module retrieves the cache line eviction probability from the cache replacement prediction module and generates a cache line eviction strategy;

s8) the cache replacement module evicts the cache line according to the cache line eviction policy, thereby storing the current cache access request in the cache;

s9) Obtain a new cache access request sequence and return to step s3).