CN111859703B - A heat-aware data center energy-saving data replica placement method - Google Patents
A heat-aware data center energy-saving data replica placement method Download PDFInfo
- Publication number
- CN111859703B CN111859703B CN202010748759.3A CN202010748759A CN111859703B CN 111859703 B CN111859703 B CN 111859703B CN 202010748759 A CN202010748759 A CN 202010748759A CN 111859703 B CN111859703 B CN 111859703B
- Authority
- CN
- China
- Prior art keywords
- copy
- disk
- energy consumption
- data
- active
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0625—Power saving in storage systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/0674—Disk device
- G06F3/0676—Magnetic disk device
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2113/00—Details relating to the application field
- G06F2113/08—Fluids
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/08—Thermal analysis or thermal optimisation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Geometry (AREA)
- Game Theory and Decision Science (AREA)
- Evolutionary Computation (AREA)
- Computer Hardware Design (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
Description
技术领域technical field
本发明涉及数据中心中副本放置技术领域,具体涉及一种基于热量感知的数据中心节能数据副本放置方法。The invention relates to the technical field of replica placement in a data center, in particular to a method for placing energy-saving data replicas in a data center based on heat perception.
背景技术Background technique
随着数据的指数型增长,数据中心作为数据的承载体也日益剧增,数据机房的建设量与建设规模不断扩大。但与此同时,数据中心的耗电量与碳排放量也呈几何级增长,其能耗问题日益突出。对于存储型数据中心在不影响数据访问业务的情况下,如何有效降低能耗问题亟需解决。With the exponential growth of data, the data center as a carrier of data is also increasing rapidly, and the construction volume and scale of data room are constantly expanding. However, at the same time, the power consumption and carbon emissions of data centers have also increased exponentially, and the problem of energy consumption has become increasingly prominent. For storage data centers, it is urgent to solve the problem of how to effectively reduce energy consumption without affecting data access services.
现有的存储型数据中心节能方法,主要通过调整磁盘阵列,规划设计磁盘的组合方式,调整数据集排布,实现节能。但是现有方法的节能效果欠佳,同时产生的局部热点会造成大量的制冷能耗,在磁盘阵列的计算量上也较为复杂。当前的研究中,针对存储型数据中心副本放置采用气流组织模型降低能耗的研究较少。Existing energy-saving methods for storage-type data centers mainly realize energy saving by adjusting disk arrays, planning and designing the combination mode of disks, and adjusting the arrangement of data sets. However, the energy saving effect of the existing method is not good, and the local hot spots generated at the same time will cause a large amount of cooling energy consumption, and the calculation amount of the disk array is also relatively complicated. In the current research, there are few researches on using the airflow organization model to reduce energy consumption for replica placement in storage data centers.
发明内容SUMMARY OF THE INVENTION
本发明的目的是针对存储型数据中心存在运营成本能耗巨大、能效比低的问题,提出一种基于热量感知的数据中心节能数据副本放置方法,能够有效降低存储型数据中心的总能耗。The purpose of the present invention is to solve the problems of huge operating cost and energy consumption and low energy efficiency ratio in storage data centers, and propose a heat-sensing-based data center energy-saving data copy placement method, which can effectively reduce the total energy consumption of storage data centers.
本发明的目的可以通过采取如下技术方案达到:The purpose of the present invention can be achieved by adopting the following technical solutions:
一种基于热量感知的数据中心节能数据副本放置方法,所述的节能数据副本放置方法包括以下步骤:A method for placing copies of energy-saving data in a data center based on heat perception, the method for placing copies of energy-saving data includes the following steps:
S1、根据数据中心气流组织特征与热量再循环,针对存储型数据中心构建能耗模型;S1. According to the air distribution characteristics and heat recirculation of the data center, an energy consumption model is constructed for the storage data center;
S2、以最低总能耗为目标,生成热量感知磁盘序列DS,并将磁盘序列DS划分为活跃副本区以及冗余副本区,其中,所述的副本是数据集副本,数据集是单独一个数据块或者多个数据块的集合;S2. Aiming at the lowest total energy consumption, a heat-aware disk sequence DS is generated, and the disk sequence DS is divided into an active copy area and a redundant copy area, wherein the copy is a copy of the data set, and the data set is a single data set A block or a collection of multiple data blocks;
S3、分别对活跃副本区以及冗余副本区所包含的磁盘,再次利用存储型数据中心的能耗模型,以最低总能耗为目标,生成优化后的磁盘序列DSnew;S3, for the disks contained in the active copy area and the redundant copy area, use the energy consumption model of the storage data center again, and take the lowest total energy consumption as the goal to generate the optimized disk sequence DS new ;
S4、定义并初始化副本表ReplicaTable,然后采用副本表ReplicaTable管理副本,将副本按优化后的磁盘序列DSnew分别顺序放置于活跃副本区以及冗余副本区;S4. Define and initialize the replica table ReplicaTable, then use the replica table ReplicaTable to manage the replicas, and place the replicas in the active replica area and the redundant replica area in the order of the optimized disk sequence DS new ;
S5、统计多个周期的副本访问情况,划分冷热副本,并将冷热副本进行迁移放置,同时更新副本表ReplicaTable。S5. Count the replica access conditions of multiple cycles, divide the hot and cold replicas, migrate and place the hot and cold replicas, and update the replica table ReplicaTable at the same time.
进一步地,所述的步骤S1、根据数据中心气流组织特征与热量再循环,针对存储型数据中心构建能耗模型,具体实现如下:Further, in the step S1, according to the air distribution characteristics of the data center and the heat recirculation, an energy consumption model is constructed for the storage data center, and the specific implementation is as follows:
将存储型数据中心每个机架上的机箱看成节点,将存储型数据中心划分为若干个节点,而每个节点内包含若干个磁盘,且每个节点内磁盘共享一个节点电源,并根据数据中心气流组织特征与热量再循环,得到节点内处于活跃状态磁盘数量所产生的节点磁盘能耗模型,同时,根据节点热量循环系数矩阵以及节点磁盘能耗得到数据中心制冷能耗模型以及总能耗模型。The chassis on each rack of the storage data center is regarded as a node, and the storage data center is divided into several nodes, and each node contains several disks, and the disks in each node share a node power supply, and according to the The airflow organization characteristics and heat recirculation of the data center are used to obtain the node disk energy consumption model generated by the number of active disks in the node. consumption model.
进一步地,根据数据中心气流组织特征与热量再循环,计算出节点间的热量循环系数矩阵。Further, according to the air distribution characteristics of the data center and the heat recirculation, the heat circulation coefficient matrix between the nodes is calculated.
进一步地,所述的得到节点内处于不同状态磁盘数量所产生的节点磁盘能耗模型中,依据磁盘的服务状态将磁盘分为关闭、休眠、活跃状态,不同状态对应产生不同的能耗,不同状态下的磁盘的构成节点磁盘能耗,同时,定义Requsets(s)为存储型数据中心接收到数据集s的数据访问请求,s为该数据访问请求申请访问的数据集编号,此时该数据集s的副本所在磁盘需处于活跃状态。Further, in the obtained node disk energy consumption model generated by the number of disks in different states in the node, the disks are divided into closed, dormant, and active states according to the service state of the disk, and different states produce different energy consumption corresponding to different states. At the same time, define Requests(s) as the data access request of the data set s received by the storage data center, s is the data set number that the data access request applies for access, at this time the data The disk where the replica of set s resides must be active.
进一步地,所述的步骤S2中的以最低总能耗为目标,生成热量感知磁盘序列DS,并将磁盘序列DS划分为活跃副本区以及冗余副本区的过程如下:Further, the process of generating the heat-aware disk sequence DS with the lowest total energy consumption as the target in the step S2, and dividing the disk sequence DS into an active copy area and a redundant copy area is as follows:
S201、通过建立能耗模型,以最小总能耗为目标,采用贪心算法思想,遍历所有节点,选取出第一个开启与放置副本的节点与磁盘编号,并将该磁盘编号记录在磁盘序列DS中,其中,所述的磁盘编号由小到大进行选取,且已经存在于磁盘序列DS中的磁盘编号不重复遍历;S201. By establishing an energy consumption model, aiming at the minimum total energy consumption and adopting the greedy algorithm idea, traverse all nodes, select the first node and disk number for opening and placing a copy, and record the disk number in the disk sequence DS , wherein, the disk number is selected from small to large, and the disk number that already exists in the disk sequence DS is not traversed repeatedly;
S202、通过建立能耗模型,以最小化总能耗为目标,选取出下一个开启与放置副本的节点与磁盘编号,并将该磁盘编号记录在磁盘序列DS中;S202, by establishing an energy consumption model, with the goal of minimizing the total energy consumption, select the next node and disk number for opening and placing the replica, and record the disk number in the disk sequence DS;
S203、固定已经记录在磁盘序列DS中的磁盘编号,重复步骤S202,直到得到完整的磁盘序列DS;S203, fix the disk number that has been recorded in the disk sequence DS, repeat step S202, until the complete disk sequence DS is obtained;
S204、将磁盘序列DS按照活跃副本与冗余副本的比例,划分为活跃副本区以及冗余副本区。S204: Divide the disk sequence DS into an active copy area and a redundant copy area according to the ratio of the active copy to the redundant copy.
进一步地,所述的步骤S204中,根据设定的活跃副本数active与冗余副本数redundant的比例,将磁盘序列DS的前个磁盘组成活跃副本区,剩下的磁盘组成冗余副本区,其中,d为磁盘序列DS的磁盘总数,表示向上取整。Further, in the step S204, according to the set ratio of the number of active copies active to the number of redundant copies redundant The remaining disks form the active copy area, and the remaining disks form the redundant copy area, where d is the total number of disks in the disk sequence DS, Indicates rounded up.
进一步地,所述的步骤S3过程如下:Further, the described step S3 process is as follows:
S301、定义活跃副本区包含的磁盘编号为集合SActive,冗余副本区包含的磁盘编号为集合SRedundant;S301. Define the disk number included in the active copy area as the set S Active , and the disk number included in the redundant copy area as the set S Redundant ;
S302、建立能耗模型,以最小化总能耗为目标,遍历集合SActive,选取前active个开启并放置副本的磁盘编号,并记录在优化后的磁盘序列DSnew的活跃副本区,其中,active为活跃副本数;S302 , establishing an energy consumption model, aiming at minimizing the total energy consumption, traverse the set S Active , select the first active disk numbers that are turned on and place copies, and record them in the active copy area of the optimized disk sequence DS new , wherein, active is the number of active copies;
S303、建立能耗模型,以最小化总能耗为目标,遍历集合SRedundant,选取后redundant个开启并放置副本的磁盘编号,并记录在优化后的磁盘序列DSnew的冗余副本区,其中,redundant为冗余副本数;S303 , establishing an energy consumption model, aiming at minimizing the total energy consumption, traverse the set S Redundant , select the number of the redundant disks that are opened and place replicas, and record them in the redundant replica area of the optimized disk sequence DS new , wherein , redundant is the number of redundant copies;
S304、固定已经记录在磁盘序列DSnew中的磁盘编号,重复步骤S302和S303,直至得到完整的磁盘序列DSnew。S304: Fix the disk number that has been recorded in the disk sequence DS new , and repeat steps S302 and S303 until the complete disk sequence DS new is obtained.
进一步地,所述的步骤S4中定义并初始化副本表ReplicaTable的过程如下:Further, the process of defining and initializing the replica table ReplicaTable in the step S4 is as follows:
首先初始化副本表ReplicaTable,按照数据的访问请求,将数据备份active-1个活跃副本以及redundant个冗余副本,其中,active为活跃副本数,redundant为冗余副本数,并按照磁盘序列DSnew的磁盘编号顺序,将活跃副本放置在活跃副本区,冗余副本放置在冗余副本区,同时将副本位置写入副本表ReplicaTable,其中,所述的将副本位置写入副本表ReplicaTable中,将采用以下方法写入:First initialize the replica table ReplicaTable, according to the data access request, back up the data active-1 active replica and redundant redundant replicas, where active is the number of active replicas, redundant is the number of redundant replicas, and according to the disk sequence DS new In the order of disk numbers, the active copy is placed in the active copy area, and the redundant copy is placed in the redundant copy area. At the same time, the copy position is written into the copy table ReplicaTable, wherein, the copy position is written into the copy table ReplicaTable, will use The following method writes:
ReplicaTables,k=jReplicaTable s,k = j
其中,s表示数据集编号,k为该数据集的第k个副本,j为该副本所在磁盘编号,ReplicaTables,k=j表示数据集编号s的第k个副本存储于磁盘编号j中。Among them, s represents the data set number, k is the k-th replica of the data set, j is the disk number where the replica is located, and ReplicaTable s,k =j represents that the k-th replica of the data set number s is stored in disk number j.
进一步地,所述的步骤S5过程如下:Further, the described step S5 process is as follows:
S501、记录并统计多个周期的副本访问情况,其中,所述的副本访问情况包括数据副本访问频数、访问时间;S501, record and count the replica access situation of multiple cycles, wherein, the replica access situation includes data replica access frequency and access time;
S502、将每个数据集在每个周期内的访问频数进行比较,并降序排列,定义排名在前20%的数据集为热数据,其对应的数据集副本为热副本;剩余数据集为冷数据,对应的数据集副本为冷副本,其中,每个数据集在当前周期内的访问频数为该数据集所有副本在当前周期内的访问频数总和;S502. Compare the access frequency of each data set in each cycle, and arrange them in descending order, and define the top 20% data sets as hot data, and the corresponding data set copies are hot copies; the remaining data sets are cold Data, the corresponding data set copy is a cold copy, where the access frequency of each data set in the current cycle is the sum of the access frequencies of all copies of the data set in the current cycle;
S503、采用二次指数平滑法,对未来D个周期的数据集访问频数进行预测,并重新定义冷热副本;S503, using the quadratic exponential smoothing method to predict the access frequency of the data set for D periods in the future, and redefine the hot and cold copies;
S504、将优化后的磁盘序列DSnew活跃副本区中包含的新定义热副本迁移到该区的前20%的磁盘中,将优化后的磁盘序列DSnew活跃副本区中包含的新定义的冷副本迁移到该区剩下的磁盘中;S504: Migrate the newly defined hot copies contained in the optimized disk sequence DS new active copy area to the top 20% of the disks in the area, and transfer the newly defined cold copies contained in the optimized disk sequence DS new active copy area The copy is migrated to the remaining disks in the area;
S505、将优化后的磁盘序列DSnew冗余副本区中包含的新定义热副本迁移到该区的前20%的磁盘中,将优化后的磁盘序列DSnew冗余副本区中包含的新定义的冷副本迁移到该区剩下的磁盘中,完成冷热副本的迁移放置,同时更新副本表ReplicaTable。S505: Migrate the newly defined hot copies contained in the optimized disk sequence DS new redundant copy area to the top 20% of the disks in the area, and transfer the newly defined hot copies contained in the optimized disk sequence DS new redundant copy area The cold copy is migrated to the remaining disks in this area, the migration and placement of the hot and cold copies are completed, and the replica table ReplicaTable is updated at the same time.
进一步地,所述的建立能耗模型,以最小化总能耗为目标,具体如下所示:Further, the establishment of the energy consumption model is aimed at minimizing the total energy consumption, as follows:
Y=f(ReplicaTable,Requsets(s))Y=f(ReplicaTable, Requests(s))
其中,in,
Pnode=PusingY+PidleλP node =P using Y+P idle λ
COP=0.0068tsup 2+0.0008tsup+0.4580COP=0.0068t sup 2 +0.0008t sup +0.4580
tsup=min(tcritital-DPnode)t sup = min(t critital -DP node )
其中,假定存储型数据中心有n个节点,表示第i个节点产生的能耗,λi为0或1的二值变量表示第i个节点是否处于活跃状态,0表示否,1表示是,Y表示节点处于活跃状态的磁盘个数组成的向量,由数据访问请求Requsets(s)以及副本表ReplicaTable决定,Pnode表示组成的向量,单个节点每增加一个磁盘转换为活跃状态的额外产生能耗为Pusing,节点所有磁盘处于休眠状态能耗和为Pidle,λ表示λi组成的向量,COP为制冷设备制冷性能系数,tsup为制冷设备的供应温度,温度警戒值tcritital表示节点磁盘需要低于该温度才能提供数据访问服务,热量循环矩阵D表示节点间热量相互影响的系数和节点能耗之间的关系。Among them, it is assumed that the storage data center has n nodes, Represents the energy consumption generated by the i-th node. A binary variable with λ i of 0 or 1 indicates whether the i-th node is in an active state, 0 means no, 1 means yes, and Y means the number of disks in which the node is in an active state. Vector, determined by data access request Requests(s) and replica table ReplicaTable, represented by P node A vector composed of, the additional energy consumption of a single node when a disk is converted to an active state is P using , the energy consumption sum of all disks in a dormant state of a node is P idle , λ represents the vector composed of λ i , and COP is the cooling performance of the cooling equipment coefficient, t sup is the supply temperature of the cooling equipment, the temperature warning value t critital indicates that the node disk needs to be lower than this temperature to provide data access services, and the heat cycle matrix D indicates the relationship between the coefficient of heat mutual influence between nodes and the energy consumption of nodes .
本发明相对于现有技术具有如下的优点及效果:Compared with the prior art, the present invention has the following advantages and effects:
(1)本发明采用气流组织模型建模,可以有效兼顾存储能耗以及制冷温度与能耗,能够充分的采用热量再循环特征降低数据中心能耗。(1) The present invention adopts airflow organization model modeling, which can effectively take into account the storage energy consumption, cooling temperature and energy consumption, and can fully use the heat recirculation feature to reduce the energy consumption of the data center.
(2)本发明采用贪心算法的思想,时间复杂度较低,可用于在线存储型数据中心的运行。(2) The present invention adopts the idea of a greedy algorithm, has low time complexity, and can be used for the operation of an online storage data center.
(3)本发明采用访问请求的预测,可以更加准确的划分冷热数据副本,同时通过副本的迁移放置,进一步降低数据中心能耗。(3) The present invention adopts the prediction of the access request, which can divide the hot and cold data copies more accurately, and at the same time, further reduces the energy consumption of the data center through the migration and placement of the copies.
附图说明Description of drawings
图1是本发明提出的热量感知磁盘序列计算方法流程图;Fig. 1 is the flow chart of the heat-aware disk sequence calculation method proposed by the present invention;
图2是本发明的活跃副本分区、冗余副本分区示意图;2 is a schematic diagram of an active replica partition and a redundant replica partition of the present invention;
图3是本发明提出优化的热量感知磁盘序列计算方法流程图。FIG. 3 is a flowchart of an optimized heat-aware disk sequence calculation method proposed by the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
实施例Example
本实施例具体公开了一种基于热量感知的数据中心节能数据副本放置方法,该节能数据副本放置方法包括以下步骤:The present embodiment specifically discloses a method for placing copies of energy-saving data in a data center based on heat perception, and the method for placing copies of energy-saving data includes the following steps:
S1、根据数据中心气流组织特征与热量再循环,针对存储型数据中心构建能耗模型。S1. According to the air distribution characteristics and heat recirculation of the data center, an energy consumption model is constructed for the storage data center.
将存储型数据中心每个机架上的机箱看成节点,将存储型数据中心划分为若干个节点,而每个节点内包含若干个磁盘,且每个节点内磁盘共享一个节点电源,并根据数据中心气流组织特征与热量再循环,计算出节点间的热量循环系数矩阵。The chassis on each rack of the storage data center is regarded as a node, and the storage data center is divided into several nodes, and each node contains several disks, and the disks in each node share a node power supply, and according to the Data center airflow organization characteristics and heat recirculation, and calculate the heat circulation coefficient matrix between nodes.
依据磁盘的服务状态可分为关闭、休眠、活跃状态,而这些状态对应产生不同的能耗,这些不同状态下的磁盘构成节点磁盘能耗,且根据存储型数据中心接受到的数据访问请求Requsets(s),该数据集副本所在磁盘需处于活跃状态,Requests为数据访问请求,s为该请求访问的数据集编号。According to the service state of the disk, it can be divided into closed, dormant, and active states, and these states generate different energy consumption. The disks in these different states constitute the node disk energy consumption, and according to the data access request received by the storage data center Requests (s), the disk where the copy of the dataset is located must be active, Requests is the data access request, and s is the number of the dataset accessed by the request.
依据磁盘的服务状态将磁盘分为关闭、休眠、活跃状态,不同状态对应产生不同的能耗,不同状态下的磁盘的构成节点磁盘能耗,同时,定义Requsets(s)为存储型数据中心接收到数据集s的数据访问请求,s为该数据访问请求申请访问的数据集编号,此时该数据集s的副本所在磁盘需处于活跃状态。According to the service state of the disk, the disk is divided into closed, dormant, and active states. Different states generate different energy consumption, and the disk energy consumption of the constituent nodes of the disk in different states. A data access request to data set s, where s is the data set number that the data access request applies for access, and the disk where the copy of the data set s is located must be in an active state.
根据节点热量循环系数矩阵以及节点内处于活跃状态磁盘数量所产生的节点磁盘能耗模型,可以得到数据中心制冷能耗以及总能耗模型。According to the node heat cycle coefficient matrix and the node disk energy consumption model generated by the number of active disks in the node, the cooling energy consumption and total energy consumption model of the data center can be obtained.
假定存储型数据中心有n个节点,表示第i个节点产生的能耗,λi为0或1的二值变量表示第i个节点是否处于活跃状态,0表示否,1表示是,Y表示节点处于活跃状态的磁盘个数组成的向量,由数据访问请求以及副本表的函数决定,Pnode表示组成的向量,单个节点每增加一个磁盘转换为活跃状态的额外产生能耗为Pusing,节点所有磁盘处于休眠状态能耗和为Pidle,λ表示λi组成的向量,COP为制冷设备制冷性能系数,tsup为制冷设备的供应温度,温度警戒值tcritital表示节点磁盘需要低于该温度才能提供数据访问服务,热量循环矩阵D表示节点间热量相互影响的系数和节点能耗之间的关系,那么以最小化能耗为目标,总能耗模型表示如下:Assuming that there are n nodes in the storage data center, Represents the energy consumption generated by the i-th node. A binary variable with λ i of 0 or 1 indicates whether the i-th node is in an active state, 0 means no, 1 means yes, and Y means the number of disks in which the node is in an active state. Vector, determined by data access request and function of replica table, represented by P node A vector composed of, the additional energy consumption of a single node when a disk is converted to an active state is P using , the energy consumption sum of all disks in a dormant state of a node is P idle , λ represents the vector composed of λ i , and COP is the cooling performance of the cooling equipment coefficient, t sup is the supply temperature of the cooling equipment, the temperature warning value t critital indicates that the node disk needs to be lower than this temperature to provide data access services, and the heat cycle matrix D indicates the relationship between the coefficient of heat mutual influence between nodes and the energy consumption of nodes , then with the goal of minimizing energy consumption, the total energy consumption model is expressed as follows:
Y=f(ReplicaTable,Requsets(s))Y=f(ReplicaTable, Requests(s))
其中,in,
Pnode=PusingY+PidleλP node =P using Y+P idle λ
COP=0.0068tsup 2+0.0008tsup+0.4580COP=0.0068t sup 2 +0.0008t sup +0.4580
tsup=min(tcritital-DPnode)t sup = min(t critital -DP node )
S2、以最低总能耗为目标,生成热量感知磁盘序列DS,并将磁盘序列DS划分为活跃副本区以及冗余副本区。S2. Aiming at the lowest total energy consumption, a heat-aware disk sequence DS is generated, and the disk sequence DS is divided into an active copy area and a redundant copy area.
如图1所示,生成热量感知磁盘序列流程如下:As shown in Figure 1, the sequence flow for generating a heat-aware disk is as follows:
(1)设置节点个数n以及节点所包含的磁盘数量m,节点编号i从1开始遍历;(1) Set the number of nodes n and the number of disks m contained in the node, and the node number i starts to traverse from 1;
(2)若该节点编号i处于活跃状态的磁盘数量Yi小于节点所包含的磁盘数量m,则该节点编号处于活跃状态的磁盘数量Yi自增1,否则,节点编号i自增1,重复步骤(2);(2) If the number of disks Y i in the active state of the node number i is less than the number of disks m contained in the node, the number of disks in the active state of the node number Yi increases by 1, otherwise, the node number i increases by 1 automatically, Repeat step (2);
(3)建立数据中心总能耗模型,计算当前分配的磁盘情况下的总能耗,若总能耗最小,则将节点i的磁盘编号j记录在磁盘序列DS中,其中磁盘编号由小到大进行选取,否则该节点编号处于活跃状态的磁盘数量Yi自减1,节点编号i自增1,转到步骤(2);(3) Establish the total energy consumption model of the data center, and calculate the total energy consumption under the condition of the currently allocated disks. If the total energy consumption is the smallest, record the disk number j of node i in the disk sequence DS, where the disk numbers are from small to small. Select a large number, otherwise the number of disks Y i in the active state of the node number is decremented by 1, and the node number i is incremented by 1, and go to step (2);
(4)判断磁盘序列DS是否完整,若是,则输出磁盘序列DS,否则转到步骤(1)。(4) Determine whether the disk sequence DS is complete, if yes, output the disk sequence DS, otherwise go to step (1).
如图2所示,表示活跃副本区与冗余副本区的划分示意图。As shown in FIG. 2 , it is a schematic diagram showing the division of the active copy area and the redundant copy area.
将磁盘序列DS按照活跃副本与冗余副本的比例,划分为活跃副本区以及冗余副本区,其中根据设定的活跃副本数active与冗余副本数redundant的比例,将磁盘序列DS的前个磁盘组成活跃副本区,剩下的磁盘组成冗余副本区。Divide the disk sequence DS into an active copy area and a redundant copy area according to the ratio of active copies to redundant copies. One disk forms the active copy area, and the remaining disks form the redundant copy area.
S3、分别对活跃副本区以及冗余副本区所包含的磁盘,再次利用存储型数据中心能耗模型,以最低总能耗为目标,生成优化后的磁盘序列DSnew。S3. For the disks included in the active copy area and the redundant copy area, the storage data center energy consumption model is used again, and an optimized disk sequence DS new is generated with the lowest total energy consumption as the goal.
如图3所示,表示生成优化后的热量感知磁盘序列流程:As shown in Figure 3, it represents the process of generating an optimized heat-aware disk sequence:
(1)采用步骤S2得到的磁盘序列DS,定义活跃副本区包含的磁盘编号为集合SActive,冗余副本区包含的磁盘编号为集合SRedundant,集合SActive的磁盘序号p从1开始遍历;(1) adopting the disk sequence DS obtained in step S2, defining the disk number contained in the active copy area to be the set S Active , the disk number contained in the redundant copy area being the set S Redundant , and the disk sequence number p of the set S Active is traversed from 1;
(2)判断磁盘序号p对应的磁盘编号是否处于活跃状态,若是,则磁盘序号p自增1,重复步骤(2),否则建立数据中心总能耗模型;(2) judging whether the disk number corresponding to the disk serial number p is in an active state, if so, then the disk serial number p is automatically incremented by 1, and step (2) is repeated, otherwise the total energy consumption model of the data center is established;
(3)计算当前分配的磁盘情况下的总能耗,若总能耗最小,则磁盘序号p对应的磁盘编号记录在优化后的磁盘序列DSnew的活跃副本区,磁盘序号p自增1,转到步骤(2);(3) Calculate the total energy consumption of the currently allocated disk. If the total energy consumption is the smallest, the disk number corresponding to the disk sequence number p is recorded in the active copy area of the optimized disk sequence DS new , and the disk sequence number p is incremented by 1. Go to step (2);
(4)判断本轮累计转换活跃磁盘是否等于active,若是,定义集合SRedundant的磁盘序号q从1开始遍历,否则磁盘序号p自增1,转到步骤(2);(4) Judging whether the active disk of this round of cumulative conversion is equal to active, if so, define the disk sequence number q of the set S Redundant to traverse from 1, otherwise the disk sequence number p increases by 1, and goes to step (2);
(5)判断磁盘序号q对应的磁盘编号是否处于活跃状态,若是,则磁盘序号q自增1,重复步骤(5),否则建立数据中心总能耗模型;(5) Judging whether the disk number corresponding to the disk serial number q is in an active state, and if so, the disk serial number q is automatically incremented by 1, and step (5) is repeated, otherwise the total energy consumption model of the data center is established;
(6)计算当前分配的磁盘情况下的总能耗,若总能耗最小,则磁盘序号q对应的磁盘编号记录在优化后的磁盘序列DSnew的冗余副本区,磁盘序号q自增1,转到步骤(5);(6) Calculate the total energy consumption of the currently allocated disks. If the total energy consumption is the smallest, the disk number corresponding to the disk serial number q is recorded in the redundant copy area of the optimized disk sequence DS new , and the disk serial number q is automatically incremented by 1 , go to step (5);
(7)判断本轮累计转换活跃磁盘是否等于redundant,若是,转到步骤(8),否则磁盘序号q自增1,转到步骤(5);(7) Judging whether the current cumulative conversion active disk is equal to redundant, if so, go to step (8), otherwise the disk serial number q is incremented by 1, and go to step (5);
(8)若优化后的磁盘序列DSnew完整,则输出磁盘序列DSnew,否则,转到步骤(2)。(8) If the optimized disk sequence DS new is complete, output the disk sequence DS new , otherwise, go to step (2).
S4、定义并初始化副本表ReplicaTable,然后采用副本表ReplicaTable管理副本,将副本按优化后的磁盘序列DSnew顺序分别放置于活跃副本区以及冗余副本区;其中,定义并初始化副本表ReplicaTable的过程如下:S4. Define and initialize the replica table ReplicaTable, then use the replica table ReplicaTable to manage the replicas, and place the replicas in the active replica area and the redundant replica area in the order of the optimized disk sequence DS new ; among them, define and initialize the replica table ReplicaTable The process of as follows:
初始化副本表ReplicaTable,按照数据的访问请求,将数据备份active-1个活跃副本以及redundant个冗余副本,并按照DSnew的磁盘编号顺序,将活跃副本放置在活跃副本区,冗余副本放置在冗余副本区,同时将副本位置写入副本表。Initialize the replica table ReplicaTable, back up active-1 active replicas and redundant replicas according to the data access request, and place the active replicas in the active replica area and redundant replicas in the order of the disk numbers of DS new . Redundant replica area, while writing replica position to replica table.
副本位置写入副本表中,采用以下方法写入:The replica position is written to the replica table by the following methods:
ReplicaTables,k=jReplicaTable s,k = j
其中,s表示数据集编号,k为该数据集的第k个副本,j为该副本所在磁盘编号,ReplicaTables,k=j表示数据集编号s的第k个副本存储于磁盘编号j中。Among them, s represents the data set number, k is the k-th replica of the data set, j is the disk number where the replica is located, and ReplicaTable s,k =j represents that the k-th replica of the data set number s is stored in disk number j.
S5、统计多个周期的副本访问情况,划分冷热副本,并将冷热副本进行迁移放置,同时更新副本表ReplicaTable。S5. Count the replica access conditions of multiple cycles, divide the hot and cold replicas, migrate and place the hot and cold replicas, and update the replica table ReplicaTable at the same time.
(1)记录并统计多个周期的副本访问情况,包括数据副本访问频数、访问时间等;(1) Record and count replica access over multiple cycles, including data replica access frequency, access time, etc.;
(2)将每个数据集在每个周期内的访问频数进行比较,并降序排列,定义排名在前20%的数据集为热数据,其对应的数据集副本为热副本;剩余数据集为冷数据,对应的数据集副本为冷副本;(2) Compare the access frequency of each data set in each cycle, and arrange them in descending order, define the top 20% data sets as hot data, and the corresponding data set copies as hot copies; the remaining data sets are Cold data, the corresponding data set copy is a cold copy;
将数据集在当前周期内的访问频数为该数据集所有副本在当前周期内的访问频数总和。The access frequency of the dataset in the current cycle is the sum of the access frequencies of all replicas of the dataset in the current cycle.
(3)采用二次指数平滑法,对未来D个周期的数据集访问频数进行预测,并重新定义冷热副本;(3) Use the quadratic exponential smoothing method to predict the access frequency of the data set for D periods in the future, and redefine the hot and cold copies;
(4)将优化后的磁盘序列DSnew活跃副本区中包含的新定义热副本迁移到该区的前20%的磁盘中,将优化后的磁盘序列DSnew活跃副本区中包含的新定义的冷副本迁移到该区剩下的磁盘中;(4) Migrate the newly defined hot copy contained in the optimized disk sequence DS new active copy area to the top 20% of the disks in this area, and transfer the newly defined hot copy contained in the optimized disk sequence DS new active copy area The cold copy is migrated to the remaining disks in the area;
(5)将优化后的磁盘序列DSnew冗余副本区中包含的新定义热副本迁移到该区的前20%的磁盘中,将优化后的磁盘序列DSnew冗余副本区中包含的新定义的冷副本迁移到该区剩下的磁盘中,完成冷热副本的迁移放置,同时更新副本表ReplicaTable。(5) Migrate the newly defined hot copy contained in the optimized disk sequence DS new redundant copy area to the top 20% of the disks in this area, and transfer the new definition hot copy contained in the optimized disk sequence DS new redundant copy area The defined cold copy is migrated to the remaining disks in the area, the migration and placement of the hot and cold copies are completed, and the replica table ReplicaTable is updated at the same time.
综上所述,本实施例采用气流组织模型建模,可以有效兼顾存储能耗以及制冷温度与能耗,能够充分的采用热量再循环特征降低数据中心能耗。热量感知磁盘序列的生成,采用采用贪心算法的思想,时间复杂度较低,可用于在线存储型数据中心的运行。采用访问请求的预测,可以更加准确的划分冷热数据副本,同时通过副本的迁移放置,进一步降低数据中心能耗。To sum up, this embodiment adopts the airflow organization model for modeling, which can effectively take into account the storage energy consumption, cooling temperature and energy consumption, and can fully use the heat recirculation feature to reduce the energy consumption of the data center. The generation of the heat-aware disk sequence adopts the idea of the greedy algorithm, which has a low time complexity and can be used for the operation of an online storage data center. Using the prediction of access requests, the hot and cold data copies can be divided more accurately, and the energy consumption of the data center can be further reduced through the migration and placement of copies.
上述实施例为本发明较佳的实施方式,但本发明的实施方式并不受上述实施例的限制,其他的任何未背离本发明的精神实质与原理下所作的改变、修饰、替代、组合、简化,均应为等效的置换方式,都包含在本发明的保护范围之内。The above-mentioned embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited by the above-mentioned embodiments, and any other changes, modifications, substitutions, combinations, The simplification should be equivalent replacement manners, which are all included in the protection scope of the present invention.
Claims (9)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010748759.3A CN111859703B (en) | 2020-07-30 | 2020-07-30 | A heat-aware data center energy-saving data replica placement method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010748759.3A CN111859703B (en) | 2020-07-30 | 2020-07-30 | A heat-aware data center energy-saving data replica placement method |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111859703A CN111859703A (en) | 2020-10-30 |
| CN111859703B true CN111859703B (en) | 2022-05-10 |
Family
ID=72946392
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010748759.3A Active CN111859703B (en) | 2020-07-30 | 2020-07-30 | A heat-aware data center energy-saving data replica placement method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111859703B (en) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117931513A (en) * | 2023-12-12 | 2024-04-26 | 天翼云科技有限公司 | Mixed cloud data backup management method based on multi-objective optimal copy management strategy |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130232310A1 (en) * | 2012-03-05 | 2013-09-05 | Nec Laboratories America, Inc. | Energy efficiency in a distributed storage system |
| CN103294167B (en) * | 2013-05-21 | 2016-02-10 | 暨南大学 | A kind of low energy consumption cluster-based storage reproducing unit based on data behavior and method |
| CN103530317B (en) * | 2013-09-12 | 2017-07-07 | 杭州电子科技大学 | A kind of copy management method of energy consumption self adaptation in cloud storage system |
| CN105701028B (en) * | 2014-11-28 | 2018-10-09 | 国际商业机器公司 | Disk management method in distributed memory system and equipment |
| CN105607967A (en) * | 2015-11-27 | 2016-05-25 | 北京航空航天大学 | Data center-oriented energy consumption perception-based data backup method |
| CN105681052B (en) * | 2016-01-11 | 2018-11-27 | 天津大学 | A kind of power-economizing method for the storage of data center's distributed document |
| CN110941396A (en) * | 2019-11-22 | 2020-03-31 | 暨南大学 | A copy placement method based on airflow organization for cloud data center |
-
2020
- 2020-07-30 CN CN202010748759.3A patent/CN111859703B/en active Active
Also Published As
| Publication number | Publication date |
|---|---|
| CN111859703A (en) | 2020-10-30 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN104699424B (en) | A kind of isomery EMS memory management process based on page temperature | |
| CN110096350B (en) | Cold and hot area division energy-saving storage method based on cluster node load state prediction | |
| CN102981971B (en) | A kind of phase transition storage loss equalizing method of quick response | |
| JP2010027026A (en) | Memory storage device and control method thereof | |
| CN116737064B (en) | Data management method and system for solid state disk | |
| CN111427969A (en) | Data replacement method of hierarchical storage system | |
| CN108334541B (en) | A kind of date storage method, device, equipment and storage medium | |
| CN115470216B (en) | FTL-based intelligent Internet of things table storage management method and storage medium | |
| CN110795363A (en) | Hot page prediction method and page scheduling method for storage medium | |
| CN105574153A (en) | Transcript placement method based on file heat analysis and K-means | |
| JP6642495B2 (en) | Storage management system | |
| CN111859703B (en) | A heat-aware data center energy-saving data replica placement method | |
| CN118859769A (en) | A method for energy-saving intelligent control of water-cooled air conditioners in computer rooms based on deep reinforcement learning | |
| CN108572799B (en) | Data page migration method of heterogeneous memory system of bidirectional hash chain table | |
| CN116991580A (en) | Distributed database system load balancing method and device | |
| CN106898368B (en) | CD server switch controlling device, method, equipment and optical-disk type data center | |
| CN111078143A (en) | Hybrid storage method and system for data layout and scheduling based on segment mapping | |
| CN118502679A (en) | Data access scheduling method and device for memory | |
| CN112328171A (en) | Data distribution prediction method, data equalization method, device and storage medium | |
| CN110149341B (en) | Cloud system user access control method based on sleep mode | |
| CN116301282B (en) | Low-power consumption control method and device for multi-core processor chip | |
| CN116257128B (en) | A low power consumption control method and device for a multi-core heterogeneous chip | |
| CN114020443B (en) | Supercomputer resource scheduling method, electronic device and medium | |
| CN110941396A (en) | A copy placement method based on airflow organization for cloud data center | |
| CN117130549A (en) | Data storage methods, devices, computer equipment and storage media thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |