CN111385200B

CN111385200B - Control method and device for data block repair

Info

Publication number: CN111385200B
Application number: CN202010144314.4A
Authority: CN
Inventors: 郭得科; 夏俊旭; 李江帆; 李尚森
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2020-03-04
Filing date: 2020-03-04
Publication date: 2022-03-04
Anticipated expiration: 2040-03-04
Also published as: CN111385200A

Abstract

A control method and device for data block repair are provided, wherein the method comprises the following steps: acquiring a topological structure of a data center and a repair information providing node and a target repair node for repairing a data block; determining a maximum bottleneck bandwidth in the topological structure according to the repair information providing node and the target repair node; the bottleneck bandwidth is the minimum available bandwidth of a link connecting the repair information providing node and the target repair node; determining a link with the minimum transmission cost in the topological structure according to the maximum bottleneck bandwidth; and repairing the data block according to the repair information providing node, the target repair node, the maximum bottleneck bandwidth and the link with the minimum transmission cost, thereby effectively improving the repair efficiency of the data block.

Description

Control method and device for data block repair

Technical Field

The present application relates to the field of data repair technologies, and in particular, to a control method and apparatus for repairing a data block.

Background

With the increasing demand for data storage, many applications typically store large amounts of data generated in a data center. To ensure storage requirements, the demand on storage systems is also increasing.

Disclosure of Invention

The invention aims to provide a control method and a control device for data block repair, which effectively improve the repair efficiency of a data block by maximizing bottleneck bandwidth and then minimizing transmission cost.

To solve the above problem, a first aspect of the present invention provides a control method for data block repair, including: acquiring a topological structure of a data center and a repair information providing node and a target repair node for repairing a data block; determining a maximum bottleneck bandwidth in the topological structure according to the repair information providing node and the target repair node; the bottleneck bandwidth is the minimum available bandwidth of a link connecting the repair information providing node and the target repair node; determining a link with the minimum transmission cost in the topological structure according to the maximum bottleneck bandwidth; and repairing the data block according to the repair information providing node, the target repair node, the maximum bottleneck bandwidth and the link with the minimum transmission cost.

Further, the determining, according to the repair information, a maximum bottleneck bandwidth in the topology structure by the providing node and the target repair node further includes: acquiring bottleneck bandwidth between any two nodes in the topological structure; identifying a maximum value of a bottleneck bandwidth in a link connecting the repair information providing node and the target repair node.

Further, the determining, in the topology structure, a link with a minimum transmission cost according to the maximum bottleneck bandwidth further includes: deleting a link, of which the bottleneck bandwidth is smaller than the maximum bottleneck bandwidth, in a link connecting the repair information providing node and the target repair node; identifying a minimum Steiner tree connecting the repair information providing node and the target repair node in remaining links.

Further, identifying coincident nodes of any two links in the minimum Steiner tree; and at a first one of the coincident nodes, aggregating information provided by the repair information providing nodes in the two links.

Further, the repairing the data block according to the link with the smallest repair information providing node, the target repair node, the maximum bottleneck bandwidth, and the transmission cost further includes: and aggregating the information provided by all the repair information providing nodes at the target repair node so as to repair the data block.

According to another aspect of the present invention, there is provided a control apparatus for data block repair, comprising: the data block recovery system comprises an acquisition module, a recovery module and a recovery module, wherein the acquisition module is used for acquiring a topological structure of a data center and a recovery information providing node and a target recovery node which are used for recovering a data block; a first determining module, configured to determine, according to the repair information providing node and the target repair node, a maximum bottleneck bandwidth in the topology; the bottleneck bandwidth is the minimum available bandwidth of a link connecting the repair information providing node and the target repair node; a second determining module, configured to determine, according to the maximum bottleneck bandwidth, a link with a minimum transmission cost in the topology; and the repair module is used for repairing the data block according to the repair information providing node, the target repair node, the link with the minimum maximum bottleneck bandwidth and the minimum transmission cost.

The technical scheme of the invention has the following beneficial technical effects: not only has fast repair speed, but also has lower bandwidth consumption. It is worth noting that the core of the present application is to implement intra-network aggregation that can efficiently repair failed data blocks while achieving lower bandwidth consumption than existing approaches.

Drawings

FIG. 1 is a schematic topology diagram of an embodiment of the present invention;

FIG. 2 is a block diagram illustrating a method for repairing failed data blocks using a tree structure according to an embodiment of the present invention;

FIG. 3 is a block diagram illustrating an aggregation using servers on intermediate links according to an embodiment of the present invention;

fig. 4 is a schematic diagram of an aggregation using an intermediate switch according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a pruning method for finding a shortest path according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a pruning method for finding a minimum Steiner tree according to an embodiment of the present invention;

FIG. 7 is a flowchart of a control method for data block repair according to an embodiment of the present invention;

FIG. 8 is a flow chart of a control method for data block repair according to one embodiment of the present invention;

FIG. 9 is a flowchart of a control method for data block repair according to another embodiment of the present invention;

FIG. 10 is a schematic structural diagram of a control apparatus for data block repair according to an embodiment of the present invention;

FIG. 11 is a graph illustrating the comparison of bottleneck bandwidths of different methods according to an embodiment of the present invention;

FIG. 12 is a graph illustrating the results of comparing the transmission costs of different methods according to one embodiment of the present invention;

FIG. 13 is a comparison of bottleneck bandwidths of the present application in different topologies according to an embodiment of the present invention;

FIG. 14 illustrates the transmission cost savings of the present application compared to a bandwidth optimization method in an embodiment of the present invention;

FIG. 15 shows that the transmission cost of the present application varies with the number of providers and the size of the topology according to an embodiment of the present invention;

FIG. 16 is a transmission cost under a tree topology according to an embodiment of the present invention;

FIG. 17 illustrates the increased bottleneck bandwidth and saved transmission cost of the present application compared to the tree structure approach in an embodiment of the present invention;

FIG. 18 illustrates a comparison of bottleneck bandwidth values for a 12 number provider in an embodiment of the invention;

FIG. 19 is a comparison of transmission costs in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings in conjunction with the following detailed description. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.

In the drawings a schematic view of a layer structure according to an embodiment of the invention is shown. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity. The shapes of various regions, layers, and relative sizes and positional relationships therebetween shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, as actually required.

It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the description of the present invention, it should be noted that the terms "first", "second", and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The invention will be described in more detail below with reference to the accompanying drawings. Like elements in the various figures are denoted by like reference numerals. For purposes of clarity, the various features in the drawings are not necessarily drawn to scale.

In the following description, numerous specific details of the invention, such as structure, materials, dimensions, processing techniques and techniques of the devices are described in order to provide a more thorough understanding of the invention. However, as will be understood by those skilled in the art, the present invention may be practiced without these specific details. Unless otherwise specifically noted below, various portions of the semiconductor device may be composed of materials well known to those skilled in the art.

It should be noted that, in a data center for storing data, a distributed storage system implements large-scale storage of data by deploying a large number of storage nodes. To improve the storage service's instructions, many distributed storage systems, such as google and microsoft storage systems, use erasure codes to improve storage reliability and reduce storage space overhead. These storage systems typically distribute large amounts of data among multiple servers across a data center, where each server holds a portion of the encoded data, avoiding file loss due to machine failure.

Even with preventive measures, data loss is common in data centers. Moreover, data center hardware damage can also cause data loss, such as disk damage and server failure. When a data block fails due to a failure of the original storage node, a number of related data blocks from other storage servers (becoming providers) will be transferred to a new node (called the newcomer) to repair the failed data block. The repair process typically consumes a large amount of bandwidth, placing considerable pressure on the data center network. In order to solve the problem of broadband consumption, a scheme is needed for guiding data blocks to be aggregated in the repair process. However, these schemes focus on the scheduling between providers and newcomers, which partially aggregate data blocks according to a certain scheme, and eventually complete the repair of failed blocks. Although these methods achieve some improvement, scheduling only between providers still incurs a lot of network overhead.

It is well known that in server-centric data centers, servers are often used as switching nodes. Of course, these servers may also be used to aggregate data blocks. In addition, switches are also often deployed to process and cache some intermediate data. Therefore, unlike previous work, the present application will enable a significant reduction in traffic volume using intra-network aggregation during transmission. When data blocks are transmitted from a provider to a new recipient, they pass through many intermediate nodes in the network, which mainly include some servers and switches. The application may aggregate these data blocks at the intermediate nodes rather than direct them to a particular storage node. The respective data streams may then be naturally aggregated along the transmission path.

To repair failed data blocks efficiently, two major challenges need to be addressed first. The first is the heterogeneity of available bandwidth across different links. Efficient data block repair is difficult to achieve by means of hardware alone, and repair of a failed data block is a cooperative process among a plurality of related nodes. Congestion in one link can slow down the entire repair process. A second challenge is how to efficiently select the transmission link. This is because different links may incur different transmission costs. Simply looking for a link with a high available bandwidth usually entails considerable transmission costs, since this may take up more links for transmission over links with as high an available bandwidth as possible. Therefore, an efficient scheme is needed to control the transmission cost during the repair process.

The following describes a cloud-computing data center.

Cloud computing typically requires the storage and processing of large amounts of data. Data fault tolerance has important significance for cloud computing. In a distributed system, data is typically broken up into multiple data blocks, stored on different nodes, which consist of a large number of storage devices distributed in a data center. However, stored data is often lost or invalidated due to disk failures, downtime, and the like. To this end, distributed systems employ redundancy mechanisms to ensure the reliability of data storage.

Duplication is a classical solution. The original data is replicated in multiple copies and stored on different nodes. Although the method of replication is simple and practical, it consumes a large amount of storage space, and for this reason, many distributed storage systems employ new fault-tolerant methods, such as erasure codes, network coding, etc., to reduce the consumption of storage space.

The storage method of the erasure code divides original data into a plurality of data blocks and then generates a plurality of redundant blocks. Specifically, in a (k, m) -erasure code based storage system, a source node splits a source file into k blocks of the same size, denoted as D₁,D₂,D₃，…，D_k. The encoding process can be seen as a matrix-vector multiplication, where the vectors of k blocks are multiplied by a particular matrix a, as shown in equation (1). The matrix a is referred to as a generator matrix, wherein,

representing the galois field over q, i 1,2, …, m, j 1,2, …, k, resulting in m redundant blocks, R₁,R₂,…,R_m。

When the downloading node wants to restore the original file, it only needs to obtain any k blocks P from the k + m data blocks₁,P₂,…,P_kThe original data, D, can be recovered₁,D₂,…,D_k. As shown in formula (2), A' is represented by P₁,P₂,…,P_kThe decoding process is a linear transformation.

An erasure code based storage system will recover the original file when the application needs to read the file. However, a more common operation in daily storage is to repair failed data blocks. In an erasure coding system, the system can recover the source file as long as there are k available data blocks. However, if the failed block is not repaired in time, the data will eventually be unusable due to too much damage. Node failures in data centers are common, and therefore, the repair of failed data blocks is a routine task in distributed storage systems.

When a data block fails due to a failure of the original storage node, k data block blocks from k providers need to be transferred to a new storage node, called the newcomer, which receives the blocks and regenerates the originally failed data block to maintain redundancy. This repair process is a linear combination of k data blocks, as shown in equation (3).

P₁,P₂,…,P_kAre k blocks of data from k providers, the corresponding matrix of decoding coefficients is denoted as beta₁,β₂,…,

Representing a galois field over q. P^*The data blocks to be repaired are represented, the repairing process can be regarded as a linear combination of k data blocks, and the size of each added data block is kept unchanged in the whole repairing process.

The conventional data block repair method is called a star structured scheme. The newcomer receives all the involved data blocks P₁,P₂,…,P_kAnd then aggregate them locally. This often results in link congestion at the newcomer. It has been pointed out in the related art that inefficient repair of erasure codes has become a bottleneck that restricts its widespread use. Therefore, there is a need to improve the original data block repair method to improve the efficiency of data block repair.

Further, in order to repair a failed data block, the network transmission time takes up to 94% of the repair time. Therefore, the existing data block optimization repair method mainly focuses on network transmission. Another related art proposes a method for constructing a spanning tree to repair a failed data block, which becomes a tree structure scheme. The core idea of this tree structure scheme is to find a spanning tree based on the available bandwidth between the provider and the new receiver. Unlike the conventional star-structured scheme, this tree-structured scheme requires the storage node to transmit the processed data block β_iP_iInstead of the original data block P_i. The intermediate nodes in the spanning tree receive the data blocks from their child nodes and then aggregate them with the local data blocks, generating new intermediate data blocks. There is also a tree-structured scheme to reduce the path length of the transmission. The difference is that the goal is to reduce the transmission cost, but this method cannot effectively guarantee the transmission speed.

The goal of the tree structure method is to find a connection liftSpanning trees for the donor and the newcomer. For (k, m) -based erasure code storage systems, the present application represents a node as V ═ V₀,v₁,v₂,…,v_kIn which v is₀e.V is a new party, other V₁,v₂,…,v_kIs a provider. The newcomers receive the data blocks from the providers. Side set v₁,v₂,…,v_kRepresenting a slave node v_iTo node v_jThe path of (2). Each edge has a weight w (v)_i,v_j) W (v) if the maximum transmission speed is desired_i,v_j) Representing a node v_iAnd node v_jOf the available bandwidth in between, w (v) if the minimum transmission cost is desired_i,v_j) Representing a node v_iAnd node v_jThe path length in between. The repair model for a failed data block may be represented as a weighted undirected complete graph G ═ V, E, W. The goal of the tree structure approach is to find a maximum/minimum spanning tree in graph G.

Note that the nodes in the weighted undirected graph G ═ V, E, W are providers and newcomers, where the set of nodes is V, the set of edges is E, and the set of edge weights is W. The tree structure approach only focuses on building spanning trees between these providers and new receivers, ignoring nodes on the transmission path. This may result in some physical links in the actual topology of the data center being repeatedly occupied, resulting in a true transmission speed that is lower than the theoretical value. In contrast, the method is based on a complete data center topological structure, and nodes except the storage node on the transmission path participate in the repair process, so that the efficiency of repairing the failed data block is further improved.

Specifically, first, data block repair refers to building a spanning tree connection provider and newcomers. In this spanning tree, providers (leaf nodes or intermediate nodes) transmit data blocks β_iP_iIntermediate nodes in the spanning tree to the new-from (root node.) aggregate data blocks from children and send the aggregated data blocks to their parents. The newcomer then receives all the data blocks and generates a final data block

k represents a total of k providers.

In order to optimize the transmission speed of data block repair, the current tree structure method constructs a maximum spanning tree according to a weighted undirected complete graph. This spanning tree connects the provider and the newcomer. Providers on leaf nodes transfer blocks of data to their parent nodes. Furthermore, a provider acting as an intermediate node in the spanning tree may aggregate these data blocks from its children nodes and then transfer the aggregated data blocks to the parent node. The new party performs as a root node the reception of data blocks from these child nodes and the repair of failed data blocks. In erasure code based storage systems, data blocks are encoded in units of bits, and thus, intermediate nodes can aggregate data immediately upon receiving the data, much like an assembly pipeline. That is, regardless of the computation time, the minimum bandwidth of the spanning tree determines the repair speed.

Fig. 1 is a schematic diagram of a topology (a topology composed of 6 servers and 2 switches) according to an embodiment of the present invention. As shown in FIG. 1, server s₂Is a new party to repair failed data blocks, and server s₃、s₄And s₆Is a provider. In this topology, the servers also act as switches, which is very common in server-centric topologies, where the weights of the edges represent the available bandwidth between the nodes.

Fig. 2 is a schematic structural diagram of repairing a failed data block by using a tree structure method according to an embodiment of the present invention. Wherein the provider is v₁＝s₃，v₂＝s₄And v₃＝s₆And the new one is v₀＝s₂. The tree structure approach aims at constructing a maximum spanning tree, where the weight values of the edges represent the maximum available bandwidth between the nodes. The repair method of the above example is node v₁Transmitting coded data blocks, e.g. beta₁P₁To node v₂. The node encodes its own block of coded data, e.g. beta₂P₂And encoding it into a data block beta₂P₂And beta₁P₁And (4) polymerizing. Aggregated data block beta₁P₁+β₂P₂Is transmitted to a new party v₀. Notably, the aggregated data block β₁P₁+β₂P₂And coded data block beta₁P₁The sizes are the same. Node v₃Transmitting its coded data block beta₃P₃Coming directly to the new person v₀Finally, the new person v₀Repairing failed data blocks P^*，P^*＝β₁P₁+β₂P₂+β₃P₃. As can be seen from fig. 2, the minimum available bandwidth of the spanning tree is 43.

From this, it can be seen that the undirected graph G is represented by the node V (V, E, W)₀＝s₂,v₁＝s₃,v₂＝s₄And v₃＝s₆Structure wherein V ═ { V ═ V₀,v₁,v₂,v₃}. W middle edge weight W (v)_i,v_j) Representing the maximum available bandwidth between these nodes. Further, the complete transmission path of the topology in fig. 1 is:

path(v₁,v₂)＝{s₃,w₂,s₅,s₄}，path(v₂,v₀)＝{s₄,s₁,s₂}，

path(v₃,v₀)＝{s₆,s₅,s₄,s₁,s₂}

it can be seen that the link(s)₅,s₄),(s₄,s₁) And(s)₁,s₂) Are repeatedly occupied, and therefore, the tree structure scheme cannot effectively guarantee the actual bandwidth.

A more reasonable approach is to find the spanning tree from the global topology and aggregate the data blocks along the path at some intersecting intermediate nodes. E.g. servers s₅Aggregating data from a server s₃And s₆Then server s₅Sending the aggregated data block to s₄As shown in fig. 3. Such asThe minimum bandwidth of the scheme is also 43, and no link is reused, which is an advantage of the aggregation method within the network. In particular, as shown in fig. 4, when aggregation is performed using a switch, only 5 links may be used instead of 6 links, thereby further saving transmission costs.

For undirected complete graph G ═ (V, E, W) obtained from a data center, the set of providers and newcomers is denoted as M ═ { V ═ V₀,v₁,v₂,…,v_kIn which v is₀Indicates a new party, v₁,v₂,…,v_kIs taken as a representative provider. Node v may also aggregate data blocks since the switches of the data center_iWhich may be a switch or server of node assembly V. Wherein | V | is used in the present application to represent the number of nodes in the node set V. Edge E in E_ijRepresenting a connecting node v_iAnd v_j(v_i,v_jE.v). Edge weight W (e) in W_ij) Represents an edge e_ijThe available bandwidth of (a). In order to efficiently repair failed data blocks, the core of the aggregation mechanism of the application is to find a spanning tree T_MTo connect all nodes in M, (V ', E'), wherein,

further, the bottleneck bandwidth of the spanning tree refers to the minimum available bandwidth of all links in the spanning tree. Each link in the data center transports data from a different service. Occupying a link reduces its available bandwidth and affects the transmission of other services. In addition, the bottleneck bandwidth determines the repair speed of the failed data block. Therefore, a first objective of the present application is to find the maximum bottleneck bandwidth B, thereby guaranteeing the highest transmission speed. The spanning tree with the largest bottleneck bandwidth is called the maximum bottleneck spanning tree. Formalizing the target as equation (4), where w (e)_ij) Represents an edge e_ijE, available bandwidth of E. The first constraint in equation (4) is to ensure that an edge E is selected from the set of edges E_ijE.g. E. The second constraint is to guarantee T_MIn the set M consisting of all providers and new receivers, (V ', E') may be connectedAnd (4) nodes.

Further, the transmission cost refers to the number of links in the spanning tree.

Specifically, in an erasure-coded storage system, original data is divided into k data blocks, each of which is the same size. In repairing failed data blocks, the bandwidth consumed is proportional to the number of occupied links, since each occupied link transmits a data block of the same size. Thus, the transmission cost refers to the number of links connecting k providers and new receivers in the corresponding spanning tree. In addition, the more links a transmission occupies, the more bandwidth it consumes. Therefore, a second objective of the present application is to construct a spanning tree with the lowest transmission cost. Wherein the second objective is as shown in equation (5), the objective in equation (5) guarantees the lowest transmission cost, wherein the second constraint guarantees that the available bandwidth of the used link is higher than the bottleneck bandwidth B.

Based on the first and second objectives, the present application can find a spanning tree T with the largest bottleneck bandwidth_M(V ', E') while the transmission cost is as small as possible.

It can be seen that the pure pursuit of bandwidth maximization often brings unnecessary extra transmission cost, as shown in fig. 1, if only the first objective is considered, it is easy to find the slave node v₀The maximum bottleneck path to each storage node, the spanning tree used is then composed as the path in fig. 3. However, the bottleneck bandwidth of the spanning tree determines the transmission speed. It is not necessary to always search for the link with the highest bandwidth. In other words, as long as the link bandwidth in the spanning tree is greater than or equal to the bottleneck bandwidth of the spanning tree, the transmission speed is not affected. For the example in fig. 3, the bottleneck bandwidth is 43, so links with a bandwidth greater than 43 can be used to transmit dataBlocks without reducing the speed of repair, rather than directing data blocks to the server s₅Polymerization is performed so that the transmission cost can be further reduced. The bottleneck bandwidth in fig. 4 is also 43, and the transmission cost is significantly lower than that in fig. 3, so that the transmission cost can be further reduced under the condition that the spanning tree bottleneck bandwidth is kept unchanged.

The following provides a control method and apparatus for data block repair, which is proposed by the present application with reference to the accompanying drawings.

Based on the above analysis, the first step of the method proposed in this application is to establish a maximum bottleneck spanning tree, find its maximum bottleneck bandwidth B, as shown in equation (4), this maximum bottleneck spanning tree should include all providers and newcomers. Providers in this spanning tree act as leaf nodes or intermediate nodes, while newcomers act as root nodes. Other nodes of the data center may participate in the repair process of the data block. They are responsible for forwarding data blocks or aggregating some intermediate blocks. This is different from previous tree structure patterns, where only the provider and the newcomer can aggregate these intermediate data blocks.

In addition, the following is v_iTo v_jThe maximum bottleneck path of (c) is the connection v in the weighted undirected graph_iAnd v_jAnd bottleneck the path with the largest bandwidth.

To find the maximum bottleneck bandwidth B, a maximum bottleneck spanning tree is first constructed, and in a weighted undirected graph G ═ V, E, W, a maximum bottleneck path from all providers to the new receiver is first constructed. Intermediate nodes on these paths may forward the data blocks and aggregate some intermediate blocks. All paths then form a spanning tree T_s＝(V_s,E_s,W_s). This spanning tree T_sIncluding all nodes in M and some others in V. These nodes together form V_s. Each path from the newcomer to the provider is made up of a plurality of links e_ijComposition, all links contained in these paths constitute an edge set E_s. Their weights form a weight set W.

Furthermore, the bottleneck bandwidth of the spanning tree refers to the minimum available bandwidth between all links in the spanning treeBandwidth. To get the maximum bottleneck spanning tree T_s＝(V_s,E_s,W_s) The minimum weight is searched for the bottleneck bandwidth B to obtain the bottleneck bandwidth B as max (minw (e)_ij)),e_ij∈E_s。

The present application assumes that there is a spanning tree T_s', with a bottleneck value greater than T_sSince the bottleneck bandwidth of the spanning tree is the minimum bottleneck value of all paths from the leaf node to the root node, the bottleneck value of the spanning tree is B, and T is_s' in this example, the minimum bottleneck value of all paths from leaf node to root node is greater than T_sIs contradictory, because at T, there is a bottleneck value of B_sIn this example, each path from a leaf node to a root node is a maximum bottleneck path. Thus, the maximum bottleneck spanning tree T_s＝(V_s,E_s,W_s) Any other spanning tree containing node set M in (V, E, W) has the same or higher bottleneck bandwidth than G.

Therefore, the method constructs a maximum bottleneck spanning tree, and finds the maximum bottleneck path between each provider and each new receiver respectively. Using MaxBottlePath (v)_i,v_j) To be calculated to pair node v_iAnd v_jThe largest bottleneck path in between. In having a source node v_iAnd a target node v_jIn the weighted undirected graph G, each edge e_ij,e_ijE, its weight w (E)_ij),w(e_ij) e.W are arranged in descending order. From the source node v_iInitially, according to the weight of the undirected graph G, selecting an edge joining path from the graph G to form a path connection v_iAnd v_jWithout creating a loop. Thus, this path connects v_iAnd v_jBecause the most weighted edge is selected to connect the two nodes each time.

After obtaining the maximum bottleneck paths from each provider to the new receiver, a spanning tree, i.e. a maximum bottleneck spanning tree, can be formed according to the maximum bottleneck paths, and all the providers and the new receivers are connected. Compared with a tree structure scheme in the related technology, the maximum bottleneck spanning tree obtained by the method guarantees the maximum bottleneck bandwidth based on the global topology. In contrast, the tree structure scheme may cause congestion of some links in the actual topology, thereby reducing the transmission speed.

Taking the topology of FIG. 1 as an example, where v₀＝s₂Is a new one, and v₁＝s₃,v₂＝s₄And is v₃＝s₆A provider. As previously described, to build the maximum bottleneck spanning tree, a maximum bottleneck path is searched between the new caller and each provider. The following paths are obtained:

from v₁To v₀:s₃→w₂→s₅→s₄→s₁→s₂；

From v₂To v₀:s₄→s₁→s₂；

From v₃To v₀:s₆→s₅→s₄→s_j→s₂；

In this maximum bottleneck spanning tree, node s₅From v₁＝s₃And v₃＝s₆And aggregating the data blocks. v. of₂＝s₄Polymerized from s₅And s₄The data block of (1). The result is shown in fig. 3, where the bottleneck bandwidth B is max (minw (e)_ij))＝43,w(e_ij)∈W_s. In this way, it is guaranteed that the transmission bandwidth of the spanning tree is maximal.

The second stage of the present application is to reduce the transmission cost while ensuring the maximum bottleneck bandwidth. Since the transmission speed of the spanning tree is related to its bottleneck bandwidth. Finding the maximum bottleneck spanning tree may guarantee maximum transmission speed, but its transmission cost may be relatively high. Therefore, blindly finding the link with the largest available bandwidth often results in extra consumption, resulting in a transmission path that is too long and takes too many links. In fact, only links with a larger available bandwidth than the bottleneck bandwidth B of the spanning tree need to be found, which does not affect the transmission speed, but also a large amount of available bandwidth can be saved for other transmission tasks.

In constructing the maximum bottleneck spanning tree T_s＝(V_s,E_s,W_s) In time, the tree T is generated without considering the transmission cost_sThe objective of minimizing the transmission cost in equation (5) may not be satisfied, and therefore, it is necessary to further reduce the transmission cost without affecting the transmission speed.

Since the number of links in the spanning tree affects the transmission cost, it is necessary to find the maximum bottleneck spanning tree with the least number of links. In order to achieve this goal, the present application further needs to modify the original topology map G ═ (V, E, W) according to the maximum bottleneck bandwidth B, and specifically may adopt the function updatetopology (B). If the weight of one link in the weighted undirected graph G is lower than the threshold B, it is removed from the graph G. Thus, the weights of the remaining links are all greater than or equal to the bottleneck bandwidth B. Further, the modified graph is represented as updatetopology (b). Obviously, minw (e)_ij) Not less than B, wherein

Note that the threshold B is difficult to set manually in advance, and if the preset threshold is too large, the pruned graph G' may not connect all providers and newcomers, and conversely, if the preset threshold is too small, the transmission speed is limited.

Thus, the pruned graph G' can guarantee the connection between the provider and the newcomer. Wherein, the spanning tree T_sIs the largest bottleneck spanning tree in graph G, T_sAll providers and newcomers may be connected. At the same time, E_sThe edge weights in (1) are not less than B. That is, T_sAll edges in (a) are derived from graph G, i.e.

Thus, at least a spanning tree T is generated_sIt connects all the providers and newcomers in the graph G'. This is illustrated in FIG. T_sIn this case, a spanning tree must be found whose transmission cost is less than or equal to the maximum bottleneck spanning tree T_sConnecting the provider and the newcomer. Therefore, in order to reduce transmission costs, it is desirable to connect these providers and newcomers with as few links as possible.

In particular, reduce the outgoing costA simple method is to find the shortest path between each provider and the new receiver in G' in the graph. Defining a set of binding groups C, wherein C (e)_ij)＝1,c(e_ij) E.g. C. For each edge e of the graph G_ij∈E_cWeighted value of edge w (e)_ij) Replacement by c (e)_ij) E.g. C, to obtain a new graph G_c′＝(V,E_cAnd C). That is, graph G_cAll edges in' are weighted by 1. Then, in the graph G_c' finding the shortest paths from each provider to the new provider, these paths forming a spanning tree

Wherein

Nodes at path intersections aggregate data blocks. Spanning tree T_MIs equal to the maximum bottleneck bandwidth B, but the transmission cost is further reduced.

It can be seen through analysis that the above-mentioned deletion method can reduce the transmission cost, but the effect is not perfect.

The maximum bottleneck spanning tree with the lowest transmission cost is connected with all providers and newcomers, and the minimum links are consumed on the premise of ensuring that the bottleneck bandwidth is unchanged. In the figure T_MFinding this spanning tree allows for the addition of other intermediate nodes

Joining spanning tree T_MTo reduce transmission costs, which is the minimum steiner tree problem.

As shown in FIGS. 5 and 6, node v₁And v₂Is a provider, v₀Is a new one. V is respectively obtained by using a deleting method for obtaining the shortest path₁And v₀、v₂And v₀The shortest path between them. As can be seen from fig. 5, the transmission cost is 3+ 4-7, but if the data block is aggregated to the node v₃In this way, the transmission cost can be further reduced, as shown in fig. 6, which is 6. Therefore, it is necessary from a global perspectiveA spanning tree is constructed to minimize the total transmission cost. Therein, it should be understood that in practical applications, when the transmission link as shown in fig. 6 is employed, also at node v₃Go up to pair node v₁And v₂Aggregation of information to reduce newcomers v₀The pressure of the polymerization.

Thus, the method of the present application can be understood as: input data center topology G and set M ═ v₀,v₁,v₂,…,v_kAnd M comprises a provider and a newcomer. The output is a spanning tree T_MThe bottleneck bandwidth is the largest and the transmission cost is the lowest. Finding node v by MaxBottlePath function_iAnd v_jThe largest bottleneck path in between, and then the edge whose weight is smaller than the bottleneck bandwidth B is deleted from the graph G by UpdateTopology. The algorithm first calculates the maximum bottleneck path between the newcomer and each provider and then records their bottleneck values. The minimum bottleneck value is the maximum bottleneck bandwidth B of the maximum bottleneck spanning tree. And then, deleting the original topology according to the maximum bottleneck bandwidth B, and deleting the link with the bandwidth lower than that of the bottleneck B. Using the remaining topology, a minimal steiner tree is found connecting all providers and the new node.

Fig. 7 is a flowchart of a control method for data block repair according to an embodiment of the present invention. As shown in fig. 7, the control method for data block repair according to the embodiment of the present invention includes the following steps:

s101: and acquiring a topological structure of the data center and a repair information providing node and a target repair node for repairing the data block.

It should be noted that the repair information providing node is the provider, and the target repair node is the newcomer.

S102: and determining the maximum bottleneck bandwidth in the topological structure according to the repair information providing node and the target repair node.

Where the bottleneck bandwidth is the minimum available bandwidth in the link connecting the provider and the new receiver.

According to an embodiment of the present invention, as shown in fig. 8, step S102 further includes:

s201: and acquiring the bottleneck bandwidth between any two nodes in the topological structure.

S202: identifying a maximum value of a bottleneck bandwidth in a link connecting the repair information providing node and the target repair node.

S103: and determining the link with the minimum transmission cost in the topological structure according to the maximum bottleneck bandwidth.

According to an embodiment of the present invention, as shown in fig. 9, step S103 further includes:

s301: and deleting the link with the bottleneck bandwidth smaller than the maximum bottleneck bandwidth in the links connecting the repair information providing node and the target repair node.

S302: a minimum steiner tree connecting the repair information providing node and the target repair node is identified in the remaining links.

It should be further noted that, in the embodiment of the present invention, after the minimum steiner tree is obtained, the method further includes identifying a coincident node of at least two links in the minimum steiner tree, and aggregating information provided by a repair information providing node in a link at a first coincident node in the coincident node, for example, in a topology structure as shown in fig. 6, at a node v, aggregating information provided by repair information providing nodes in links₃Middle first pair of nodes v₁And node v₂The provided information is aggregated in the network and then is transmitted by the node v₃And transmitting the data block to the target node for data block repair.

S104: and repairing the data block according to the repair information providing node, the target repair node, the maximum bottleneck bandwidth and the link with the minimum transmission cost.

Specifically, the information provided by all the repair information providing nodes is aggregated at the target repair node to repair the data block.

It should be noted that, in the foregoing content of the present application, the method of the present application has been analyzed in detail, and is not repeated again.

Specifically, the above method can be implemented by using the following codes:

fig. 10 is a control apparatus for data block repair according to an embodiment of the present invention. As shown in fig. 10, the control apparatus 100 for data block repair includes an obtaining module 10, a first determining module 20, a second determining module 30, and a repairing module 30.

The obtaining module 10 is configured to obtain a topology structure of a data center and a repair information providing node and a target repair node for repairing a data block; the first determining module 20 is configured to determine a maximum bottleneck bandwidth in the topology according to the repair information providing node and the target repair node; the bottleneck bandwidth is the minimum available bandwidth of a link connecting the repair information providing node and the target repair node; the second determining module 30 is configured to determine, according to the maximum bottleneck bandwidth, a link with the minimum transmission cost in the topology; the repair module 30 is configured to repair the data block according to the repair information providing node, the target repair node, the link with the smallest maximum bottleneck bandwidth and the smallest transmission cost.

Further, in order to verify the advantages of the present application, the method proposed by the present application is further verified.

Where 100 trials were conducted comparing bottleneck bandwidth and transmission cost at 12 providers. As can be seen from fig. 11, the method proposed in the present application achieves higher bottleneck bandwidth than the tree structure method and the cost optimization method. Compared with a tree structure method, the method has the advantage that the bottleneck bandwidth is increased by 4.06 times on average. This is because the tree structure method does not consider the data center topology, and due to repeated occupation of links, the tree structure method is difficult to reach the theoretical bottleneck bandwidth value, so that the actual bottleneck bandwidth is only a little higher than the cost optimization method. The cost-optimized approach ignores the available bandwidth of the link, and the bottleneck bandwidth is always kept at a low value. The bandwidth optimization method can obtain the same bottleneck bandwidth as the present application, and thus, it is not shown in the figure.

Fig. 12 compares the transmission costs of the different methods. The tree structure approach incurs higher transmission costs due to the reuse of links. In contrast, the bandwidth optimization method utilizes the intermediate node to aggregate the data blocks, thereby avoiding the possibility of repeatedly occupying the link and reducing the transmission cost. However, this increases the link consumption since it always selects links with as high bandwidth available as possible. The goal of the cost-optimized approach is to build a spanning tree with the lowest transmission cost, and therefore its transmission cost is the lowest of the three approaches. The transmission cost generated by the method adopted by the application is slightly higher than that of the cost-optimized method, because the transmission cost is effectively reduced in the second stage of the application. Compared with the tree structure method, the method reduces the transmission cost by 80.86%.

To further evaluate the influence of the topology size, it can be seen in fig. 13 that the larger the topology size is, the higher the bottleneck bandwidth obtained by the present application is. The reason is that as the size of the topology grows, more links with high available bandwidth in the data center can be selected, which enables a better construction of spanning trees with larger bottleneck bandwidths.

As shown in fig. 14, the transmission cost of the method and the bandwidth optimization method proposed by the present application is low. In different topologies, although the bandwidth optimization method can obtain the same bottleneck bandwidth as the method provided by the application, the transmission cost is far higher than the method provided by the application. The larger the topology scale is, the more the transmission cost can be reduced by the method provided by the application. The reason is that the number of links with high available bandwidth increases, which provides more alternative links to reduce transmission costs.

FIG. 15 evaluates the impact of a change in the number of providers on the cost of transmission. As the number of providers increases, the transmission cost of the method proposed by the present application increases slightly. In different fat-tree topologies, the difference in transmission cost is not significant because the fat-tree topology is composed of three-tier switches. Because the sizes of different topologies of fat-trees are determined by the parameter k, and the change of k does not affect the distance between nodes in the topology.

The transmission costs for the different methods are shown in fig. 16. The transmission cost of the tree structure method and the bandwidth optimization method is much higher than that of the method and the cost optimization method provided by the application under different provider numbers. When the number of providers is small, the bandwidth-optimized approach may consume more transmission cost than the tree-structured approach, since its goal is to obtain the highest bottleneck bandwidth, ignoring the number of consumed links. As the number of providers increases, the gains from aggregation within the network make the transmission cost of the bandwidth-optimized approach lower than the tree-structured approach. The transmission cost used by the method provided by the application is only slightly higher than that of the cost-optimal method, but the bottleneck bandwidth increase is very obvious, and the effectiveness of the method provided by the application is proved.

In fig. 17, the transmission cost and the bottleneck bandwidth of the method proposed by the present application and the tree structure method are compared. Compared with the tree structure method, the bottleneck bandwidth saving rate of the method provided by the application is higher and higher along with the increase of the number of providers. Since the more providers, the greater the likelihood of link collisions in the tree structure approach, the higher the transmission cost. The gain in bottleneck bandwidth is more pronounced as the number of providers increases. This is because in the tree structure approach, the increased link collisions greatly reduce its actual available bandwidth. Especially, when there are 12 providers, compared with the tree structure method, the method provided by the application can improve the bottleneck bandwidth by 2.86 times and reduce the transmission cost by 81.68%.

Further experiments were conducted to evaluate the bottleneck bandwidth when the number of providers was 12, as shown in fig. 18. Similar to the result of fat-tree topology, the method provided by the application obtains the highest bottleneck bandwidth compared with a tree structure method and a cost optimization method. Fig. 19 compares the transmission costs of these four methods in BCube topology. It can be seen that the transmission cost of the method proposed by the present application is slightly higher than that of the cost-optimized method, but is lower than that of the bandwidth-optimized method and the tree structure method. Somewhat different from the results of the fat-tree topology, the method proposed by the present application consumes higher transmission costs. This is because changing paths in BCube is likely to increase the path length, whereas in fat-tree, there are multiple paths with the same length between nodes. Nevertheless, the transmission cost of the proposed method is still significantly lower than the bandwidth optimized method and the tree structure method.

It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

In the above description, the technical details of patterning, etching, and the like of each layer are not described in detail. It will be understood by those skilled in the art that layers, regions, etc. of the desired shape may be formed by various means known in the art. In addition, in order to form the same structure, those skilled in the art can also design a method which is not exactly the same as the method described above.

The invention has been described above with reference to embodiments thereof. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. The scope of the invention is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the invention, and these alternatives and modifications are intended to be within the scope of the invention.

Although the embodiments of the present invention have been described in detail, it should be understood that various changes, substitutions, and alterations can be made hereto without departing from the spirit and scope of the invention.

It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims

1. a control method for data block repair, is characterized in that, comprises:

Obtain the topology of the data center and the repair information providing nodes and target repair nodes for repairing data blocks;

determining a maximum bottleneck bandwidth in the topology according to the repair information providing node and the target repair node;

Obtain the bottleneck bandwidth between any two nodes in the topology structure;

Identifying the maximum value of the bottleneck bandwidth in the link connecting the repair information providing node and the target repair node;

The bottleneck bandwidth is the minimum available bandwidth of the link connecting the repair information providing node and the target repair node;

According to the maximum bottleneck bandwidth, determine the link with the smallest transmission cost in the topology structure;

Delete the link whose bottleneck bandwidth is less than the maximum bottleneck bandwidth in the link connecting the repair information providing node and the target repair node;

identifying a minimum Steiner tree connecting the repair information providing node and the target repair node in the remaining links;

Repairing the data block according to the repair information providing node, the target repair node, the maximum bottleneck bandwidth and the link with the smallest transmission cost;

The information provided by all the repair information providing nodes is aggregated at the target repair node to repair the data block.

2. The control method for data block repairing according to claim 1, characterized in that,

The MaxBottlePath function is used to identify the maximum value of the bottleneck bandwidth in the link connecting the repair information providing node and the target repair node.

3. The control method for data block repair according to claim 1, wherein,

The UpdateTopology (B) function is used to delete a link whose bottleneck bandwidth is less than the maximum bottleneck bandwidth in the link connecting the repair information providing node and the target repair node.

4. The control method for data block repair according to claim 3, further comprising:

identifying coincident nodes of at least two links in the minimal Steiner tree;

At the first coincident node among the coincident nodes, the information provided by the repair information providing node in the link is aggregated.