CN116302406A

CN116302406A - Flow control and data replication method, node, system and storage medium

Info

Publication number: CN116302406A
Application number: CN202310164756.9A
Authority: CN
Inventors: 卢玥; 孔伟康; 庄灿伟; 王竹凡; 杨绣; 董元元
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2023-02-10
Filing date: 2023-02-10
Publication date: 2023-06-23
Also published as: WO2024164894A1

Abstract

The embodiment of the application provides a flow control and data replication method, a node, a system and a storage medium. In the embodiment of the application, the flow control node periodically predicts the flow requirements of the data read-write task in the next flow control period and the flow requirements of the data replication task in the next flow control period, and the flow quota required by executing the data replication task can be reasonably and dynamically distributed to the working nodes through dynamic sensing of the two flow requirements and is issued to each working node to realize the flow control of the data replication task. In the flow control process, the flow quota distributed to the working node is adaptively adjusted according to the change of the flow demand of the data read-write task, so that the influence of the data copying process on the data read-write performance can be reduced, the data copying can be completed as soon as possible, the reliability of the system is ensured, and the service quality and the resource utilization rate of the system are improved; in addition, the adjustment of the flow quota does not need manual intervention, so that the labor cost of operation and maintenance can be reduced.

Description

Flow control and data replication method, node, system and storage medium

Technical Field

The present disclosure relates to the field of database technologies, and in particular, to a method, a node, a system, and a storage medium for controlling flow and copying data.

Background

In the distributed storage system, when a certain storage node fails to read data, the data which cannot be read is recovered from other storage nodes storing the data through a data copying technology, and the data is copied into another normal storage node, so that data redundancy backup is realized, data loss is effectively avoided, and the reliability of the system is further improved. Moreover, the greater the Input/Output (IO) traffic of the storage node occupied by the data replication process, the faster the missing replica data is recovered, and the shorter the time the system reliability is reduced due to storage node failure.

In the process of data copying, a client of the distributed storage system can still perform data read-write operation on the distributed storage system, the data copying operation and the data read-write operation share the disk flow of a storage node in the system, and if the disk flow resources occupied by the data copying process are more, the disk flow resources which can be provided for the read-write operation by the system are fewer, so that the data read-write performance is affected. How to reduce the influence of the data copying process on the read-write performance, complete the data copying as soon as possible, and ensure the reliability of the system is a great challenge faced by the distributed storage system.

Disclosure of Invention

Aspects of the present application provide a flow control and data replication method, node, system, and storage medium, which are used to reduce the influence of the data replication process on the data read-write performance, complete the data replication as soon as possible, and ensure the reliability of the system.

In a first aspect, an embodiment of the present application provides a flow control method, applied to a flow control node in a storage system, where the method includes: predicting a first flow demand corresponding to a data read-write task in a next flow control period according to IO flow consumed by at least one client for executing the data read-write task on a storage node in at least one previous flow control period, wherein the previous flow control period is a flow control period before the next flow control period; predicting a second flow demand corresponding to the data replication task in the next flow control period according to IO flow consumed by at least one working node for executing the data replication task between storage nodes in at least one previous flow control period; distributing target flow quota of each working node in a next flow control period from IO flow of the storage system according to the first flow requirement and the second flow requirement; and providing the target flow quota of each flow control period to the at least one working node, so that the at least one working node can execute the data replication task in the next flow control period according to the respective target flow quota.

In a second aspect, an embodiment of the present application further provides a data replication method, applied to any working node in a storage system, where the method includes: determining the priority of a data copying task according to the data loss state corresponding to the data copying task to be executed; receiving a target flow quota of the working node in a next flow control period, which is provided by a flow control node in the system; executing the data replication task between storage nodes associated with the data replication task in the system in the next flow control period according to the target flow quota and the priority of the data replication task; the target flow quota is allocated from the IO flow of the storage system by the flow control node according to a first flow requirement corresponding to a data read-write task in a predicted next flow control period and a second flow requirement corresponding to a data replication task in the next flow control period.

In a third aspect, an embodiment of the present application provides a flow control device applied to a flow control node in a storage system, where the device includes:

the first prediction module is used for predicting a first flow demand corresponding to a data read-write task in a next flow control period according to IO flow consumed by at least one client for executing the data read-write task on a storage node in at least one previous flow control period, wherein the previous flow control period is a flow control period before the next flow control period;

The second prediction module is used for predicting a second flow demand corresponding to the data replication task in the next flow control period according to the IO flow consumed by the at least one working node for executing the data replication task between the storage nodes in the at least one previous flow control period;

the flow quota allocation module is used for allocating target flow quota of each working node in a next flow control period from IO flow of the storage system according to the first flow requirement and the second flow requirement;

and the flow quota providing module is used for providing the target flow quota of each working node in the next flow control period to the at least one working node so that the at least one working node can execute the data replication task in the next flow control period according to the respective target flow quota.

In a fourth aspect, an embodiment of the present application provides a data replication device, applied to any working node in a storage system, where the device includes:

the priority determining module is used for determining the priority of the data copying task according to the data loss state corresponding to the data copying task to be executed;

the flow resource management module is used for receiving a target flow quota of the working node in the next flow control period, which is provided by the flow control node in the system;

The flow resource access control module is used for executing the data replication task between storage nodes associated with the data replication task in the system in the next flow control period according to the target flow quota and the priority of the data replication task;

the target flow quota is allocated from the IO flow of the storage system by the flow control node according to a first flow requirement corresponding to a data read-write task in a predicted next flow control period and a second flow requirement corresponding to a data replication task in the next flow control period.

In a fifth aspect, an embodiment of the present application provides a flow control method, applied to a flow control node in a service cluster, where the method includes:

according to IO flow consumed by at least one client-side executing a first task on a service node in at least one prior flow control period, predicting a first flow demand corresponding to the first task in a next flow control period, wherein the prior flow control period is a flow control period before the next flow control period;

predicting a second flow demand corresponding to a second task in a next flow control period according to IO flow consumed by at least one working node for executing the second task between service nodes in at least one previous flow control period;

Distributing target flow quota of each working node in a next flow control period from IO flow of the service cluster according to the first flow requirement and the second flow requirement;

and providing the target flow quota of each flow control period to the at least one working node, so that the at least one working node can execute a second task in the next flow control period according to the respective target flow quota.

In a sixth aspect, an embodiment of the present application provides a fluidic node, including: a memory and a processor; the memory is used for storing a computer program; the processor is coupled to the memory for executing the computer program for performing the steps in the method provided in the first or fifth aspect.

In a seventh aspect, embodiments of the present application provide a working node, including: a memory and a processor; the memory is used for storing a computer program; the processor is coupled to the memory for executing the computer program for performing the steps in the method provided in the second aspect.

In an eighth aspect, embodiments of the present application provide a storage system, including: a plurality of storage nodes providing data storage services for at least one client, at least one worker node provided in the seventh aspect for performing data replication tasks between the storage nodes, and a flow control node provided in the sixth aspect.

In a ninth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to implement the steps in the method provided in the first or second aspect.

In the embodiment of the application, in a storage system, a flow control node predicts the flow demand corresponding to a data read-write task in a next flow control period according to the IO flow consumed by a client to execute the data read-write task on the storage node in at least one previous flow control period, predicts the flow demand corresponding to a data copy task in the next flow control period according to the IO flow consumed by a working node to execute the data copy task between the storage nodes in at least one previous flow control period, and can reasonably and dynamically allocate the flow quota required by executing the data copy task to the working node by dynamically sensing the two flow demands, and issue the flow quota to each working node to realize the flow control of the data copy task. In the flow control process, the flow quota distributed to the working node can be adaptively adjusted by considering the flow demand of the data read-write task, so that the influence of the data copying process on the data read-write performance can be reduced, the data copying can be completed as soon as possible, the reliability of the system is ensured, and the service quality and the resource utilization rate of the system are improved; in addition, the adjustment of the flow quota does not need manual intervention, so that the labor cost of operation and maintenance can be reduced.

Further, in the embodiment of the present application, the working node has the capability of determining the priority of the data replication task, so that when the data replication task is executed according to the allocated flow quota, the priority of the data replication task can be considered at the same time, the data replication task with high priority can be executed preferentially, replication of severely missing data can be completed preferentially, and the data reliability of the system is further improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:

fig. 1 is a schematic structural diagram of a storage system 100 according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a flow management model according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of an internal structure of a working node according to an embodiment of the present application;

fig. 4 is a schematic diagram of scheduling a data replication task according to a flow quota and priority according to an embodiment of the present application;

fig. 5 is a schematic internal structure of a fluidic node according to an embodiment of the present disclosure;

Fig. 6 is a schematic flow chart of a flow control method according to an embodiment of the present application;

fig. 7a is a schematic flow chart of a data replication method according to an embodiment of the present application;

fig. 7b is a flow chart of a flow control method for a service cluster according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of another fluidic node according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of another working node according to an embodiment of the present application.

Detailed Description

For the purposes, technical solutions and advantages of the present application, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.

In a distributed storage system, under the condition that a data copying task and a data reading and writing task coexist, the data reading and writing task is hoped not to be interfered by a data copying process, the data copying process is hoped to be completed as soon as possible, so that the reliability of the system is ensured, and when the IO flow required by the data reading and writing task is not high, the IO flow occupied by the data copying task can be properly improved, and the utilization rate of storage resources is effectively improved. In response to this need for a distributed storage system, embodiments of the present application provide a flow control and data replication scheme for a distributed storage system.

In the scheme of the embodiment of the application, the flow control node periodically predicts the flow requirements of the data read-write task in the next flow control period and the flow requirements of the data replication task in the next flow control period according to the flow control period, and the flow quota required by executing the data replication task can be reasonably and dynamically distributed to the working nodes through dynamic perception of the two flow requirements and is issued to each working node to realize the flow control of the data replication task. In the flow control process, the flow quota distributed to the working node is adaptively adjusted according to the change of the flow demand of the data read-write task, so that the influence of the data copying process on the data read-write performance can be reduced, the data copying can be completed as soon as possible, the reliability of the system is ensured, and the service quality and the resource utilization rate of the system are improved; in addition, the adjustment of the flow quota does not need manual intervention, so that the labor cost of operation and maintenance can be reduced.

Optionally, when the flow demand of the data read-write task in the next flow control period is not high, the flow quota allocated to the working node can be dynamically increased, and the idle IO flow is dynamically inclined to the working node for the data replication task, so that the data replication can be completed as soon as possible, and the reliability of the system is ensured; when the flow demand of the data read-write task in the next flow control period is increased, the flow quota distributed to the working node can be dynamically reduced, and more IO flow is inclined to the client side for the data read-write task, so that the influence of the data replication task on the data read-write performance is reduced.

Further, in the scheme of the embodiment of the application, the working node has the capability of determining the priority of the data replication task, so that when the data replication task is executed according to the allocated flow quota, the priority of the data replication task can be considered at the same time, the data replication task with high priority can be executed preferentially, replication of severely missing data can be completed preferentially, and the data reliability of the system is further improved.

The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.

Fig. 1 is a schematic structural diagram of a storage system 100 according to an embodiment of the present application. As shown in fig. 1, the system 100 includes: multiple storage nodes 101, the multiple storage nodes 101 being distributed deployed on different physical devices. The physical device carrying the storage node 101 may be any electronic device with a certain storage medium, for example, but not limited to: the server, gateway device, base station, etc. in the machine room or the cluster may be various terminal devices such as mobile phone, notebook computer, desktop computer, tablet computer, intelligent sound equipment, etc., or may be special storage device.

In the present embodiment, the type of storage medium used by the storage node 101 is not limited, and it may be implemented by any type of volatile or nonvolatile storage device or a combination thereof, such as Static Random-Access Memory (SRAM), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read Only Memory, EEPROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.

Based on the data storage function of storage node 101, storage system 100 of the present embodiment may provide data storage services for at least one client. The client of the storage system 100 may be any object with a data access requirement, for example, may be an application system, an application program, a functional module or a process with a data access requirement, or may be various cloud computing instances with a data access requirement in a proprietary cloud, a public cloud or an edge cloud network. When any client needs to write data into the storage system 100, a write request can be submitted to the storage system 100, and the storage system 100 writes the data to be written into the corresponding storage node 101 according to the write request; when any client needs to read data from the storage system 100, a read request may be submitted to the storage system 100, and the storage system 100 reads the data required by the client from the corresponding storage node 101 according to the read request and provides the data to the client.

To better provide data storage services for clients, as shown in FIG. 1, the storage system 100 further includes: a Master node (Master) 102, at least one Worker node (Worker) 103, and a flow control node 104. In the deployment implementation, the master node 102, the working node 103 and the flow control node 104 may be deployed on any physical device, and of course, the master node 102, the working node 103 and the flow control node 104 may be deployed on the same physical device or may be deployed on different physical devices, which is not limited. In addition, the physical device carrying the master node 102, the working node 103, or the flow control node 104 may be a physical device carrying the storage node 101 at the same time, or may be another physical device not carrying any storage node 101, which is not limited thereto.

In this embodiment, the master node 102 is responsible for managing and maintaining various information related to the storage nodes 101, for example, monitoring and maintaining the usage status of each storage node 101, and further, in the case that the storage nodes 101 include a plurality of data disks (e.g. magnetic disks), monitoring and maintaining the usage status of each data disk on each storage node 101, further, maintaining a storage mapping relationship among the storage nodes, the data disks, and the data, where a record is recorded on which data disks each storage node includes, and which data is stored on each data disk. On the other hand, the master node 102 is located between the client and the storage nodes 101, and is responsible for receiving the read-write request initiated by the client, load balancing the read-write request to different storage nodes 101, and providing data access service for the client by different storage nodes 101.

Specifically, when any client needs to perform data reading and writing, a reading and writing request is sent to the storage system 100, where the reading and writing request of the client reaches the master node 102, and the reading and writing request is a summary of the reading request and the writing request, and includes two cases of the reading request and the writing request. For a write request, the master node 102 distributes the write request to a certain storage node 101 by adopting a load balancing algorithm, and the storage node 101 provides data storage service for a client, namely is responsible for storing data to be written by the client and updating the maintained storage mapping relation; for a read request, the master node 102 determines which storage node 101 the client needs to perform a read operation on according to the maintained storage mapping relationship, distributes the read request to the corresponding storage node 101, and the storage node 101 provides a data reading service for the client, namely, is responsible for reading data required by the client and returning the data to the client.

In the above procedure, the master node 102 aims at equalizing the read/write traffic on each storage node 101 as much as possible when performing the read request and the write request allocation. For example, in distributing write requests, write requests may be distributed as much as possible to storage nodes 101 that have fewer read requests or have not received read requests; in distributing the read request, if there are a plurality of storage nodes 101 storing data to be read, the storage node 101 having fewer write requests or not receiving the write request may be preferentially selected to provide the data read service for the client.

In the storage system 100, in order to ensure data reliability and system reliability, a data redundancy storage mode is adopted, that is, when data is written, redundancy backup of the data can be implemented on different storage nodes 101, so that if one storage node 101 fails, the data can be recovered or read from other storage nodes, and data loss is avoided. The present embodiment is not limited to a specific implementation of the data redundancy storage. In an alternative embodiment, a multiple copy redundancy storage manner may be adopted, that is, when writing data, multiple copies of the data are stored on different storage nodes 101 at the same time, and the number of the copies of the data is not limited in this embodiment, for example, two, three or more copies of the data may be stored. In another alternative embodiment, the purpose of redundant storage of data can be achieved by adopting an Erasure Code (EC) coding mode, and EC coding is a coding technology, which can add m data to n parts of original data and restore the original data by any n data in n+m parts, that is, if any data less than or equal to m parts fail, the original data can still be restored by the rest data, where n and m are positive integers.

In order to ensure redundant storage of data, the storage system 100 of this embodiment also supports a data replication technology, which is a technology for implementing data backup, and can ensure that the same piece of data stored on different storage nodes 101 is consistent. Based on the data replication technology, when a certain storage node or a data disk breaks down to cause that data cannot be read, the data can be obtained from other storage nodes (if a multi-copy redundancy storage mode is adopted, data copies can be directly read from other storage nodes, if an EC coding mode is adopted, n data on other storage nodes can be recovered based on the EC coding principle), and the data is copied into another normal storage node to replace the data in the fault storage node or the data disk, so that redundant storage of the data is ensured. The data can be effectively prevented from being lost by the data replication technology, and the reliability of the system is further improved.

To implement the data replication technique, the master node 102 further has the function of monitoring the usage status of each storage node 101 and the data disks on the storage nodes 101, and when a failure of a certain storage node 101 or a certain data disk is found, it means that the data disk that has failed or the data on the storage node 101 needs to be replicated, so that different data replication tasks are generated for the data, and distributed to at least one working node 103, and the working node 103 performs the data replication tasks between the storage nodes 101. Each data replication task involves identification information of a source storage node, a target storage node, and data to be replicated, and for the working node 103 performing the data replication task, the data to be replicated needs to be obtained from the source storage node according to the identification information of the data to be replicated, and then the data to be replicated is written into the target storage node. The manner in which the worker node 103 obtains the data to be replicated from the source storage node may vary depending on the manner in which the data is stored redundantly. If a multi-copy redundancy storage mode is adopted, the working node 103 can directly read the data copy from the source storage node, and if an EC coding mode is adopted, the working node 103 can recover the data to be copied by utilizing the n data stored on the source storage node based on the EC coding principle. The source storage node and the target storage node are normal storage nodes except for the storage node with faults, the source storage node is a normal storage node for storing copies of data to be copied or EC-encoded data, and the target storage node is a normal storage node for not storing copies of data to be copied or EC-encoded data.

In this embodiment, the determination modes of the source storage node and the target storage node are not limited, and the source storage node and the target storage node may be determined by the master node 102 according to the maintained storage mapping relationship and the use states of the storage nodes; alternatively, the working node 103 that receives the data replication task may determine the source storage node and the target storage node according to the storage mapping relationship and the use state of each storage node; the storage mapping relationship and the use state of each storage node may be provided to the working node 103 by the master node 102. The identification information of the data to be copied may be determined by the master node 102 and provided to the working node 103, or alternatively, the master node 102 may carry the identification information of the data to be copied as attribute information of the data copying task in the data copying task and provide the attribute information to the working node 103, or may separately provide the identification information of the data to be copied to the working node 103, which is not limited. Accordingly, in the scheme in which the master node 102 determines the source storage node and the target storage node, the master node 102 may carry the identification information of the source storage node and the identification information of the target storage node as attribute information of the data replication task in the data replication task and provide them to the working node 103, or may separately provide the identification information of the source storage node and the identification information of the target storage node to the working node 103, which is not limited.

In the present embodiment, the number of the working nodes 103 is not limited, and may be one or a plurality of. Alternatively, in the case where there are multiple worker nodes 103, the master node 102 may employ a load balancing algorithm to distribute the data replication tasks to the different worker nodes 103, targeting as much balancing of the data replication traffic on each worker node 103 as possible.

In this embodiment, the read-write performance that can be provided by the storage node 101 is limited, alternatively, the read-write performance that can be provided by the storage node 101 may be represented by the data amount that the storage node 101 supports read-write in a unit time (for example, 1 s), which is simply referred to as the IO traffic of the storage node. The IO traffic of different storage nodes 101 may be the same or different. Further, the storage node 101 may include one or more data disks, where each data disk has a respective IO traffic, and the IO traffic of the data disk may also be represented by the data volume that the data disk supports reading and writing in a unit time; the IO traffic of different data disks may be the same or different. The IO traffic of the storage node may be the sum of the IO traffic of the data disks included in the storage node, and the IO traffic of the storage system 100 is the sum of the IO traffic of the storage node included in the storage system. Here, the representation of the IO flow by the data amount supporting the reading and writing in the unit time is only one implementation, and is not limited to this, and for example, the IO flow may be represented by the number of times supporting the reading and writing in the unit time, or the IO flow may be represented by the number of times supporting the reading and writing in the unit time and the data amount supported by each reading and writing operation.

In this embodiment, no matter whether the client performs a data read/write task on the storage node 101 or the working node performs a data copy task between the storage nodes 101, the IO traffic of the storage nodes is consumed. From the perspective of data reading and writing task or data copying task alone, the more IO flow can be occupied, the faster the data reading and writing or data copying speed is, and the better the performance is. However, the IO traffic of the whole storage system 100 is limited, and in the case that the data read/write task and the data copy task share the IO traffic of the storage system, if the IO traffic occupied by the data copy task is more, the IO traffic that the system can provide to the data read/write task is reduced, which reduces the data read/write rate and reduces the data read/write performance. For the storage system 100, it is desirable to ensure the data read-write performance preferentially, reduce the interference of the data copy task process on the data read-write performance, and at the same time, also, when the IO traffic occupied by the data read-write task is not high, properly improve the IO traffic occupied by the data copy task, ensure that the data copy is completed as soon as possible, and further ensure the reliability of the system, thereby effectively improving the utilization rate of the system resources.

Therefore, in this embodiment, the flow control node 104 periodically senses the flow demands of the data replication task and the data read-write task according to the flow control period, and reasonably and dynamically distributes the flow quota required for executing the data replication task to the working nodes according to the change of the flow demands of the data read-write task by dynamically sensing the two flow demands, and issues the flow quota to each working node, so that each working node can execute the data replication task according to the distributed flow quota, thereby realizing the flow control of the data replication task. In this process, the flow control node 104 can adaptively adjust the flow quota allocated to the working node according to the change of the flow demand of the data read-write task, so that not only can the influence of the data copying process on the data read-write performance be reduced, but also the data copying can be completed as soon as possible when the flow demand of the data read-write task is not high, the reliability of the system is ensured, and the service quality and the resource utilization rate of the system are improved. For example, when the flow demand of the data read-write task is not high, the flow control node 104 can dynamically increase the flow quota allocated to the working node, and incline the idle IO flow resource to the working node for the data replication task, so that the data replication can be completed as soon as possible, and the reliability of the system is ensured; when the flow demand of the data read-write task increases, the flow control node 104 can dynamically reduce the flow quota allocated to the working node, and tilt more IO traffic to the client for the data read-write task, so as to reduce the influence of the data copy task on the data read-write performance. In addition, the flow quota in the embodiment is adaptively adjusted by the flow control node according to the change of the flow demand of the data read-write task, manual intervention is not needed, and the labor cost of operation and maintenance can be reduced.

In this embodiment of the present application, the flow control node 104 may be a newly added node in the system, or may be obtained by expanding the functions of the existing functional nodes in the system 100, which is not limited thereto. For example, a function expansion related to flow control may be performed on a QoS (Quality of Service) management node included in the storage system 100, and the QoS management node after the function expansion may be implemented as the flow control node 104 in the embodiment of the present application. The QoS management and control node after the function expansion still has the function of performing QoS related management and control on various resources in the storage system.

The process of flow control by the flow control node 104 for data replication tasks is described in detail below in connection with the system architecture shown in fig. 1:

in this embodiment, the flow control node 104, as a coordinated management node of the IO traffic resource of the storage system 100, has the following functions:

one of the functions is: according to the set flow collection period, each storage node 101 is periodically accessed to collect and collect the IO flow of each storage node 101, and the IO flow of each storage node 101 is summarized to obtain the IO flow of the storage system 100.

Alternatively, the flow control node 104 may periodically send a first flow request to each storage node 101, where each storage node sums the IO flow of each data disk according to the first flow request as the IO flow of the storage node, and reports the IO flow to the flow control node 104. Or, the flow control node 104 issues the flow collection period to each storage node 101 in advance, and each storage node 101 actively reports the respective IO flow to the flow control node 104 according to the flow collection period. After receiving the IO traffic reported by each storage node 101, the flow control node 104 sums the IO traffic of each storage node 101 to obtain the IO traffic of the storage system 100.

In this embodiment, the time length of the flow rate collection period is not limited, and may be, for example, 1 minute, 10 minutes, 1 hour, one day, three days, or one week, and the like, and may be flexibly set according to application requirements. The smaller the flow collection period, the more accurate the IO flow of the storage system 100 can be obtained by the flow control node 104, which is more beneficial to improving the accuracy of flow control. It should be noted that, the IO traffic of each storage node 101 will not generally change greatly unless special conditions such as node failure or serious performance degradation occur, so the time length of the traffic collection period may be appropriately longer, which is beneficial to reducing the resources consumed by collecting the IO traffic of the storage system 100. Preferably, the flow acquisition period is greater than the flow control period.

Here, if the IO traffic of each storage node 101 has higher stability, meaning that the IO traffic of the storage system 100 is unchanged or substantially unchanged, in this case, the flow control node 104 may maintain the IO traffic of the storage system 100 in advance, without periodically collecting the IO traffic of the storage system 100, that is, one of the functions is an optional function of the flow control node 104, not an optional function.

And the second function: maintaining a flow control period, and according to the flow control period, acquiring and maintaining IO traffic consumed by each client to perform data read-write tasks on the storage nodes 101 in each flow control period, which is denoted as first IO traffic, and acquiring and maintaining IO traffic consumed by each working node 103 to perform data copy tasks between the storage nodes 101 in each flow control period, which is denoted as second IO traffic. The first IO traffic consumed by the same client in different flow control periods may be the same or different, and the first IO traffic consumed by different clients in the same flow control period may be the same or different. Accordingly, the second IO traffic consumed by the same working node in different flow control periods may be the same or different, and the second IO traffic consumed by different working nodes in the same flow control period may be the same or different.

For any client, the first IO traffic refers to the IO traffic of storage node 101 consumed by the client to perform data read/write tasks on storage node 101 during one flow control period. In one flow control period, only a data reading task and only a data writing task are possible, or the data reading task and the data writing task are included at the same time, the data reading task and/or the data writing task can relate to different storage nodes, and the first IO flow comprises IO flow consumed by each data reading task and/or each data writing task on each storage node in the flow control period.

For any working node, the second IO traffic refers to the IO traffic of the storage node 101 consumed by the working node to perform a data replication task between the storage nodes 101 in one flow control period, where the second IO traffic includes the IO traffic consumed by reading data from one storage node and the IO traffic consumed by writing data to another storage node. In one flow control period, one or more data replication tasks may be involved, and the second IO traffic includes the IO traffic consumed by each data replication task in the flow control period on each storage node.

For each client, the data amount that is read and/or written from the storage node 101 in each flow control period can be counted, and the data amount divided by the time length of the flow control period is used as the first IO traffic consumed by the client to perform the data read-write task on the storage node 101 in each flow control period, and is reported to the flow control node 104. Accordingly, for each working node, the sum of the data amount read from the storage node 101 and the data amount written into the storage node 101 in each flow control period can be counted, and the sum of the data amount divided by the time length of the flow control period is used as the IO traffic consumed by the working node to perform the data replication task between the storage nodes 101 in each flow control period and reported to the flow control node 104.

Alternatively, according to the flow control period, the flow control node 104 may periodically send a second flow request to each client and each working node, where each client counts, according to the second flow request, a first IO flow consumed by the client to perform a data read-write task on the storage node 101 in a previous flow control period and reports the first IO flow to the flow control node 104, and each working node 103 counts, according to the second flow request, a second IO flow consumed by the client to perform a data copy task between the storage nodes 101 in the previous flow control period and reports the second IO flow to the flow control node 104. Or, the flow control node 104 issues the flow control period to each client and each working node in advance, each client actively counts the first IO flow consumed by the client to perform the data read-write task on the storage node 101 in each flow control period and reports the first IO flow to the flow control node 104, and each working node 103 actively counts the second IO flow consumed by the client to perform the data copy task between the storage nodes 101 in each flow control period and reports the second IO flow to the flow control node 104.

In this embodiment, the time length of the flow control period is not limited, and may be, for example, 1 second, 5 seconds, 1 minute, 5 minutes, or the like. The smaller the flow control period, the higher the accuracy of flow control and the higher the processing power requirement of the flow control node 104. The value of the flow control period can be flexibly set according to the requirement of flow control precision.

Third function: and according to the related information collected by the first function and the second function, periodically distributing flow quota for each working node according to the flow control period, and providing the flow quota for each working node so that each working node can execute data replication tasks according to the distributed flow quota.

In this embodiment, the manner in which the flow control node 104 periodically allocates flow quota for each working node includes: predicting the flow demand corresponding to the data read-write task in the next flow control period according to the first IO flow consumed by the client for executing the data read-write task on the storage node in at least one previous flow control period before the next flow control period, and recording the flow demand as the first flow demand, wherein for convenience of description and distinction, the flow control period before the next flow control period is called the previous flow control period, and predicting the flow demand corresponding to the data copy task in the next flow control period according to the second IO flow consumed by the at least one working node for executing the data copy task between the storage nodes in the at least one previous flow control period, and recording the flow demand as the second flow control demand; according to the first flow control requirement and the second flow control requirement, distributing target flow quota of each working node in a next flow control period from IO flow of the storage system 100, wherein the target flow quota refers to flow quota distributed to a corresponding working node for executing data replication tasks in the next flow control period; and providing the target flow quota of each flow control period to at least one working node, so that the at least one working node can execute the data copying task in the next flow control period according to the respective target flow quota.

The present embodiment is not limited to the "at least one previous flow control period" used, and may be, for example, the latest one or more previous flow control periods, the set previous flow control period, the flow control period with the same history, or the like. If at least one previous flow control period is the latest one or more previous flow control periods, predicting a first flow demand corresponding to the data read-write task in the next flow control period according to IO flow consumed by each client to execute the data read-write task on the storage node in the latest one or more previous flow control periods; and predicting a second flow demand corresponding to the data replication task in the next flow control period according to IO flow consumed by each working node for executing the data replication task between the storage nodes in the latest one or more previous flow control periods. If at least one previous flow control period is a flow control period in a specific period before the next flow control period, predicting a first flow demand corresponding to the data read-write task in the next flow control period according to IO flow consumed by each client to execute the data read-write task on the storage node in the previous flow control period in the specific period; and predicting a second flow demand corresponding to the data replication task in the next flow control period according to the IO flow consumed by each working node for executing the data replication task between the storage nodes in the previous flow control period in the specific period. Or if at least one previous flow control period is the flow control period with the most data reading and writing tasks occurring in the set period before the next flow control period, predicting the first flow requirement corresponding to the data reading and writing tasks in the next flow control period according to the IO flow consumed by each client in executing the data reading and writing tasks on the storage node in the flow control period with the most data reading and writing tasks; and predicting a second flow demand corresponding to the data replication task in the next flow control period according to the IO flow consumed by each working node for executing the data replication task between the storage nodes in the flow control period with the maximum data reading and writing tasks. In addition, the "at least one previous flow control period" used when predicting the first flow demand and the "at least one previous flow control period" used when predicting the second flow demand may be the same flow control period, or may be different flow control periods, and preferably the same flow control period is used.

In the present embodiment, a main problem is how to predict the first flow demand and the second flow demand, and how to allocate IO flow resources between the data read-write task and the data replication task according to the first flow demand and the second flow demand, for the flow control node 104. To facilitate predicting the first and second traffic demands, in this embodiment, a concept of a Resource Pool (Resource Pool) is defined, the Resource Pool is regarded as a logical abstraction of the IO traffic resources, the IO traffic resources between different Resource pools are isolated, and the IO traffic resources of the same Resource Pool are shared inside the Resource Pool. In this embodiment, the IO traffic Resource of the storage system 100 may be regarded as one Resource Pool. On the basis, this embodiment also provides a traffic management model, as shown in fig. 2, in which Resource Pool organizes and manages the IO traffic resources hierarchically according to a hierarchy of clusters (Cluster) - - > services) - - > clients (clients) and working nodes (workers) sequentially from top to bottom. The topmost layer is Cluster, representing IO traffic for storage system 100; the next layer of the Cluster is Service, which includes front end Service (front end Service) and background Service (Background Service), wherein front end Service represents IO traffic of data read-write tasks, and the data read-write tasks can be data read-write tasks generated by any one of elastic block storage (Elastic Block Store, EBS), object storage (Object Storage Service, OSS), form storage (Open Table Service, OTS), log Service (SLS) and the like; background Service denotes the IO traffic of the data replication task. The bottom layer is Client and Worker, client uses IO traffic resources under front Service, and Worker uses IO traffic resources under Background Service.

Based on the flow management model, the flow control node 104 may predict the first flow requirement and the second flow requirement in a bottom-up manner, and allocate the target flow quota in the next flow control period to each working node in a top-down manner.

The method for predicting the first flow demand comprises the following steps: and generating a first flow demand according to the hierarchical relationship of front Service-Client according to the first IO flow consumed by the at least one Client for executing the data read-write task on the storage node in at least one previous flow control period. Wherein the first flow requirement is generated according to the front Service- > Client hierarchical relationship, the following embodiments may be adopted but not limited to:

in an alternative embodiment A1, the flow requirement of at least one client in the next flow control period may be predicted according to the first IO flow consumed by the at least one client in the at least one previous flow control period to perform the data read-write task on the storage node; and summarizing the flow requirements of at least one Client in the next flow control period according to the hierarchical relationship of front Service-Client to obtain a first flow requirement. Taking at least one previous flow control period as an example, the flow control node 104 may receive the first IO flow reported by at least one client, and predict, according to the first IO flow of each client, a flow demand of each client in a next flow control period; and then, according to the hierarchical relationship of front Service-Client, summarizing the flow requirements of each Client in the next flow control period to obtain a first flow requirement.

The above manner of predicting the traffic demand of each client in the next traffic control period according to the first IO traffic of each client includes, but is not limited to: directly taking the first IO flow of the client as the flow demand of the client in the next flow control period; or, according to a set numerical calculation mode, performing numerical calculation on the first IO flow of the client, and taking the result of the numerical calculation as the flow demand of the client in the next flow control period. For example, the first IO traffic of the client may be multiplied by a coefficient, which may be greater than 1 or less than 1, or an exponential calculation may be performed on the first IO traffic of the client, or the like.

The method for summarizing the flow requirements of at least one client in the next flow control period to obtain the first flow requirement includes, but is not limited to: directly summing the flow demands of each client in the next flow control period, and taking the summed result as a first flow demand; or, according to a preset weight coefficient, carrying out weighted summation on the flow demands of each client in the next flow control period, and taking the weighted summation result as a first flow demand.

In an alternative embodiment A2, according to the hierarchical relationship of Frontend Service — > Client, the first IO traffic consumed by the Client in performing the data read-write task on the storage node in at least one previous flow control period is summarized to obtain the first traffic demand. Taking at least one previous flow control period as an example, the flow control node may receive the first IO flow reported by at least one Client, and summarize the first IO flow of each Client according to the hierarchical relationship of front Service- > Client to obtain a first flow requirement.

The above manner of summarizing the first IO traffic of each client to obtain the first traffic demand includes, but is not limited to: directly summing the first IO flows of all clients, and taking the sum result as a first flow demand; or, according to a preset weight coefficient, carrying out weighted summation on the first IO flows of the clients, and taking the weighted summation result as a first flow demand.

Similarly, the manner in which the second flow demand is predicted includes: and generating a second flow demand according to the hierarchical relationship Background Service- > Worker according to the second IO flow consumed by the at least one working node for executing the data replication task between the storage nodes in at least one previous flow control period. Wherein the second flow requirement is generated according to the hierarchical relationship Background Service- > workbench, the following implementation may be adopted but not limited to:

in an alternative embodiment B1, the flow requirement of the at least one working node in the next flow control period may be predicted according to the second IO flow consumed by the at least one working node to perform the data replication task between the storage nodes in the at least one previous flow control period; and according to the hierarchical relationship of Background Service- > workbench, summarizing the flow requirements of at least one working node in the next flow control period to obtain a second flow requirement. Taking at least one previous flow control period as an example, the flow control node 104 may receive the second IO flow reported by at least one working node, and predict, according to the second IO flow of each working node, a flow demand of each working node in a next flow control period; and then, according to the hierarchical relationship of Background Service < - > workbench, summarizing the flow requirements of each working node in the next flow control period to obtain a second flow requirement.

The above manner of predicting the flow demand of each working node in the next flow control period according to the second IO flow of each working node includes, but is not limited to: directly taking the second IO flow of the working node as the flow demand of the working node in the next flow control period; or, according to a set numerical calculation mode, performing numerical calculation on the second IO flow of the working node, and taking the result of the numerical calculation as the flow demand of the working node in the next flow control period. For example, the second IO flow rate may be multiplied by a certain coefficient, which may be greater than 1 or less than 1, or an exponential calculation may be performed on the second IO flow rate.

The method for summarizing the flow demand of at least one working node in the next flow control period to obtain the second flow demand includes, but is not limited to: directly summing the flow demands of each working node in the next flow control period, and taking the summed result as a second flow demand; or, according to a preset weight coefficient, carrying out weighted summation on the flow demands of each working node in the next flow control period, and taking the weighted summation result as a second flow demand.

In an alternative embodiment B2, the second IO traffic consumed by the working nodes to perform the data replication task between the storage nodes in at least one preceding flow control period is summarized according to the hierarchical relationship Background Service — > Worker to obtain a second traffic demand. Taking at least one previous flow control period as an example, the flow control node can receive the second IO flow reported by at least one working node, and summarize the second IO flow of each working node according to the hierarchical relationship of Background Service- > Worker to obtain a second flow demand.

The above manner of summarizing the second IO traffic of each working node to obtain the second traffic demand includes, but is not limited to: directly summing the second IO flows of all the working nodes, and taking the sum result as a second flow demand; or, according to a preset weight coefficient, carrying out weighted summation on the second IO flows of the working nodes, and taking the weighted summation result as a second flow demand.

Based on the predicted first and second flow demands, the flow control node 104 may allocate a target flow quota for each of the at least one working node in a next flow control period from the IO flows of the storage system 100 according to the first and second flow demands. Wherein the target traffic quota of each of the at least one working node in the next flow control period may be allocated in a manner that is not limited to the following alternative embodiments.

In an alternative embodiment C1, the flow control node 104 may determine, according to the first flow requirement and the second flow requirement, a global flow quota for the data replication task in the next flow control period from the IO flow of the storage system; the global traffic quota is then assigned to the at least one worker node to obtain a traffic quota required by the at least one worker node to each perform a data replication task in a next streaming cycle.

In an alternative embodiment C2, the flow control node 104 may directly determine, according to the first flow requirement and the second flow requirement, the target flow quota of each working node in the next flow control period from the IO flow of the storage system by combining the flow requirements of each working node in the next flow control period. The foregoing embodiments may be referred to as the obtaining manner of the flow demand of each working node in the next flow control period, and will not be described herein.

Further, in alternative embodiment C1, the implementation of determining a global traffic quota for a data replication task in a next streaming cycle is not limited. For example, the weight ratio between the data read-write task and the data copy task in the next flow control period can be obtained; according to the weight ratio between the data read-write task and the data replication task in the next flow control period, the IO flow of the storage system 100 is distributed between the data read-write task and the data replication task so as to obtain a first initial flow quota and a second initial flow quota; the first initial flow quota is an initial flow quota allocated to a data read-write task in a next flow control period, and the second initial flow quota is an initial flow quota allocated to a data copy task in the next flow control period; if the first initial flow quota is greater than the first flow demand and the second initial flow quota is less than the second flow demand, continuing to allocate the flow quota for the data replication task from the flow difference between the first initial flow quota and the first flow demand until the flow quota is greater than or equal to the second flow demand or the flow difference is allocated to obtain the global flow quota for the data replication task in the next flow control period. The global flow quota may be considered a flow quota for Background Service in the next flow control period.

In this embodiment, the manner of acquiring the weight ratio between the data read-write task and the data copy task in the next streaming control period is not limited. Optionally, the weight ratio between the data read-write task and the data replication task may be preset according to the application requirement, and then the preset weight ratio between the data read-write task and the data replication task may be directly obtained. When the weight duty ratio is set, if the read-write performance of the data read-write task is expected to be guaranteed preferentially, the weight duty ratio of the data read-write task can be set to be higher than the weight duty ratio of the data copying task; if it is desired to preferentially ensure the performance of the data copying task, the weight ratio of the data copying task can be set higher than the weight ratio of the data reading and writing task.

Or alternatively

Optionally, the weight ratio between the data read-write task and the data replication task in the next flow control period may be determined according to the first flow requirement corresponding to the data read-write task in the next flow control period and the second flow requirement corresponding to the data replication task in the next flow control period. In this manner, the traffic demand of the data read-write task and the data replication task is dynamically changed in different streaming cycles, so the weight ratio between the data read-write task and the data replication task is also dynamically changed in different streaming cycles. For example, in the case where the first flow demand is greater than the second flow demand, or the first flow demand is greater than a set first threshold, or in the case where the first flow demand is less than the second flow demand but the difference from the second flow demand is less than a set second threshold, it may be determined that the weight ratio of the data read-write task is higher than the weight ratio of the data copy task in the next flow control period; otherwise, it can be determined that the weight ratio of the data read-write task in the next streaming control period is lower than the weight ratio of the data copy task. In this embodiment, the specific value of the weight ratio between the data read/write task and the data copy task is not limited, and for example, the weight ratio of the data read/write task is 70%, the weight ratio of the data copy task is 30%, or the weight ratio of the data read/write task is 60%, and the weight ratio of the data copy task is 40%. Or determining the weight ratio between the data read-write task and the data replication task according to the ratio of the first flow requirement to the second flow requirement.

After calculating the global flow quota for the data replication task in the next flow control period, the global flow quota may be assigned to at least one working node to obtain a target flow quota for each working node in the next flow control period. Optionally, the global flow quota may be averagely distributed to at least one working node, or may be distributed to at least one working node in a random distribution manner, or may be distributed to at least one working node according to a flow requirement of at least one working node in a next flow control period. The flow demand of at least one working node in the next flow control period can be predicted according to the second IO flow consumed by the at least one working node in the at least one previous flow control period for executing the data replication task between the storage nodes, and the specific prediction mode can be referred to the foregoing embodiments and is not described herein.

Further alternatively, a weighted maximum minimum fairness (weighted max min fairness) algorithm may be employed to assign a global traffic quota to at least one worker node based on the traffic demand of the at least one worker node in the next flow control period. Specifically, according to the flow demand of at least one working node in the next flow control period, determining the weight ratio between the at least one working node; distributing the global flow quota among the at least one working node according to the weight ratio among the at least one working node so as to obtain the flow quota of the at least one working node; judging whether a first working node with a flow quota larger than a corresponding flow requirement and a second working node with a flow quota smaller than the corresponding flow requirement exist in at least one working node at the same time; if the first working node with the flow quota larger than the corresponding flow requirement and the second working node with the flow quota smaller than the corresponding flow requirement are judged, the flow quota exceeding the corresponding flow requirement distributed by the first working node is continuously distributed among the second working nodes according to the weight ratio among the second working nodes until the flow quota distributed by each working node is larger than the corresponding flow requirement or no working node with the flow quota larger than the corresponding flow requirement exists, and the target flow quota of each working node in the next flow control period is obtained.

In the allocation process of the target flow quota, when the front Service and the Background Service are allocated, the weight ratio of the two services and the flow demand of the two services in the next flow control period are considered, so that the method has the following advantages: when the traffic demands of the front Service and the Background Service are large and the situation of competing for IO traffic resources occurs, more important (i.e. high weight ratio) front Service can be ensured to be distributed to more traffic resources; when the traffic demand of the front Service is smaller, the traffic resource allocated to Background Service can be moderately increased, so as to improve the data replication efficiency, and further improve the utilization rate of the IO traffic resource in the storage system. Further, when the flow quota is allocated between different works, the flow demand of each work in the next flow control period is considered, so that the flow allocation can be reasonably carried out between different works, more flow quota can be allocated to works with larger flow demand, and less flow quota can be allocated to works with smaller flow demand.

In this description, considering that the data read-write task compares performance indexes such as delay and throughput, even if the front Service responds in time from a smaller flow requirement to the occurrence of burst flow, the performance indexes such as the read-write delay and the throughput are guaranteed, in order to avoid adverse effects on the read-write performance caused by performing flow control on the front Service, in the above-mentioned target flow quota allocation flow, a flow quota may not be set for each Client under the front Service. In addition, the frequency of acquiring IO flow interactively between the flow control node and the Client or the Worker (for example, acquiring once in 1 s) can be increased by setting a reasonable flow control period, so that the flow control node can timely sense the rising of the flow demand of the front Service, timely reduce the flow quota distributed to the Worker, and ensure the performance indexes such as read-write time delay, throughput and the like.

After obtaining the target flow quota for each working node in the next flow control period, each working node may be provided with its target flow quota in the next flow control period. Optionally, the fluidic node 104 may send each working node its target flow quota in the next fluidic cycle; alternatively, each worker node may periodically request its target flow quota from the fluidic node for the next fluidic cycle. The working node 103 is an executing node for realizing data replication and related flow control in the storage system 100, on one hand, according to the flow control period, the second IO flow consumed by executing the data replication task in each flow control period can be counted periodically and reported to the flow control node 104; on the other hand, the target flow quota of the working node in the next flow control period provided by the flow control node 104 can be received, and the data replication task is executed in the next flow control period according to the target flow quota. The process of executing the data replication task in the next flow control period by each working node according to the received target flow quota is the same or similar, and a process of executing the data replication task in the next flow control period by any working node according to the target flow quota is described below by taking any working node as an example.

In this embodiment, the number of data loss states are divided into different data loss states, the data loss states represent the severity of data loss, and the different data loss states represent the severity of data loss. Alternatively, taking the example of multi-copy redundant storage, one can divide into the following data loss states:

normal state, noted Normal: indicating that no copy of the data is lost;

a first data loss state, noted LessMax: indicating that there is a data Copy missing, but the current Copy number is greater than a preset minimum Copy number (Min Copy);

a second data loss state, denoted LessMin: indicating that the data Copy is lost, wherein the current Copy number is smaller than the preset Min Copy;

a third data loss state, noted 1Replica: indicating that the data copy is lost, and only one data copy is left, and the user data is thoroughly lost when the data copy is lost again;

a fourth data loss state, denoted as NoneCopy: indicating that all copies of the data have been lost and that the user data has been lost entirely.

The above states are gradually serious from top to bottom, the reliability of the system is gradually threatened, and the need for completing data copy recovery by data copy is urgent. Aiming at the scene, the working node of the embodiment has the function of marking the priority of the data copying task, and can mark different priority labels for the data copying tasks in different data loss states. For example, 4 priority tags from high to low may be set for P1-P4, with the 1Replica state data replication task labeled P1 priority, the LessMin state data replication task labeled P2 priority, the LessMax state data replication task labeled P3 priority, and the Normal state data replication task labeled P4 priority. Regarding the NoneCopy state, since all the data copies are lost, the data copying cannot be completed, and the data copying task corresponding to the state can be ignored or a prompt message can be output for related personnel (such as a system maintainer or a client) to perform manual intervention, which is not limited.

Based on the above, when generating the data replication task, the master node 102 may further determine a data loss state corresponding to the data replication task, and provide the data loss state as attribute information of the data replication task to the working node 103; the working node 103 receives the data replication task distributed by the master node 102, and acquires the data loss state corresponding to the data replication task from the attribute information of the data replication task; and determining the priority of the data replication task according to the data loss state corresponding to the data replication task. Based on this, the worker node 103 may perform the data replication task between the storage nodes associated with the data replication task in the next streaming cycle according to both its target traffic quota and the priority of the data replication task to be performed in the next streaming cycle received. The priority of the data replication task determines the scheduled execution sequence of the data replication task, the target flow quota is used for performing flow control on the data replication task, a certain amount of flow quota is consumed for each data replication task executed, and the target flow quota is continuously consumed along with the continuous execution of the data replication task until the target flow quota is consumed or the residual available amount is insufficient for executing the subsequent data replication task. For example, the target traffic quota is 10, the first data replication task consumes 2, the second data replication task consumes 4, the third data replication task consumes 3, the available amount of the target traffic quota is 1, the IO traffic required by the fourth data replication task is 3, and the available amount of the target traffic quota is insufficient to execute the subsequent data replication task. In this example, the units of traffic consumed by the traffic quota and data replication task may be, but are not limited to, bytes per second or bytes per flow control period.

Further, in this embodiment, the working node executes the data replication task by using a plurality of data replication threads, and each data replication thread is responsible for at least one data replication task, and may concurrently execute the data replication task by using the plurality of data replication threads, which is beneficial to improving the execution efficiency of the data replication task. Alternatively, the working node 103 may determine the priority of the data replication task according to the above manner after receiving the data replication task distributed by the master node 102, and then assign the data replication task to one of the plurality of data replication threads. Alternatively, the work node 103 may randomly allocate the data replication task to each data replication thread, or may preferentially allocate the data replication task to the data replication thread with a lighter load according to the load condition of the data replication thread, so as to implement load balancing.

In addition, in the embodiment of the present application, the working node 103 may adopt a Two-level (Two-level) scheduling architecture, so as to decouple the scheduling of the traffic quota from the scheduling of the data replication task. The scheduling of the flow quota refers to a process of distributing the flow quota of the working node to each data replication thread; the scheduling of the data replication task is a process of scheduling and executing the data replication task according to the allocated flow quota by the data replication thread. The two-stage scheduling process comprises the following steps: after receiving the target flow quota provided by the flow control node, the working node 103 can predict the flow demands of the data replication threads in the next flow control period according to the IO flow consumed by the data replication threads for executing the data replication tasks in the previous flow control period; and then, distributing the target flow quota of the working node in the next flow control period to the plurality of data replication threads according to the flow requirements of the plurality of data replication threads in the next flow control period.

In this embodiment, the implementation of predicting the flow requirements of the plurality of data replication threads in the next flow control period according to the IO flow consumed by the plurality of data replication threads to execute the data replication task in the previous flow control period is not limited. For example, for each data replication thread, taking the IO traffic consumed by the data replication thread to execute the data replication task in the previous flow control period as the traffic demand of the data replication thread in the next flow control period; or multiplying the IO flow consumed by the data replication thread to execute the data replication task in the upper flow control period by a certain coefficient, wherein the coefficient can be more than 1 or less than 1, and taking the multiplication result as the flow demand of the data replication thread in the next flow control period; or, other numerical calculations may be performed on the IO traffic consumed by the data replication thread to perform the data replication task in the previous flow control period, and the calculation result may be used as the traffic demand of the data replication thread in the next flow control period.

Similarly, in this embodiment, the implementation manner of allocating the target flow quota of the working node in the next flow control period to the plurality of data replication threads according to the flow demands of the plurality of data replication threads in the next flow control period is not limited. For example, a weighted maximum-minimum fairness algorithm may be employed to allocate a target flow quota for a working node in a next flow control period to a plurality of data replication threads based on flow requirements of the plurality of data replication threads in the next flow control period. Specifically, according to the flow demand of a plurality of data replication threads in the next flow control period, determining the weight ratio among the plurality of data replication threads; distributing the target flow quota of the working node in the next flow control period among the plurality of data replication threads according to the weight ratio among the plurality of data replication threads so as to obtain the flow quota of each data replication thread; if the distributed flow quota is greater than the first data replication thread corresponding to the flow demand and the distributed flow quota is less than the second data replication thread corresponding to the flow demand in the plurality of data replication threads, continuing to distribute the flow quota which is distributed by the first data replication thread and exceeds the corresponding flow demand among the second data replication threads according to the weight of the second data replication threads until the flow quota distributed by each data replication thread is greater than the corresponding flow demand or no data replication thread with the distributed flow quota greater than the corresponding flow demand exists, and obtaining the flow quota of each data replication thread.

For each data replication thread, the responsible data replication task can be scheduled and executed in the next streaming control period according to the allocated flow quota and the priority of the responsible data replication task. The priority of the data replication task determines the sequence of the data replication task scheduled by the data replication thread; and determining whether the currently scheduled data replication task can be executed or not by the flow quota allocated to the data replication thread, when the available amount of the flow quota allocated to the data replication thread is larger than the IO flow required by the currently scheduled data replication task, indicating that the currently scheduled data replication task can be executed, executing the currently scheduled data replication task by the data replication thread, and updating the available amount of the flow quota allocated to the data replication thread according to the IO flow consumed by the data replication task. As the data replication task is performed, the available volume of traffic quota allocated to the data replication thread may gradually decrease.

Each data replication thread executes the data replication task in the next flow control period according to the allocated flow quota and the priority of the responsible data replication task, and the method comprises the following steps: for any data replication thread, when the next flow control period arrives, namely when the next flow control period is realized as the current flow control period, judging whether a data replication task exists in at least one priority queue corresponding to the data replication thread, wherein the priority queue is used for storing the data replication task which is not executed due to the insufficient available amount of the target flow quota distributed by the previous flow control period, namely the data replication task in the priority queue is the data replication task which is not executed due to the insufficient flow quota in the previous flow control period; if the target data replication task exists, scheduling the data replication task in at least one priority queue according to a set priority scheduling strategy, and controlling the data replication thread to execute the target data replication task among storage nodes associated with the target data replication task when the available amount of the target flow quota of the current flow control period is larger than the IO flow required by the target data replication task which is currently scheduled. The storage nodes associated with the target data replication task include a source storage node and a target storage node, and the foregoing embodiments may be referred to as determining manners of the source storage node and the target storage node, which are not described herein again.

Further, under the condition of executing the target data replication task, the available amount of the target flow quota distributed by the current flow control period is updated according to the IO flow required by the target data replication task. The available amount of the target flow quota refers to a flow quota after the IO flow consumed by the executed data replication task is deducted from the target flow quota.

In this embodiment, the implementation manner of scheduling the data replication task in at least one priority queue according to the priority scheduling policy is not limited. For example, each priority queue may be polled in order of priority from high to low, and if there is a data replication task in the polled priority queue, one data replication task in the priority queue (e.g., the data replication task that was first added to the priority queue) is scheduled, and then the next priority queue is polled. For example, each priority queue may be polled in order of priority from high to low, if a data replication task exists in the polled priority queue, the data replication task in the priority queue is scheduled, and when all the data replication tasks in the priority queue are scheduled, the next priority queue is polled. For another example, a weighted round robin scheduling (Weighted Round Robin, WRR) algorithm may also be employed to schedule data replication tasks in at least one priority queue.

Further, under the condition that the data replication task exists in at least one priority queue, the data replication task in the priority queue can be scheduled preferentially, and if a new data replication task is received in the scheduling process, the new data replication task is added into the corresponding priority queue according to the priority of the new data replication task to wait for scheduling.

Further, under the condition that no data replication task exists in at least one priority queue, according to the sequence of receiving the data replication tasks, when the available amount of the target flow quota is larger than the IO flow required by the latest received data replication task, the data replication thread is controlled to execute the latest received data replication task.

Further, when the available amount of the target flow quota of any data replication thread is insufficient, the received and unexecuted data replication task can be added into a corresponding priority queue according to the priority thereof, so that the data replication thread can be waited to distribute the flow quota in the next flow control period, and then the data replication thread can be waited to schedule the data replication task according to the priority scheduling policy in the next flow control period.

In this embodiment, the working node determines the priority for the data replication task, and when the data replication task is subjected to flow control according to the target flow control quota, the data replication task is scheduled according to the priority, so that the data replication task with high priority (for example, 1 Replica) can be preferentially ensured to be executed; when the data replication task with high priority is completed, the data replication task with low priority can occupy idle traffic resources, so that the reliability of the system can be ensured, and the waste of resources can be avoided.

In the embodiment of the present application, the implementation structure of the working node is not limited, and as shown in fig. 3, one implementation structure of the working node includes: a traffic Resource management module (Resource Manager) 31, a traffic Resource gate control module (Resource Guard) 32, and a priority determination module 33.

The flow resource management module 31 may be used as a proxy of the flow control node and periodically interact with the flow control node, on one hand, collect the second IO flow consumed by the working node to perform the data replication task in each flow control period, and report the second IO flow to the flow control node, and on the other hand, receive the target flow quota of the working node in the next flow control period provided by the flow control node. Optionally, the traffic resource management module 31 may be deployed on a background thread of the working node, and collect, in a lightweight manner, IO flows consumed by the multiple data replication threads in each flow control period, and report the IO flows to the flow control node after the IO flows are summarized as second IO flows. The flow resource access control module 32 is configured to execute the data replication task between storage nodes associated with the data replication task in the system in a next flow control period according to the target flow quota received by the flow resource management module 31 and the priority of the data replication task.

The priority determining module 33 is configured to determine the priority of the data replication task according to the data loss state corresponding to the data replication task to be executed. Optionally, the priority determining module 33 is specifically configured to: receiving a data replication task distributed by a main node in a system, wherein attribute information of the data replication task comprises a data loss state corresponding to the data replication task; and determining the priority of the data replication task according to the data loss state corresponding to the data replication task, and distributing the data replication task to one data replication thread in the plurality of data replication threads.

Further, the traffic resource management module 31 is further configured to coordinate and allocate the target traffic quota received by the target node among the data replication threads. The traffic resource management module 31 adopts a Two-level scheduling architecture to decouple the scheduling of the target traffic quota from the scheduling of the data replication task. The detailed implementation of this part can be found in the foregoing embodiments, and will not be described in detail here.

The flow resource access control module 32 is specifically configured to: according to the flow quota allocated to the plurality of data replication threads by the flow resource management module 31, the plurality of data replication threads are controlled to execute the data replication tasks in the next flow control period according to the allocated flow quota and the priority of the responsible data replication task. Optionally, the traffic resource access control module 32 may be disposed on each data replication thread, and is responsible for controlling the data replication thread where it is located to execute the data replication task in the next streaming control period according to the allocated traffic quota and the priority of the data replication task that is responsible, and maintaining the usage situation of the data replication thread according to the allocated traffic quota.

The process of controlling the data replication thread to execute the data replication task by the flow resource access control module 32 is shown in fig. 4, when the available amount of the target flow quota in the current flow control period is insufficient and the data replication task cannot be executed, the data replication task is added into the corresponding priority queue, four priority queues P1-P4 are shown in fig. 4, and priorities corresponding to the four priority queues are from high to low; then, waiting for a target flow quota in a next flow control period, when the target flow quota in the next flow control period is allocated and the next flow control period arrives, scheduling data replication tasks from the four priority queues P1-P4 according to a priority scheduling policy, and executing the scheduled data replication tasks among related storage nodes when the available amount of the target flow quota is enough, namely reading data to be replicated from the storage node serving as a source end, and writing the data to be replicated into the storage node serving as a target end.

In this embodiment, the implementation structure of the flow control node is not limited, and as shown in fig. 5, one implementation structure of the flow control node includes: a first prediction module 51, a second prediction module 52, a flow quota allocation module 53, and a flow quota providing module 54.

The first prediction module 51 is configured to predict, according to the IO traffic consumed by at least one client in executing the data read-write task on the storage node in at least one previous flow control period, a first traffic demand corresponding to the data read-write task in a next flow control period, where the previous flow control period is a flow control period before the next flow control period.

The second prediction module 52 is configured to predict, according to the IO traffic consumed by the at least one working node to perform the data replication task between the storage nodes in the at least one previous flow control period, a second traffic demand corresponding to the data replication task in the next flow control period.

The flow quota allocation module 53 is configured to allocate, from the IO traffic of the storage system, a target flow quota of each of the at least one working node in a next flow control period according to the first flow requirement and the second flow requirement.

The flow quota providing module 54 is configured to provide the target flow quota of each of the at least one working node in the next flow control period, so that the at least one working node performs the data replication task in the next flow control period according to the respective target flow quota.

In an alternative embodiment, the flow quota allocation module 53 is specifically configured to: according to the first flow demand and the second flow demand, determining a global flow quota for the data replication task in a next flow control period from IO flow of the storage system; the global flow quota is allocated to the at least one working node to obtain a target flow quota of each of the at least one working node in a next flow control period.

In an alternative embodiment, the flow quota allocation module 53 is specifically configured to, when determining the global flow quota: according to the weight ratio between the data read-write task and the data replication task in the next flow control period, the IO flow of the storage system is distributed between the data read-write task and the data replication task, so that a first initial flow quota and a second initial flow quota are obtained; if the first initial flow quota is greater than the first flow requirement and the second initial flow quota is less than the second flow requirement, continuing to distribute the flow quota for the data replication task from the flow difference between the first initial flow quota and the first flow requirement so as to obtain the global flow quota for the data replication task in the next flow control period.

In an alternative embodiment, the flow quota allocation module 53 is specifically configured to, when allocating a global flow quota to at least one working node: predicting the flow demand of at least one working node in the next flow control period according to the IO flow consumed by the at least one working node in the at least one previous flow control period by executing the data replication task between the storage nodes; and distributing the global flow quota to the at least one working node according to the flow demand of the at least one working node in the next flow control period so as to obtain the target flow quota of each working node in the next flow control period.

Further optionally, the flow quota allocation module 53 is specifically configured to, when allocating the global flow quota to the at least one working node according to the flow requirement of the at least one working node in the next flow control period: determining a weight duty ratio between at least one working node according to the flow demand of the at least one working node in the next flow control period; distributing the global flow quota among the at least one working node according to the weight ratio among the at least one working node so as to obtain the flow quota of the at least one working node; if the first working node with the flow quota larger than the corresponding flow requirement and the second working node with the flow quota smaller than the corresponding flow requirement exist, continuing to distribute the flow quota distributed by the first working node and exceeding the corresponding flow requirement among the second working nodes according to the weight ratio among the second working nodes until the flow quota distributed by each working node is larger than the corresponding flow requirement or no working node with the flow quota larger than the corresponding flow requirement exists.

In an alternative embodiment, the first prediction module 51 is specifically configured to: predicting the flow demand of at least one client in the next flow control period according to the IO flow consumed by the at least one client in the at least one previous flow control period by executing the data read-write task on the storage node; and generating a first flow demand according to the flow demand of at least one client in the next flow control period. The second prediction module 52 is specifically configured to: predicting the flow demand of at least one working node in the next flow control period according to the IO flow consumed by the at least one working node in the at least one previous flow control period by executing the data replication task between the storage nodes; and generating a second flow demand according to the flow demand of at least one working node in the next flow control period.

The specific manner in which the various modules in the working node embodiment of fig. 3 and the flow control node embodiment of fig. 5 perform operations has been described in detail in the system embodiment described above and will not be described in detail herein. The advantages brought by the working node shown in fig. 3 and the flow control node shown in fig. 5 can also be seen in the foregoing embodiments, and will not be described herein again.

Fig. 6 is a flow chart of a flow control method according to an embodiment of the present application. The method may be performed by the flow control node in the foregoing embodiment, as shown in fig. 6, and includes:

601. predicting a first flow demand corresponding to a data read-write task in a next flow control period according to IO flow consumed by at least one client for executing the data read-write task on a storage node in at least one previous flow control period, wherein the previous flow control period is a flow control period before the next flow control period;

602. predicting a second flow demand corresponding to the data replication task in the next flow control period according to IO flow consumed by at least one working node for executing the data replication task between storage nodes in at least one previous flow control period;

603. according to the first flow demand and the second flow demand, distributing target flow quota of at least one working node in a next flow control period from IO flow of a storage system where the storage node is located;

604. And providing the target flow quota of each flow control period to at least one working node, so that the at least one working node can execute the data copying task in the next flow control period according to the respective target flow quota.

In an optional embodiment, the allocating, according to the first flow requirement and the second flow requirement, a target flow quota of each of the at least one working node in a next flow control period from the IO flow of the storage system where the storage node is located includes: according to the first flow demand and the second flow demand, determining a global flow quota for the data replication task in a next flow control period from IO flow of the storage system; the global flow quota is allocated to the at least one working node to obtain a target flow quota of each of the at least one working node in a next flow control period.

In an alternative embodiment, the determining, according to the first flow requirement and the second flow requirement, the global flow quota for the data replication task in the next flow control period from the IO flow of the storage system includes: according to the weight ratio between the data read-write task and the data replication task in the next flow control period, the IO flow of the storage system is distributed between the data read-write task and the data replication task, so that a first initial flow quota and a second initial flow quota are obtained; if the first initial flow quota is greater than the first flow requirement and the second initial flow quota is less than the second flow requirement, continuing to distribute the flow quota for the data replication task from the flow difference between the first initial flow quota and the first flow requirement so as to obtain the global flow quota for the data replication task in the next flow control period.

In an optional embodiment, the allocating the global traffic quota to the at least one working node to obtain a target traffic quota of each of the at least one working node in a next flow control period includes: predicting the flow demand of at least one working node in the next flow control period according to the IO flow consumed by the at least one working node in the at least one previous flow control period by executing the data replication task between the storage nodes; and distributing the global flow quota to the at least one working node according to the flow demand of the at least one working node in the next flow control period so as to obtain the target flow quota of each working node in the next flow control period.

In an optional embodiment, the allocating the global flow quota to the at least one working node according to the flow requirement of the at least one working node in the next flow control period to obtain the target flow quota of each of the at least one working node in the next flow control period includes: determining a weight duty ratio between at least one working node according to the flow demand of the at least one working node in the next flow control period; distributing the global flow quota among the at least one working node according to the weight ratio among the at least one working node so as to obtain the flow quota of the at least one working node; if the first working node with the flow quota larger than the corresponding flow requirement and the second working node with the flow quota smaller than the corresponding flow requirement exist, continuing to distribute the flow quota distributed by the first working node and exceeding the corresponding flow requirement among the second working nodes according to the weight ratio among the second working nodes until the flow quota distributed by each working node is larger than the corresponding flow requirement or no working node with the flow quota larger than the corresponding flow requirement exists.

In an optional embodiment, predicting, according to the IO traffic consumed by the at least one client in the at least one previous flow control period for executing the data read-write task on the storage node, a first traffic demand corresponding to the data read-write task in a next flow control period includes: predicting the flow demand of at least one client in the next flow control period according to the IO flow consumed by the at least one client in the at least one previous flow control period by executing the data read-write task on the storage node; and generating a first flow demand according to the flow demand of at least one client in the next flow control period.

Correspondingly, predicting the second traffic demand corresponding to the data replication task in the next flow control period according to the IO traffic consumed by the at least one working node to perform the data replication task between the storage nodes in the at least one previous flow control period, including: predicting the flow demand of at least one working node in the next flow control period according to the IO flow consumed by the at least one working node in the at least one previous flow control period by executing the data replication task between the storage nodes; and generating a second flow demand according to the flow demand of at least one working node in the next flow control period.

The detailed implementation and the beneficial effects of each step in the method of this embodiment have been described in the foregoing embodiments, and will not be described in detail herein.

Fig. 7a is a flowchart of a data replication method according to an embodiment of the present application. The method may be performed by the working node in the previous embodiment, as shown in fig. 7a, the method comprising:

701. determining the priority of the data replication task according to the data loss state corresponding to the data replication task to be executed;

702. receiving a target flow quota of a working node in a next flow control period, which is provided by a flow control node in a storage system;

703. according to the target flow quota and the priority of the data replication task, executing the data replication task between storage nodes associated with the data replication task in the system in the next flow control period; the target flow quota is distributed from the IO flow of the storage system by the flow control node according to the predicted first flow demand corresponding to the data read-write task in the next flow control period and the second flow demand corresponding to the data replication task in the next flow control period.

In an optional embodiment, according to the traffic quota and the priority of the data replication task, the performing, in a next flow control period, the data replication task between storage nodes associated with the data replication task in the system includes: predicting the flow demand of the data replication threads in the next flow control period according to the IO flow consumed by the data replication threads for executing the data replication tasks in the previous flow control period; distributing flow quota to a plurality of data replication threads according to the flow demand of the plurality of data replication threads in the next flow control period, wherein one data replication thread is responsible for at least one data replication task; and controlling a plurality of data replication threads to execute the data replication tasks among the storage nodes associated with the data replication tasks in the next flow control period according to the allocated flow quota and the priority of the responsible data replication tasks.

In an optional embodiment, determining the priority of the data replication task according to the data loss state corresponding to the data replication task to be executed includes: receiving a data replication task distributed by a main node in a system, wherein attribute information of the data replication task comprises a data loss state corresponding to the data replication task; and determining the priority of the data replication task according to the data loss state corresponding to the data replication task, and distributing the data replication task to one data replication thread in the plurality of data replication threads.

In an optional embodiment, the controlling the plurality of data replication threads to execute the data replication task between storage nodes associated with the data replication task in a next streaming control period according to the allocated traffic quota and the priority of the responsible data replication task includes: for any data copying thread, when the next flow control period is reached, judging whether a data copying task exists in at least one priority queue corresponding to the data copying thread, wherein the priority queue is used for storing the data copying task which is not executed due to insufficient available amount of a target flow quota of the previous flow control period; if the target data replication task exists, scheduling the data replication task in at least one priority queue according to a set priority scheduling strategy, and controlling the data replication thread to execute the target data replication task among storage nodes associated with the target data replication task when the available amount of the target flow quota of the current flow control period is larger than the IO flow required by the target data replication task which is currently scheduled.

In an alternative embodiment, the method further comprises at least one of the following operations:

under the condition of executing the target data replication task, updating the available amount of the target flow quota of the current flow control period according to the IO flow required by the target data replication task;

under the condition that a data replication task exists in at least one priority queue, if a new data replication task is received in the scheduling process, adding the new data replication task into the corresponding priority queue according to the priority of the new data replication task to wait for scheduling;

and under the condition that no data replication task exists in at least one priority queue, according to the sequence of the received data replication tasks, when the available amount of the target flow quota of the current flow control period is larger than the IO flow required by the latest received data replication task, controlling the data replication thread to execute the latest received data replication task.

It should be noted that, in some of the above embodiments and the flows described in the drawings, a plurality of operations appearing in a specific order are included, but it should be clearly understood that the operations may be performed out of the order in which they appear herein or performed in parallel, the sequence numbers of the operations such as 601, 602, etc. are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any execution sequence. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.

In the above embodiments, the storage system is taken as an example to describe the flow control method provided in the embodiments of the present application in detail, but the flow control method provided in the embodiments of the present application may be applied to not only the storage system, but also other systems or scenarios with flow control requirements, such as service clusters. The service cluster comprises a plurality of service nodes, the service nodes can provide corresponding services to the outside, and the service nodes can be computing nodes capable of providing computing services, can be management nodes capable of providing management services, can be load balancing nodes capable of providing load balancing services, can be gateway nodes capable of providing gateway services and the like. Regardless of the service nodes providing which services, these service nodes may be implemented in a distributed deployment on physical devices capable of network and computing resources, which may be conventional servers, cloud servers, or various terminal devices, etc.

In this embodiment, the service cluster may provide a corresponding service for at least one client, for example, the client may initiate a service request to a service node in the service cluster, and the service node executes a first task according to the service request initiated by the client, to provide a corresponding service for the client that initiates the service request. Taking the example that the service node is a computing node that provides computing services, the computing node may perform computing tasks to provide computing services to clients that initiate service requests. Further, a working node is included in the service cluster, the working node being responsible for performing a second task between the service nodes. In this embodiment, the association relationship between the first task and the second task is not limited, alternatively, the association relationship between the second task and the first task may not exist, or the association relationship between the second task and the first task exists, for example, the second task is a task that provides services for the first task. The first task is a computing task, the second task can be a data migration task, and the data migration task is used for migrating data on the first computing node to the second computing node so as to make more data storage space for the first computing node and further make the first computing node better execute the computing task. The first task and the second task herein are merely examples, and are not limited thereto, and those skilled in the art can understand: the first task and the second task may differ in implementation for a service cluster providing different services.

In the case where the first task and the second task exist simultaneously, both tasks occupy the IO traffic of the service node, but the IO traffic that can be supported by the service node and the service cluster is limited. In order to more reasonably use the IO traffic of the service node, the performance of the first task is expected to be guaranteed preferentially, the interference of the second task on the first task is reduced, meanwhile, when the IO traffic occupied by the first task is not high, the IO traffic occupied by the second task is expected to be improved appropriately, the second task is guaranteed to be completed as soon as possible, the reliability of the service cluster is further guaranteed, and therefore the utilization rate of traffic resources is effectively improved.

In view of this, the service cluster of the present embodiment includes a flow control node in addition to the service node and the working node. Under the cooperation of the flow control node, the service node and the working node, IO flow resources of the service cluster can be reasonably distributed between the first task and the second task. The process of the flow control node executing flow control for the service cluster is shown in fig. 7b, and includes:

71. according to IO flow consumed by at least one client-side executing a first task on a service node in at least one prior flow control period, predicting a first flow demand corresponding to the first task in a next flow control period, wherein the prior flow control period is a flow control period before the next flow control period;

72. Predicting a second flow demand corresponding to a second task in a next flow control period according to IO flow consumed by at least one working node for executing the second task between service nodes in at least one previous flow control period;

73. according to the first flow demand and the second flow demand, distributing target flow quota of each working node in the next flow control period from IO flows of the service clusters;

74. the at least one working node is provided with a target flow quota for each of the next flow control periods, so that the at least one working node can execute a second task in the next flow control period according to the respective target flow quota.

For the detailed description of the working node and the flow control node, reference may be made to the foregoing embodiments, and the definition of the IO flow of the service cluster may also be referred to the definition of the IO flow of the storage system in the foregoing embodiments, and in addition, the detailed implementation of each operation in the foregoing embodiments may be referred to the foregoing embodiments, which are not repeated herein.

Fig. 8 is a schematic structural diagram of another fluidic node according to an embodiment of the present disclosure. As shown in fig. 8, the flow control node includes: a memory 81 and a processor 82.

A memory 81 for storing a computer program and may be configured to store various other data to support operations on the flow control node. Examples of such data include instructions, messages, pictures, videos, etc. for any application or method operating on the flow control node.

The processor 82 is coupled to the memory 81 for executing the computer program in the memory 81 for: predicting a first flow demand corresponding to a data read-write task in a next flow control period according to IO flow consumed by at least one client for executing the data read-write task on a storage node in at least one previous flow control period, wherein the previous flow control period is a flow control period before the next flow control period; predicting a second flow demand corresponding to the data replication task in the next flow control period according to IO flow consumed by at least one working node for executing the data replication task between storage nodes in at least one previous flow control period; according to the first flow demand and the second flow demand, distributing target flow quota of each working node in the next flow control period from IO flow of the storage system; and providing the target flow quota of each flow control period to at least one working node, so that the at least one working node can execute the data copying task in the next flow control period according to the respective target flow quota.

In an alternative embodiment, processor 82 is specifically configured to, when allocating a target traffic quota for each of the at least one worker node in a next flow control period: according to the first flow demand and the second flow demand, determining a global flow quota for the data replication task in a next flow control period from IO flow of the storage system; the global flow quota is allocated to the at least one working node to obtain a target flow quota of each of the at least one working node in a next flow control period.

In an alternative embodiment, processor 82, when determining the global flow quota, is specifically configured to: according to the weight ratio between the data read-write task and the data replication task in the next flow control period, the IO flow of the storage system is distributed between the data read-write task and the data replication task, so that a first initial flow quota and a second initial flow quota are obtained; if the first initial flow quota is greater than the first flow requirement and the second initial flow quota is less than the second flow requirement, continuing to distribute the flow quota for the data replication task from the flow difference between the first initial flow quota and the first flow requirement so as to obtain the global flow quota for the data replication task in the next flow control period.

In an alternative embodiment, processor 82, when assigning a global traffic quota to at least one working node, is specifically configured to: predicting the flow demand of at least one working node in the next flow control period according to the IO flow consumed by the at least one working node in the at least one previous flow control period by executing the data replication task between the storage nodes; and distributing the global flow quota to the at least one working node according to the flow demand of the at least one working node in the next flow control period so as to obtain the target flow quota of each working node in the next flow control period.

In an alternative embodiment, processor 82 is specifically configured to, when assigning a global traffic quota to at least one of the worker nodes based on the traffic demand of the at least one worker node in the next flow control period: determining a weight duty ratio between at least one working node according to the flow demand of the at least one working node in the next flow control period; distributing the global flow quota among the at least one working node according to the weight ratio among the at least one working node so as to obtain the flow quota of the at least one working node; if the first working node with the flow quota larger than the corresponding flow requirement and the second working node with the flow quota smaller than the corresponding flow requirement exist, continuing to distribute the flow quota distributed by the first working node and exceeding the corresponding flow requirement among the second working nodes according to the weight ratio among the second working nodes until the flow quota distributed by each working node is larger than the corresponding flow requirement or no working node with the flow quota larger than the corresponding flow requirement exists.

In an alternative embodiment, processor 82 is specifically configured to, when predicting a first flow demand: predicting the flow demand of at least one client in the next flow control period according to the IO flow consumed by the at least one client in the at least one previous flow control period by executing the data read-write task on the storage node; and generating a first flow demand according to the flow demand of at least one client in the next flow control period. Accordingly, the processor 82, when predicting the second flow demand, is specifically configured to: predicting the flow demand of at least one working node in the next flow control period according to the IO flow consumed by the at least one working node in the at least one previous flow control period by executing the data replication task between the storage nodes; and generating a second flow demand according to the flow demand of at least one working node in the next flow control period.

Further, as shown in fig. 8, the flow control node further includes: communication component 83, display 84, power component 85, audio component 86, and other components. Only some of the components are schematically shown in fig. 8, which does not mean that the fluidic node only comprises the components shown in fig. 8. In addition, the components within the dashed box in fig. 8 are optional components, not necessarily optional components, depending on the product morphology of the fluidic node. The flow control node in this embodiment may be implemented as a terminal device such as a desktop computer, a notebook computer, a smart phone, or an IOT device, or may be a server device such as a conventional server, a cloud server, or a server array. If the flow control node in this embodiment is implemented as a terminal device such as a desktop computer, a notebook computer, a smart phone, etc., the flow control node may include components within the dashed line frame in fig. 8; if the flow control node in this embodiment is implemented as a server device such as a conventional server, a cloud server, or a server array, the components in the dashed box in fig. 8 may not be included.

The embodiment of the present application also provides another kind of fluidic node, where the fluidic node has the same or similar structure as the fluidic node shown in fig. 8, and the difference between the fluidic node provided in the embodiment and the fluidic node shown in fig. 8 is that: the functions implemented by a processor executing a computer program stored in memory are different. The processor in the flow control node of this embodiment executes the computer program stored in the memory to implement: according to IO flow consumed by at least one client-side executing a first task on a service node in at least one prior flow control period, predicting a first flow demand corresponding to the first task in a next flow control period, wherein the prior flow control period is a flow control period before the next flow control period; predicting a second flow demand corresponding to a second task in a next flow control period according to IO flow consumed by at least one working node for executing the second task between service nodes in at least one previous flow control period; according to the first flow demand and the second flow demand, distributing target flow quota of each working node in the next flow control period from IO flows of the service clusters; the at least one working node is provided with a target flow quota for each of the next flow control periods, so that the at least one working node can execute a second task in the next flow control period according to the respective target flow quota. The foregoing embodiments may be referred to for a detailed description of each operation, and are not repeated herein.

Fig. 9 is a schematic structural diagram of another working node according to an embodiment of the present application. As shown in fig. 9, the working node includes: a memory 91 and a processor 92.

The memory 91 is used for storing computer programs and may be configured to store various other data to support operations on the working node. Examples of such data include instructions, messages, pictures, videos, etc. for any application or method operating on the working node.

The processor 92 is coupled to the memory 91 for executing the computer program in the memory 91 for: determining the priority of the data replication task according to the data loss state corresponding to the data replication task to be executed; receiving a target flow quota of a working node in a next flow control period, which is provided by a flow control node in a storage system; according to the target flow quota and the priority of the data replication task, executing the data replication task between storage nodes associated with the data replication task in the system in the next flow control period; the target flow quota is distributed from the IO flow of the storage system by the flow control node according to the predicted first flow demand corresponding to the data read-write task in the next flow control period and the second flow demand corresponding to the data replication task in the next flow control period.

In an alternative embodiment, processor 92 is specifically configured to, when executing a data replication task between storage nodes associated with the data replication task in the system in a next streaming cycle according to the traffic quota and the priority of the data replication task: predicting the flow demand of the data replication threads in the next flow control period according to the IO flow consumed by the data replication threads for executing the data replication tasks in the previous flow control period; distributing flow quota to a plurality of data replication threads according to the flow demand of the plurality of data replication threads in the next flow control period, wherein one data replication thread is responsible for at least one data replication task; and controlling a plurality of data replication threads to execute the data replication tasks among the storage nodes associated with the data replication tasks in the next flow control period according to the allocated flow quota and the priority of the responsible data replication tasks.

In an alternative embodiment, the processor 92 is specifically configured to, when determining the priority of the data replication task according to the data loss state corresponding to the data replication task to be performed: receiving a data replication task distributed by a main node in a system, wherein attribute information of the data replication task comprises a data loss state corresponding to the data replication task; and determining the priority of the data replication task according to the data loss state corresponding to the data replication task, and distributing the data replication task to one data replication thread in the plurality of data replication threads.

In an alternative embodiment, the processor 92 is specifically configured to, when controlling the plurality of data replication threads to execute the data replication task between storage nodes associated with the data replication task in a next streaming control period according to the allocated traffic quota and the priority of the responsible data replication task: for any data copying thread, when the next flow control period is reached, judging whether a data copying task exists in at least one priority queue corresponding to the data copying thread, wherein the priority queue is used for storing the data copying task which is not executed due to insufficient available amount of a target flow quota of the previous flow control period; if the target data replication task exists, scheduling the data replication task in at least one priority queue according to a set priority scheduling strategy, and controlling the data replication thread to execute the target data replication task among storage nodes associated with the target data replication task when the available amount of the target flow quota of the current flow control period is larger than the IO flow required by the target data replication task which is currently scheduled.

In an alternative embodiment, processor 92 is further configured to perform at least one of the following:

Further, as shown in fig. 9, the working node further includes: communication component 93, display 94, power component 95, audio component 96, and other components. Only some of the components are schematically shown in fig. 9, which does not mean that the working node comprises only the components shown in fig. 9. In addition, the components within the dashed box in fig. 9 are optional components, and not necessarily optional components, depending on the product form of the working node. The working node of the embodiment can be implemented as terminal equipment such as a desktop computer, a notebook computer, a smart phone or an IOT device, and also can be a server-side device such as a conventional server, a cloud server or a server array. If the working node of the embodiment is implemented as a terminal device such as a desktop computer, a notebook computer, a smart phone, etc., the working node may include components within the dashed line frame in fig. 9; if the working node of the present embodiment is implemented as a server device such as a conventional server, a cloud server, or a server array, the components within the dashed box in fig. 9 may not be included.

Accordingly, embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method embodiments shown in fig. 7a, fig. 7b or fig. 8.

Accordingly, embodiments of the present application also provide a computer program product comprising computer programs/instructions which, when executed by a processor, cause the processor to carry out the steps of the method embodiments shown in fig. 7a, 7b or 8.

The memory may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as SRAM, EEPROM, EPROM, PROM, ROM, magnetic memory, flash memory, magnetic or optical disk.

The communication component is configured to facilitate wired or wireless communication between the device in which the communication component is located and other devices. The device where the communication component is located can access a wireless network based on a communication standard, such as a mobile communication network of WiFi,2G, 3G, 4G/LTE, 5G, etc., or a combination thereof. In one exemplary embodiment, the communication component receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component further includes a near field communication (Near Field Communication, NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on radio frequency identification (Radio Frequency Identification, RFID) technology, infrared data association (Infrared Data Association, irDA) technology, ultra Wideband (UWB) technology, blueTooth (BT) technology, and other technologies.

The display includes a screen, which may include a liquid crystal display (Liquid Crystal Display, LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or sliding action, but also the duration and pressure associated with the touch or sliding operation.

The power supply component provides power for various components of equipment where the power supply component is located. The power components may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the devices in which the power components are located.

The audio component described above may be configured to output and/or input an audio signal. For example, the audio component includes a Microphone (MIC) configured to receive external audio signals when the device in which the audio component is located is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may be further stored in a memory or transmitted via a communication component. In some embodiments, the audio assembly further comprises a speaker for outputting audio signals.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, magnetic disk storage, CD-ROM (Compact Disc Read-Only Memory), optical storage, etc.) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (Central Processing Unit, CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random access memory (Random Access Memory, RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash RAM. Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase-change memory (Phase-change Random Access Memory, PRAM), static Random Access Memory (SRAM), dynamic random access memory (Dynamic Random Access Memory, DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital versatile disks (Digital Video Disc, DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by the computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims

1. A flow control method for a flow control node in a storage system, the method comprising:

predicting a first flow demand corresponding to a data read-write task in a next flow control period according to IO flow consumed by at least one client for executing the data read-write task on a storage node in at least one previous flow control period, wherein the previous flow control period is a flow control period before the next flow control period;

predicting a second flow demand corresponding to the data replication task in the next flow control period according to IO flow consumed by at least one working node for executing the data replication task between storage nodes in at least one previous flow control period;

distributing target flow quota of each working node in a next flow control period from IO flow of the storage system according to the first flow requirement and the second flow requirement;

And providing the target flow quota of each flow control period to the at least one working node, so that the at least one working node can execute the data replication task in the next flow control period according to the respective target flow quota.

2. The method of claim 1, wherein allocating the target traffic quota for each of the at least one worker node in a next traffic control cycle from the IO traffic of the storage system according to the first traffic demand and the second traffic demand comprises:

according to the first flow demand and the second flow demand, determining a global flow quota for a data replication task in a next flow control period from IO (input/output) flow of the storage system;

and distributing the global flow quota to the at least one working node to obtain a target flow quota of each working node in a next flow control period.

3. The method of claim 2, wherein determining a global traffic quota for the data replication task in a next flow control cycle from the IO traffic of the storage system based on the first traffic demand and the second traffic demand comprises:

Distributing IO flow of the storage system between the data read-write task and the data replication task according to the weight ratio between the data read-write task and the data replication task in the next flow control period so as to obtain a first initial flow quota and a second initial flow quota;

if the first initial flow quota is greater than the first flow demand and the second initial flow quota is less than the second flow demand, continuing to distribute the flow quota for the data replication task from the flow difference between the first initial flow quota and the first flow demand so as to obtain the global flow quota for the data replication task in the next flow control period.

4. The method of claim 2, wherein assigning the global traffic quota to the at least one worker node to obtain a target traffic quota for each of the at least one worker node in a next streaming cycle comprises:

predicting the flow demand of at least one working node in the next flow control period according to IO flow consumed by the at least one working node in the at least one previous flow control period by executing a data replication task between storage nodes;

And distributing the global flow quota to the at least one working node according to the flow demand of the at least one working node in the next flow control period so as to obtain the target flow quota of each working node in the next flow control period.

5. The method of claim 4, wherein assigning the global flow quota to the at least one worker node based on the flow demand of the at least one worker node in the next flow control period to obtain the target flow quota for each of the at least one worker node in the next flow control period, comprises:

determining a weight duty ratio between the at least one working node according to the flow demand of the at least one working node in the next flow control period;

distributing the global flow quota among the at least one working node according to the weight ratio among the at least one working node so as to obtain the flow quota of the at least one working node;

if a first working node with the flow quota larger than the corresponding flow requirement and a second working node with the flow quota smaller than the corresponding flow requirement exist, continuing to distribute the flow quota distributed by the first working node and exceeding the corresponding flow requirement among the second working nodes according to the weight duty ratio among the second working nodes until the flow quota distributed by each working node is larger than the corresponding flow requirement or no working node with the flow quota larger than the corresponding flow requirement exists.

6. The method according to any one of claims 1-5, wherein predicting the first traffic demand corresponding to the data read-write task in the next streaming control period according to the IO traffic consumed by the at least one client to perform the data read-write task on the storage node in the at least one previous streaming control period comprises: predicting the flow demand of at least one client in the next flow control period according to IO flow consumed by the at least one client in the at least one previous flow control period by executing a data read-write task on a storage node; generating the first flow demand according to the flow demand of the at least one client in the next flow control period;

accordingly, predicting, according to the IO traffic consumed by at least one working node to perform the data replication task between storage nodes in at least one previous flow control period, a second traffic demand corresponding to the data replication task in a next flow control period, including: predicting the flow demand of at least one working node in the next flow control period according to IO flow consumed by the at least one working node in the at least one previous flow control period by executing a data replication task between storage nodes; and generating the second flow demand according to the flow demand of the at least one working node in the next flow control period.

7. A method of data replication for any working node in a storage system, the method comprising:

determining the priority of a data copying task according to the data loss state corresponding to the data copying task to be executed;

receiving a target flow quota of the working node in a next flow control period, which is provided by a flow control node in the system;

executing the data replication task between storage nodes associated with the data replication task in the system in the next flow control period according to the target flow quota and the priority of the data replication task;

8. The method of claim 7, wherein performing the data replication task between storage nodes associated with the data replication task in the system during the next streaming cycle based on the traffic quota and the priority of the data replication task comprises:

Predicting the flow demand of the data replication threads in the next flow control period according to the IO flow consumed by the data replication threads for executing the data replication tasks in the previous flow control period;

distributing the flow quota to a plurality of data replication threads according to the flow demand of the data replication threads in the next flow control period, wherein one data replication thread is responsible for at least one data replication task;

and controlling the plurality of data replication threads to execute the data replication tasks among storage nodes associated with the data replication tasks in the next flow control period according to the allocated flow quota and the priority of the responsible data replication tasks.

9. The method of claim 8, wherein determining the priority of the data replication task based on the data loss status corresponding to the data replication task to be performed comprises:

receiving a data replication task distributed by a main node in the system, wherein attribute information of the data replication task comprises a data loss state corresponding to the data replication task;

and determining the priority of the data replication task according to the data loss state corresponding to the data replication task, and distributing the data replication task to one data replication thread in the plurality of data replication threads.

10. The method of claim 8 or 9, wherein controlling the plurality of data replication threads to perform the data replication task between storage nodes associated with the data replication task in a next streaming cycle according to the allocated traffic quota and the priority of the responsible data replication task comprises:

for any data replication thread, when the next flow control period is reached, judging whether a data replication task exists in at least one priority queue corresponding to the data replication thread, wherein the priority queue is used for storing the data replication task which is not executed due to insufficient available amount of a target flow quota of the previous flow control period;

if the target data replication task exists, scheduling the data replication task in the at least one priority queue according to a set priority scheduling strategy, and controlling the data replication thread to execute the target data replication task between storage nodes associated with the target data replication task when the available amount of the target flow quota of the current flow control period is larger than the IO flow required by the target data replication task which is currently scheduled.

11. The method of claim 10, further comprising at least one of:

if a new data replication task is received in the scheduling process under the condition that the data replication task exists in the at least one priority queue, adding the new data replication task into the corresponding priority queue according to the priority of the new data replication task to wait for scheduling;

and under the condition that no data replication task exists in the at least one priority queue, controlling the data replication thread to execute the latest received data replication task according to the sequence of the received data replication tasks when the available amount of the target flow quota of the current flow control period is larger than the IO flow required by the latest received data replication task.

12. A flow control method, applied to a flow control node in a service cluster, the method comprising:

13. A fluidic node, comprising: a memory and a processor;

the memory is used for storing a computer program;

the processor is coupled to the memory for executing the computer program for performing the steps of the method of any of claims 1-6 and 12.

14. A working node, comprising: a memory and a processor;

the memory is used for storing a computer program;

the processor is coupled to the memory for executing the computer program for performing the steps in the method of any of claims 7-11.

15. A storage system, comprising: a plurality of storage nodes providing data storage services for at least one client, at least one worker node for performing data replication tasks between the storage nodes as claimed in claim 14, and a flow control node as claimed in claim 13.

16. A computer readable storage medium storing a computer program, which when executed by a processor causes the processor to carry out the steps of any one of the methods of claims 1-6, claims 7-11 and claim 12.