[go: up one dir, main page]

WO2024239992A1 - Data synchronization method for distributed cluster and related device therefor - Google Patents

Data synchronization method for distributed cluster and related device therefor Download PDF

Info

Publication number
WO2024239992A1
WO2024239992A1 PCT/CN2024/092720 CN2024092720W WO2024239992A1 WO 2024239992 A1 WO2024239992 A1 WO 2024239992A1 CN 2024092720 W CN2024092720 W CN 2024092720W WO 2024239992 A1 WO2024239992 A1 WO 2024239992A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage management
management unit
data
data operation
slave
Prior art date
Application number
PCT/CN2024/092720
Other languages
French (fr)
Chinese (zh)
Inventor
秦世成
胡怡
崔力强
Original Assignee
阿里云计算有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里云计算有限公司 filed Critical 阿里云计算有限公司
Publication of WO2024239992A1 publication Critical patent/WO2024239992A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • the present application relates to the field of computer technology, and in particular to a data synchronization method for a distributed cluster and related equipment.
  • the product warehouse needs to be redundantly backed up at different sites and synchronized between different sites, so that when regional problems occur, master-slave switching can be performed to achieve high availability.
  • network delays caused by regional factors can also be solved through multi-site deployment.
  • product synchronization between different deployment environments Unverified products need to be verified in the development environment first, and then synchronized to the stable production environment through the site synchronization function after passing the verification.
  • the distributed cluster data synchronization method and related devices provided by the embodiments of the present invention at least solve the problem of service unavailability caused by shutdown migration between isolated different sites.
  • a distributed cluster data synchronization method comprising:
  • the data operation log is synchronized to a slave storage management unit in the storage management unit group, so that the slave storage management unit performs data synchronization based on the data operation log.
  • the method further comprises:
  • the metadata is updated based on the synchronization result of the data operation log.
  • generating a data operation log of a primary storage management unit in the storage management unit group within a synchronization time slice includes:
  • Persisting data operation information of the cluster in the distributed cluster according to the execution order of the data operation wherein the data operation information includes the operation object, the operation content and the information of the storage management unit that performs the data operation;
  • a data operation log of the primary storage management unit in the synchronization time slice is obtained, wherein the data operation log is sorted according to the execution order of the data operation.
  • determining the synchronization time slice based on the master-slave state switching of the storage management unit group in the distributed cluster includes:
  • the master-slave state switching logic clock of the storage management unit group is obtained, and the synchronization time slice and the sequence number of the synchronization time slice are determined according to the master-slave state switching logic clock.
  • the method further comprises:
  • the cluster When the master-slave state of the storage management unit of the cluster in the distributed cluster switches, the cluster reports the master-slave state switching logical clock to the storage system, and the storage system sequentially increases the sequence number of the synchronization time slice according to the master-slave state switching logical clock and sends it to the distributed cluster.
  • the method further comprises:
  • the distributed cluster periodically obtains the sequence number of the synchronization time slice from the storage system.
  • synchronizing the data operation log to a slave storage management unit in the storage management unit group includes:
  • the index number of the data operation log the index number of the submitted data operation log recorded in the metadata and the sequence number of the synchronization time slice, the data operation log to be synchronized in the main storage management unit is determined, and the data operation log to be synchronized is synchronized to the slave storage management unit.
  • the data operation logs to be synchronized are determined and synchronized.
  • the storage management units in the master cluster in the distributed cluster are all master storage management units
  • the storage management units in the slave clusters in the distributed cluster are all slave storage management units
  • the master-slave switching of the storage management unit group is managed by a preset controller.
  • the preset controller sets the storage management unit with the identifier of the storage management unit as the main storage management unit by adding the identifier of the storage management unit to the gray list, so as to realize the master-slave state switching of the storage management units in the storage management unit group in the distributed cluster.
  • the primary storage management unit in the storage management unit group is determined by querying the preset controller, and the data is written into the primary storage management unit.
  • the method further comprises:
  • the data storage state of the storage management unit is used as a finite state machine, and a leader node and at least one follower node are respectively elected in the main storage management unit and the slave storage management unit.
  • the data operation log performs copy synchronization of the finite state machine within the storage management unit group.
  • a distributed cluster system comprises a plurality of clusters and a storage system, wherein the plurality of clusters comprises a master cluster and at least one slave cluster, and the storage system is connected to the plurality of clusters respectively;
  • the storage system is used to store the serial number of the synchronization time slice; the multiple clusters all perform data synchronization through the above-mentioned data synchronization method.
  • the distributed cluster system further includes a preset controller, which is used to control the master-slave state switching of the storage management unit and record the master-slave state of the storage management unit.
  • An electronic device comprises: a processor and a memory storing a program, wherein the program comprises instructions, and when the instructions are executed by the processor, the processor executes the above-mentioned data synchronization method.
  • a non-transitory machine-readable medium storing computer instructions, wherein the computer instructions are used to enable the computer to execute the above-mentioned data synchronization method.
  • the data synchronization method and related equipment of a distributed cluster determine the synchronization time slice by switching the master-slave state of a storage management unit group in the distributed cluster; generate a data operation log of the master storage management unit in the storage management unit group within the synchronization time slice; synchronize the data operation log to the slave storage management unit in the storage management unit group, so that the slave storage management unit synchronizes data based on the data operation log, thereby solving the problems of service unavailability and incremental data synchronization consistency caused by downtime migration between different isolated sites, realizing non-stop data migration between isolated sites, and realizing incremental data synchronization between isolated sites.
  • FIG1 is a flow chart of a distributed cluster data synchronization method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of generating a data operation log and metadata of a storage management unit according to an embodiment of the present invention.
  • FIG. 3 is a flow chart of a storage management unit sending a data operation log request according to an embodiment of the present invention.
  • FIG. 4 is a flow chart of a storage management unit receiving a data operation log request according to an embodiment of the present invention.
  • FIG5 is a flow chart of a storage management unit serving as a Master sending a data snapshot according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of synchronous control of multiple storage management units according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of synchronous task processing based on a distributed consistency algorithm according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of the structure of a distributed cluster according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of an electronic device according to this embodiment.
  • the artifact library is a warehouse for artifacts. Artifacts are the fruitful products of software delivery, usually in executable binary form, so the artifact library is often also called a binary artifact warehouse.
  • the artifact library provides a unique entry point for dependency resolution for microservice developers using various development languages.
  • the build phase it provides a unique dependency resolution source and a unified artifact management library for various languages for build tools in various languages.
  • all test environment deployment tools pull artifacts that meet the test conditions from the artifact library for deployment.
  • the test result data is fed back to the artifact library and associated with the artifact.
  • the artifact is checked based on the quality level to see if it meets the deployment conditions. If it does, the deployment tool pulls the artifact from the artifact library to the environment for deployment.
  • the real-time data synchronization solution requires a unified synchronized clock between different devices to avoid data synchronization errors and ensure data rollback.
  • the product warehouse uses multiple regions for isolation. Since there is no synchronized clock between the isolated regions, the conventional real-time data synchronization method in related technologies cannot be used, and only the shutdown migration method can be used.
  • Jforg's repository replication data migration solution uploads products through http calls by calling the interface of the existing product system that does not have a product system.
  • this method cannot avoid the security and stability of data synchronization under non-Byzantine problems. It only supports product migration but not data migration in other formats, and it cannot isolate data in multiple regions.
  • the open source data migration solution of Rsync establishes an SSH connection with the target system for data transmission and calls the data interface on the remote system for data transmission.
  • This method requires manual migration and cannot automatically synchronize incremental data. It is only a simple file synchronization and cannot establish a connection between files and systems.
  • the distributed cluster system of this embodiment includes a master cluster and at least one slave cluster.
  • Each cluster includes multiple nodes, which are divided into multiple storage management units isolated from each other according to regions.
  • the storage management unit is also called a region.
  • the master storage management unit and the slave storage management unit for data synchronization are regarded as a storage management unit group. There is only one master storage management unit and at least one or more slave storage management units in the storage management unit group.
  • the master-slave status of the storage management units in the storage management unit group can be switched.
  • the regions in the master cluster are all in the master control role, that is, in the Master state
  • the regions in the slave cluster are all in the slave control role, that is, in the Slaver state.
  • the master cluster acts as the cluster that initiates data synchronization during the data synchronization process; the slave cluster acts as the cluster that receives data synchronization during the data synchronization process.
  • data synchronization may also be initiated by the slave cluster and received by the master cluster.
  • data is written to the region that serves as the master in the master cluster and then synchronized to the region that serves as the slave in the slave cluster.
  • data is written to the region that serves as the master in the slave cluster and then synchronized to the region that serves as the slave in the master cluster. region.
  • FIG1 is a flow chart of a distributed cluster data synchronization method according to an embodiment of the present invention. As shown in FIG1 , the process includes the following steps:
  • Step S101 determining a synchronization time slice based on the master-slave state switching of a storage management unit group in a distributed cluster.
  • Step S102 Generate a data operation log of the primary storage management unit in the storage management unit group within the synchronization time slice.
  • Step S103 synchronize the data operation log to the slave storage management unit in the storage management unit group, so that the slave storage management unit performs data synchronization based on the data operation log.
  • the synchronization time slice is determined, and the data storage state of the storage management unit within the synchronization time slice is synchronized based on the data operation log, which solves the problem of real-time data synchronization between isolated multiple storage management units.
  • This data synchronization method can not only realize real-time data synchronization of existing data, but also realize real-time data synchronization of incremental data, without downtime migration, and is also applicable to any type of data synchronization.
  • the method further includes: generating metadata of the master storage management unit so that when synchronizing the data operation log to the slave storage management unit in the storage management unit group, the data operation log to be synchronized in the data operation log is determined based on the metadata, wherein the metadata includes the synchronization status of the data operation log; based on the synchronization result of the data operation log, the metadata is updated.
  • the data storage state of the storage management unit is used as a finite state machine, so that data synchronization of the storage management unit can be achieved based on the replica synchronization mechanism of the finite state machine.
  • replica synchronization is performed based on a synchronization log.
  • the data operation log is used as a synchronization log for replica synchronization of the finite state machine.
  • the predefined data objects include cluster objects, node objects and finite state machines.
  • the cluster object includes the status of each cluster node.
  • the node object includes the metadata of the node, which is used to record the status information of the cluster during data synchronization, including the sequence number (term) of the synchronization time slice of the storage management unit, the index (logIndex) of the data operation log for marking the location information of the data operation log, the index number (snapshotIndex) of the last data operation log contained in the snapshot, the sequence number (snapshotTerm) of the synchronization time slice of the last data operation log contained in the snapshot, and the index number (preLogIndex) of the maximum log that has been applied to the slave.
  • the node object also includes node location information (endpoint), which is used to represent the coordinates of the node, which is generally the IP address of the terminal in a network environment.
  • the node object can also include the status of the master node that initiates the data synchronization cluster during data synchronization.
  • the finite state machine includes the executed data object and the executed action (i.e., data operation).
  • the storage management unit with a master-slave relationship is regarded as a finite state machine.
  • the data storage status includes the existing data, as well as the data added, deleted, and modified based on the existing data. Operation.
  • the stock data is usually stored in the form of data snapshots, and the data operations are recorded in the operation log. If the stock data between the storage management units in a master-slave relationship has been synchronized, that is, the stock data of each storage management unit is the same, then only the data operations on the stock data can be used to represent the data storage status of the storage management unit.
  • a finite state machine has a finite number of states.
  • the state machine starts at a given start state (such as an initial state, or a state with the same stock data).
  • a given start state such as an initial state, or a state with the same stock data.
  • Each input received by the state machine generates a new state and corresponding output through transition equations and output equations. This new state will remain until the next input arrives, and the generated output will be delivered to the corresponding recipient.
  • Multiple copies of the same state machine can be synchronized between multiple storage management units based on a replicated state machine. Multiple copies of the same state machine start with a start state, and receiving the same input in the same order will reach the same state that has generated the same output, i.e., a replicated state machine.
  • the term of the synchronization time slice is used as the synchronization clock of all clusters in the distributed cluster.
  • the term of the synchronization time slice is strictly and sequentially incremented from a set value (eg, 0).
  • the term of the synchronization time slice will be cached in a storage system with a high-performance storage medium.
  • the synchronization time slice of the finite state machine is updated by the master-slave state switching of any group of storage management units. Since the update of the sequence number of the synchronization time slice is marked by the master-slave state switching of the storage management unit, the state of the finite state machine of any storage management unit after the master-slave state switching falls into at least one synchronization time slice, that is, a corresponding relationship can be established between the finite state machine of the storage management unit and the sequence number of the synchronization time slice.
  • the cluster can determine the sequence number of the synchronization time slice of the synchronization time slice that the storage management unit enters from the master-slave state switching logic clock. Since the sequence number of the synchronization time slice is also affected by the master-slave state switching of the storage management units of other groups in the distributed cluster and increases automatically, for a certain storage management unit, the state change of the finite state machine may span multiple synchronization time slices, corresponding to the sequence numbers of multiple synchronization time slices. In this embodiment, the data operation log and metadata generated in each synchronization time slice are marked with the corresponding synchronization time slice sequence number.
  • any cluster reports the master-slave state switching logic clock to the storage system, and the sequence number of the synchronization time slice in the storage system is incremented by one.
  • the storage system will sequentially increment the sequence number of the synchronization time slice according to the master-slave state switching logic clock and then send it to the distributed cluster.
  • the clusters in the distributed cluster will periodically pull the sequence number of the synchronization time slice to prevent the failure of sending the sequence number of the synchronization time slice.
  • the input data of the finite state machine is data operation information.
  • the data operation information not only needs to have the same operation content, but also needs to ensure the same execution order. Therefore, in any cluster of the distributed cluster, the data operation information of the cluster in the distributed cluster is persisted according to the execution order of the data operation, wherein the data operation information includes the operation object, the operation content, and the information of the storage management unit that executes the data operation.
  • the cluster that receives data writes (usually the Master cluster)
  • it is used as the incremental operation log (pendingLog) generated by the data synchronization initiator, and then the incremental operation log is persisted.
  • the incremental operation logs are stored in the database in the order of execution.
  • the incremental operation logs executed by all storage management units are stored together in the order of execution.
  • the incremental operation logs can be further grouped and processed according to the storage management units. For example, as shown in Figure 2, a node for processing synchronization tasks is selected in the cluster that receives data writes to start a thread to batch read incremental operation logs from the database, and then divide them according to the storage management units.
  • each log is marked with the serial number of the synchronization time slice and the index number (logIndex) of the data operation log to generate a data operation log for data synchronization.
  • the data snapshot index number (snapshotIndex) in the metadata is a random value n (for example, it can be any integer between 1-100).
  • the metadata of the storage management unit also includes data used for data synchronization management, for example, the index number of the last data operation log of the storage management unit that has been applied from the node (lastAppliedIndex), and the index number of the data operation log that has been submitted by the storage management unit (committedIndex).
  • lastAppliedIndex and committedIndex are set to the same value n as snapshotIndex.
  • the serial number of the synchronization time slice in the metadata is set to the serial number of the current synchronization time slice.
  • the data operation log and metadata are persisted, and the data preparation work for data synchronization is completed.
  • the current storage management unit is not generating a synchronization log for the first time, that is, there are already historical data operation logs (including data operation logs that have been synchronized and data operation logs that have not been synchronized)
  • the index number of the newly generated data operation log is increased by one from the index number of the last data operation log generated last time
  • the snapshotIndex, lastAppliedIndex, and committedIndex in the metadata are also increased by one from the snapshotIndex, lastAppliedIndex, and committedIndex generated last time
  • the sequence number of the synchronization time slice is also set to the sequence number of the current synchronization time slice. Persisting the data operation log and metadata completes the data preparation work for data synchronization.
  • data synchronization is implemented based on replica synchronization of the finite state machine.
  • the cluster receiving the data write will determine the data operation log to be synchronized in the primary storage management unit according to the index number of the data operation log, the index number of the submitted data operation log recorded in the metadata, and the sequence number of the synchronization time slice, and synchronize the data operation log to be synchronized to the secondary storage management unit.
  • the main storage management unit selects a node for processing synchronization tasks, and the node processes the synchronization tasks and selects the storage management units that need to be synchronized.
  • the next log index number that needs to be synchronized is first obtained in the cache. If it does not exist, the index number of the last submitted log in the metadata is queried, and then the index number preLogIndex of the previous data operation log is set to the submitted index number minus one.
  • preLogIndex and snapshotIndex it is determined whether to send a snapshot (the current data snapshot of the finite state machine). If preLogIndex is not less than snapshotIndex, it means that the stock data has been migrated. Otherwise, a batch of data operation logs that need to be synchronized will be taken out from the database and sent to the secondary storage management unit for processing.
  • the process of processing the data operation log from the storage management unit is shown in FIG4 , and a node for processing data synchronization is also selected from the storage management unit.
  • the node is responsible for processing the data operation log.
  • the master storage management unit and the slave storage management unit When the sequence numbers of the synchronization time slices of the units are the same, the data operation logs to be synchronized are determined and the synchronization of the data operation logs to be synchronized is performed. If the sequence numbers of the synchronization time slices are different, a retry is performed.
  • sequence number of the synchronization time slice of the received data operation log is inconsistent with the sequence number of the synchronization time slice of the current node, it may be that the sequence number of the synchronization time slice of the current node has not been updated or the synchronization request of the data operation log has expired. At this time, the node will return the sequence number of the synchronization time slice of the current node to the main storage management unit. If the sequence numbers of the synchronization time slices are consistent, it means that the synchronization request of the data operation log is valid, and the current storage management unit is checked again to see if it is in the slave role (Slaver) state in the cluster. If it is in the Slaver state, log verification is performed.
  • Slaver slave role
  • the state of the current FSM of the slave storage management unit is determined by the committedIndex of the committed data operation log and the sequence number of the synchronization time slice. If it can match the synchronization request, it only needs to directly apply the log sent by the main storage management unit, update the current FSM state and metadata information, and return the result to the main storage management unit. Otherwise, it is necessary to return the current FSM state to the main storage management unit in the hope that the main storage management unit will send a matching log in the next synchronization request.
  • the main storage management unit may not receive the return value from the slave storage management unit due to a response timeout. In this case, it will retry. If it receives the return value from the slave storage management unit, it will update the next synchronization request sent according to the state returned by the slave storage management unit.
  • the above synchronization method can be used for the synchronization of incremental data, and can also be used for the synchronization of stock data.
  • the main storage management unit continues to send the data operation log after updating the nextLogIndex.
  • the preLogIndex of the data operation log must be less than the randomly generated snapshotIndex, so the main storage management unit will send the snapshot to migrate the stock data.
  • Snapshot is a snapshot of the FSM of the current storage management unit generated by the main storage management unit. The snapshot stores the indexes of all data (such as products) and related data.
  • the snapshot sent by the main storage management unit can be the URL of the snapshot, and the actual generated snapshot will be uploaded to the file storage system, which can reduce the data size of the snapshot request.
  • the process of sending snapshots by the main storage management unit is shown in Figure 5.
  • the storage management unit that needs to migrate inventory is determined in the set of snapshots that need to be sent, and the FSM snapshot of the storage management unit is generated.
  • the installation status of the slave storage management unit is determined. If the sequence number of the request and the synchronization time slice of the slave storage management unit do not match or the installation of the snapshot from the storage management unit fails, the installation snapshot request will be added to the set of snapshots to be sent for retry.
  • the storage management units in the master cluster in the distributed cluster are all master storage management units
  • the storage management units in the slave clusters in the distributed cluster are all slave storage management units
  • the master-slave switching of the storage management units is managed by a preset controller.
  • the preset controller sets the storage management unit with the identifier as the master storage management unit by adding the identifier of the storage management unit to the gray list, so as to realize the master-slave state switching of the storage management units in the distributed cluster.
  • the storage management units of cluster 1 (for example, the master cluster) are all in the Master state
  • the storage management units of cluster 2 (for example, the slave cluster) are all in the Master state. All management units are in the Slaver state, and the data flow direction is from cluster 1 to cluster 2. If the storage management unit is in the gray list, the storage management unit of cluster 2 is in the Master state, and data flows from cluster 2 to cluster 1.
  • the finite state machine (FSM) of the master storage management unit and the slave storage management unit in the same storage management unit group always maintains a consistent state in the safe state.
  • FSM finite state machine
  • the roles of different storage management units in different clusters are controlled to control the flow of data synchronization.
  • the data added by the user in any cluster will not be lost, and the new data will be synchronized in cluster 1 and cluster 2 in the form of synchronization logs.
  • the above-mentioned safe state means that the FSM state of the master storage management unit and the slave storage management unit in the same storage management unit is consistent, and there is no data operation log that has been applied to the master storage management unit but not yet applied to the slave storage management unit.
  • the master storage management unit is determined by querying the preset controller, and the new data is written to the master storage management unit. For example, when a user needs to perform a data operation, the user can directly access cluster 1 or cluster 2; the user queries the identification of the storage management unit stored in the master-slave controller to confirm whether the access is the master node or the slave node.
  • a leader node and at least one follower node are respectively elected in the main storage management unit and the slave storage management unit, and the leader node in each storage management unit performs the copy synchronization of the finite state machine of the storage management unit with a master-slave relationship.
  • a leader node can be elected in each cluster or storage management unit to process the synchronization task; at the same time, multiple follower nodes are configured to be able to take over the leader node to continue to process the synchronization task after the leader node fails, ensuring that the data will not be lost and can be synchronized to the slave storage management unit.
  • the leader node of the main storage management unit crashes or other abnormal situations occur, the data in the main storage management unit has been persisted so that it will not be lost.
  • After selecting a new leader in the follower node it can continue to synchronize data to the slave storage management unit according to the persisted log and metadata, so the slave storage management unit can still be consistent with the main storage management unit.
  • the embodiment of the present invention based on the raft log synchronization technology, can realize the stock data migration and incremental data synchronization of multiple storage management units, and can be applied to files of any format and is not limited to Jforg product files.
  • raft is used to ensure the security and reliability of data synchronization under non-Byzantine problems. All synchronized data is applied to the system through FSM, and the association relationship between data can also be synchronized between different systems. By isolating data through multiple storage management units, the synchronization direction of different storage management units between different sites can be controlled.
  • FIG8 is a schematic diagram of the distributed cluster system of this embodiment.
  • the system includes a cluster 81, at least one cluster 82, and a storage system 83.
  • the cluster 81 serves as a master control role
  • the cluster 82 serves as a slave control role
  • the storage system 83 is connected to the above-mentioned multiple clusters respectively.
  • the storage system 83 is used to store the serial numbers of the synchronization time slices; the clusters all perform data synchronization through the above-mentioned distributed cluster data synchronization method.
  • the cluster switches the master-slave state of the storage management unit group in the distributed cluster to ensure generating a data operation log of the master storage management unit in the storage management unit group within the synchronization time slice; synchronizing the data operation log to the slave storage management unit in the storage management unit group, so that the slave storage management unit performs data synchronization based on the data operation log.
  • the cluster generates metadata of the master storage management unit so that when synchronizing the data operation log to the slave storage management unit in the storage management unit group, the data operation log to be synchronized in the data operation log is determined based on the metadata, wherein the metadata includes the synchronization status of the data operation log; based on the synchronization result of the data operation log, the metadata is updated
  • the cluster persists data operation information of the cluster in the distributed cluster according to the execution order of the data operations, wherein the data operation information includes information of the operation object, operation content, and the storage management unit that performs the data operation; based on the data operation information of the cluster, the data operation log of the main storage management unit within the synchronization time slice is obtained, wherein the data operation log is sorted according to the execution order of the data operations.
  • the cluster obtains the master-slave state switching logic clock of the storage management unit group, and determines the synchronization time slice and the sequence number of the synchronization time slice according to the master-slave state switching logic clock.
  • the cluster when the master-slave state of the storage management unit of the cluster in the distributed cluster is switched, the cluster reports the master-slave state switching logical clock to the storage system, and the storage system sequentially increments the sequence number of the synchronization time slice according to the master-slave state switching logical clock and sends it to the distributed cluster.
  • the distributed cluster periodically obtains the sequence number of the synchronization time slice from the storage system.
  • the cluster determines the data operation log to be synchronized in the primary storage management unit based on the index number of the data operation log, the index number of the submitted data operation log recorded in the metadata, and the sequence number of the synchronization time slice, and synchronizes the data operation log to be synchronized to the secondary storage management unit.
  • the cluster determines the data operation logs to be synchronized and synchronizes the data operation logs to be synchronized when the sequence numbers of the synchronization time slices of the primary storage management unit and the secondary storage management unit are the same.
  • the distributed cluster system further includes a preset controller, which is used to control the master-slave state switching of the storage management unit and record the master-slave state of the storage management unit.
  • the storage management units in the master cluster in the distributed cluster are all master storage management units, and the storage management units in the slave clusters in the distributed cluster are all slave storage management units, and the master-slave switching of the storage management unit group is managed by a preset controller.
  • the preset controller sets the storage management unit with the identifier of the storage management unit as the main storage management unit by adding the identifier of the storage management unit to the gray list, so as to realize the master-slave state switching of the storage management units in the storage management unit group in the distributed cluster.
  • a primary storage management unit in the storage management unit group is determined by querying a preset controller, and the data is written into the primary storage management unit.
  • the data storage state of the storage management unit is used as a finite state machine.
  • a leader node and at least one follower node are respectively elected in the main storage management unit and the slave storage management unit.
  • the leader node synchronizes the copies of the finite state machine within the storage management unit group based on the data operation log.
  • An embodiment of the present invention further provides an electronic device, comprising: at least one processor; and a memory in communication with the at least one processor.
  • the memory stores a computer program executable by the at least one processor, and when the computer program is executed by the at least one processor, the electronic device executes the method of the embodiment of the present invention.
  • An embodiment of the present invention further provides a non-transitory machine-readable medium storing a computer program, wherein the computer program, when executed by a processor of a computer, is used to cause the computer to execute the method of the embodiment of the present invention.
  • An embodiment of the present invention further provides a computer program product, including a computer program, wherein the computer program, when executed by a processor of a computer, is used to enable the computer to execute the method of the embodiment of the present invention.
  • the electronic device is intended to represent various forms of digital electronic computer equipment, such as laptop computers, desktop computers, workbenches, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • the electronic device can also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are merely examples, and are not intended to limit the implementation of the present invention described herein and/or required.
  • the electronic device includes a computing unit 901, which can perform various appropriate actions and processes according to a computer program stored in a read-only memory (ROM) 902 or a computer program loaded from a storage unit 908 into a random access memory (RAM) 903.
  • ROM read-only memory
  • RAM random access memory
  • various programs and data required for the operation of the electronic device can also be stored.
  • the computing unit 901, ROM 902, and RAM 903 are connected to each other via a bus 904.
  • An input/output (I/O) interface 905 is also connected to the bus 904.
  • the input unit 906 can be any type of device that can input information to the electronic device, and the input unit 906 can receive input digital or character information, and generate key signal input related to user settings and/or function control of the electronic device.
  • the output unit 907 can be any type of device that can present information, and can include but is not limited to a display, a speaker, a video/audio output terminal, a vibrator, and/or a printer.
  • the storage unit 908 can include but is not limited to a disk, an optical disk.
  • the communication unit 909 allows the electronic device to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks, and can include but is not limited to a modem, a network card, an infrared communication device, a wireless communication transceiver, and/or a chipset, such as a Bluetooth device, a WiFi device, a WiMax device, a cellular communication device, and/or the like.
  • the computing unit 901 may be a variety of general and/or special processing components with processing and computing capabilities. Some examples of the computing unit 901 include, but are not limited to, a CPU, a graphics processing unit (GPU), various special artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processors (DSPs), and any appropriate processors, controllers, microcontrollers, etc.
  • the computing unit 901 performs the various methods and processes described above.
  • the method embodiments of the present invention may be implemented as a computer program, which is tangibly contained in a machine-readable medium, such as a storage unit 908.
  • part or all of the computer program may be loaded and/or installed on an electronic device via the ROM 902 and/or the communication unit 909.
  • the computing unit 901 may be a computer program that is tangibly contained in a machine-readable medium, such as a storage unit 908.
  • Element 901 may be configured to execute the above method in any other appropriate manner (eg, by means of firmware).
  • the computer programs for implementing the methods of the embodiments of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor or controller of a general-purpose computer, a special-purpose computer, or other programmable data processing device, so that when the computer programs are executed by the processor or controller, the functions/operations specified in the flow chart and/or block diagram are implemented.
  • the computer programs may be executed entirely on the machine, partially on the machine, partially on the machine as a stand-alone software package and partially on a remote machine, or entirely on a remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, device, or equipment.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable signal medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or equipment, or any suitable combination of the foregoing.
  • a more specific example of a machine-readable storage medium may include an electrical connection based on one or more lines, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or flash memory erasable programmable read-only memory
  • CD-ROM portable compact disk read-only memory
  • CD-ROM compact disk read-only memory
  • magnetic storage device or any suitable combination of the foregoing.
  • the user information including but not limited to user device information, user personal information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.
  • the user information and data involved in the embodiments of the present invention are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with relevant laws, regulations and standards of relevant countries and regions, and provide corresponding operation entrances for users to choose to authorize or refuse.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Hardware Redundancy (AREA)

Abstract

The present application relates to a data synchronization method for a distributed cluster and a related device therefor. The method comprises: on the basis of the master-slave state switching of a memory management unit group in a distributed cluster, determining a synchronization time slice; generating a data operation log of a master memory management unit in the memory management unit group within the synchronization time slice; synchronizing the data operation log to a slave memory management unit in the memory management unit group, so that the slave memory management unit performs data synchronization on the basis of the data operation log. The present invention solves problems of service unavailability and incremental data synchronization consistency due to downtime migration between different isolated sites, allows for zero-downtime data migration between the isolated sites, and implements incremental data synchronization between the isolated sites.

Description

分布式集群的数据同步方法及其相关设备Distributed cluster data synchronization method and related equipment

本申请要求于2023年05月19日提交中国专利局、申请号为202310576529.7、申请名称为“分布式集群的数据同步方法及其相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the China Patent Office on May 19, 2023, with application number 202310576529.7 and application name “Data synchronization method and related equipment for distributed clusters”, the entire contents of which are incorporated by reference in this application.

技术领域Technical Field

本申请涉及计算机技术领域,特别是涉及分布式集群的数据同步方法及其相关设备。The present application relates to the field of computer technology, and in particular to a data synchronization method for a distributed cluster and related equipment.

背景技术Background Art

本部分旨在为权利要求书中陈述的本发明实施例提供背景或上下文。不应以此处的描述包括在本部分中就承认是现有技术。This section is intended to provide a background or context to the embodiments of the invention recited in the claims. No admission is made that the description herein is prior art by its inclusion in this section.

制品仓库存在需要在不同站点进行冗余备份,不同站点之间进行制品同步,这样在出现区域性问题时可以进行主从切换达到高可用的目的,同时,由于区域因素造成网络延迟也可以通过多站点部署解决。此外,在不同部署环境之间也存在制品同步的需求,未经验证的制品需要先在开发环境进行验证,在通过验证之后通过站点同步功能同步至稳定的生产环境中。The product warehouse needs to be redundantly backed up at different sites and synchronized between different sites, so that when regional problems occur, master-slave switching can be performed to achieve high availability. At the same time, network delays caused by regional factors can also be solved through multi-site deployment. In addition, there is also a need for product synchronization between different deployment environments. Unverified products need to be verified in the development environment first, and then synchronized to the stable production environment through the site synchronization function after passing the verification.

制品仓库的不同站点之间隔离,因此常规的在线数据同步方法不能实现制品仓库的不同站点之间的数据同步。相关技术中隔离的不同站点之间数据的迁移通常采用停机迁移的方式,然而停机迁移将会导致服务暂时不可用,而且对于增量数据也无法实现实时同步。Different sites of the product warehouse are isolated from each other, so conventional online data synchronization methods cannot achieve data synchronization between different sites of the product warehouse. In the related art, data migration between isolated different sites usually adopts the method of downtime migration, which will cause the service to be temporarily unavailable, and it is impossible to achieve real-time synchronization for incremental data.

发明内容Summary of the invention

本发明实施例提供的分布式集群的数据同步方法及其相关装置,至少解决隔离的不同站点之间停机迁移导致的服务不可用的问题。The distributed cluster data synchronization method and related devices provided by the embodiments of the present invention at least solve the problem of service unavailability caused by shutdown migration between isolated different sites.

一种分布式集群的数据同步方法,包括:A distributed cluster data synchronization method, comprising:

基于分布式集群中存储管理单元组的主从状态切换,确定同步时间片;Determine the synchronization time slice based on the master-slave state switching of the storage management unit group in the distributed cluster;

生成所述存储管理单元组中的主存储管理单元在同步时间片内的数据操作日志;Generating a data operation log of a primary storage management unit in the storage management unit group within a synchronization time slice;

将所述数据操作日志同步至所述存储管理单元组中的从存储管理单元,以使所述从存储管理单元基于所述数据操作日志进行数据同步。The data operation log is synchronized to a slave storage management unit in the storage management unit group, so that the slave storage management unit performs data synchronization based on the data operation log.

在其中的一些实施例中,所述方法还包括:In some embodiments, the method further comprises:

生成所述主存储管理单元的元数据,以使得在将所述数据操作日志同步至所述存储管理单元组中的从存储管理单元时,基于所述元数据确定所述数据操作日志中待同步的数据操作日志,其中,所述元数据包括所述数据操作日志的同步状态;Generate metadata of the master storage management unit so that when synchronizing the data operation log to the slave storage management unit in the storage management unit group, the data operation log to be synchronized in the data operation log is determined based on the metadata, wherein the metadata includes a synchronization status of the data operation log;

基于所述数据操作日志的同步结果,更新所述元数据。 The metadata is updated based on the synchronization result of the data operation log.

在其中的一些实施例中,生成所述存储管理单元组中的主存储管理单元在同步时间片内的数据操作日志包括:In some embodiments, generating a data operation log of a primary storage management unit in the storage management unit group within a synchronization time slice includes:

按照数据操作的执行顺序,持久化所述分布式集群中集群的数据操作信息,其中,所述数据操作信息包括操作对象、操作内容和执行数据操作的存储管理单元的信息;Persisting data operation information of the cluster in the distributed cluster according to the execution order of the data operation, wherein the data operation information includes the operation object, the operation content and the information of the storage management unit that performs the data operation;

根据所述集群的数据操作信息,获取所述主存储管理单元在所述同步时间片内的数据操作日志,其中,所述数据操作日志按照数据操作的执行顺序排序。According to the data operation information of the cluster, a data operation log of the primary storage management unit in the synchronization time slice is obtained, wherein the data operation log is sorted according to the execution order of the data operation.

在其中的一些实施例中,基于分布式集群中存储管理单元组的主从状态切换,确定同步时间片包括:In some of the embodiments, determining the synchronization time slice based on the master-slave state switching of the storage management unit group in the distributed cluster includes:

获取所述存储管理单元组的主从状态切换逻辑时钟,根据所述主从状态切换逻辑时钟确定所述同步时间片以及所述同步时间片的序号。The master-slave state switching logic clock of the storage management unit group is obtained, and the synchronization time slice and the sequence number of the synchronization time slice are determined according to the master-slave state switching logic clock.

在其中的一些实施例中,所述方法还包括:In some embodiments, the method further comprises:

在所述分布式集群中集群的存储管理单元的主从状态切换时,所述集群将主从状态切换逻辑时钟上报至存储系统,由所述存储系统将根据所述主从状态切换逻辑时钟对同步时间片的序号有序自增后下发给所述分布式集群。When the master-slave state of the storage management unit of the cluster in the distributed cluster switches, the cluster reports the master-slave state switching logical clock to the storage system, and the storage system sequentially increases the sequence number of the synchronization time slice according to the master-slave state switching logical clock and sends it to the distributed cluster.

在其中的一些实施例中,所述方法还包括:In some embodiments, the method further comprises:

所述分布式集群定期从所述存储系统获取同步时间片的序号。The distributed cluster periodically obtains the sequence number of the synchronization time slice from the storage system.

在其中的一些实施例中,将所述数据操作日志同步至所述存储管理单元组中的从存储管理单元包括:In some embodiments, synchronizing the data operation log to a slave storage management unit in the storage management unit group includes:

根据所述数据操作日志的索引号、所述元数据中记录的已提交的数据操作日志的索引号以及所述同步时间片的序号,确定主存储管理单元中的待同步的数据操作日志,将所述待同步的数据操作日志同步至从存储管理单元。According to the index number of the data operation log, the index number of the submitted data operation log recorded in the metadata and the sequence number of the synchronization time slice, the data operation log to be synchronized in the main storage management unit is determined, and the data operation log to be synchronized is synchronized to the slave storage management unit.

在其中的一些实施例中,在所述主存储管理单元和所述从存储管理单元的同步时间片的序号相同的情况下,确定所述待同步的数据操作日志及进行所述待同步的数据操作日志的同步。In some of the embodiments, when the sequence numbers of the synchronization time slices of the primary storage management unit and the secondary storage management unit are the same, the data operation logs to be synchronized are determined and synchronized.

在其中的一些实施例中,初始状态下,所述分布式集群中主集群中的存储管理单元均为主存储管理单元,所述分布式集群中从集群中的存储管理单元均为从存储管理单元,所述存储管理单元组的主从切换通过预设控制器管理。In some of the embodiments, in an initial state, the storage management units in the master cluster in the distributed cluster are all master storage management units, the storage management units in the slave clusters in the distributed cluster are all slave storage management units, and the master-slave switching of the storage management unit group is managed by a preset controller.

在其中的一些实施例中,所述预设控制器通过将存储管理单元的标识加入灰度名单的方式,将具有该标识的存储管理单元置为主存储管理单元,以实现所述分布式集群中存储管理单元组中的存储管理单元的主从状态切换。In some of the embodiments, the preset controller sets the storage management unit with the identifier of the storage management unit as the main storage management unit by adding the identifier of the storage management unit to the gray list, so as to realize the master-slave state switching of the storage management units in the storage management unit group in the distributed cluster.

在其中的一些实施例中,在有数据待写入所述存储管理单元组的情况下,通过查询所述预设控制器确定所述存储管理单元组中的主存储管理单元,将数据写入主存储管理单元。In some of the embodiments, when there is data to be written into the storage management unit group, the primary storage management unit in the storage management unit group is determined by querying the preset controller, and the data is written into the primary storage management unit.

在其中的一些实施例中,所述方法还包括:In some embodiments, the method further comprises:

所述存储管理单元的数据存储状态作为有限状态机,在所述主存储管理单元和所述从存储管理单元中分别选举一个leader节点和至少一个follower节点,由leader节点基于所 述数据操作日志在所述存储管理单元组内进行所述有限状态机的副本同步。The data storage state of the storage management unit is used as a finite state machine, and a leader node and at least one follower node are respectively elected in the main storage management unit and the slave storage management unit. The data operation log performs copy synchronization of the finite state machine within the storage management unit group.

一种分布式集群系统,包括多个集群和存储系统,所述多个集群包括主集群和至少一个从集群,所述存储系统分别与所述多个集群连接;A distributed cluster system comprises a plurality of clusters and a storage system, wherein the plurality of clusters comprises a master cluster and at least one slave cluster, and the storage system is connected to the plurality of clusters respectively;

所述存储系统用于存储所述同步时间片的序号;所述多个集群均通过上述的数据同步方法进行数据同步。The storage system is used to store the serial number of the synchronization time slice; the multiple clusters all perform data synchronization through the above-mentioned data synchronization method.

在其中的一些实施例中,所述分布式集群系统还包括预设控制器,所述预设控制器用于控制存储管理单元的主从状态切换,以及记录存储管理单元的主从状态。In some of the embodiments, the distributed cluster system further includes a preset controller, which is used to control the master-slave state switching of the storage management unit and record the master-slave state of the storage management unit.

一种电子设备,包括:处理器,以及存储程序的存储器,其中,所述程序包括指令,所述指令在由所述处理器执行时使所述处理器执行上述的数据同步方法。An electronic device comprises: a processor and a memory storing a program, wherein the program comprises instructions, and when the instructions are executed by the processor, the processor executes the above-mentioned data synchronization method.

一种存储有计算机指令的非瞬时机器可读介质,其中,所述计算机指令用于使所述计算机执行上述的数据同步方法。A non-transitory machine-readable medium storing computer instructions, wherein the computer instructions are used to enable the computer to execute the above-mentioned data synchronization method.

本发明实施例提供的分布式集群的数据同步方法及其相关设备,通过基于分布式集群中存储管理单元组的主从状态切换,确定同步时间片;生成存储管理单元组中的主存储管理单元在同步时间片内的数据操作日志;将数据操作日志同步至存储管理单元组中的从存储管理单元,以使从存储管理单元基于数据操作日志进行数据同步,解决了隔离的不同站点之间停机迁移导致的服务不可用和增量数据同步一致性的问题,实现了隔离站点间的不停机迁移数据,实现了隔离站点之间增量数据同步。The data synchronization method and related equipment of a distributed cluster provided by an embodiment of the present invention determine the synchronization time slice by switching the master-slave state of a storage management unit group in the distributed cluster; generate a data operation log of the master storage management unit in the storage management unit group within the synchronization time slice; synchronize the data operation log to the slave storage management unit in the storage management unit group, so that the slave storage management unit synchronizes data based on the data operation log, thereby solving the problems of service unavailability and incremental data synchronization consistency caused by downtime migration between different isolated sites, realizing non-stop data migration between isolated sites, and realizing incremental data synchronization between isolated sites.

本发明的一个或多个实施例的细节在以下附图和描述中提出,以使本发明的其他特征、目的和优点更加简明易懂。The details of one or more embodiments of the invention are set forth in the following drawings and description so that other features, objects, and advantages of the invention are more readily apparent.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单的介绍。显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的实施例。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following briefly introduces the drawings required for use in the embodiments or the prior art descriptions. Obviously, the drawings described below are only some embodiments of the present invention, and for ordinary technicians in this field, other embodiments can be obtained based on these drawings without creative work.

图1是本发明实施例的分布式集群的数据同步方法的流程图。FIG1 is a flow chart of a distributed cluster data synchronization method according to an embodiment of the present invention.

图2是本发明实施例的生成存储管理单元的数据操作日志和元数据的示意图。FIG. 2 is a schematic diagram of generating a data operation log and metadata of a storage management unit according to an embodiment of the present invention.

图3是本发明实施例的存储管理单元发送数据操作日志请求的流程图。FIG. 3 is a flow chart of a storage management unit sending a data operation log request according to an embodiment of the present invention.

图4是本发明实施例的存储管理单元接收数据操作日志请求的流程图。FIG. 4 is a flow chart of a storage management unit receiving a data operation log request according to an embodiment of the present invention.

图5是本发明实施例的作为Master的存储管理单元发送数据快照的流程图。FIG5 is a flow chart of a storage management unit serving as a Master sending a data snapshot according to an embodiment of the present invention.

图6是本发明实施例的多存储管理单元同步控制的示意图。FIG. 6 is a schematic diagram of synchronous control of multiple storage management units according to an embodiment of the present invention.

图7是本发明实施例的基于分布式一致性算法的同步任务处理的示意图。FIG. 7 is a schematic diagram of synchronous task processing based on a distributed consistency algorithm according to an embodiment of the present invention.

图8是本发明实施例的分布式集群的结构示意图。FIG. 8 is a schematic diagram of the structure of a distributed cluster according to an embodiment of the present invention.

图9是本实施例的电子设备的结构示意图。FIG. 9 is a schematic structural diagram of an electronic device according to this embodiment.

具体实施方式 DETAILED DESCRIPTION

下面将参照附图更详细地描述本实施例的实施例。虽然附图中显示了本实施例的某些实施例,然而应当理解的是,本实施例可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本实施例。应当理解的是,本实施例的附图及实施例仅用于示例性作用,并非用于限制本实施例的保护范围。Embodiments of the present embodiment will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the present embodiment are shown in the accompanying drawings, it should be understood that the present embodiment can be implemented in various forms and should not be construed as being limited to the embodiments set forth herein, which are instead provided for a more thorough and complete understanding of the present embodiment. It should be understood that the drawings and embodiments of the present embodiment are only for exemplary purposes and are not intended to limit the scope of protection of the present embodiment.

制品库顾名思义是制品的仓库,制品是软件交付的成果性产物,通常是可运行的二进制形式,因此制品库通常也被称之为二进制制品仓库。制品库在开发阶段为使用各种开发语言的微服务开发者提供依赖解析的唯一入口。在构建阶段为各种语言的构建工具提供唯一的依赖解析源和统一的各种语言制品管理库。进入测试阶段后,所有测试环境部署工具从制品库拉取满足测试条件的制品进行部署,测试结束后将测试结果数据反馈到制品库,并且与制品进行关联。部署阶段依据质量关卡检查制品是否满足部署条件,满足则由部署工具从制品库拉取制品对接环境进行部署。As the name implies, the artifact library is a warehouse for artifacts. Artifacts are the fruitful products of software delivery, usually in executable binary form, so the artifact library is often also called a binary artifact warehouse. During the development phase, the artifact library provides a unique entry point for dependency resolution for microservice developers using various development languages. During the build phase, it provides a unique dependency resolution source and a unified artifact management library for various languages for build tools in various languages. After entering the test phase, all test environment deployment tools pull artifacts that meet the test conditions from the artifact library for deployment. After the test is completed, the test result data is fed back to the artifact library and associated with the artifact. During the deployment phase, the artifact is checked based on the quality level to see if it meets the deployment conditions. If it does, the deployment tool pulls the artifact from the artifact library to the environment for deployment.

实时数据同步方案为了保证数据同步的安全性,需要不同设备间具有统一的同步时钟,从而避免数据同步错误,保障数据可回滚性。然而,制品仓库采用多region(区域)隔离,由于相互隔离的区域间无同步时钟,无法采用相关技术中常规的实时数据同步方法,而只能采用停机迁移的方式。In order to ensure the security of data synchronization, the real-time data synchronization solution requires a unified synchronized clock between different devices to avoid data synchronization errors and ensure data rollback. However, the product warehouse uses multiple regions for isolation. Since there is no synchronized clock between the isolated regions, the conventional real-time data synchronization method in related technologies cannot be used, and only the shutdown migration method can be used.

例如,相关技术中Jforg的repository replication数据迁移方案通过http调用方式,通过在已有制品系统中调用未有制品系统的接口上传制品,但这种方式无法避免非拜占庭问题下的数据同步安全性及稳定性,仅支持制品迁移但不支持其他格式的数据迁移,也无法对多region数据进行隔离操作。For example, in the related technology, Jforg's repository replication data migration solution uploads products through http calls by calling the interface of the existing product system that does not have a product system. However, this method cannot avoid the security and stability of data synchronization under non-Byzantine problems. It only supports product migration but not data migration in other formats, and it cannot isolate data in multiple regions.

相关技术中Rsync的open source数据迁移方案,通过和目标系统建立ssh连接进行数据传输,调用远程系统上的数据接口进行数据传输。这种方式需要手动迁移且无法自动同步增量数据,且仅仅是简单的文件同步,无法建立文件与系统之间的联系。In the related art, the open source data migration solution of Rsync establishes an SSH connection with the target system for data transmission and calls the data interface on the remote system for data transmission. This method requires manual migration and cannot automatically synchronize incremental data. It is only a simple file synchronization and cannot establish a connection between files and systems.

本实施例的分布式集群系统包括一个主集群(Master cluster)和至少一个从集群(Slaver cluster)。每个集群包括多个节点,这些节点又被按照区域划分为相互隔离的多个存储管理单元,存储管理单元又称为region。进行数据同步的主存储管理单元和从存储管理单元作为一个存储管理单元组,在存储管理单元组中有且仅有一个主存储管理单元,有至少一个或者多个从存储管理单元。存储管理单元组内的存储管理单元的主从状态可以切换。在初始状态下,主集群中的region都为主控角色,也即处于Master状态,从集群中的region都为从控角色,也即处于Slaver状态。主集群在数据同步过程中作为发起数据同步的集群;从集群在数据同步过程中作为接收数据同步的集群。但是在主从集群内的region的主从状态切换后,数据同步也可能由从集群发起而由主集群接收。The distributed cluster system of this embodiment includes a master cluster and at least one slave cluster. Each cluster includes multiple nodes, which are divided into multiple storage management units isolated from each other according to regions. The storage management unit is also called a region. The master storage management unit and the slave storage management unit for data synchronization are regarded as a storage management unit group. There is only one master storage management unit and at least one or more slave storage management units in the storage management unit group. The master-slave status of the storage management units in the storage management unit group can be switched. In the initial state, the regions in the master cluster are all in the master control role, that is, in the Master state, and the regions in the slave cluster are all in the slave control role, that is, in the Slaver state. The master cluster acts as the cluster that initiates data synchronization during the data synchronization process; the slave cluster acts as the cluster that receives data synchronization during the data synchronization process. However, after the master-slave status of the regions in the master-slave clusters is switched, data synchronization may also be initiated by the slave cluster and received by the master cluster.

例如,数据同步时,在初始状态下,数据写入主集群中作为主控角色的region后同步至从集群中作为从控角色的region。当互为主从的两个或者两个以上的region的主从状态切换后,数据写入从集群中作为主控角色的region后同步至主集群中作为从控角色的 region。For example, during data synchronization, in the initial state, data is written to the region that serves as the master in the master cluster and then synchronized to the region that serves as the slave in the slave cluster. When the master-slave status of two or more regions that are mutually master-slave is switched, data is written to the region that serves as the master in the slave cluster and then synchronized to the region that serves as the slave in the master cluster. region.

为了实现隔离站点之间的数据同步并避免服务停机不可用,本发明实施例提供了一种分布式集群的数据同步方法。图1是本发明实施例的分布式集群的数据同步方法的流程图,如图1所示,该流程包括如下步骤:In order to achieve data synchronization between isolated sites and avoid service downtime and unavailability, an embodiment of the present invention provides a distributed cluster data synchronization method. FIG1 is a flow chart of a distributed cluster data synchronization method according to an embodiment of the present invention. As shown in FIG1 , the process includes the following steps:

步骤S101,基于分布式集群中存储管理单元组的主从状态切换,确定同步时间片。Step S101, determining a synchronization time slice based on the master-slave state switching of a storage management unit group in a distributed cluster.

步骤S102,生成存储管理单元组中的主存储管理单元在同步时间片内的数据操作日志。Step S102: Generate a data operation log of the primary storage management unit in the storage management unit group within the synchronization time slice.

步骤S103,将数据操作日志同步至存储管理单元组中的从存储管理单元,以使从存储管理单元基于数据操作日志进行数据同步。Step S103: synchronize the data operation log to the slave storage management unit in the storage management unit group, so that the slave storage management unit performs data synchronization based on the data operation log.

通过上述步骤,基于分布式集群中存储管理单元组的主从状态切换,确定同步时间片,对同步时间片内的存储管理单元的数据存储状态基于数据操作日志进行同步,解决了隔离的多存储管理单元之间的实时数据同步的问题。该数据同步方法不仅能够实现存量数据的实时数据同步,还能够实现增量数据的实时数据同步,无需停机迁移,也适用于任意类型的数据同步。Through the above steps, based on the master-slave state switching of the storage management unit group in the distributed cluster, the synchronization time slice is determined, and the data storage state of the storage management unit within the synchronization time slice is synchronized based on the data operation log, which solves the problem of real-time data synchronization between isolated multiple storage management units. This data synchronization method can not only realize real-time data synchronization of existing data, but also realize real-time data synchronization of incremental data, without downtime migration, and is also applicable to any type of data synchronization.

其中,数据操作日志的同步状态,例如,已同步的数据操作日志、已提交给从存储管理单元的数据操作日志等信息,这些信息可以保存在元数据中,并根据数据操作日志的同步结果进行更新,以确定在同步数据时各个存储管理单元的数据存储状态。在其中的一些实施例中,方法还包括:生成所述主存储管理单元的元数据,以使得在将所述数据操作日志同步至所述存储管理单元组中的从存储管理单元时,基于所述元数据确定所述数据操作日志中待同步的数据操作日志,其中,所述元数据包括所述数据操作日志的同步状态;基于所述数据操作日志的同步结果,更新所述元数据。Among them, the synchronization status of the data operation log, for example, the synchronized data operation log, the data operation log submitted to the slave storage management unit and other information, this information can be saved in the metadata, and updated according to the synchronization result of the data operation log to determine the data storage status of each storage management unit when synchronizing data. In some embodiments thereof, the method further includes: generating metadata of the master storage management unit so that when synchronizing the data operation log to the slave storage management unit in the storage management unit group, the data operation log to be synchronized in the data operation log is determined based on the metadata, wherein the metadata includes the synchronization status of the data operation log; based on the synchronization result of the data operation log, the metadata is updated.

在其中的一些实施例中,将存储管理单元的数据存储状态作为有限状态机,从而可以基于有限状态机的副本同步机制,实现存储管理单元的数据同步。在有限状态机的副本同步机制中,基于同步日志(log)进行副本同步,在本实施例中,将数据操作日志作为有限状态机的副本同步的同步日志。In some of the embodiments, the data storage state of the storage management unit is used as a finite state machine, so that data synchronization of the storage management unit can be achieved based on the replica synchronization mechanism of the finite state machine. In the replica synchronization mechanism of the finite state machine, replica synchronization is performed based on a synchronization log. In this embodiment, the data operation log is used as a synchronization log for replica synchronization of the finite state machine.

在本实施例中,预先定义的数据对象包括集群对象,节点对象和有限状态机。其中,集群对象包括各个集群节点的状态。节点对象包括节点的元数据(metadata),metadata用于记录数据同步过程中集群的状态信息,包括存储管理单元的同步时间片的序号(term)、用于标记数据操作日志的位置信息的数据操作日志的索引(logIndex),快照所包含的最后一条数据操作日志的索引号(snapshotIndex)、快照所包含的最后一条数据操作日志的同步时间片的序号(snapshotTerm),以及已经应用到slave的最大日志的索引号(preLogIndex)。节点对象还包括节点位置信息(endpoint),用于表示节点的坐标,在网络环境中一般为终端的IP地址。节点对象还可以包括数据同步过程中发起数据同步集群的主节点的状态。有限状态机包括执行的数据对象、执行的动作(即数据操作)。在本实施例中,具有主从关系的存储管理单元视为一个有限状态机。In this embodiment, the predefined data objects include cluster objects, node objects and finite state machines. Among them, the cluster object includes the status of each cluster node. The node object includes the metadata of the node, which is used to record the status information of the cluster during data synchronization, including the sequence number (term) of the synchronization time slice of the storage management unit, the index (logIndex) of the data operation log for marking the location information of the data operation log, the index number (snapshotIndex) of the last data operation log contained in the snapshot, the sequence number (snapshotTerm) of the synchronization time slice of the last data operation log contained in the snapshot, and the index number (preLogIndex) of the maximum log that has been applied to the slave. The node object also includes node location information (endpoint), which is used to represent the coordinates of the node, which is generally the IP address of the terminal in a network environment. The node object can also include the status of the master node that initiates the data synchronization cluster during data synchronization. The finite state machine includes the executed data object and the executed action (i.e., data operation). In this embodiment, the storage management unit with a master-slave relationship is regarded as a finite state machine.

其中,数据存储状态包括存量数据,也包括在存量数据的基础上进行的增删改等数据 操作。其中,存量数据通常以数据快照(snapshot)的方式存储,数据操作则记录在操作日志中。在互为主从关系的存储管理单元之间,若存量数据已经完成同步,即各存储管理单元的存量数据相同,则也可以仅用对存量数据的数据操作来表示存储管理单元的数据存储状态。The data storage status includes the existing data, as well as the data added, deleted, and modified based on the existing data. Operation. The stock data is usually stored in the form of data snapshots, and the data operations are recorded in the operation log. If the stock data between the storage management units in a master-slave relationship has been synchronized, that is, the stock data of each storage management unit is the same, then only the data operations on the stock data can be used to represent the data storage status of the storage management unit.

有限状态机具有有限的状态,状态机开始于给定的开始状态(例如初始状态,或者具有相同的存量数据的状态),状态机每收到的输入都通过过渡方程和输出方程来产生一个新的状态以及相应的输出。这个新的状态会一直保持到下一个输入到达,产生的输出会传递给相应的接收者。同一状态机的多个副本可以基于复制状态机方式在多个存储管理单元之间同步,同一状态机的多个副本以开始状态开始,并且以相同顺序接收相同输入将到达已生成相同输出的相同状态,即复制状态机。A finite state machine has a finite number of states. The state machine starts at a given start state (such as an initial state, or a state with the same stock data). Each input received by the state machine generates a new state and corresponding output through transition equations and output equations. This new state will remain until the next input arrives, and the generated output will be delivered to the corresponding recipient. Multiple copies of the same state machine can be synchronized between multiple storage management units based on a replicated state machine. Multiple copies of the same state machine start with a start state, and receiving the same input in the same order will reach the same state that has generated the same output, i.e., a replicated state machine.

上述的同步时间片的序号(term)是作为分布式集群中所有集群的同步时钟,该同步时间片的序号从设定值(例如0)开始严格有序自增。同步时间片的序号将会缓存到具有高性能存储介质的存储系统中。The term of the synchronization time slice is used as the synchronization clock of all clusters in the distributed cluster. The term of the synchronization time slice is strictly and sequentially incremented from a set value (eg, 0). The term of the synchronization time slice will be cached in a storage system with a high-performance storage medium.

在上述步骤中,以任一组存储管理单元的主从状态切换更新有限状态机的同步时间片。由于同步时间片的序号的更新以存储管理单元的主从状态切换作为标志,因此,对于任一个存储管理单元在主从状态切换后的有限状态机的状态,都落入到至少一个同步时间片内,即在存储管理单元的有限状态机和同步时间片的序号之间能够建立对应关系。因此,集群通过获取集群中任一组存储管理单元的主从状态切换逻辑时钟,就能够确定该存储管理单元从主从状态切换逻辑时钟开始进入的同步时间片的同步时间片的序号。由于同步时间片的序号也受到分布式集群中其他组的存储管理单元的主从状态切换的影响而自增,因此,对于某一个存储管理单元而言,有限状态机的状态变化可能跨越多个同步时间片,而对应于多个同步时间片的序号。在本实施例中,对于每个同步时间片内产生的数据操作日志和元数据,都标注上相应的同步时间片的序号。In the above steps, the synchronization time slice of the finite state machine is updated by the master-slave state switching of any group of storage management units. Since the update of the sequence number of the synchronization time slice is marked by the master-slave state switching of the storage management unit, the state of the finite state machine of any storage management unit after the master-slave state switching falls into at least one synchronization time slice, that is, a corresponding relationship can be established between the finite state machine of the storage management unit and the sequence number of the synchronization time slice. Therefore, by obtaining the master-slave state switching logic clock of any group of storage management units in the cluster, the cluster can determine the sequence number of the synchronization time slice of the synchronization time slice that the storage management unit enters from the master-slave state switching logic clock. Since the sequence number of the synchronization time slice is also affected by the master-slave state switching of the storage management units of other groups in the distributed cluster and increases automatically, for a certain storage management unit, the state change of the finite state machine may span multiple synchronization time slices, corresponding to the sequence numbers of multiple synchronization time slices. In this embodiment, the data operation log and metadata generated in each synchronization time slice are marked with the corresponding synchronization time slice sequence number.

在分布式集群中任一集群的任一组存储管理单元的主从状态切换时,任一集群将主从状态切换逻辑时钟上报至存储系统,存储系统中的同步时间片的序号自增一,存储系统将根据主从状态切换逻辑时钟对同步时间片的序号有序自增后下发给分布式集群。此外,分布式集群中的集群都会定时拉取同步时间片的序号,以防止同步时间片的序号下发失败。When the master-slave state of any group of storage management units in any cluster in the distributed cluster is switched, any cluster reports the master-slave state switching logic clock to the storage system, and the sequence number of the synchronization time slice in the storage system is incremented by one. The storage system will sequentially increment the sequence number of the synchronization time slice according to the master-slave state switching logic clock and then send it to the distributed cluster. In addition, the clusters in the distributed cluster will periodically pull the sequence number of the synchronization time slice to prevent the failure of sending the sequence number of the synchronization time slice.

本实施例中有限状态机的输入数据为数据操作信息。为了满足复制状态机所要求的输入相同的要求,数据操作信息不仅需要操作内容相同,还需要保证执行顺序也相同。因此,在分布式集群的任一集群中,都按照数据操作的执行顺序,持久化分布式集群中集群的数据操作信息,其中,数据操作信息包括操作对象、操作内容和执行数据操作的存储管理单元的信息。In this embodiment, the input data of the finite state machine is data operation information. In order to meet the same input requirement of the replication state machine, the data operation information not only needs to have the same operation content, but also needs to ensure the same execution order. Therefore, in any cluster of the distributed cluster, the data operation information of the cluster in the distributed cluster is persisted according to the execution order of the data operation, wherein the data operation information includes the operation object, the operation content, and the information of the storage management unit that executes the data operation.

具体地,在接收数据写入的集群(一般为Master集群)中,可以通过在每一处写操作附加埋点记录,以记录操作对象、操作内容、执行数据操作的存储管理单元的信息,封装后作为数据同步发起方产生的增量的操作日志(pendingLog),然后将增量的操作日志持久 化到数据库中,通过数据库来保证增量的操作日志按照执行顺序存储。Specifically, in the cluster that receives data writes (usually the Master cluster), you can attach a point record to each write operation to record the operation object, operation content, and information about the storage management unit that performs the data operation. After encapsulation, it is used as the incremental operation log (pendingLog) generated by the data synchronization initiator, and then the incremental operation log is persisted. The incremental operation logs are stored in the database in the order of execution.

由于在接收数据写入的集群中,将所有的存储管理单元执行的增量的操作日志都按照执行顺序存储在一起。为了进行以存储管理单元为单位进行数据同步,可以对增量的操作日志进一步按照存储管理单元进行分组处理。例如,如图2所示,接收数据写入的集群中选择用于处理同步任务的节点开启线程从数据库中批量读取增量的操作日志,然后按照存储管理单元进行划分,在同一个存储管理单元中对每一条日志标注同步时间片的序号和数据操作日志的索引号(logIndex)信息生成用于数据同步的数据操作日志。Because in the cluster that receives data writes, the incremental operation logs executed by all storage management units are stored together in the order of execution. In order to synchronize data in units of storage management units, the incremental operation logs can be further grouped and processed according to the storage management units. For example, as shown in Figure 2, a node for processing synchronization tasks is selected in the cluster that receives data writes to start a thread to batch read incremental operation logs from the database, and then divide them according to the storage management units. In the same storage management unit, each log is marked with the serial number of the synchronization time slice and the index number (logIndex) of the data operation log to generate a data operation log for data synchronization.

若当前存储管理单元为第一次生成同步日志(log),则数据操作日志的索引号从0开始有序自增,同时初始化该存储管理单元的元数据(metadata),metadata中的数据快照索引号(snapshotIndex)为随机值n(例如可以是1-100之间的任意整数)。存储管理单元的元数据中还包括用于进行数据同步管理的数据,例如,该存储管理单元的上一条已经被从节点应用的数据操作日志的索引号(lastAppliedIndex),以及该存储管理单元的已经提交的数据操作日志的索引号(committedIndex)。在初始化时,lastAppliedIndex和committedIndex置为与snapshotIndex相同的值n。元数据中的同步时间片的序号置为当前的同步时间片的序号。持久化数据操作日志和元数据,即完成了数据同步的数据准备工作。If the current storage management unit generates a synchronization log (log) for the first time, the index number of the data operation log will be automatically incremented from 0, and the metadata (metadata) of the storage management unit will be initialized at the same time. The data snapshot index number (snapshotIndex) in the metadata is a random value n (for example, it can be any integer between 1-100). The metadata of the storage management unit also includes data used for data synchronization management, for example, the index number of the last data operation log of the storage management unit that has been applied from the node (lastAppliedIndex), and the index number of the data operation log that has been submitted by the storage management unit (committedIndex). During initialization, lastAppliedIndex and committedIndex are set to the same value n as snapshotIndex. The serial number of the synchronization time slice in the metadata is set to the serial number of the current synchronization time slice. The data operation log and metadata are persisted, and the data preparation work for data synchronization is completed.

若当前存储管理单元不是首次生成同步日志(log),即已经存在历史的数据操作日志(包括已经被同步过的数据操作日志,以及未被同步过的数据操作日志),则新生成的数据操作日志的索引号从上一次生成的最后一条数据操作日志的索引号加一,元数据中snapshotIndex、lastAppliedIndex、committedIndex也都是以上一次生成的snapshotIndex、lastAppliedIndex、committedIndex各自加一,同步时间片的序号同样置为当前的同步时间片的序号。持久化数据操作日志和元数据,即完成了数据同步的数据准备工作。If the current storage management unit is not generating a synchronization log for the first time, that is, there are already historical data operation logs (including data operation logs that have been synchronized and data operation logs that have not been synchronized), the index number of the newly generated data operation log is increased by one from the index number of the last data operation log generated last time, and the snapshotIndex, lastAppliedIndex, and committedIndex in the metadata are also increased by one from the snapshotIndex, lastAppliedIndex, and committedIndex generated last time, and the sequence number of the synchronization time slice is also set to the sequence number of the current synchronization time slice. Persisting the data operation log and metadata completes the data preparation work for data synchronization.

在进行数据同步时,基于有限状态机的副本同步实现数据同步。对于存储管理单元,接收数据写入的集群将根据数据操作日志的索引号、元数据中记录的已提交的数据操作日志的索引号以及同步时间片的序号,确定主存储管理单元中的待同步的数据操作日志,将待同步的数据操作日志同步至从存储管理单元。When data synchronization is performed, data synchronization is implemented based on replica synchronization of the finite state machine. For the storage management unit, the cluster receiving the data write will determine the data operation log to be synchronized in the primary storage management unit according to the index number of the data operation log, the index number of the submitted data operation log recorded in the metadata, and the sequence number of the synchronization time slice, and synchronize the data operation log to be synchronized to the secondary storage management unit.

参考图3,主存储管理单元选择用于处理同步任务的节点,由该节点处理同步任务,筛选出需要同步的存储管理单元。同步任务中首先在缓存中获取下一个需要同步的日志索引号,如果不存在则查询元数据中最后一条已提交日志的索引号,然后将上一条已数据操作日志索引号preLogIndex置为已提交索引号减一,通过判断preLogIndex与snapshotIndex的大小决定是否发送snapshot(有限状态机当前数据快照),若preLogIndex不小于snapshotIndex则说明已经迁移过存量数据,否则将从数据库中取出一批需要同步的数据操作日志发送到从存储管理单元中处理。Referring to Figure 3, the main storage management unit selects a node for processing synchronization tasks, and the node processes the synchronization tasks and selects the storage management units that need to be synchronized. In the synchronization task, the next log index number that needs to be synchronized is first obtained in the cache. If it does not exist, the index number of the last submitted log in the metadata is queried, and then the index number preLogIndex of the previous data operation log is set to the submitted index number minus one. By judging the size of preLogIndex and snapshotIndex, it is determined whether to send a snapshot (the current data snapshot of the finite state machine). If preLogIndex is not less than snapshotIndex, it means that the stock data has been migrated. Otherwise, a batch of data operation logs that need to be synchronized will be taken out from the database and sent to the secondary storage management unit for processing.

从存储管理单元中处理数据操作日志的流程如图4所示,从存储管理单元也选出用于处理数据同步的节点。该节点负责处理数据操作日志。The process of processing the data operation log from the storage management unit is shown in FIG4 , and a node for processing data synchronization is also selected from the storage management unit. The node is responsible for processing the data operation log.

在其中的一些实施例中,在同一个存储管理单元组中,主存储管理单元和从存储管理 单元的同步时间片的序号相同的情况下,则确定待同步的数据操作日志及进行待同步的数据操作日志的同步,如同步时间片的序号不相同,则进行重试。In some embodiments, in the same storage management unit group, the master storage management unit and the slave storage management unit When the sequence numbers of the synchronization time slices of the units are the same, the data operation logs to be synchronized are determined and the synchronization of the data operation logs to be synchronized is performed. If the sequence numbers of the synchronization time slices are different, a retry is performed.

若接收到的数据操作日志的同步时间片的序号与当前节点的同步时间片的序号不一致,则可能是当前节点的同步时间片的序号还未更新或者数据操作日志的同步请求已经过期,此时节点将会把当前节点的同步时间片的序号返回给主存储管理单元。如果同步时间片的序号一致则说明数据操作日志的同步请求有效,再次校验当前存储管理单元在所在集群中是否处于从控角色(Slaver)状态,若为Slaver状态,则进行log校验。If the sequence number of the synchronization time slice of the received data operation log is inconsistent with the sequence number of the synchronization time slice of the current node, it may be that the sequence number of the synchronization time slice of the current node has not been updated or the synchronization request of the data operation log has expired. At this time, the node will return the sequence number of the synchronization time slice of the current node to the main storage management unit. If the sequence numbers of the synchronization time slices are consistent, it means that the synchronization request of the data operation log is valid, and the current storage management unit is checked again to see if it is in the slave role (Slaver) state in the cluster. If it is in the Slaver state, log verification is performed.

从存储管理单元的当前FSM的状态由已提交数据操作日志的committedIndex和同步时间片的序号决定,若能够和同步请求匹配,则只需要直接将主存储管理单元发送的log进行应用,更新当前FSM的状态和metadata信息,将结果返回给主存储管理单元,否则需要将当前FSM的状态返回给主存储管理单元,以期望主存储管理单元在下次同步请求中发送能够匹配的log。主存储管理单元可能因为响应超时接收不到从存储管理单元的返回值,此时将重试,若接收到从存储管理单元的返回值,根据从存储管理单元返回的状态更新下一次发送的同步请求。The state of the current FSM of the slave storage management unit is determined by the committedIndex of the committed data operation log and the sequence number of the synchronization time slice. If it can match the synchronization request, it only needs to directly apply the log sent by the main storage management unit, update the current FSM state and metadata information, and return the result to the main storage management unit. Otherwise, it is necessary to return the current FSM state to the main storage management unit in the hope that the main storage management unit will send a matching log in the next synchronization request. The main storage management unit may not receive the return value from the slave storage management unit due to a response timeout. In this case, it will retry. If it receives the return value from the slave storage management unit, it will update the next synchronization request sent according to the state returned by the slave storage management unit.

上述的同步方法可以用于增量数据的同步,还可以用于存量数据的同步。继续参考图3,当第一次发送数据操作日志请求时,从存储管理单元中还没有该存储管理单元的数据操作日志写入,返回给主存储管理单元的冲突位置为0,主存储管理单元更新nextLogIndex后继续发送数据操作日志,此时数据操作日志的preLogIndex必然小于随机生成的snapshotIndex,所以主存储管理单元会发送snapshot进行存量数据的迁移。snapshot是主存储管理单元生成的当前存储管理单元的FSM的快照,snapshot中保存了所有的数据(例如制品)的索引以及相关的数据。在一些实施例中,主存储管理单元发送的snapshot可以是snapshot的URL,实际生成的snapshot会上传到文件存储系统中,这样能够减轻发送snapshot请求的数据大小。主存储管理单元发送snapshot的流程如图5所示,首先在需要发送snapshot的集合中确定需要存量迁移的存储管理单元,生成该存储管理单元的FSM快照,发送安装snapshot请求到从存储管理单元后,判断从存储管理单元的安装状态,若请求和从存储管理单元的同步时间片的序号不匹配或从存储管理单元安装snapshot失败,则会将安装snapshot请求加入发送snapshot的集合进行重试。The above synchronization method can be used for the synchronization of incremental data, and can also be used for the synchronization of stock data. Continuing to refer to Figure 3, when the data operation log request is sent for the first time, there is no data operation log of the storage management unit written in the slave storage management unit, and the conflict position returned to the main storage management unit is 0. The main storage management unit continues to send the data operation log after updating the nextLogIndex. At this time, the preLogIndex of the data operation log must be less than the randomly generated snapshotIndex, so the main storage management unit will send the snapshot to migrate the stock data. Snapshot is a snapshot of the FSM of the current storage management unit generated by the main storage management unit. The snapshot stores the indexes of all data (such as products) and related data. In some embodiments, the snapshot sent by the main storage management unit can be the URL of the snapshot, and the actual generated snapshot will be uploaded to the file storage system, which can reduce the data size of the snapshot request. The process of sending snapshots by the main storage management unit is shown in Figure 5. First, the storage management unit that needs to migrate inventory is determined in the set of snapshots that need to be sent, and the FSM snapshot of the storage management unit is generated. After sending an installation snapshot request to the slave storage management unit, the installation status of the slave storage management unit is determined. If the sequence number of the request and the synchronization time slice of the slave storage management unit do not match or the installation of the snapshot from the storage management unit fails, the installation snapshot request will be added to the set of snapshots to be sent for retry.

在其中的一些实施例中,初始状态下,分布式集群中主集群中的存储管理单元均为主存储管理单元,分布式集群中从集群中的存储管理单元均为从存储管理单元,存储管理单元的主从切换通过预设控制器进行管理。预设控制器通过将存储管理单元的标识加入灰度名单的方式,将具有该标识的存储管理单元置为主存储管理单元,以实现分布式集群中存储管理单元的主从状态切换。In some of the embodiments, in the initial state, the storage management units in the master cluster in the distributed cluster are all master storage management units, and the storage management units in the slave clusters in the distributed cluster are all slave storage management units, and the master-slave switching of the storage management units is managed by a preset controller. The preset controller sets the storage management unit with the identifier as the master storage management unit by adding the identifier of the storage management unit to the gray list, so as to realize the master-slave state switching of the storage management units in the distributed cluster.

例如,参考图6,设置不同存储管理单元在不同集群的主从关系,将其保存在可供查询的主从控制器中。设置统一接入的主从控制器,当存储管理单元不处于灰度名单中时,集群1(例如主集群)的存储管理单元均处于Master状态,集群2(例如从集群)的存储 管理单元均处于Slaver状态,此时数据流向为从集群1流入集群2中。若存储管理单元处于灰度名单中,则集群2的存储管理单元为Master状态,数据从集群2流入集群1中。由于数据双向写入,在安全状态下同一存储管理单元组中的主存储管理单元和从存储管理单元的有限状态机(FSM)始终保持一致状态,此时通过控制存储管理单元是否处于灰度名单中,以控制不同存储管理单元在不同集群中的角色从而控制数据同步的流向。用户在任意集群中新增的数据不会丢失,新增数据会以同步日志(log)的方式在集群1和集群2中进行同步。其中,上述的安全状态是指同一存储管理单元中的主存储管理单元和从存储管理单元的FSM状态保持一致,不存在已经应用到主存储管理单元但还未应用到从存储管理单元的数据操作日志。For example, referring to Figure 6, set the master-slave relationship of different storage management units in different clusters and save it in the master-slave controller that can be queried. Set the master-slave controller for unified access. When the storage management unit is not in the gray list, the storage management units of cluster 1 (for example, the master cluster) are all in the Master state, and the storage management units of cluster 2 (for example, the slave cluster) are all in the Master state. All management units are in the Slaver state, and the data flow direction is from cluster 1 to cluster 2. If the storage management unit is in the gray list, the storage management unit of cluster 2 is in the Master state, and data flows from cluster 2 to cluster 1. Due to bidirectional data writing, the finite state machine (FSM) of the master storage management unit and the slave storage management unit in the same storage management unit group always maintains a consistent state in the safe state. At this time, by controlling whether the storage management unit is in the gray list, the roles of different storage management units in different clusters are controlled to control the flow of data synchronization. The data added by the user in any cluster will not be lost, and the new data will be synchronized in cluster 1 and cluster 2 in the form of synchronization logs. Among them, the above-mentioned safe state means that the FSM state of the master storage management unit and the slave storage management unit in the same storage management unit is consistent, and there is no data operation log that has been applied to the master storage management unit but not yet applied to the slave storage management unit.

在设置了可供查询存储管理单元的主从状态的预设控制器的情形下,在其中的一些实施例中,在有新数据写入存储管理单元的情况下,通过查询预设控制器确定主存储管理单元,将新数据写入主存储管理单元。例如,用户在需要执行数据操作时,用户可以直接访问集群1或者集群2;用户通过查询主从控制器中保存的存储管理单元的标识,以确认访问的是主节点还是从节点。In the case where a preset controller is set for querying the master-slave status of the storage management unit, in some embodiments, when new data is written to the storage management unit, the master storage management unit is determined by querying the preset controller, and the new data is written to the master storage management unit. For example, when a user needs to perform a data operation, the user can directly access cluster 1 or cluster 2; the user queries the identification of the storage management unit stored in the master-slave controller to confirm whether the access is the master node or the slave node.

为了保证数据同步的可靠性,避免因单一节点故障导致的数据同步失败,在其中的一些实施例中,在主存储管理单元和从存储管理单元中,分别选举一个leader节点和至少一个follower节点,由各存储管理单元中的leader节点进行具有主从关系的存储管理单元的有限状态机的副本同步。参考图7,本实施例中可以基于raft(一种分布式一致性算法),在每一个集群或存储管理单元中选举出一个leader节点用于处理同步任务;同时配置多个follower节点,以在leader节点出现故障后能够接替leader节点继续处理同步任务,确保数据不会丢失且一定能够同步到从存储管理单元中去,当主存储管理单元的leader节点出现宕机等异常情况时,在主存储管理单元中的数据已经持久化所以不会丢失,在follower节点中选取新的leader后,能够根据持久化的log和metadata继续向从存储管理单元同步数据,所以从存储管理单元仍然能够和主存储管理单元保持一致。In order to ensure the reliability of data synchronization and avoid data synchronization failure caused by a single node failure, in some embodiments, a leader node and at least one follower node are respectively elected in the main storage management unit and the slave storage management unit, and the leader node in each storage management unit performs the copy synchronization of the finite state machine of the storage management unit with a master-slave relationship. Referring to Figure 7, in this embodiment, based on raft (a distributed consistency algorithm), a leader node can be elected in each cluster or storage management unit to process the synchronization task; at the same time, multiple follower nodes are configured to be able to take over the leader node to continue to process the synchronization task after the leader node fails, ensuring that the data will not be lost and can be synchronized to the slave storage management unit. When the leader node of the main storage management unit crashes or other abnormal situations occur, the data in the main storage management unit has been persisted so that it will not be lost. After selecting a new leader in the follower node, it can continue to synchronize data to the slave storage management unit according to the persisted log and metadata, so the slave storage management unit can still be consistent with the main storage management unit.

本发明实施例,基于raft日志同步技术,能够实现多存储管理单元的存量数据迁移和增量数据同步,可以应用于任意格式文件而不局限于Jforg的制品文件,同时通过raft保证了在非拜占庭问题下的数据同步安全性和可靠性。所有的同步数据都是经过FSM应用到系统中的,数据之间的关联关系也能够在不同系统之间同步。通过多存储管理单元将数据进行隔离,可以控制不同存储管理单元在不同站点之间的同步方向。The embodiment of the present invention, based on the raft log synchronization technology, can realize the stock data migration and incremental data synchronization of multiple storage management units, and can be applied to files of any format and is not limited to Jforg product files. At the same time, raft is used to ensure the security and reliability of data synchronization under non-Byzantine problems. All synchronized data is applied to the system through FSM, and the association relationship between data can also be synchronized between different systems. By isolating data through multiple storage management units, the synchronization direction of different storage management units between different sites can be controlled.

本实施例还提供了一种分布式集群系统。图8是本实施例的分布式集群系统的示意图,如图8所示,该系统包括集群81、至少一个集群82和存储系统83,集群81作为主控角色,集群82作为从控角色,存储系统83分别与上述的多个集群连接。This embodiment also provides a distributed cluster system. FIG8 is a schematic diagram of the distributed cluster system of this embodiment. As shown in FIG8 , the system includes a cluster 81, at least one cluster 82, and a storage system 83. The cluster 81 serves as a master control role, the cluster 82 serves as a slave control role, and the storage system 83 is connected to the above-mentioned multiple clusters respectively.

存储系统83用于存储同步时间片的序号;集群均通过上述的分布式集群的数据同步方法进行数据同步。The storage system 83 is used to store the serial numbers of the synchronization time slices; the clusters all perform data synchronization through the above-mentioned distributed cluster data synchronization method.

在其中的一些实施例中,集群基于分布式集群中存储管理单元组的主从状态切换,确 定同步时间片;生成存储管理单元组中的主存储管理单元在同步时间片内的数据操作日志;将数据操作日志同步至存储管理单元组中的从存储管理单元,以使从存储管理单元基于数据操作日志进行数据同步。In some embodiments, the cluster switches the master-slave state of the storage management unit group in the distributed cluster to ensure generating a data operation log of the master storage management unit in the storage management unit group within the synchronization time slice; synchronizing the data operation log to the slave storage management unit in the storage management unit group, so that the slave storage management unit performs data synchronization based on the data operation log.

在其中的一些实施例中,集群生成主存储管理单元的元数据,以使得在将数据操作日志同步至存储管理单元组中的从存储管理单元时,基于元数据确定数据操作日志中待同步的数据操作日志,其中,元数据包括数据操作日志的同步状态;基于数据操作日志的同步结果,更新元数据In some embodiments, the cluster generates metadata of the master storage management unit so that when synchronizing the data operation log to the slave storage management unit in the storage management unit group, the data operation log to be synchronized in the data operation log is determined based on the metadata, wherein the metadata includes the synchronization status of the data operation log; based on the synchronization result of the data operation log, the metadata is updated

在其中的一些实施例中,集群按照数据操作的执行顺序,持久化分布式集群中集群的数据操作信息,其中,数据操作信息包括操作对象、操作内容和执行数据操作的存储管理单元的信息;根据集群的数据操作信息,获取主存储管理单元在同步时间片内的数据操作日志,其中,数据操作日志按照数据操作的执行顺序排序。In some of the embodiments, the cluster persists data operation information of the cluster in the distributed cluster according to the execution order of the data operations, wherein the data operation information includes information of the operation object, operation content, and the storage management unit that performs the data operation; based on the data operation information of the cluster, the data operation log of the main storage management unit within the synchronization time slice is obtained, wherein the data operation log is sorted according to the execution order of the data operations.

在其中的一些实施例中,集群获取存储管理单元组的主从状态切换逻辑时钟,根据主从状态切换逻辑时钟确定同步时间片以及同步时间片的序号。In some of the embodiments, the cluster obtains the master-slave state switching logic clock of the storage management unit group, and determines the synchronization time slice and the sequence number of the synchronization time slice according to the master-slave state switching logic clock.

在其中的一些实施例中,在分布式集群中集群的存储管理单元的主从状态切换时,集群将主从状态切换逻辑时钟上报至存储系统,由存储系统将根据主从状态切换逻辑时钟对同步时间片的序号有序自增后下发给分布式集群。In some of the embodiments, when the master-slave state of the storage management unit of the cluster in the distributed cluster is switched, the cluster reports the master-slave state switching logical clock to the storage system, and the storage system sequentially increments the sequence number of the synchronization time slice according to the master-slave state switching logical clock and sends it to the distributed cluster.

在其中的一些实施例中,分布式集群定期从存储系统获取同步时间片的序号。In some of the embodiments, the distributed cluster periodically obtains the sequence number of the synchronization time slice from the storage system.

在其中的一些实施例中,集群根据数据操作日志的索引号、元数据中记录的已提交的数据操作日志的索引号以及同步时间片的序号,确定主存储管理单元中的待同步的数据操作日志,将待同步的数据操作日志同步至从存储管理单元。In some of the embodiments, the cluster determines the data operation log to be synchronized in the primary storage management unit based on the index number of the data operation log, the index number of the submitted data operation log recorded in the metadata, and the sequence number of the synchronization time slice, and synchronizes the data operation log to be synchronized to the secondary storage management unit.

在其中的一些实施例中,集群在主存储管理单元和从存储管理单元的同步时间片的序号相同的情况下,确定待同步的数据操作日志及进行待同步的数据操作日志的同步。In some of the embodiments, the cluster determines the data operation logs to be synchronized and synchronizes the data operation logs to be synchronized when the sequence numbers of the synchronization time slices of the primary storage management unit and the secondary storage management unit are the same.

在其中的一些实施例中,分布式集群系统还包括预设控制器,预设控制器用于控制存储管理单元的主从状态切换,以及记录存储管理单元的主从状态。In some of the embodiments, the distributed cluster system further includes a preset controller, which is used to control the master-slave state switching of the storage management unit and record the master-slave state of the storage management unit.

在其中的一些实施例中,初始状态下,分布式集群中主集群中的存储管理单元均为主存储管理单元,分布式集群中从集群中的存储管理单元均为从存储管理单元,存储管理单元组的主从切换通过预设控制器管理。In some of the embodiments, in an initial state, the storage management units in the master cluster in the distributed cluster are all master storage management units, and the storage management units in the slave clusters in the distributed cluster are all slave storage management units, and the master-slave switching of the storage management unit group is managed by a preset controller.

在其中的一些实施例中,预设控制器通过将存储管理单元的标识加入灰度名单的方式,将具有该标识的存储管理单元置为主存储管理单元,以实现分布式集群中存储管理单元组中的存储管理单元的主从状态切换。In some of the embodiments, the preset controller sets the storage management unit with the identifier of the storage management unit as the main storage management unit by adding the identifier of the storage management unit to the gray list, so as to realize the master-slave state switching of the storage management units in the storage management unit group in the distributed cluster.

在其中的一些实施例中,在有数据待写入存储管理单元组的情况下,通过查询预设控制器确定存储管理单元组中的主存储管理单元,将数据写入主存储管理单元。In some of the embodiments, when there is data to be written into the storage management unit group, a primary storage management unit in the storage management unit group is determined by querying a preset controller, and the data is written into the primary storage management unit.

在其中的一些实施例中,存储管理单元的数据存储状态作为有限状态机,在主存储管理单元和从存储管理单元中分别选举一个leader节点和至少一个follower节点,由leader节点基于数据操作日志在存储管理单元组内进行有限状态机的副本同步。 In some of the embodiments, the data storage state of the storage management unit is used as a finite state machine. A leader node and at least one follower node are respectively elected in the main storage management unit and the slave storage management unit. The leader node synchronizes the copies of the finite state machine within the storage management unit group based on the data operation log.

本发明实施例还提供一种电子设备,包括:至少一个处理器;以及与至少一个处理器通信连接的存储器。上述存储器存储有能够被上述至少一个处理器执行的计算机程序,上述计算机程序在被上述至少一个处理器执行时用于使电子设备执行本发明实施例的方法。An embodiment of the present invention further provides an electronic device, comprising: at least one processor; and a memory in communication with the at least one processor. The memory stores a computer program executable by the at least one processor, and when the computer program is executed by the at least one processor, the electronic device executes the method of the embodiment of the present invention.

本发明实施例还提供一种存储有计算机程序的非瞬时机器可读介质,其中,上述计算机程序在被计算机的处理器执行时用于使上述计算机执行本发明实施例的方法。An embodiment of the present invention further provides a non-transitory machine-readable medium storing a computer program, wherein the computer program, when executed by a processor of a computer, is used to cause the computer to execute the method of the embodiment of the present invention.

本发明实施例还提供一种计算机程序产品,包括计算机程序,其中,计算机程序在被计算机的处理器执行时用于使计算机执行本发明实施例的方法。An embodiment of the present invention further provides a computer program product, including a computer program, wherein the computer program, when executed by a processor of a computer, is used to enable the computer to execute the method of the embodiment of the present invention.

参考图9,现将描述可以作为本发明实施例的服务器或客户端的电子设备的结构框图,其是可以应用于本发明的各方面的硬件设备的示例。电子设备旨在表示各种形式的数字电子的计算机设备,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本发明的实现。With reference to Figure 9, the structural block diagram of the electronic device that can be used as the server or client of an embodiment of the present invention will now be described, which is an example of a hardware device that can be applied to various aspects of the present invention. The electronic device is intended to represent various forms of digital electronic computer equipment, such as laptop computers, desktop computers, workbenches, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device can also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely examples, and are not intended to limit the implementation of the present invention described herein and/or required.

如图9所示,电子设备包括计算单元901,其可以根据存储在只读存储器(ROM)902中的计算机程序或者从存储单元908加载到随机访问存储器(RAM)903中的计算机程序,来执行各种适当的动作和处理。在RAM 903中,还可存储电子设备操作所需的各种程序和数据。计算单元901、ROM 902以及RAM 903通过总线904彼此相连。输入/输出(I/O)接口905也连接至总线904。As shown in FIG9 , the electronic device includes a computing unit 901, which can perform various appropriate actions and processes according to a computer program stored in a read-only memory (ROM) 902 or a computer program loaded from a storage unit 908 into a random access memory (RAM) 903. In RAM 903, various programs and data required for the operation of the electronic device can also be stored. The computing unit 901, ROM 902, and RAM 903 are connected to each other via a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.

电子设备中的多个部件连接至I/O接口905,包括:输入单元906、输出单元907、存储单元908以及通信单元909。输入单元906可以是能向电子设备输入信息的任何类型的设备,输入单元906可以接收输入的数字或字符信息,以及产生与电子设备的用户设置和/或功能控制有关的键信号输入。输出单元907可以是能呈现信息的任何类型的设备,并且可以包括但不限于显示器、扬声器、视频/音频输出终端、振动器和/或打印机。存储单元908可以包括但不限于磁盘、光盘。通信单元909允许电子设备通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据,并且可以包括但不限于调制解调器、网卡、红外通信设备、无线通信收发机和/或芯片组,例如蓝牙设备、WiFi设备、WiMax设备、蜂窝通信设备和/或类似物。Multiple components in the electronic device are connected to the I/O interface 905, including: an input unit 906, an output unit 907, a storage unit 908, and a communication unit 909. The input unit 906 can be any type of device that can input information to the electronic device, and the input unit 906 can receive input digital or character information, and generate key signal input related to user settings and/or function control of the electronic device. The output unit 907 can be any type of device that can present information, and can include but is not limited to a display, a speaker, a video/audio output terminal, a vibrator, and/or a printer. The storage unit 908 can include but is not limited to a disk, an optical disk. The communication unit 909 allows the electronic device to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks, and can include but is not limited to a modem, a network card, an infrared communication device, a wireless communication transceiver, and/or a chipset, such as a Bluetooth device, a WiFi device, a WiMax device, a cellular communication device, and/or the like.

计算单元901可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元901的一些示例包括但不限于CPU、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元901执行上文所描述的各个方法和处理。例如,在一些实施例中,本发明的方法实施例可被实现为计算机程序,其被有形地包含于机器可读介质,例如存储单元908。在一些实施例中,计算机程序的部分或者全部可以经由ROM 902和/或通信单元909而被载入和/或安装到电子设备上。在一些实施例中,计算单 元901可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行上述的方法。The computing unit 901 may be a variety of general and/or special processing components with processing and computing capabilities. Some examples of the computing unit 901 include, but are not limited to, a CPU, a graphics processing unit (GPU), various special artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processors (DSPs), and any appropriate processors, controllers, microcontrollers, etc. The computing unit 901 performs the various methods and processes described above. For example, in some embodiments, the method embodiments of the present invention may be implemented as a computer program, which is tangibly contained in a machine-readable medium, such as a storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed on an electronic device via the ROM 902 and/or the communication unit 909. In some embodiments, the computing unit 901 may be a computer program that is tangibly contained in a machine-readable medium, such as a storage unit 908. Element 901 may be configured to execute the above method in any other appropriate manner (eg, by means of firmware).

用于实施本发明实施例的方法的计算机程序可以采用一个或多个编程语言的任何组合来编写。这些计算机程序可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得计算机程序当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。计算机程序可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。The computer programs for implementing the methods of the embodiments of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor or controller of a general-purpose computer, a special-purpose computer, or other programmable data processing device, so that when the computer programs are executed by the processor or controller, the functions/operations specified in the flow chart and/or block diagram are implemented. The computer programs may be executed entirely on the machine, partially on the machine, partially on the machine as a stand-alone software package and partially on a remote machine, or entirely on a remote machine or server.

在本发明实施例的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读信号介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of an embodiment of the present invention, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, device, or equipment. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable signal medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or equipment, or any suitable combination of the foregoing. A more specific example of a machine-readable storage medium may include an electrical connection based on one or more lines, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

需要说明的是,本发明实施例使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。本发明实施例中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。It should be noted that the term "including" and its variations used in the embodiments of the present invention are open inclusions, that is, "including but not limited to". The term "based on" means "based at least in part on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one other embodiment"; the term "some embodiments" means "at least some embodiments". The modifications of "one" and "multiple" mentioned in the embodiments of the present invention are illustrative rather than restrictive. Those skilled in the art should understand that unless the context clearly indicates otherwise, it should be understood as "one or more".

本发明实施例所涉及的用户信息(包括但不限于用户设备信息、用户个人信息等)和数据(包括但不限于用于分析的数据、存储的数据、展示的数据等),均为经用户授权或者经过各方充分授权的信息和数据,并且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准,并提供有相应的操作入口,供用户选择授权或者拒绝。The user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in the embodiments of the present invention are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with relevant laws, regulations and standards of relevant countries and regions, and provide corresponding operation entrances for users to choose to authorize or refuse.

本发明实施例所提供的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本发明的保护范围在此方面不受限制。The various steps described in the method implementation methods provided in the embodiments of the present invention may be performed in different orders and/or in parallel. In addition, the method implementation methods may include additional steps and/or omit the steps shown. The scope of protection of the present invention is not limited in this respect.

“实施例”一词在本说明书中指的是结合实施例描述的具体特征、结构或特性可以包括在本发明的至少一个实施例中。该短语出现在说明书中的各个位置并不一定意味着相同的实施例,也不意味着与其它实施例相互排斥而具有独立性或可供选择。本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见。尤其,对于装置、设备、系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明。The term "embodiment" in this specification refers to specific features, structures or characteristics described in conjunction with the embodiment that can be included in at least one embodiment of the present invention. The appearance of this phrase in various places in the specification does not necessarily mean the same embodiment, nor does it mean that it is mutually exclusive with other embodiments and is independent or optional. The various embodiments in this specification are described in a related manner, and the same or similar parts between the various embodiments refer to each other. In particular, for the device, equipment, and system embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and the relevant parts refer to the partial description of the method embodiment.

以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对专利保护范围的限制。应当指出的是,对于本领域的普通技术人员来说, 在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明的保护范围应以所附权利要求为准。 The above-mentioned embodiments only express several implementation methods of the present invention, and the description thereof is relatively specific and detailed, but it cannot be understood as limiting the scope of patent protection. It should be pointed out that for ordinary technicians in this field, Without departing from the concept of the present invention, several modifications and improvements may be made, which all belong to the protection scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the appended claims.

Claims (16)

一种分布式集群的数据同步方法,包括:A distributed cluster data synchronization method, comprising: 基于分布式集群中存储管理单元组的主从状态切换,确定同步时间片;Determine the synchronization time slice based on the master-slave state switching of the storage management unit group in the distributed cluster; 生成所述存储管理单元组中的主存储管理单元在同步时间片内的数据操作日志;Generating a data operation log of a primary storage management unit in the storage management unit group within a synchronization time slice; 将所述数据操作日志同步至所述存储管理单元组中的从存储管理单元,以使所述从存储管理单元基于所述数据操作日志进行数据同步。The data operation log is synchronized to a slave storage management unit in the storage management unit group, so that the slave storage management unit performs data synchronization based on the data operation log. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises: 生成所述主存储管理单元的元数据,以使得在将所述数据操作日志同步至所述存储管理单元组中的从存储管理单元时,基于所述元数据确定所述数据操作日志中待同步的数据操作日志,其中,所述元数据包括所述数据操作日志的同步状态;Generate metadata of the master storage management unit so that when synchronizing the data operation log to the slave storage management unit in the storage management unit group, the data operation log to be synchronized in the data operation log is determined based on the metadata, wherein the metadata includes a synchronization status of the data operation log; 基于所述数据操作日志的同步结果,更新所述元数据。The metadata is updated based on the synchronization result of the data operation log. 根据权利要求1所述的方法,其中,生成所述存储管理单元组中的主存储管理单元在同步时间片内的数据操作日志包括:The method according to claim 1, wherein generating a data operation log of a primary storage management unit in the storage management unit group within a synchronization time slice comprises: 按照数据操作的执行顺序,持久化所述分布式集群中集群的数据操作信息,其中,所述数据操作信息包括操作对象、操作内容和执行数据操作的存储管理单元的信息;Persisting data operation information of the cluster in the distributed cluster according to the execution order of the data operation, wherein the data operation information includes the operation object, the operation content and the information of the storage management unit that performs the data operation; 根据所述集群的数据操作信息,获取所述主存储管理单元在所述同步时间片内的数据操作日志,其中,所述数据操作日志按照数据操作的执行顺序排序。According to the data operation information of the cluster, a data operation log of the primary storage management unit in the synchronization time slice is obtained, wherein the data operation log is sorted according to the execution order of the data operation. 根据权利要求1所述的方法,其中,基于分布式集群中存储管理单元组的主从状态切换,确定同步时间片包括:The method according to claim 1, wherein determining the synchronization time slice based on the master-slave state switching of the storage management unit group in the distributed cluster comprises: 获取所述存储管理单元组的主从状态切换逻辑时钟,根据所述主从状态切换逻辑时钟确定所述同步时间片以及所述同步时间片的序号。The master-slave state switching logic clock of the storage management unit group is obtained, and the synchronization time slice and the sequence number of the synchronization time slice are determined according to the master-slave state switching logic clock. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises: 在所述分布式集群中集群的存储管理单元的主从状态切换时,所述集群将主从状态切换逻辑时钟上报至存储系统,由所述存储系统将根据所述主从状态切换逻辑时钟对同步时间片的序号有序自增后下发给所述分布式集群。When the master-slave state of the storage management unit of the cluster in the distributed cluster switches, the cluster reports the master-slave state switching logical clock to the storage system, and the storage system sequentially increases the sequence number of the synchronization time slice according to the master-slave state switching logical clock and sends it to the distributed cluster. 根据权利要求5所述的方法,其中,所述方法还包括:The method according to claim 5, wherein the method further comprises: 所述分布式集群定期从所述存储系统获取同步时间片的序号。The distributed cluster periodically obtains the sequence number of the synchronization time slice from the storage system. 根据权利要求2所述的方法,其中,将所述数据操作日志同步至所述存储管理单元组中的从存储管理单元包括:The method according to claim 2, wherein synchronizing the data operation log to a slave storage management unit in the storage management unit group comprises: 根据所述数据操作日志的索引号、所述元数据中记录的已提交的数据操作日志的索引号以及所述同步时间片的序号,确定主存储管理单元中的待同步的数据操作日志,将所述待同步的数据操作日志同步至从存储管理单元。According to the index number of the data operation log, the index number of the submitted data operation log recorded in the metadata and the sequence number of the synchronization time slice, the data operation log to be synchronized in the main storage management unit is determined, and the data operation log to be synchronized is synchronized to the slave storage management unit. 根据权利要求7所述的方法,其中,在所述主存储管理单元和所述从存储管理单元的同步时间片的序号相同的情况下,确定所述待同步的数据操作日志及进行所述待同步的数据操作日志的同步。 According to the method of claim 7, wherein, when the serial numbers of the synchronization time slices of the primary storage management unit and the secondary storage management unit are the same, the data operation log to be synchronized is determined and the synchronization of the data operation log to be synchronized is performed. 根据权利要求1所述的方法,其中,初始状态下,所述分布式集群中主集群中的存储管理单元均为主存储管理单元,所述分布式集群中从集群中的存储管理单元均为从存储管理单元,所述存储管理单元组的主从切换通过预设控制器管理。According to the method according to claim 1, wherein, in an initial state, the storage management units in the master cluster in the distributed cluster are all master storage management units, the storage management units in the slave cluster in the distributed cluster are all slave storage management units, and the master-slave switching of the storage management unit group is managed by a preset controller. 根据权利要求9所述的方法,其中,所述预设控制器通过将存储管理单元的标识加入灰度名单的方式,将具有该标识的存储管理单元置为主存储管理单元,以实现所述分布式集群中存储管理单元组中的存储管理单元的主从状态切换。According to the method according to claim 9, the preset controller sets the storage management unit with the identifier of the storage management unit as the main storage management unit by adding the identifier of the storage management unit to the gray list, so as to realize the master-slave state switching of the storage management units in the storage management unit group in the distributed cluster. 根据权利要求9所述的方法,其中,在有数据待写入所述存储管理单元组的情况下,通过查询所述预设控制器确定所述存储管理单元组中的主存储管理单元,将数据写入主存储管理单元。The method according to claim 9, wherein, in the case where there is data to be written into the storage management unit group, a primary storage management unit in the storage management unit group is determined by querying the preset controller, and the data is written into the primary storage management unit. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises: 所述存储管理单元的数据存储状态作为有限状态机,在所述主存储管理单元和所述从存储管理单元中分别选举一个leader节点和至少一个follower节点,由leader节点基于所述数据操作日志在所述存储管理单元组内进行所述有限状态机的副本同步。The data storage state of the storage management unit is used as a finite state machine. A leader node and at least one follower node are respectively selected in the main storage management unit and the slave storage management unit. The leader node synchronizes the copies of the finite state machine within the storage management unit group based on the data operation log. 一种分布式集群系统,包括多个集群和存储系统,所述多个集群包括主集群和至少一个从集群,所述存储系统分别与所述多个集群连接;A distributed cluster system comprises a plurality of clusters and a storage system, wherein the plurality of clusters comprises a master cluster and at least one slave cluster, and the storage system is connected to the plurality of clusters respectively; 所述存储系统用于存储所述同步时间片的序号;所述多个集群均通过权利要求1至12中任一项所述的方法进行数据同步。The storage system is used to store the serial number of the synchronization time slice; the multiple clusters all perform data synchronization through the method described in any one of claims 1 to 12. 根据权利要求13所述的分布式集群系统,其中,所述分布式集群系统还包括预设控制器,所述预设控制器用于控制存储管理单元的主从状态切换,以及记录存储管理单元的主从状态。According to the distributed cluster system of claim 13, the distributed cluster system further comprises a preset controller, which is used to control the master-slave state switching of the storage management unit and record the master-slave state of the storage management unit. 一种电子设备,包括:处理器,以及存储程序的存储器,其中,所述程序包括指令,所述指令在由所述处理器执行时使所述处理器执行根据权利要求1至12中任一项所述的方法。An electronic device comprises: a processor and a memory storing a program, wherein the program comprises instructions, and when the instructions are executed by the processor, the processor is caused to perform the method according to any one of claims 1 to 12. 一种存储有计算机指令的非瞬时机器可读介质,其中,所述计算机指令用于使所述计算机执行根据权利要求1至12中任一项所述的方法。 A non-transitory machine-readable medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the method according to any one of claims 1 to 12.
PCT/CN2024/092720 2023-05-19 2024-05-11 Data synchronization method for distributed cluster and related device therefor WO2024239992A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202310576529.7A CN117112690A (en) 2023-05-19 2023-05-19 Data synchronization method of distributed cluster and related equipment thereof
CN202310576529.7 2023-05-19

Publications (1)

Publication Number Publication Date
WO2024239992A1 true WO2024239992A1 (en) 2024-11-28

Family

ID=88795436

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2024/092720 WO2024239992A1 (en) 2023-05-19 2024-05-11 Data synchronization method for distributed cluster and related device therefor

Country Status (2)

Country Link
CN (1) CN117112690A (en)
WO (1) WO2024239992A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117112690A (en) * 2023-05-19 2023-11-24 阿里云计算有限公司 Data synchronization method of distributed cluster and related equipment thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104320401A (en) * 2014-10-31 2015-01-28 北京思特奇信息技术股份有限公司 Big data storage and access system and method based on distributed file system
US20190095293A1 (en) * 2016-07-27 2019-03-28 Tencent Technology (Shenzhen) Company Limited Data disaster recovery method, device and system
CN112612853A (en) * 2020-12-28 2021-04-06 深圳壹账通智能科技有限公司 Data processing method and device based on database cluster and electronic equipment
CN115599747A (en) * 2022-04-22 2023-01-13 北京志凌海纳科技有限公司(Cn) Metadata synchronization method, system and equipment of distributed storage system
CN117112690A (en) * 2023-05-19 2023-11-24 阿里云计算有限公司 Data synchronization method of distributed cluster and related equipment thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104320401A (en) * 2014-10-31 2015-01-28 北京思特奇信息技术股份有限公司 Big data storage and access system and method based on distributed file system
US20190095293A1 (en) * 2016-07-27 2019-03-28 Tencent Technology (Shenzhen) Company Limited Data disaster recovery method, device and system
CN112612853A (en) * 2020-12-28 2021-04-06 深圳壹账通智能科技有限公司 Data processing method and device based on database cluster and electronic equipment
CN115599747A (en) * 2022-04-22 2023-01-13 北京志凌海纳科技有限公司(Cn) Metadata synchronization method, system and equipment of distributed storage system
CN117112690A (en) * 2023-05-19 2023-11-24 阿里云计算有限公司 Data synchronization method of distributed cluster and related equipment thereof

Also Published As

Publication number Publication date
CN117112690A (en) 2023-11-24

Similar Documents

Publication Publication Date Title
AU2018395919B2 (en) Efficiently propagating diff values
US20200356448A1 (en) Manifest-based snapshots in distributed computing environments
US10621049B1 (en) Consistent backups based on local node clock
US12210419B2 (en) Continuous data protection
US9747582B2 (en) Implementing a consistent ordering of operations in collaborative editing of shared content items
US20190129976A1 (en) Apparatus for controlling synchronization of metadata on network and method for the same
US10489378B2 (en) Detection and resolution of conflicts in data synchronization
CN107644030A (en) Data synchronization method for distributed database, relevant apparatus and system
CN114490677A (en) Data synchronization in a data analysis system
US20200104404A1 (en) Seamless migration of distributed systems
CN111666134B (en) A distributed task scheduling method and system
CN113032477B (en) GTID-based long-distance data synchronization method, device and computing equipment
CN117931531B (en) Data backup system, method, device, equipment, storage medium and program product
CN115185966A (en) Method and device for processing data consistency in distributed cluster
US11042454B1 (en) Restoration of a data source
WO2024239992A1 (en) Data synchronization method for distributed cluster and related device therefor
CN118796932A (en) Data synchronization method, device, equipment and storage medium
WO2022121387A1 (en) Data storage method and apparatus, server, and medium
CN113515574A (en) Data synchronization method and device
WO2024040902A1 (en) Data access method, distributed database system and computing device cluster
CN111522688B (en) Data backup method and device for distributed system
WO2024103898A1 (en) Database cluster management method and apparatus
US12174855B2 (en) Interrupted synchronization detection and recovery
US10185759B2 (en) Distinguishing event type
CN114968656A (en) Data rollback method, device, equipment and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24810239

Country of ref document: EP

Kind code of ref document: A1