CN102394914A - Cluster brain-split processing method and device - Google Patents
Cluster brain-split processing method and device Download PDFInfo
- Publication number
- CN102394914A CN102394914A CN2011102825734A CN201110282573A CN102394914A CN 102394914 A CN102394914 A CN 102394914A CN 2011102825734 A CN2011102825734 A CN 2011102825734A CN 201110282573 A CN201110282573 A CN 201110282573A CN 102394914 A CN102394914 A CN 102394914A
- Authority
- CN
- China
- Prior art keywords
- cluster
- node
- heartbeat
- nodes
- business
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 19
- 238000000034 method Methods 0.000 claims abstract description 17
- 238000001514 detection method Methods 0.000 claims description 7
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000011084 recovery Methods 0.000 description 29
- 230000010247 heart contraction Effects 0.000 description 14
- 230000004044 response Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000007257 malfunction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
Images
Landscapes
- Hardware Redundancy (AREA)
Abstract
The invention provides a cluster brain-split processing method and a device, relates to the field of computer technology application, which solves the problem that the brain-split processing mode is single, so that the cluster working efficiency is influenced. The method comprises the following steps that: each node in a cluster detects a heartbeat line between the node and other nodes in the cluster; when any heartbeat line cannot be detected in each node in the cluster, and the node stops the business on the node. The technical scheme provided by the invention is suitable for high-availability cluster, and realizes flexible and effective brain-split processing.
Description
Technical field
The present invention relates to the computer technology application field, relate in particular to a kind of cluster fissure processing method and device.
Background technology
High available Clustering is widely used in technical field of memory.In order to guarantee the operate as normal of high available cluster, need each node in the cluster that normal activity is arranged when external service is provided, thereby guarantee externally to provide stable service.Provide in the process of service at cluster, because the variation of environment, node may take place like this or such fault, and causes node to break from cluster, the fissure phenomenon occurs.Because when fissure took place, the former service that provides of the node of disconnection now maybe be undesired, causes the cluster cisco unity malfunction, so, detect and respond fissure fast and accurately, can improve the performance of cluster.
The way of existing response fissure and recovery nodes mainly is directly the node of disconnection to be closed and restart computer system; Restore the initial environment of computer on the disconnected node; After restoring completion, this node being added in cluster provides service again again, guarantees that the service that provides afterwards on this node is stable.This method can guarantee the stability that service is provided of node computer; Yet under many circumstances; For example: the disconnection of netting twine etc., directly restarting computer system does not have great necessity, and behind computer system starting, will reinitialize information as requested; This will be a relatively time-consuming procedure, reduce efficient.To sum up, the processing mode to fissure in the prior art is single, has influenced the cluster operating efficiency.
Summary of the invention
The invention provides a kind of cluster fissure processing method and device, solved the fissure processing mode singlely, influence the problem of cluster operating efficiency.
A kind of cluster fissure processing method comprises:
Heartbeat line between other nodes in each this node of node detection and the cluster in the cluster;
When the cluster interior nodes detected less than any heartbeat line, this node was ended the business on this node.
Preferably, said when the cluster interior nodes detects less than any heartbeat line, this node is ended also to comprise after the step of the business on this node:
Said node detection to and cluster in behind the heartbeat line heartbeating recovery between each node, the business on this node is reopened.
Preferably, said when the cluster interior nodes detects less than any heartbeat line, the business that this node is ended on this node is specially:
In the time can't detecting any heartbeat line in the sense cycle that the cluster interior nodes is presetting, this node is ended the business on this node.
Preferably, above-mentioned cluster fissure processing method also comprises:
The cluster interior nodes can detect and the part cluster in during heartbeat line between other nodes, judge detect less than heartbeat failure.
The present invention also provides a kind of cluster fissure processing unit, comprising:
The heartbeat administration module is used for detecting the heartbeat line between cluster interior nodes and other nodes of cluster;
The cluster management module is used for when detecting less than any heartbeat line between cluster interior nodes and other nodes of cluster, ending the business on this cluster interior nodes.
Preferably, said cluster management module also is used for behind the heartbeat line heartbeating recovery that detects between cluster interior nodes and other nodes of said cluster, the business on this cluster interior nodes being reopened.
Preferably, said heartbeat administration module, also be used for can detect and other nodes of part cluster between the heartbeat line time, judge to detect less than heartbeat failure.
The invention provides a kind of cluster fissure processing method and device; Heartbeat line between other nodes in each this node of node detection and the cluster in the cluster, when the cluster interior nodes detects less than any heartbeat line, the business on this this node of node termination; Replaced the system of directly restarting of the prior art with the termination business; Save recovery time, improved the accuracy that the fissure phenomenon is handled, guaranteed system works efficient.
Description of drawings
The cluster fissure method that Fig. 1 provides for embodiments of the invention one is to the flow chart of fissure response;
The cluster fissure method that Fig. 2 provides for embodiments of the invention one is recovered the flow chart of response to fissure;
A kind of cluster fissure process flow figure that Fig. 3 provides for embodiments of the invention two;
A kind of cluster fissure processing unit structural representation that Fig. 4 provides for embodiments of the invention two.
Embodiment
Under many circumstances, for example: the disconnection of netting twine etc., directly restarting computer system does not have great necessity, and behind computer system starting, will reinitialize information as requested, and this will be a relatively time-consuming procedure, reduce efficient.
In order to address the above problem, embodiments of the invention provide a kind of cluster fissure processing method and device, fast detecting and response fissure, stop on this node shared resource, stop the business service that this node provides, guarantee the fail safe of shared resource; Behind this node heartbeating recovery, can be directly, the service of recovery nodes fast and efficiently.Not only guarantee the safety of resource, improved the speed of cluster recovery and the performance of high-availability system simultaneously.
Hereinafter will combine accompanying drawing that embodiments of the invention are elaborated.Need to prove that under the situation of not conflicting, embodiment among the application and the characteristic among the embodiment be combination in any each other.
At first combine accompanying drawing, embodiments of the invention one are described.
The embodiment of the invention provides a kind of cluster fissure processing method and device, in the available cluster of height, after node finds that heartbeat is broken off, can directly shut-down operation system, and just stop on this node shared resource, stop the business service that this node provides; Behind this node heartbeating recovery, can be directly, the service of recovery nodes fast and efficiently.The method has not only guaranteed the safety of resource, has improved the speed of cluster recovery simultaneously, improves the performance of high-availability system.The cluster fissure processing unit that the embodiment of the invention provides comprises: heartbeat administration module, cluster management module and local resource administration module.
In conjunction with above-mentioned cluster fissure processing unit, the cluster fissure processing method of using the embodiment of the invention to provide, the flow process that the node that the fissure phenomenon takes place is handled is following:
1) in the heartbeat administration module, heartbeat module regularly detects the information of every heartbeat line of all nodes in the cluster.In the time that system is provided with in advance, if continue not detect the information of heartbeat line, this judges this heartbeat failure.In a node, if all heartbeat line fault is all then judged other nodes disconnections in this node and the cluster.
2) in the cluster management module, when this module is received heartbeat module heartbeat ON-and OFF-command, can carry out a series of nodal informations and judge, confirm the processing method of node at last.If this node is the node that breaks from cluster, this node will be not can directly shut-down operation system, but startup local resource administration module (3) stop on this node shared resource, stop the business service that this node provides.Other normal node will be taken over the business on this disconnected node in the cluster, and service externally is provided.
3) the heartbeat administration module still detects the information of every heartbeat line of each node behind heartbeat failure, behind the heartbeat message that detects fault heartbeat line again, sends the order of heartbeating recovery and gives the cluster management module.
4) after the order that receives heartbeating recovery, the cluster management module will be made different operation according to the current state of cluster.As the cluster normal node can be directly, the service of recovery nodes fast and efficiently; Like cluster has been the fissure state, with the service of the whole cluster of fast quick-recovery.
After node breaks from cluster, can directly shut-down operation system, and just stop on this node shared resource, stop the business service that this node provides, guaranteed the fail safe of shared resource; The present invention has simultaneously increased the heartbeating recovery testing mechanism, behind this node heartbeating recovery, can be directly, the service of recovery nodes fast and efficiently, and improved the speed of cluster recovery, improve the performance of high-availability system.
To combine accompanying drawing that the present invention is carried out more detailed description below:
The master server of cluster management also is a node in the cluster, and this node can initiatively distribute the resource of cluster, to different servers, service is provided externally the various service assignment of cluster; Simultaneously, master server is also directly relevant with the user, and the user directly is assigned on the node of appointment by this node the operation of cluster.
Accompanying drawing 1 is the described fissure responding process of embodiment of the invention figure.The cluster management module is given in the heartbeat that detects certain node when the heartbeat administration module dead order of sending node when cluster breaks; The cluster management module is at first deleted and is upgraded the clustered node information list; And whether computing node is host node, and whether decision node is this node then, if be that this node breaks from cluster; The local resource administration module will stop on this node shared resource, stop the business service that this node provides, wait for the resurrection of heartbeat; Node breaking off is not under the situation of this node; Calculate the start node number of cluster, whether the decision node number is the high available cluster mode of 1+1 of 2 nodes, in the high available cluster of 2 nodes; This node is PING third party IP address initiatively; Judge whether this node also breaks from network, if this node breaks from network, the local resource administration module will stop on this node shared resource, stop the business service that this node provides; Do not wait for the resurrection of heartbeat, not then take over the master server of cluster management; Under the cluster situation of multinode; The half the size of contrast existing node number of cluster and start node number; If existing node number is less than a half; The local resource administration module will stop on this node shared resource, stop the business service that this node provides, the node number that wait for to bring back to life heartbeat is greater than 1/2; When existing node number equals 1/2, judge whether there is master server in the existing node; When existing node greater than 1/2 the time, judge then whether the node that breaks off is master server, if the node that breaks off is a master server, this node will calculate the information of this node, makes a strategic decision and whether takes over master server; If disconnected node is not master server, judge then whether this node is master server, if master server, the business on the disconnected node of then shifting is to other movable nodes.
Fig. 2 is heartbeating recovery responding process figure.The cluster management module is given in the order that sending node recovers when the heartbeat administration module detects the heartbeating recovery of node, and the cluster management module is at first sent the message of several times request adding and given all nodes in the cluster.For all nodes in the cluster; After the request of receiving adds order, will join nodal information in the node listing information on this node, in the cluster all nodes all cognitive the existence of node; Judge then whether this node is primary server joint; If node is a master server, this node will be replied the message of recovery nodes, inform the existence of master server; For the heartbeating recovery node, the request of transmission will be waited for the answer message of some time wait master server after adding message, if receive the answer message of master server, then node adds in the cluster, can start in the cluster and serve; Were it not for the answer message of receiving master server; Explain that master server does not exist; This recovery nodes will be sent again the main clothes of decision-making device and ordered to all nodes in the cluster, after each node is received this order, and information of computing node all; Make a strategic decision out new master server in the cluster restarts the service of cluster.
Cluster fissure processing method and device that the embodiment of the invention provided; Can respond the order that heartbeat is broken off fast; Stop local business and shared resource; And master server will guarantee the fail safe of resource to have guaranteed professional continuity simultaneously service assignment on the disconnected node to other normal nodes; Simultaneously, when the node heartbeating recovery, can be directly, the service of recovery nodes fast and efficiently, improved the speed of cluster recovery, improve the performance of high-availability system.
Below in conjunction with accompanying drawing, embodiments of the invention two are described.
The embodiment of the invention provides a kind of cluster fissure processing method, and it is as shown in Figure 3 to use this method to accomplish the flow process that fissure node in the cluster is handled, and comprising:
Heartbeat line between other nodes in each this node of node detection and the cluster in step 301, the cluster;
In the time can't detecting any heartbeat line in the sense cycle that the cluster interior nodes is presetting, this node is ended the business on this node.
After step 301, if the cluster interior nodes can detect one or more heartbeat line, but can't detect whole heartbeat line the time, explain that fissure does not take place this node, at this moment, decidable detect less than heartbeat failure.
The embodiment of the invention also provides a kind of cluster fissure processing unit, and its structure is as shown in Figure 4, comprising:
Preferably, said cluster management module 402 also is used for behind the heartbeat line heartbeating recovery that detects between cluster interior nodes and other nodes of said cluster, the business on this cluster interior nodes being reopened.
Preferably, said heartbeat administration module 401, also be used for can detect and other nodes of part cluster between the heartbeat line time, judge to detect less than heartbeat failure.
Above-mentioned cluster fissure processing unit can be integrated on interior each node of cluster, to accomplish the monitoring and the fissure of each node is handled.
The cluster fissure processing unit that the embodiment of the invention provides; Can combine with a kind of cluster fissure processing method that embodiments of the invention are provided; Heartbeat line between other nodes in each this node of node detection and the cluster in the cluster, when the cluster interior nodes detects less than any heartbeat line, the business on this this node of node termination; Replaced the system of directly restarting of the prior art with the termination business; Save recovery time, improved the accuracy that the fissure phenomenon is handled, guaranteed system works efficient.
The all or part of step that the one of ordinary skill in the art will appreciate that the foregoing description program circuit that can use a computer is realized; Said computer program can be stored in the computer-readable recording medium; Said computer program (like system, unit, device etc.) on the relevant hardware platform is carried out; When carrying out, comprise one of step or its combination of method embodiment.
Alternatively, all or part of step of the foregoing description also can use integrated circuit to realize, these steps can be made into integrated circuit modules one by one respectively, perhaps a plurality of modules in them or step is made into the single integrated circuit module and realizes.Like this, the present invention is not restricted to any specific hardware and software combination.
Each device/functional module/functional unit in the foregoing description can adopt the general calculation device to realize, they can concentrate on the single calculation element, also can be distributed on the network that a plurality of calculation element forms.
Each device/functional module/functional unit in the foregoing description is realized with the form of software function module and during as independently production marketing or use, can be stored in the computer read/write memory medium.The above-mentioned computer read/write memory medium of mentioning can be a read-only memory, disk or CD etc.
Any technical staff who is familiar with the present technique field can expect changing or replacement in the technical scope that the present invention discloses easily, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the described protection range of claim.
Claims (7)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2011102825734A CN102394914A (en) | 2011-09-22 | 2011-09-22 | Cluster brain-split processing method and device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2011102825734A CN102394914A (en) | 2011-09-22 | 2011-09-22 | Cluster brain-split processing method and device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN102394914A true CN102394914A (en) | 2012-03-28 |
Family
ID=45862118
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN2011102825734A Pending CN102394914A (en) | 2011-09-22 | 2011-09-22 | Cluster brain-split processing method and device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN102394914A (en) |
Cited By (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102769673A (en) * | 2012-07-25 | 2012-11-07 | 楚云汉智武汉网络存储系统有限公司 | Failure detection method suitable to large-scale storage cluster |
| CN102904946A (en) * | 2012-09-29 | 2013-01-30 | 浪潮(北京)电子信息产业有限公司 | Intra-cluster node management method and device |
| CN103607310A (en) * | 2013-11-29 | 2014-02-26 | 华为技术有限公司 | Method for arbitration of remote disaster recovery |
| CN103684941A (en) * | 2013-11-23 | 2014-03-26 | 广东新支点技术服务有限公司 | Arbitration server based cluster split-brain prevent method and device |
| CN104094577A (en) * | 2012-08-13 | 2014-10-08 | 统一有限责任两合公司 | Method and apparatus for indirectly assessing a status of an active entity |
| CN104239182A (en) * | 2014-09-03 | 2014-12-24 | 北京鲸鲨软件科技有限公司 | Cluster file system split-brain processing method and device |
| CN104378232A (en) * | 2014-11-10 | 2015-02-25 | 东软集团股份有限公司 | Schizencephaly finding and recovering method and device under main joint and auxiliary joint cluster networking mode |
| CN104579765A (en) * | 2014-12-27 | 2015-04-29 | 北京奇虎科技有限公司 | Disaster tolerance method and device for cluster system |
| CN104994173A (en) * | 2015-07-16 | 2015-10-21 | 浪潮(北京)电子信息产业有限公司 | Message processing method and system |
| WO2016050074A1 (en) * | 2014-09-29 | 2016-04-07 | 中兴通讯股份有限公司 | Cluster split brain processing method and apparatus |
| CN105515838A (en) * | 2015-11-26 | 2016-04-20 | 青岛海信传媒网络技术有限公司 | Service configuration method and HA (High Available) cluster system |
| CN105704187A (en) * | 2014-11-27 | 2016-06-22 | 华为技术有限公司 | Processing method and apparatus of cluster split brain |
| CN105849702A (en) * | 2013-12-25 | 2016-08-10 | 日本电气方案创新株式会社 | Cluster system, server device, cluster system management method, and computer-readable recording medium |
| CN105933407A (en) * | 2016-04-20 | 2016-09-07 | 中国银联股份有限公司 | Method and system for achieving high availability of Redis cluster |
| CN107528724A (en) * | 2017-07-20 | 2017-12-29 | 北京奇安信科技有限公司 | A kind of optimized treatment method and device of node cluster |
| CN109088794A (en) * | 2018-08-20 | 2018-12-25 | 郑州云海信息技术有限公司 | A kind of fault monitoring method and device of node |
| CN110377487A (en) * | 2019-07-11 | 2019-10-25 | 无锡华云数据技术服务有限公司 | A kind of method and device handling high-availability cluster fissure |
| US12045667B2 (en) | 2021-08-02 | 2024-07-23 | International Business Machines Corporation | Auto-split and auto-merge clusters |
| US12333343B2 (en) | 2021-11-23 | 2025-06-17 | International Business Machines Corporation | Avoidance of workload duplication among split-clusters |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101179432A (en) * | 2007-12-13 | 2008-05-14 | 浪潮电子信息产业股份有限公司 | A Method of Realizing System High Availability in Multi-machine Environment |
| CN101291243A (en) * | 2007-04-16 | 2008-10-22 | 广东省新支点技术服务有限公司 | Split-brain prevention method for high-availability cluster system |
| CN101651680A (en) * | 2009-09-14 | 2010-02-17 | 杭州华三通信技术有限公司 | Network safety allocating method and network safety device |
-
2011
- 2011-09-22 CN CN2011102825734A patent/CN102394914A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101291243A (en) * | 2007-04-16 | 2008-10-22 | 广东省新支点技术服务有限公司 | Split-brain prevention method for high-availability cluster system |
| CN101179432A (en) * | 2007-12-13 | 2008-05-14 | 浪潮电子信息产业股份有限公司 | A Method of Realizing System High Availability in Multi-machine Environment |
| CN101651680A (en) * | 2009-09-14 | 2010-02-17 | 杭州华三通信技术有限公司 | Network safety allocating method and network safety device |
Cited By (29)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102769673B (en) * | 2012-07-25 | 2015-03-25 | 深圳市中博科创信息技术有限公司 | Failure detection method suitable to large-scale storage cluster |
| CN102769673A (en) * | 2012-07-25 | 2012-11-07 | 楚云汉智武汉网络存储系统有限公司 | Failure detection method suitable to large-scale storage cluster |
| CN104094577A (en) * | 2012-08-13 | 2014-10-08 | 统一有限责任两合公司 | Method and apparatus for indirectly assessing a status of an active entity |
| CN104094577B (en) * | 2012-08-13 | 2017-07-04 | 统一有限责任两合公司 | Method and apparatus for indirectly assessing the state of an active entity |
| CN102904946B (en) * | 2012-09-29 | 2015-06-10 | 浪潮(北京)电子信息产业有限公司 | Method and device for managing nodes in cluster |
| CN102904946A (en) * | 2012-09-29 | 2013-01-30 | 浪潮(北京)电子信息产业有限公司 | Intra-cluster node management method and device |
| CN103684941A (en) * | 2013-11-23 | 2014-03-26 | 广东新支点技术服务有限公司 | Arbitration server based cluster split-brain prevent method and device |
| CN103684941B (en) * | 2013-11-23 | 2018-01-16 | 广东中兴新支点技术有限公司 | Cluster based on arbitrating server splits brain preventing method and device |
| CN103607310A (en) * | 2013-11-29 | 2014-02-26 | 华为技术有限公司 | Method for arbitration of remote disaster recovery |
| CN105849702A (en) * | 2013-12-25 | 2016-08-10 | 日本电气方案创新株式会社 | Cluster system, server device, cluster system management method, and computer-readable recording medium |
| US10102088B2 (en) | 2013-12-25 | 2018-10-16 | Nec Solution Innovators, Ltd. | Cluster system, server device, cluster system management method, and computer-readable recording medium |
| CN104239182A (en) * | 2014-09-03 | 2014-12-24 | 北京鲸鲨软件科技有限公司 | Cluster file system split-brain processing method and device |
| CN104239182B (en) * | 2014-09-03 | 2017-05-03 | 北京鲸鲨软件科技有限公司 | Cluster file system split-brain processing method and device |
| WO2016050074A1 (en) * | 2014-09-29 | 2016-04-07 | 中兴通讯股份有限公司 | Cluster split brain processing method and apparatus |
| CN104378232A (en) * | 2014-11-10 | 2015-02-25 | 东软集团股份有限公司 | Schizencephaly finding and recovering method and device under main joint and auxiliary joint cluster networking mode |
| CN104378232B (en) * | 2014-11-10 | 2018-01-19 | 东软集团股份有限公司 | Fissure discovery, restoration methods and device under active and standby cluster networking pattern |
| CN105704187A (en) * | 2014-11-27 | 2016-06-22 | 华为技术有限公司 | Processing method and apparatus of cluster split brain |
| CN105704187B (en) * | 2014-11-27 | 2019-03-05 | 华为技术有限公司 | A method and device for treating cluster split-brain |
| CN104579765A (en) * | 2014-12-27 | 2015-04-29 | 北京奇虎科技有限公司 | Disaster tolerance method and device for cluster system |
| CN104994173A (en) * | 2015-07-16 | 2015-10-21 | 浪潮(北京)电子信息产业有限公司 | Message processing method and system |
| CN105515838A (en) * | 2015-11-26 | 2016-04-20 | 青岛海信传媒网络技术有限公司 | Service configuration method and HA (High Available) cluster system |
| CN105933407A (en) * | 2016-04-20 | 2016-09-07 | 中国银联股份有限公司 | Method and system for achieving high availability of Redis cluster |
| CN105933407B (en) * | 2016-04-20 | 2019-12-06 | 中国银联股份有限公司 | method and system for realizing high availability of Redis cluster |
| CN107528724A (en) * | 2017-07-20 | 2017-12-29 | 北京奇安信科技有限公司 | A kind of optimized treatment method and device of node cluster |
| CN107528724B (en) * | 2017-07-20 | 2020-09-29 | 奇安信科技集团股份有限公司 | Optimization processing method and device for node cluster |
| CN109088794A (en) * | 2018-08-20 | 2018-12-25 | 郑州云海信息技术有限公司 | A kind of fault monitoring method and device of node |
| CN110377487A (en) * | 2019-07-11 | 2019-10-25 | 无锡华云数据技术服务有限公司 | A kind of method and device handling high-availability cluster fissure |
| US12045667B2 (en) | 2021-08-02 | 2024-07-23 | International Business Machines Corporation | Auto-split and auto-merge clusters |
| US12333343B2 (en) | 2021-11-23 | 2025-06-17 | International Business Machines Corporation | Avoidance of workload duplication among split-clusters |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN102394914A (en) | Cluster brain-split processing method and device | |
| CN108847982B (en) | Distributed storage cluster and node fault switching method and device thereof | |
| CN102355369B (en) | Virtual clustered system as well as processing method and processing device thereof | |
| CN106330475B (en) | A method and device for managing active and standby nodes in a communication system and a high-availability cluster | |
| US9348706B2 (en) | Maintaining a cluster of virtual machines | |
| CN112887367B (en) | Method, system and computer readable medium for realizing high availability of distributed cluster | |
| CN105933407B (en) | method and system for realizing high availability of Redis cluster | |
| CN102360324B (en) | Failure recovery method and equipment for failure recovery | |
| CN103559108A (en) | Method and system for carrying out automatic master and slave failure recovery on the basis of virtualization | |
| CN105095001A (en) | Virtual machine exception recovery method under distributed environment | |
| WO2013102812A1 (en) | A fault tolerant system in a loosely-coupled cluster environment | |
| CN110134518A (en) | A method and system for improving the high availability of multi-node applications in a big data cluster | |
| CN113055203B (en) | Method and device for recovering exception of SDN control plane | |
| CN112380062A (en) | Method and system for rapidly recovering system for multiple times based on system backup point | |
| CN114116912A (en) | Method for realizing high availability of database based on Keepalived | |
| CN107276839A (en) | A kind of cloud platform from monitoring method and system | |
| CN116668269A (en) | Arbitration method, device and system for dual-activity data center | |
| CN110971662A (en) | Two-node high-availability implementation method and device based on Ceph | |
| CN100426751C (en) | Method for ensuring accordant configuration information in cluster system | |
| CN102457400B (en) | Method for preventing split brain of disk mirror image resource | |
| CN118277488A (en) | A hyper-converged system distributed storage cluster management method, device and medium | |
| CN112612653B (en) | A business recovery method, device, arbitration server and storage system | |
| CN111309515B (en) | A disaster recovery control method, device and system | |
| CN109002478A (en) | The fault handling method and relevant device of distributed file system | |
| CN101686261A (en) | RAC-based redundant server system |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C12 | Rejection of a patent application after its publication | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20120328 |