[go: up one dir, main page]

CN100426751C - Method for ensuring accordant configuration information in cluster system - Google Patents

Method for ensuring accordant configuration information in cluster system Download PDF

Info

Publication number
CN100426751C
CN100426751C CNB2006100651544A CN200610065154A CN100426751C CN 100426751 C CN100426751 C CN 100426751C CN B2006100651544 A CNB2006100651544 A CN B2006100651544A CN 200610065154 A CN200610065154 A CN 200610065154A CN 100426751 C CN100426751 C CN 100426751C
Authority
CN
China
Prior art keywords
information
token
management module
cluster resource
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2006100651544A
Other languages
Chinese (zh)
Other versions
CN1874267A (en
Inventor
黄西华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CNB2006100651544A priority Critical patent/CN100426751C/en
Publication of CN1874267A publication Critical patent/CN1874267A/en
Application granted granted Critical
Publication of CN100426751C publication Critical patent/CN100426751C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Hardware Redundancy (AREA)

Abstract

本发明提供了保证集群系统中配置信息一致的方法,包括:集群资源备份管理模块对集群系统中各节点服务器的工作情况进行监测;当节点服务器中的应用程序信息发生变化时,更新所述集群资源备份管理模块中存储的配置信息,更新该集群资源备份管理模块中的令牌信息,所述令牌信息用于标识配置信息的更新情况;集群资源管理模块重新启动后,根据所述的令牌信息更新集群资源管理模块或集群资源备份管理模块中的配置信息。采用本发明提供的方法,有效保证了集群系统中各节点服务器上配置信息的一致性,提高了配置信息在灾难情况下的生存性与可恢复性,增强了集群系统的可靠性。

Figure 200610065154

The invention provides a method for ensuring consistency of configuration information in a cluster system, including: a cluster resource backup management module monitors the working conditions of each node server in the cluster system; when the application program information in the node server changes, update the cluster The configuration information stored in the resource backup management module updates the token information in the cluster resource backup management module, and the token information is used to identify the update status of the configuration information; after the cluster resource management module restarts, according to the command The license information updates the configuration information in the cluster resource management module or the cluster resource backup management module. The method provided by the invention effectively ensures the consistency of the configuration information on each node server in the cluster system, improves the survivability and recoverability of the configuration information in a disaster situation, and enhances the reliability of the cluster system.

Figure 200610065154

Description

保证集群系统中配置信息一致的方法 A method to ensure consistent configuration information in a cluster system

技术领域 technical field

本发明涉及一种计算机集群技术,尤其是一种保证集群系统中配置信息一致的方法。The invention relates to a computer cluster technology, in particular to a method for ensuring consistency of configuration information in a cluster system.

背景技术 Background technique

集群(Cluster)技术就是在网络中将一组相互独立的服务器作为单一系统的模式进行管理,来实现高的可用性、可管理性和优越的可伸缩性,以实现服务的并行处理、负载均衡功能,以及系统的容错、容灾功能。集群是一种松散耦合的计算机节点集合,通过集群管理器(Cluster Administrator):一个图形化的管理员工具,通过它可执行维护、监视和故障转移管理,实现对群集实体(如节点、资源、资源组)和群集自身的管理。一个集群包含多台(至少两台)拥有共享数据存储空间的服务器,任何一台服务器运行一个应用时,应用数据被存储在共享的数据空间内,每台服务器的操作系统和应用程序文件存储在其各自的本地储存空间上,集群内各节点服务器通过一内部局域网相互通讯。当一台节点服务器发生故障时,这台服务器上所运行的应用程序将在另一节点服务器上被自动接管。当一个应用服务发生故障时,应用服务将被重新启动或被另一台服务器接管。图1所示为包含两台拥有共享数据存储空间的服务器的集群系统的组网结构示意图。主机与备机互相备份,通常情况下,由主机提供应用服务,当主机出现故障无法正常提供应用服务的时候,备机接管主机的工作,此称为active/standby(主动/备份)模式;应用程序的一部分进程运行在主机上,另一部分进程运行在备机上,当主机出现故障不能正常提供应用服务的时候,备机接管主机的工作,反之相同,此称为active/active(主动/主动)模式。Cluster (Cluster) technology is to manage a group of mutually independent servers as a single system mode in the network to achieve high availability, manageability and superior scalability, to achieve parallel processing of services and load balancing functions , and the fault tolerance and disaster recovery functions of the system. A cluster is a loosely coupled collection of computer nodes. Through the Cluster Administrator (Cluster Administrator): a graphical administrator tool through which maintenance, monitoring, and failover management can be performed, cluster entities (such as nodes, resources, resource groups) and management of the cluster itself. A cluster includes multiple (at least two) servers with shared data storage space. When any server runs an application, the application data is stored in the shared data space, and the operating system and application files of each server are stored in In their respective local storage spaces, each node server in the cluster communicates with each other through an internal local area network. When a node server fails, the applications running on this server will be automatically taken over on another node server. When an application service fails, the application service will be restarted or taken over by another server. FIG. 1 is a schematic diagram of a network structure of a cluster system including two servers with shared data storage space. The main machine and the standby machine back up each other. Normally, the main machine provides application services. When the main machine fails to provide application services normally, the standby machine takes over the work of the main machine. This is called active/standby (active/backup) mode; Part of the process of the program runs on the main machine, and another part of the process runs on the standby machine. When the main machine fails and cannot provide application services normally, the standby machine takes over the work of the main machine, and vice versa. This is called active/active (active/active) model.

现有技术中,无论是在active/standby或active/active工作模式下,主机与备机均为一个单独的个体,所有的配置信息在每一台机器上都单独保存,因此存在以下缺陷与不足:In the prior art, no matter in the active/standby or active/active working mode, the main machine and the standby machine are a single individual, and all configuration information is stored separately on each machine, so there are the following defects and deficiencies :

当A节点服务器上的配置信息被更新时,作为A节点服务器的备份的B节点服务器上的配置信息却不能被及时更新,无法保证配置信息的同步性与一致性,这样,当提供应用服务的A节点服务器出现故障时,B节点服务器因为配置信息的滞后无法正确接管A节点服务器上的服务,可靠性低,给用户带来损失。When the configuration information on the A-node server is updated, the configuration information on the B-node server as the backup of the A-node server cannot be updated in time, and the synchronization and consistency of the configuration information cannot be guaranteed. In this way, when providing application services When the A-node server fails, the B-node server cannot correctly take over the service on the A-node server due to the lag of configuration information, and the reliability is low, which brings losses to users.

发明内容 Contents of the invention

本发明所要解决的技术问题在于针对现有的集群系统存在的缺陷与不足,提供一种集群系统及保证集群系统中配置信息一致的方法,以保证集群系统中各节点服务器上的配置信息的同步性与一致性,提高机群系统的可靠性。The technical problem to be solved by the present invention is to provide a cluster system and a method for ensuring consistent configuration information in the cluster system in view of the defects and deficiencies in the existing cluster system, so as to ensure the synchronization of configuration information on each node server in the cluster system Consistency and consistency, improve the reliability of the cluster system.

本发明的一种保证集群系统中配置信息一致的方法,执行以下步骤:A method for ensuring consistent configuration information in the cluster system of the present invention, the following steps are performed:

步骤1、集群资源备份管理模块(Watchman Cluster Server Back,WMCSB)对集群系统中各节点服务器的工作情况进行监测;Step 1, cluster resource backup management module (Watchman Cluster Server Back, WMCSB) monitors the working conditions of each node server in the cluster system;

步骤2、当节点服务器中的应用程序信息发生变化时,更新所述集群资源备份管理模块中存储的配置信息,更新该集群资源备份管理模块中的令牌信息,所述令牌信息用于标识配置信息的更新情况;Step 2. When the application program information in the node server changes, update the configuration information stored in the cluster resource backup management module, update the token information in the cluster resource backup management module, and the token information is used to identify Updates to configuration information;

步骤3、集群资源管理模块重新启动后,根据所述的令牌信息更新集群资源管理模块(Watchman Cluster Server,WMCS)或集群资源备份管理模块中的配置信息。Step 3, after the cluster resource management module restarts, update the configuration information in the cluster resource management module (Watchman Cluster Server, WMCS) or the cluster resource backup management module according to the token information.

上述技术方案中,所述步骤1之前还执行以下操作:在集群资源管理模块与集群资源备份管理模块中分别设置初始信息相同的令牌信息,该令牌信息包括令牌组标识信息与令牌更新信息。In the above technical solution, the following operations are performed before the step 1: Token information with the same initial information is respectively set in the cluster resource management module and the cluster resource backup management module, and the token information includes token group identification information and token Update information.

所述步骤2中对令牌更新信息进行更新是对令牌号信息进行更新,所述令牌号的大小标识所述配置信息的更新情况。Updating the token update information in the step 2 is to update the token number information, and the size of the token number identifies the update status of the configuration information.

所述步骤3包括:步骤301、集群资源管理模块重新启动后,与集群资源备份管理模块建立连接,并与该集群资源备份管理模块相互获取令牌信息;步骤302、所述集群资源管理模块与所述集群资源备份管理模块比较接收到的令牌信息中的令牌号是否与自己保存的令牌信息中的令牌号相等,若相等,执行步骤303;否则,执行步骤305;步骤303、向管理工具发出报警信息;步骤304、用户制订数据源并利用该数据源对所述集群资源管理模块与所述集群资源备份管理模块中的配置信息进行更新,然后执行步骤306;步骤305、根据预先设定的参数,由令牌号大的或者小的集群资源管理模块或集群资源备份管理模块对对方的配置信息进行更新;步骤306、结束。The step 3 includes: step 301, after the cluster resource management module is restarted, establish a connection with the cluster resource backup management module, and obtain token information with the cluster resource backup management module; step 302, the cluster resource management module and The cluster resource backup management module compares whether the token number in the received token information is equal to the token number in the token information saved by itself, if they are equal, execute step 303; otherwise, execute step 305; step 303, Send an alarm message to the management tool; step 304, the user formulates a data source and uses the data source to update the configuration information in the cluster resource management module and the cluster resource backup management module, and then executes step 306; step 305, according to For preset parameters, the cluster resource management module or cluster resource backup management module with a large or small token number updates the configuration information of the other party; step 306, end.

其中,在步骤301与步骤302之间包括:所述集群资源管理模块与所述集群资源备份管理模块对接收到的令牌信息中的令牌组标识信息进行分析,判断该令牌组标识信息中的令牌组标识是否与自己保存的令牌信息中的令牌组标识相等,是,则相应执行步骤302;否则,执行步骤306。所述步骤305中对对方的配置信息进行更新包括:所述集群资源管理模块或所述集群资源备份管理模块向对方发送配置信息及更新请求信息;所述集群资源备份管理模块或所述集群资源管理模块利用接收到的配置信息对自己保存的配置信息进行更新。Wherein, between step 301 and step 302 includes: the cluster resource management module and the cluster resource backup management module analyze the token group identification information in the received token information, and determine the token group identification information Whether the token group identifier in is equal to the token group identifier in the token information saved by itself, if yes, execute step 302 accordingly; otherwise, execute step 306. Updating the configuration information of the other party in step 305 includes: the cluster resource management module or the cluster resource backup management module sending configuration information and update request information to the other party; the cluster resource backup management module or the cluster resource The management module uses the received configuration information to update the configuration information saved by itself.

另外,所述步骤2中对令牌更新信息进行更新是对令牌更新时间信息进行更新。相应的,所述步骤3包括:步骤311、集群资源管理模块重新启动后,与集群资源备份管理模块建立连接,并与该集群资源备份管理模块相互获取令牌信息;步骤312、所述集群资源管理模块与所述集群资源备份管理模块比较接收到的令牌信息中的令牌更新时间与自己保存的令牌信息中的令牌更新时间的大小是否相等,或者二者的差值是否小于预先规定的数值,是,执行步骤313;否则,执行步骤315;步骤313、向管理工具发出报警信息;步骤314、用户制订数据源并利用该数据源对所述集群资源管理模块与所述集群资源备份管理模块中的配置信息进行更新,然后执行步骤316;步骤315、由令牌更新时间晚的集群资源管理模块或集群资源备份管理模块对对方的配置信息进行更新;步骤316、结束。In addition, updating the token update information in step 2 is updating the token update time information. Correspondingly, the step 3 includes: step 311, after the cluster resource management module is restarted, establish a connection with the cluster resource backup management module, and obtain token information with the cluster resource backup management module; step 312, the cluster resource backup management module The management module compares with the cluster resource backup management module whether the token update time in the received token information is equal to the token update time in the token information saved by itself, or whether the difference between the two is less than the preset The specified value, if yes, execute step 313; otherwise, execute step 315; step 313, send an alarm message to the management tool; step 314, the user formulates a data source and uses the data source to monitor the cluster resource management module and the cluster resource The configuration information in the backup management module is updated, and then step 316 is executed; step 315, the cluster resource management module or the cluster resource backup management module whose token update time is late updates the other party's configuration information; step 316, end.

在步骤311与步骤312之间包括:所述集群资源管理模块与所述集群资源备份管理模块对接收到的令牌信息中的令牌组标识信息进行分析,判断该令牌组标识信息中的令牌组标识是否与自己保存的令牌信息中的令牌组标识相等,是,则相应执行步骤312;否则,执行步骤316。Between step 311 and step 312 includes: the cluster resource management module and the cluster resource backup management module analyze the token group identification information in the received token information, and determine the token group identification information in the token group identification information Whether the token group identifier is equal to the token group identifier in the token information saved by itself, if yes, then execute step 312 accordingly; otherwise, execute step 316.

同时,所述步骤2中对令牌更新信息进行更新是同时对令牌号信息与令牌更新时间信息进行更新。相应的,所述步骤3包括:At the same time, updating the token update information in the step 2 is updating the token number information and token update time information at the same time. Correspondingly, the step 3 includes:

步骤321、集群资源管理模块重新启动后,与集群资源备份管理模块建立连接,并与该集群资源备份管理模块相互获取令牌信息;步骤322、所述集群资源管理模块与所述集群资源备份管理模块对接收到的令牌信息中的权重信息进行分析,判断是否由令牌号决定更新控制权,是,执行步骤323;否则,执行步骤325;步骤323、所述集群资源管理模块与所述集群资源备份管理模块比较接收到的令牌信息中的令牌号是否与自己保存的令牌信息中的令牌号相等,若不相等,执行步骤324;否则,执行步骤327;步骤324、根据预先设定的参数,由令牌号大的或者小的集群资源管理模块或集群资源备份管理模块对对方的配置信息进行更新,然后执行步骤329;步骤325、所述集群资源管理模块与所述集群资源备份管理模块比较接收到的令牌信息中的令牌更新时间与自己保存的令牌信息中的令牌更新时间的大小是否相等,或者二者的差值是否小于预先规定的数值,否,执行步骤326;是,执行步骤327;步骤326、由令牌更新时间晚的集群资源管理模块或集群资源备份管理模块对对方的配置信息进行更新,然后执行步骤329;步骤327、向管理工具发出报警信息;步骤328、用户制订数据源并利用该数据源对所述集群资源管理模块与所述集群资源备份管理模块中的配置信息进行更新;Step 321, after the cluster resource management module is restarted, establish a connection with the cluster resource backup management module, and mutually obtain token information with the cluster resource backup management module; Step 322, the cluster resource management module and the cluster resource backup management module The module analyzes the weight information in the token information received, and judges whether to update the control right by the token number, if yes, execute step 323; otherwise, execute step 325; step 323, the cluster resource management module and the Cluster resource backup management module compares whether the token number in the received token information is equal to the token number in the token information saved by itself, if not equal, execute step 324; otherwise, execute step 327; step 324, according to Pre-set parameters, the cluster resource management module or cluster resource backup management module with a large or small token number updates the configuration information of the other party, and then performs step 329; step 325, the cluster resource management module and the cluster resource backup management module The cluster resource backup management module compares whether the token update time in the received token information is equal to the token update time in the token information it saves, or whether the difference between the two is less than a predetermined value, no , go to step 326; yes, go to step 327; step 326, the cluster resource management module or the cluster resource backup management module whose token update time is later update the configuration information of the other party, and then go to step 329; step 327, report to the management tool Send an alarm message; step 328, the user formulates a data source and uses the data source to update the configuration information in the cluster resource management module and the cluster resource backup management module;

在步骤321与步骤322之间包括:所述集群资源管理模块与所述集群资源备份管理模块对接收到的令牌信息中的令牌组标识信息进行分析,判断该令牌组标识信息中的令牌组标识是否与自己保存的令牌信息中的令牌组标识相等,是,则相应执行步骤322;否则,执行步骤329。Between step 321 and step 322 includes: the cluster resource management module and the cluster resource backup management module analyze the token group identification information in the token information received, and determine the token group identification information in the token group identification information Whether the token group identifier is equal to the token group identifier in the token information saved by itself, if yes, then execute step 322; otherwise, execute step 329.

在上述任意一项方法中,所述步骤2中更新所述集群资源备份管理模块中存储的配置信息具体为:所述集群资源备份管理模块利用变化后的应用程序更新信息更新所述集群资源备份管理模块中存储的配置信息。In any one of the above methods, updating the configuration information stored in the cluster resource backup management module in the step 2 specifically includes: the cluster resource backup management module updates the cluster resource backup with the changed application update information Configuration information stored in the management module.

另外,在上述任意一项方法中,还包括更新集群管理代理模块(WatchmanCluster Agent,WMCA)中的配置信息的操作。更新集群管理代理模块中的配置信息的操作具体为:In addition, in any one of the above methods, the operation of updating the configuration information in the cluster management agent module (WatchmanCluster Agent, WMCA) is also included. The operation of updating the configuration information in the cluster management agent module is as follows:

集群资源备份管理模块接收到集群管理代理模块发送的连接请求信息时,从该连接请求信息中获取所述集群管理代理模块的地址信息,从该集群资源备份管理模块中存储的配置信息中提取出与所述地址信息对应的信息,并利用该对应的信息更新所述集群管理代理模块中的配置信息。When the cluster resource backup management module receives the connection request information sent by the cluster management agent module, it obtains the address information of the cluster management agent module from the connection request information, and extracts the address information from the configuration information stored in the cluster resource backup management module. Information corresponding to the address information, and using the corresponding information to update the configuration information in the cluster management agent module.

基于上述技术方案,本发明采用令牌信息保持WMCS与WMCSB中的配置信息的一致性,从而保证了集群系统中的所有WMCA、WMCS与WMCSB中配置信息的一致性与自动同步,提高了配置信息在灾难情况下的生存性与可恢复性,增强了系统的可靠性。Based on the above technical solution, the present invention uses token information to maintain the consistency of configuration information in WMCS and WMCSB, thereby ensuring the consistency and automatic synchronization of configuration information in all WMCA, WMCS and WMCSB in the cluster system, and improving configuration information. Survivability and recoverability in disaster situations enhance system reliability.

下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。The technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments.

附图说明 Description of drawings

图1为现有技术集群系统的组网结构示意图;FIG. 1 is a schematic diagram of a network structure of a prior art cluster system;

图2为实现本发明保证集群系统中配置信息一致的方法的一集群系统的物理分布示意图;Fig. 2 is a schematic diagram of the physical distribution of a cluster system implementing the method for ensuring consistent configuration information in the cluster system according to the present invention;

图3为实现本发明保证集群系统中配置信息一致的方法的集群系统的另一物理分布示意图;Fig. 3 is another schematic diagram of the physical distribution of the cluster system implementing the method for ensuring consistency of configuration information in the cluster system according to the present invention;

图4为本发明保证集群系统中配置信息一致的方法的一流程图;Fig. 4 is a flow chart of the method for ensuring consistent configuration information in the cluster system according to the present invention;

图5为本发明保证集群系统中配置信息一致的方法的另一流程图;Fig. 5 is another flowchart of the method for ensuring consistent configuration information in the cluster system according to the present invention;

图6为本发明保证集群系统中配置信息一致的方法的又一流程图;Fig. 6 is another flow chart of the method for ensuring consistent configuration information in the cluster system according to the present invention;

图7为本发明保证集群系统中配置信息一致的方法的再一流程图。FIG. 7 is another flow chart of the method for ensuring consistency of configuration information in a cluster system according to the present invention.

具体实施方式 Detailed ways

由于集群系统中的各个节点服务器上分别设置对所在的节点服务器上的本地资源进行管理的WMCA,在其中的一个节点服务器上设置对集群系统的资源进行管理的WMCS,在另一个节点服务器上设置对集群系统的资源进行备份管理的WMCSB。正常情况下,本发明由WMCS对集群系统的资源进行管理,当集群系统中一个节点服务器上的应用程序信息发生变化时,例如用户通过WMCS在一个节点服务器中添加了进程、删除了进程或修改了进程属性信息,WMCS便更新该节点服务器上的WMCA以及WMCSB中的配置信息。当WMCS出现故障时,由WMCSB代替WMCS对集群系统中所有节点服务器的资源进行管理。当WMCA因故障重启并与WMCS建立连接后,WMCS便重新配置WMCA中的配置信息,消除WMCA因为离线后导致的配置信息的差异性。Since WMCA for managing the local resources on the node server is set on each node server in the cluster system, WMCS for managing the resources of the cluster system is set on one of the node servers, and WMCS is set on the other node server WMCSB that backs up and manages the resources of the cluster system. Under normal circumstances, the present invention manages the resources of the cluster system by WMCS. When the application program information on a node server in the cluster system changes, for example, a user adds a process, deletes a process, or modifies a process in a node server through WMCS. After obtaining the process attribute information, WMCS updates the WMCA on the node server and the configuration information in WMCSB. When WMCS fails, WMCSB replaces WMCS to manage the resources of all node servers in the cluster system. When WMCA restarts due to a fault and establishes a connection with WMCS, WMCS reconfigures the configuration information in WMCA to eliminate the difference in configuration information caused by WMCA being offline.

参见图2,图2所示为实现本发明保证集群系统中配置信息一致的方法的一集群系统的物理分布示意图,该集群系统包括n(n为大于1的整数)个节点服务器(图中未标出),在各节点服务器中分别设有对所在的节点服务器上的资源进行管理的WMCA10、WMCA20、……WMCAn0(n为大于1的整数),在其中的两个节点服务器例如节点服务器1与节点服务器2中分别设有对集群中所有节点服务器上的资源进行管理的WMCS11,以及当集群资源管理模块发生故障时,对集群中所有节点服务器上的资源进行备份管理的WMCSB21,WMCS11与WMCSB21相互通信连接,并且均与WMCA10、WMCA20、......WMCAn0通信连接。正常情况下,集群系统中配置信息保存在WMCS11中,WMCSB21对WMCS11中的配置信息进行实时的同步备份。具体的,WMCS11与WMCSB21之间及WMCS11、WMCSB21与WMCA10、WMCA20、......WMCAn0之间通过TCP/IP通信连接。Referring to FIG. 2, FIG. 2 shows a schematic diagram of the physical distribution of a cluster system implementing the method for ensuring consistent configuration information in the cluster system according to the present invention. The cluster system includes n (n is an integer greater than 1) node servers (not shown in the figure) marked), each node server is respectively equipped with WMCA10, WMCA20, ... WMCAn0 (n is an integer greater than 1) which manages the resources on the node server where it is located, and the two node servers in it are, for example, node server 1 In node server 2, there are WMCS11 for managing resources on all node servers in the cluster, and WMCSB21, WMCS11 and WMCSB21 for backup management of resources on all node servers in the cluster when the cluster resource management module fails communicate with each other, and all communicate with WMCA10, WMCA20, ... WMCAn0. Under normal circumstances, the configuration information in the cluster system is stored in WMCS11, and WMCSB21 performs real-time synchronous backup of the configuration information in WMCS11. Specifically, WMCS11 and WMCSB21 and WMCS11, WMCSB21 and WMCA10, WMCA20, . . . WMCAn0 are connected through TCP/IP communication.

WMCA10、WMCA20、......WMCAn0与WMCS11以及WMCSB21之间都建立了连接,并且WMCSB21对WMCS11中的配置信息进行实时的同步备份,二者具有相同的配置信息,这样,当WMCSB21与WMCS11中的之一发生故障例如由于网卡等物理故障导致的通讯故障时,还可通过未发生故障的WMCSB21或WMCS11实现对集群系统的资源管理;而当任何一个WMCA发生故障时,因其与WMCSB21及WMCS11都连接,而WMCSB21与WMCS11同时发生故障的概率很低,因此,当WMCA故障恢复以后,可被WMCSB21或WMCS11及时更新配置信息,保证了集群系统中的WMCA10、WMCA20、......WMCAn0与WMCS11以及WMCSB21中配置信息的一致性与自动同步,提高了集群系统的安全性与可靠性,有效避免了因节点服务器的故障导致的WMCS、WMCSB与WMCA之间的配置信息差异。WMCA10, WMCA20, ...WMCAn0 have established connections with WMCS11 and WMCSB21, and WMCSB21 performs real-time synchronous backup of the configuration information in WMCS11, both of which have the same configuration information, so when WMCSB21 and WMCS11 When one of them fails, such as a communication failure caused by a physical failure such as a network card, the resource management of the cluster system can also be realized through the WMCSB21 or WMCS11 that has not failed; Both WMCS11 and WMCS11 are connected, and the probability of WMCSB21 and WMCS11 failure at the same time is very low. Therefore, when WMCA fails to recover, the configuration information can be updated in time by WMCSB21 or WMCS11, ensuring that WMCA10, WMCA20, ... The consistency and automatic synchronization of configuration information in WMCAn0, WMCS11, and WMCSB21 improves the security and reliability of the cluster system, and effectively avoids the configuration information differences between WMCS, WMCSB, and WMCA caused by node server failures.

图3所示为实现本发明保证集群系统中配置信息一致的方法的另一集群系统的物理分布示意图,该集群系统在图2所示的集群系统的基础上,还增设了一个可视化的管理装置(Cluster Administrator,CA)0,该CA0分别与WMCS11及WMCSB21连接,用于对WMCS11与WMCSB21进行管理,尤其是当WMCS11与WMCSB21的配置信息发生冲突时,可通过CA0对WMCS11与WMCSB21进行强制管理,提高了集群系统的安全性、可靠性以及对系群系统维护的效率。Figure 3 is a schematic diagram of the physical distribution of another cluster system that implements the method for ensuring consistency of configuration information in the cluster system according to the present invention. On the basis of the cluster system shown in Figure 2, the cluster system also adds a visual management device (Cluster Administrator, CA)0, the CA0 is connected to WMCS11 and WMCSB21 respectively, and is used to manage WMCS11 and WMCSB21, especially when the configuration information of WMCS11 and WMCSB21 conflicts, WMCS11 and WMCSB21 can be managed through CA0, The safety and reliability of the cluster system and the maintenance efficiency of the cluster system are improved.

为了明确WMCS11或者WMCSB21中配置信息较新,本发明在WMCS11与WMCSB21中设置初始信息相同的令牌信息,该令牌信息用于标识在后续的流程中,WMCS11与WMCSB21对各自的配置信息的更新情况,包括令牌组ID与令牌更新信息。由于在设置令牌信息的同一时刻,WMCS 11与WMCSB21中的配置信息是相同的,因此其令牌信息相同;令牌组ID标识集群系统,因WMCSB21用于WMCS11的备份,因此,WMCS11与WMCSB21应该属于同一集群系统,其令牌组ID相同,并且一旦设定,不会因为配置信息的更新发生变化。In order to clarify that the configuration information in WMCS11 or WMCSB21 is relatively new, the present invention sets token information with the same initial information in WMCS11 and WMCSB21, and the token information is used to identify the update of the configuration information of WMCS11 and WMCSB21 in the subsequent process Circumstances, including token group ID and token update information. Since the configuration information in WMCS 11 and WMCSB21 are the same at the same moment when the token information is set, the token information is the same; the token group ID identifies the cluster system, because WMCSB21 is used for the backup of WMCS11, therefore, WMCS11 and WMCSB21 It should belong to the same cluster system, and its token group ID is the same, and once set, it will not change due to the update of configuration information.

另外,令牌更新信息可以是令牌号信息,也可以是令牌更新时间信息或者二者的结合。当令牌更新信息为令牌号信息时,以令牌号的大小标识配置信息的更新情况,例如,设置该令牌号的初始值为0,当WMCS11或WMCSB21中的配置信息每发生一次变化,便相应的将其令牌信息中的令牌号加1,因此,通过令牌号的大小即可获知WMCS11还是WMCSB21中的配置信息被最后更新,即配置信息最新,这样,令牌号大的WMCS11或WMCSB21便具有对对方的配置信息更新的更新控制权。同样,若以令牌更新时间信息表示令牌更新信息,则令牌更新时间晚的即数值大的表示其配置信息较新,相应的WMCS11或WMCSB21便具有对对方的配置信息更新的控制权。若令牌更新信息同时包括令牌号信息与令牌更新时间信息,则还需在令牌信息设置由令牌号或令牌更新时间决定更新控制权的权重信息,例如,权重信息可以为0或1,当权重信息为0时,表示由令牌号决定更新控制权,当权重信息为1时,表示由令牌更新时间决定更新控制权。In addition, the token update information may be token number information, token update time information or a combination of the two. When the token update information is token number information, the update status of the configuration information is identified by the size of the token number, for example, the initial value of the token number is set to 0, when the configuration information in WMCS11 or WMCSB21 changes every time , then correspondingly add 1 to the token number in its token information, therefore, you can know whether the configuration information in WMCS11 or WMCSB21 is updated last through the size of the token number, that is, the configuration information is the latest, so that the token number is large The corresponding WMCS11 or WMCSB21 has the update control right to update the configuration information of the other party. Similarly, if the token update time information is used to represent the token update information, the token update time is later, that is, the value is larger, indicating that its configuration information is relatively new, and the corresponding WMCS11 or WMCSB21 has the control right to update the configuration information of the other party. If the token update information includes both token number information and token update time information, it is also necessary to set the token number or token update time to determine the weight information of the update control right in the token information, for example, the weight information can be 0 Or 1, when the weight information is 0, it means that the update control right is determined by the token number; when the weight information is 1, it means that the update control right is determined by the token update time.

图4所示为本发明保证集群系统中配置信息一致的方法的一具体流程,其执行以下步骤:Fig. 4 shows a specific process of the method for ensuring that the configuration information in the cluster system is consistent in the present invention, and it performs the following steps:

步骤1、当WMCS11发生故障时,WMCSB21便对集群系统中各节点服务器上的工作情况进行监测。Step 1. When WMCS11 breaks down, WMCSB21 monitors the working condition of each node server in the cluster system.

正常情况下,由WMCS11对集群系统的资源进行管理,WMCSB21与WMCS11之间存在Socket(套接口)连接,对WMCS11中的配置信息进行实时的同步备份,并且对WMCS11的工作情况进行监测。Under normal circumstances, the resources of the cluster system are managed by WMCS11, there is a Socket (socket interface) connection between WMCSB21 and WMCS11, the configuration information in WMCS11 is synchronously backed up in real time, and the working conditions of WMCS11 are monitored.

步骤2、当系群系统中节点服务器上的应用程序信息发生变化时,WMCSB21利用变化后的应用程序更新信息更新WMCSB21上存储的配置信息,同时,更新令牌信息中的令牌更新信息,例如,将令牌号加1,或者将令牌更新时间修改为更新WMCSB21上存储的配置信息的时间。节点服务器上的应用程序信息发生变化,可以是该节点服务器上因增加或删除了进程引起的进程信息的变化,也可以是进程属性信息的变化。步骤3、当WMCS11的故障消除重新启动后,与WMCSB21建立Socket连接,WMCS11与WMCSB21相互获取对方的令牌信息,并对其中的令牌更新信息进行分析,根据令牌更新信息决定配置信息的更新控制权由谁掌握,获得更新控制权的WMCS11或WMCSB21利用自己当前的配置信息更新对方的配置信息,实现配置信息的同步。Step 2. When the application program information on the node server in the cluster system changes, WMCSB21 uses the changed application program update information to update the configuration information stored on WMCSB21, and at the same time, update the token update information in the token information, for example , add 1 to the token number, or change the token update time to the time for updating the configuration information stored on WMCSB21. A change in the application program information on the node server may be a change in process information caused by adding or deleting a process on the node server, or a change in process attribute information. Step 3. After the failure of WMCS11 is eliminated and restarted, establish a Socket connection with WMCSB21, WMCS11 and WMCSB21 obtain each other’s token information, analyze the token update information, and decide to update the configuration information according to the token update information Who holds the control right? The WMCS11 or WMCSB21 that has obtained the update control right uses its own current configuration information to update the configuration information of the other party, so as to realize the synchronization of configuration information.

通过令牌更新信息确定配置信息更新控制权,由新配置信息代替旧的配置信息,确保了WMCSB21与WMCS11中配置信息的实时同步,因此,提高了集群系统的安全性与可靠性,有效避免了因节点服务器的故障导致的WMCS与WMCSB之间的配置信息差异。The control right to update the configuration information is determined by the token update information, and the old configuration information is replaced by the new configuration information, which ensures the real-time synchronization of the configuration information in WMCSB21 and WMCS11, thus improving the security and reliability of the cluster system and effectively avoiding the The configuration information difference between WMCS and WMCSB is caused by the failure of the node server.

在上述实施例的步骤2中,另外,若应用程序信息发生变化的节点服务器上的WMCA为WMCAn0,则WMCSB21还利用应用程序更新信息更新WMCAn0的配置信息。更新WMCAn0的配置信息的具体操作为:WMCSB21将应用程序更新信息发送给WMCAn0,WMCAn0利用该应用程序更新信息更新其配置信息。In step 2 of the above embodiment, in addition, if the WMCA on the node server whose application program information changes is WMCAn0, WMCSB21 also uses the application program update information to update the configuration information of WMCAn0. The specific operation of updating the configuration information of WMCAn0 is as follows: WMCSB21 sends the application program update information to WMCAn0, and WMCAn0 uses the application program update information to update its configuration information.

当WMCA因故障重启后,会与WMCS21建立Socket连接,当WMCSB21监测到WMCA与其建立Socket连接时,获取其地址信息,WMCSB21对自己保存的配置信息进行分析,从中提取出与所述地址信息对应的配置信息,并将该配置信息发送给WMCA,重新配置WMCA上的配置信息,有效避免了WMCA因发生故障与WMCSB21产生的配置信息的差异。When WMCA restarts due to a fault, it will establish a Socket connection with WMCS21. When WMCSB21 detects that WMCA establishes a Socket connection with it, it will obtain its address information. Configure information, and send the configuration information to WMCA, and reconfigure the configuration information on WMCA, effectively avoiding the difference between WMCA and WMCSB21 configuration information due to failure.

当令牌信息中令牌更新信息的内容不同时,对WMCS11与WMCSB21中配置信息进行更新的具体方法也会有所不同。图5所示为当令牌更新信息为令牌号信息时,本发明保证集群系统中配置信息一致的方法的另一实施例的流程图,该实施例中的步骤1-步骤2与图4所示实施例相同,步骤3的操作为:When the content of the token update information in the token information is different, the specific methods for updating the configuration information in WMCS11 and WMCSB21 will also be different. Fig. 5 is a flow chart of another embodiment of the method for ensuring consistent configuration information in the cluster system in the present invention when the token update information is token number information, step 1-step 2 in this embodiment are the same as those in Fig. 4 The shown embodiment is the same, and the operation of step 3 is:

步骤301、当MCS11的故障被消除后,MCS11与WMCSB21建立Socket连接,之后,WMCS11与WMCSB21相互向对方发送请求信息获取对方的令牌信息;Step 301, when the failure of MCS11 is eliminated, MCS11 establishes a Socket connection with WMCSB21, and then, WMCS11 and WMCSB21 send request information to each other to obtain the token information of the other party;

步骤302、WMCS11与WMCSB21同时对接收到的令牌号信息进行分析,比较该信息中的令牌号是否与自己保存的令牌信息中的令牌号相等,是,执行步骤304;否则,执行步骤305;Step 302, WMCS11 and WMCSB21 analyze the token number information received at the same time, compare whether the token number in the information is equal to the token number in the token information saved by themselves, if yes, execute step 304; otherwise, execute Step 305;

步骤303、WMCS11与WMCSB21向CA0发出报警信息;Step 303, WMCS11 and WMCSB21 send alarm information to CA0;

步骤304、用户通过CA0制订数据源,利用该数据源对WMCS11与WMCSB21中的配置信息进行更新,然后执行步骤306;Step 304, the user formulates a data source through CA0, uses the data source to update the configuration information in WMCS11 and WMCSB21, and then executes step 306;

当通过令牌信息无法确定WMCS11与WMCSB21对配置信息的更新控制权时,可向用户发出告警信息,由用户制定数据源,通过直观的、图形化的管理工具实现对WMCS11与WMCSB21中配置信息的强制更新,提高了对集群系统的维护效率。When the update control of WMCS11 and WMCSB21 to the configuration information cannot be determined through the token information, an alarm message can be sent to the user, and the user can formulate the data source, and realize the enforcement of the configuration information in WMCS11 and WMCSB21 through intuitive and graphical management tools Updated to improve the maintenance efficiency of the cluster system.

步骤305、根据预先设定的参数,由令牌号大或小的WMCS11或WMCSB21掌握配置信息的更新控制权,并向另一方发送配置信息及利用该配置信息对接收方的配置信息进行更新的更新指示信息,WMCSB21或WMCS11利用接收到的配置信息更新自己保存的配置信息;Step 305, according to the preset parameters, the WMCS11 or WMCSB21 with a larger or smaller token number grasps the control right to update the configuration information, and sends the configuration information to the other party and uses the configuration information to update the configuration information of the receiving party Update instruction information, WMCSB21 or WMCS11 uses the received configuration information to update the configuration information saved by itself;

步骤306、结束。Step 306, end.

图6所示为当令牌更新信息为令牌更新时间信息时,本发明保证集群系统中配置信息一致的方法的又一实施例的流程图,该实施例中的步骤1-步骤2与图4所示实施例相同,步骤3的操作为:FIG. 6 is a flow chart of another embodiment of the method for ensuring consistent configuration information in the cluster system in the present invention when the token update information is token update time information. Step 1-step 2 in this embodiment are the same as those in FIG. The embodiment shown in 4 is the same, and the operation of step 3 is:

步骤311、当WMCS11的故障被消除后,WMCS11与WMCSB21建立Socket连接,之后,WMCS11与WMCSB21相互向对方发送请求信息获取对方的令牌信息;Step 311, when the fault of WMCS11 is eliminated, WMCS11 and WMCSB21 establish a Socket connection, after that, WMCS11 and WMCSB21 send request information to each other to obtain the token information of the other party;

步骤312、WMCS11与WMCSB21同时对接收到的令牌更新时间信息进行分析,比较该信息中的令牌更新时间是否与自己保存的令牌更新时间信息中的令牌更新时间的大小相等,或者二者的差值是否小于预先规定的数值例如1分钟,是,执行步骤313;否则,执行步骤315;Step 312, WMCS11 and WMCSB21 analyze the received token update time information at the same time, compare whether the token update time in the information is equal to the size of the token update time in the token update time information saved by itself, or both Whether the difference between the two is less than a predetermined value such as 1 minute, yes, execute step 313; otherwise, execute step 315;

步骤313、WMCS11与WMCSB21向CA0发出报警信息;Step 313, WMCS11 and WMCSB21 send alarm information to CA0;

步骤314、用户通过CA0制订数据源,利用该数据源对WMCS11与WMCSB21中的配置信息进行更新,然后执行步骤316;Step 314, the user formulates a data source through CA0, uses the data source to update the configuration information in WMCS11 and WMCSB21, and then executes step 316;

步骤315、根据预先设定的参数,由令牌更新时间晚的即表示该时间的数据大的WMCS11或WMCSB21掌握配置信息的更新控制权,并向另一方发送配置信息及利用该配置信息对接收方的配置信息进行更新的更新指示信息,WMCSB21或WMCS11利用接收到的配置信息更新自己保存的配置信息,然后执行步骤316;Step 315, according to the preset parameters, the WMCS11 or WMCSB21 whose token update time is late, which means that the data at this time is large, grasps the update control right of the configuration information, and sends the configuration information to the other party and uses the configuration information to receive WMCSB21 or WMCS11 uses the received configuration information to update the configuration information saved by itself, and then executes step 316;

步骤316、结束。Step 316, end.

图7所示为当令牌更新信息包括令牌号信息与令牌更新时间信息时,本发明保证集群系统中配置信息一致的方法的再一实施例的流程图,该实施例中的步骤1-步骤3与图4所示实施例相同,步骤3的操作为:Fig. 7 is a flowchart of another embodiment of the method for ensuring consistency of configuration information in the cluster system according to the present invention when the token update information includes token number information and token update time information, step 1 in this embodiment -step 3 is the same as the embodiment shown in Figure 4, and the operation of step 3 is:

步骤321、当WMCS11的故障被消除后,WMCS11与WMCSB21建立Socket连接,之后,WMCS11与WMCSB21相互向对方发送请求信息获取对方的令牌信息;Step 321, when the failure of WMCS11 is eliminated, WMCS11 and WMCSB21 establish a Socket connection, after that, WMCS11 and WMCSB21 send request information to each other to obtain the token information of the other party;

步骤322、WMCS11与WMCSB21同时对接收到的令牌更新信息中的权重信息进行分析,判断是否由令牌号决定更新控制权,是,执行步骤323;否则,执行步骤325。例如,若事先预定该权重信息为1时,由令牌号决定更新控制权,该权重信息为0时,由令牌更新时间决定更新控制权,则WMCS11与WMCSB21需要判断权重信息是否为1;Step 322, WMCS11 and WMCSB21 analyze the weight information in the received token update information at the same time, and judge whether the update control right is determined by the token number, if yes, go to step 323; otherwise, go to step 325. For example, if it is predetermined that the weight information is 1, the update control right is determined by the token number, and when the weight information is 0, the update control right is determined by the token update time, then WMCS11 and WMCSB21 need to judge whether the weight information is 1;

步骤323、WMCS11与WMCSB21进一步对接收到的令牌号信息进行分析,比较该信息中的令牌号是否与自己保存的令牌信息中的令牌号相等,否,执行步骤324;是,执行步骤327;Step 323, WMCS11 and WMCSB21 further analyze the received token number information, compare whether the token number in the information is equal to the token number in the token information saved by themselves, if no, execute step 324; yes, execute Step 327;

步骤324、根据预先设定的参数,由令牌号大或小的WMCS11或WMCSB21掌握配置信息的更新控制权,并向另一方发送配置信息及利用该配置信息对接收方的配置信息进行更新的更新指示信息,WMCSB21或WMCS11利用接收到的配置信息更新自己保存的配置信息,然后执行步骤329;Step 324: According to the preset parameters, the WMCS11 or WMCSB21 with a large or small token number controls the update control of the configuration information, and sends the configuration information to the other party and uses the configuration information to update the configuration information of the receiver To update the instruction information, WMCSB21 or WMCS11 uses the received configuration information to update the configuration information saved by itself, and then perform step 329;

步骤325、WMCS11与WMCSB21进一步对接收到的令牌更新时间信息进行分析,比较该信息中的令牌更新时间是否与自己保存的令牌更新时间信息中的令牌更新时间的大小相等,或者二者的差值是否小于预先规定的数值例如1分钟,否,执行步骤326;是,执行步骤327;Step 325, WMCS11 and WMCSB21 further analyze the token update time information received, and compare whether the token update time in the information is equal to the token update time in the token update time information saved by itself, or both Whether the difference between the two is less than a predetermined value such as 1 minute, no, execute step 326; yes, execute step 327;

步骤326、根据预先设定的参数,由令牌更新时间晚的即表示该时间的数据大的WMCS11或WMCSB21掌握配置信息的更新控制权,并向另一方发送配置信息及利用该配置信息对接收方的配置信息进行更新的更新指示信息,WMCSB21或WMCS11利用接收到的配置信息更新自己保存的配置信息,然后执行步骤329;Step 326, according to the preset parameters, the WMCS11 or WMCSB21 whose token update time is late, that is, the data at this time is large, grasps the update control right of the configuration information, and sends the configuration information to the other party and uses the configuration information to receive WMCSB21 or WMCS11 uses the received configuration information to update the configuration information saved by itself, and then executes step 329;

步骤327、WMCS11与WMCSB21向CA0发出报警信息;Step 327, WMCS11 and WMCSB21 send alarm information to CA0;

步骤328、用户通过CA0制订数据源,利用该数据源对WMCS11与WMCSB21中的配置信息进行更新;Step 328, the user formulates a data source through CA0, and uses the data source to update the configuration information in WMCS11 and WMCSB21;

步骤329、结束。Step 329, end.

在上述各实施例中,通过令牌信息有效确定了配置信息更新控制权,由掌握更新控制权的一方对另一方的配置信息进行更新,进一步保证了配置信息更新的准确性。In the above embodiments, the control right to update the configuration information is effectively determined through the token information, and the party that holds the control right to update the configuration information of the other party updates the configuration information, which further ensures the accuracy of updating the configuration information.

另外,为了确保集群系统中配置信息更新的准确性,在图5-图7所示的各实施例中WMCS11与WMCSB21对接收到的令牌更新信息进行分析之前,即在步骤301与步骤302之间,或者步骤311与步骤312之间,或者步骤321与步骤322之间,还可以先分析接收到的令牌信息中的令牌组ID是否与自己保存的令牌信息中的令牌组ID相同,在步骤301与步骤302之间该操作具体为:In addition, in order to ensure the accuracy of configuration information updates in the cluster system, before WMCS11 and WMCSB21 analyze the received token update information in the embodiments shown in FIGS. between steps 311 and 312, or between steps 321 and 322, it is also possible to first analyze whether the token group ID in the received token information is consistent with the token group ID in the token information saved by itself Similarly, the specific operations between step 301 and step 302 are:

WMCS11与WMCSB21对接收到的令牌信息进行分析,比较该信息中的令牌组ID是否与自己保存的令牌信息中的令牌组ID相同,是,则相应执行步骤302;否则,执行步骤306。在步骤311与步骤312之间,以及步骤321与步骤322之间的操作相同,不再赘述。WMCS11 and WMCSB21 analyze the token information received, and compare whether the token group ID in the information is the same as the token group ID in the token information saved by itself, if yes, then execute step 302 accordingly; otherwise, execute step 306. The operations between step 311 and step 312, and between step 321 and step 322 are the same and will not be repeated here.

通过上述实施例可知,本发明采用令牌信息保持WMCS与WMCSB中的配置信息的一致性,从而保证了集群系统中的所有WMCA、WMCS与WMCSB中配置信息的一致性与自动同步,提高了配置信息在灾难情况下的生存性与可恢复性,增强了系统的可靠性。It can be seen from the above embodiments that the present invention uses token information to maintain the consistency of configuration information in WMCS and WMCSB, thereby ensuring the consistency and automatic synchronization of configuration information in all WMCAs, WMCS and WMCSB in the cluster system, and improving configuration information. The survivability and recoverability of information in disaster situations enhance the reliability of the system.

最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本发明技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention without limitation. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be The scheme shall be modified or equivalently replaced without departing from the spirit and scope of the technical scheme of the present invention.

Claims (15)

1、一种保证集群系统中配置信息一致的方法,其特征在于,执行以下步骤:1. A method for ensuring consistent configuration information in a cluster system, characterized in that the following steps are performed: 步骤1、集群资源备份管理模块对集群系统中各节点服务器的工作情况进行监测;Step 1, the cluster resource backup management module monitors the working conditions of each node server in the cluster system; 步骤2、当节点服务器中的应用程序信息发生变化时,更新所述集群资源备份管理模块中存储的配置信息,更新该集群资源备份管理模块中的令牌信息,所述令牌信息用于标识配置信息的更新情况;Step 2. When the application program information in the node server changes, update the configuration information stored in the cluster resource backup management module, update the token information in the cluster resource backup management module, and the token information is used to identify Updates to configuration information; 步骤3、集群资源管理模块重新启动后,根据所述的令牌信息更新集群资源管理模块或集群资源备份管理模块中的配置信息。Step 3: After the cluster resource management module is restarted, the configuration information in the cluster resource management module or the cluster resource backup management module is updated according to the token information. 2、根据权利要求1所述的方法,其特征在于,所述步骤1之前还执行以下操作:在集群资源管理模块与集群资源备份管理模块中分别设置初始信息相同的令牌信息,该令牌信息包括令牌组标识信息与令牌更新信息。2. The method according to claim 1, characterized in that, before the step 1, the following operations are performed: respectively setting token information with the same initial information in the cluster resource management module and the cluster resource backup management module, the token The information includes token group identification information and token update information. 3、根据权利要求2所述的方法,其特征在于,所述步骤2中对令牌更新信息进行更新是对令牌号信息进行更新,所述令牌号的大小标识所述配置信息的更新情况。3. The method according to claim 2, characterized in that updating the token update information in the step 2 is updating the token number information, and the size of the token number identifies the update of the configuration information Condition. 4、根据权利要求3所述的方法,其特征在于,所述步骤3包括:4. The method according to claim 3, characterized in that the step 3 comprises: 步骤301、集群资源管理模块重新启动后,与集群资源备份管理模块建立连接,并与该集群资源备份管理模块相互获取令牌信息;Step 301, after the cluster resource management module is restarted, establish a connection with the cluster resource backup management module, and mutually obtain token information with the cluster resource backup management module; 步骤302、所述集群资源管理模块与所述集群资源备份管理模块比较接收到的令牌信息中的令牌号是否与自己保存的令牌信息中的令牌号相等,若相等,执行步骤303;否则,执行步骤305;Step 302, the cluster resource management module compares with the cluster resource backup management module whether the token number in the received token information is equal to the token number in the token information saved by itself, and if they are equal, execute step 303 ; Otherwise, execute step 305; 步骤303、向管理工具发出报警信息;Step 303, sending an alarm message to the management tool; 步骤304、用户制订数据源并利用该数据源对所述集群资源管理模块与所述集群资源备份管理模块中的配置信息进行更新,然后执行步骤306;Step 304, the user formulates a data source and uses the data source to update the configuration information in the cluster resource management module and the cluster resource backup management module, and then executes step 306; 步骤305、根据预先设定的参数,由令牌号大的或者小的集群资源管理模块或集群资源备份管理模块对对方的配置信息进行更新;Step 305, according to preset parameters, the cluster resource management module or the cluster resource backup management module with a large or small token number updates the configuration information of the other party; 步骤306、结束。Step 306, end. 5、根据权利要求4所述的方法,其特征在于,在步骤301与步骤302之间还包括:5. The method according to claim 4, further comprising: between step 301 and step 302: 所述集群资源管理模块与所述集群资源备份管理模块对接收到的令牌信息中的令牌组标识信息进行分析,判断该令牌组标识信息中的令牌组标识是否与自己保存的令牌信息中的令牌组标识相等,是,则相应执行步骤302;否则,执行步骤306。The cluster resource management module and the cluster resource backup management module analyze the token group identification information in the received token information, and judge whether the token group identification in the token group identification information is consistent with the If the token group identifiers in the card information are equal, if yes, execute step 302 accordingly; otherwise, execute step 306. 6、根据权利要求4所述的方法,其特征在于,所述步骤305中对对方的配置信息进行更新包括:6. The method according to claim 4, wherein updating the configuration information of the other party in step 305 includes: 所述集群资源管理模块或所述集群资源备份管理模块向对方发送配置信息及更新请求信息;The cluster resource management module or the cluster resource backup management module sends configuration information and update request information to the other party; 所述集群资源备份管理模块或所述集群资源管理模块利用接收到的配置信息对自己保存的配置信息进行更新。The cluster resource backup management module or the cluster resource management module uses the received configuration information to update the configuration information saved by itself. 7、根据权利要求1所述的方法,其特征在于,所述步骤2中对令牌更新信息进行更新是对令牌更新时间信息进行更新。7. The method according to claim 1, characterized in that updating the token update information in the step 2 is updating the token update time information. 8、根据权利要求7所述的方法,其特征在于,所述步骤3包括:8. The method according to claim 7, wherein said step 3 comprises: 步骤311、集群资源管理模块重新启动后,与集群资源备份管理模块建立连接,并与该集群资源备份管理模块相互获取令牌信息;Step 311, after the cluster resource management module is restarted, establish a connection with the cluster resource backup management module, and mutually obtain token information with the cluster resource backup management module; 步骤312、所述集群资源管理模块与所述集群资源备份管理模块比较接收到的令牌信息中的令牌更新时间与自己保存的令牌信息中的令牌更新时间的大小是否相等,或者二者的差值是否小于预先规定的数值,是,执行步骤313;否则,执行步骤315;Step 312, the cluster resource management module compares with the cluster resource backup management module whether the token update time in the received token information is equal to the token update time in the token information saved by itself, or both Whether the difference between them is less than a predetermined value, if yes, go to step 313; otherwise, go to step 315; 步骤313、向管理工具发出报警信息;Step 313, sending an alarm message to the management tool; 步骤314、用户制订数据源并利用该数据源对所述集群资源管理模块与所述集群资源备份管理模块中的配置信息进行更新,然后执行步骤316;Step 314, the user formulates a data source and uses the data source to update the configuration information in the cluster resource management module and the cluster resource backup management module, and then executes step 316; 步骤315、由令牌更新时间晚的集群资源管理模块或集群资源备份管理模块对对方的配置信息进行更新;Step 315, the cluster resource management module or the cluster resource backup management module whose token update time is late updates the other party's configuration information; 步骤316、结束。Step 316, end. 9、根据权利要求8所述的方法,其特征在于,在步骤311与步骤312之间还包括:9. The method according to claim 8, further comprising: between step 311 and step 312: 所述集群资源管理模块与所述集群资源备份管理模块对接收到的令牌信息中的令牌组标识信息进行分析,判断该令牌组标识信息中的令牌组标识是否与自己保存的令牌信息中的令牌组标识相等,是,则相应执行步骤312;否则,执行步骤316。The cluster resource management module and the cluster resource backup management module analyze the token group identification information in the received token information, and judge whether the token group identification in the token group identification information is consistent with the If the token group identifiers in the card information are equal, if yes, execute step 312 accordingly; otherwise, execute step 316. 10、根据权利要求1所述的方法,其特征在于,所述步骤2中对令牌更新信息进行更新是同时对令牌号信息与令牌更新时间信息进行更新。10. The method according to claim 1, characterized in that updating the token update information in step 2 is updating the token number information and token update time information at the same time. 11、根据权利要求10所述的方法,其特征在于,所述步骤3包括:11. The method according to claim 10, characterized in that the step 3 comprises: 步骤321、集群资源管理模块重新启动后,与集群资源备份管理模块建立连接,并与该集群资源备份管理模块相互获取令牌信息;Step 321, after the cluster resource management module is restarted, establish a connection with the cluster resource backup management module, and mutually obtain token information with the cluster resource backup management module; 步骤322、所述集群资源管理模块与所述集群资源备份管理模块对接收到的令牌信息中的权重信息进行分析,判断是否由令牌号决定更新控制权,是,执行步骤323;否则,执行步骤325;Step 322, the cluster resource management module and the cluster resource backup management module analyze the weight information in the received token information, and judge whether the token number determines whether to update the control right, if yes, execute step 323; otherwise, Execute step 325; 步骤323、所述集群资源管理模块与所述集群资源备份管理模块比较接收到的令牌信息中的令牌号是否与自己保存的令牌信息中的令牌号相等,若不相等,执行步骤324;否则,执行步骤327;Step 323: The cluster resource management module compares with the cluster resource backup management module whether the token number in the received token information is equal to the token number in the token information saved by itself, and if not, execute step 324; otherwise, execute step 327; 步骤324、根据预先设定的参数,由令牌号大的或者小的集群资源管理模块或集群资源备份管理模块对对方的配置信息进行更新,然后执行步骤329;Step 324: According to preset parameters, the cluster resource management module or cluster resource backup management module with a large or small token number updates the configuration information of the other party, and then executes Step 329; 步骤325、所述集群资源管理模块与所述集群资源备份管理模块比较接收到的令牌信息中的令牌更新时间与自己保存的令牌信息中的令牌更新时间的大小是否相等,或者二者的差值是否小于预先规定的数值,否,执行步骤326;是,执行步骤327;Step 325: The cluster resource management module compares with the cluster resource backup management module whether the token update time in the received token information is equal to the token update time in the token information saved by itself, or both Whether the difference between them is less than a predetermined value, no, execute step 326; yes, execute step 327; 步骤326、由令牌更新时间晚的集群资源管理模块或集群资源备份管理模块对对方的配置信息进行更新,然后执行步骤329;Step 326, the cluster resource management module or the cluster resource backup management module whose token update time is later updates the configuration information of the other party, and then executes step 329; 步骤327、向管理工具发出报警信息;Step 327, sending an alarm message to the management tool; 步骤328、用户制订数据源并利用该数据源对所述集群资源管理模块与所述集群资源备份管理模块中的配置信息进行更新;Step 328, the user formulates a data source and uses the data source to update the configuration information in the cluster resource management module and the cluster resource backup management module; 步骤329、结束。Step 329, end. 12、根据权利要求11所述的方法,其特征在于,在步骤321与步骤322之间还包括:12. The method according to claim 11, further comprising: between step 321 and step 322: 所述集群资源管理模块与所述集群资源备份管理模块对接收到的令牌信息中的令牌组标识信息进行分析,判断该令牌组标识信息中的令牌组标识是否与自己保存的令牌信息中的令牌组标识相等,是,则相应执行步骤322;否则,执行步骤329。The cluster resource management module and the cluster resource backup management module analyze the token group identification information in the received token information, and judge whether the token group identification in the token group identification information is consistent with the If the token group identifiers in the card information are equal, if yes, execute step 322 accordingly; otherwise, execute step 329. 13、根据权利要求1至12中任意一项所述的方法,其特征在于,所述步骤2中更新所述集群资源备份管理模块中存储的配置信息具体为:所述集群资源备份管理模块利用变化后的应用程序更新信息更新所述集群资源备份管理模块中存储的配置信息。13. The method according to any one of claims 1 to 12, wherein updating the configuration information stored in the cluster resource backup management module in the step 2 is specifically: the cluster resource backup management module utilizes The changed application update information updates the configuration information stored in the cluster resource backup management module. 14、根据权利要求1至12中任意一项所述的方法,其特征在于,还包括更新集群管理代理模块中的配置信息的操作。14. The method according to any one of claims 1 to 12, further comprising an operation of updating configuration information in the cluster management agent module. 15、根据权利要求14所述的方法,其特征在于,更新集群管理代理模块中的配置信息的操作具体为:15. The method according to claim 14, wherein the operation of updating the configuration information in the cluster management agent module is specifically: 集群资源备份管理模块接收到集群管理代理模块发送的连接请求信息时,从该连接请求信息中获取所述集群管理代理模块的地址信息,从该集群资源备份管理模块中存储的配置信息中提取出与所述地址信息对应的信息,并利用该对应的信息更新所述集群管理代理模块中的配置信息。When the cluster resource backup management module receives the connection request information sent by the cluster management agent module, it obtains the address information of the cluster management agent module from the connection request information, and extracts the address information from the configuration information stored in the cluster resource backup management module. Information corresponding to the address information, and using the corresponding information to update the configuration information in the cluster management agent module.
CNB2006100651544A 2006-03-21 2006-03-21 Method for ensuring accordant configuration information in cluster system Expired - Fee Related CN100426751C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100651544A CN100426751C (en) 2006-03-21 2006-03-21 Method for ensuring accordant configuration information in cluster system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100651544A CN100426751C (en) 2006-03-21 2006-03-21 Method for ensuring accordant configuration information in cluster system

Publications (2)

Publication Number Publication Date
CN1874267A CN1874267A (en) 2006-12-06
CN100426751C true CN100426751C (en) 2008-10-15

Family

ID=37484546

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100651544A Expired - Fee Related CN100426751C (en) 2006-03-21 2006-03-21 Method for ensuring accordant configuration information in cluster system

Country Status (1)

Country Link
CN (1) CN100426751C (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101888392A (en) * 2009-05-13 2010-11-17 上海即略网络信息科技有限公司 Trunking method
CN102339283A (en) * 2010-07-20 2012-02-01 中兴通讯股份有限公司 Access control method for cluster file system and cluster node
CN102681911B (en) * 2011-03-09 2016-03-02 腾讯科技(深圳)有限公司 A kind of disaster tolerance system of configuration center and method
CN103095759B (en) * 2011-11-02 2017-09-19 华为技术有限公司 Method and device for restoring resource environment
CN102591750A (en) * 2011-12-31 2012-07-18 曙光信息产业股份有限公司 Recovery method of cluster system
CN102761448A (en) * 2012-08-07 2012-10-31 中国石油大学(华东) Cluster monitoring and early warning method
CN102857371B (en) * 2012-08-21 2016-04-20 曙光信息产业(北京)有限公司 A kind of dynamic allocation management method towards group system
CN106155920B (en) * 2015-03-30 2019-05-07 阿里巴巴集团控股有限公司 Data managing method and device
CN106547861A (en) * 2016-10-21 2017-03-29 天脉聚源(北京)科技有限公司 A kind of method and device of the data base of intelligent management machine node
CN106685713A (en) * 2016-12-26 2017-05-17 努比亚技术有限公司 Method and apparatus for processing configuration parameters
CN106844681A (en) * 2017-01-25 2017-06-13 郑州云海信息技术有限公司 Configuration file synchronous method and managing main frame based on cluster file system
CN109240608B (en) * 2018-08-22 2021-08-31 郑州云海信息技术有限公司 A configuration information synchronization method and device
CN110635953A (en) * 2019-10-17 2019-12-31 厦门网宿有限公司 Configuration information management method and device
CN112087343B (en) * 2020-09-22 2022-07-08 广州英码信息科技有限公司 Networking and communication method of seat management system
CN112328445B (en) * 2020-10-27 2023-11-14 许继集团有限公司 A multi-node management system based on consul
CN113609211B (en) * 2021-06-20 2023-07-14 苏州浪潮智能科技有限公司 A cluster information synchronization method, device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050015471A1 (en) * 2003-07-18 2005-01-20 Zhang Pu Paul Secure cluster configuration data set transfer protocol
US6918013B2 (en) * 2001-07-16 2005-07-12 Bea Systems, Inc. System and method for flushing bean cache

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6918013B2 (en) * 2001-07-16 2005-07-12 Bea Systems, Inc. System and method for flushing bean cache
US20050015471A1 (en) * 2003-07-18 2005-01-20 Zhang Pu Paul Secure cluster configuration data set transfer protocol

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
动态恢复管理在集群作业管理系统中的应用. 扶庆华,郑炜,邓正宏.计算机工程,第30卷第21期. 2004
动态恢复管理在集群作业管理系统中的应用. 扶庆华,郑炜,邓正宏.计算机工程,第30卷第21期. 2004 *

Also Published As

Publication number Publication date
CN1874267A (en) 2006-12-06

Similar Documents

Publication Publication Date Title
CN100426751C (en) Method for ensuring accordant configuration information in cluster system
CN106331098B (en) Server cluster system
CN108234170B (en) Monitoring method and device for server cluster
CN111459749B (en) Prometheus-based private cloud monitoring method and device, computer equipment and storage medium
CN112506702B (en) Disaster recovery method, device, equipment and storage medium for data center
CN111800354B (en) Message processing method and device, message processing equipment and storage medium
CN102394914A (en) Cluster brain-split processing method and device
CN113055203B (en) Method and device for recovering exception of SDN control plane
CN105471622A (en) High-availability method and system for main/standby control node switching based on Galera
CN105933407A (en) Method and system for achieving high availability of Redis cluster
CN110134518A (en) A method and system for improving the high availability of multi-node applications in a big data cluster
CN103036719A (en) Cross-regional service disaster method and device based on main cluster servers
CN106383770A (en) Server monitoring management method and server
CN114328033B (en) Method and device for maintaining service configuration consistency of high-availability equipment group
CN108833190A (en) A kind of NFS service failure warning method, device and storage medium
CN113765690B (en) Cluster switching method, system, device, terminal, server and storage medium
CN116668269A (en) Arbitration method, device and system for dual-activity data center
CN114124803B (en) Device management method and device, electronic device and storage medium
CN114598604B (en) Monitoring method, monitoring device and terminal for virtual network function instance information
JP4673532B2 (en) Comprehensive alignment process in a multi-manager environment
WO2016101409A1 (en) Data switching method, device and system
CN110851186B (en) Network equipment restarting method and device, electronic equipment and readable storage medium
CN119065803A (en) Scheduled task scheduling method and system
CN107888491A (en) HSB standby systems and the AC double hot standby methods based on two layers of networking VRRP agreements
CN108089968A (en) Method for monitoring state of database of virtual machine by host machine

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20081015

Termination date: 20160321