[go: up one dir, main page]

CN111711964A - System disaster tolerance capability test method - Google Patents

System disaster tolerance capability test method Download PDF

Info

Publication number
CN111711964A
CN111711964A CN202010366940.8A CN202010366940A CN111711964A CN 111711964 A CN111711964 A CN 111711964A CN 202010366940 A CN202010366940 A CN 202010366940A CN 111711964 A CN111711964 A CN 111711964A
Authority
CN
China
Prior art keywords
area network
test
disaster tolerance
regions
wide area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010366940.8A
Other languages
Chinese (zh)
Other versions
CN111711964B (en
Inventor
武义涵
燕敬博
赵丽
王石
张久岭
杜鹏
王帅
沈时军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN202010366940.8A priority Critical patent/CN111711964B/en
Publication of CN111711964A publication Critical patent/CN111711964A/en
Application granted granted Critical
Publication of CN111711964B publication Critical patent/CN111711964B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/04Arrangements for maintaining operational condition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

From the perspective of system reliability evaluation, the invention provides a system disaster tolerance capability test method, which achieves the effect of quantitatively evaluating the system disaster tolerance capability based on network equipment operation and network flow component analysis, and realizes accurate and convenient evaluation test of the system disaster tolerance capability.

Description

System disaster tolerance capability test method
Technical Field
The invention relates to the technical field of software, in particular to a system disaster tolerance capability testing method.
Background
With the popularization of mobile communication networks, the service quality of internet enterprises becomes one of the most important core competitiveness, the requirement of users on the service quality of mobile communication is higher and higher, the network security is one of the key factors of the service quality of mobile communication, and the system reliability is an important guarantee of high service quality. To increase the level of operation, mobile operators must provide users with high quality, uninterrupted service. Due to human misoperation, equipment failure, natural disasters and the like, the failure of the communication network node is inevitable.
Once the system fails, the system causes great loss to enterprises in various aspects including economy, reputation and the like. The operator requires that the network continues to provide service after the network node fails. In order to improve reliability, an important system needs to have capacity, and a core system needs to have remote disaster recovery capacity. At present, a system disaster tolerance capability test method capable of helping a building or operation and maintenance worker to evaluate the system disaster tolerance capability is urgently needed.
Disclosure of Invention
Based on the above, the invention provides a quantitative test method for the disaster tolerance capability of the system, which helps the construction or operation and maintenance personnel to evaluate the disaster tolerance capability of the system from the viewpoint of system reliability evaluation.
The system disaster tolerance capability test method is applied to a disaster tolerance test platform and carries out quantitative disaster tolerance test on a test system;
the disaster recovery test platform comprises:
the system comprises an Sflow flow monitoring module, a wide area network monitoring module and a data processing module, wherein the Sflow monitoring module is used for monitoring network flow based on an Sflow protocol and analyzing flow components from each system to the wide area network in real time;
the link interruption and recovery module is used for realizing the online/offline function of the systems in each region by carrying out port on/off operation on the wide area network router;
the disaster backup flow analysis module is used for analyzing the flow component quintuple acquired by the Sflow flow monitoring module after the system of the designated region is offline, and judging whether the service flow is transferred to other regions or not and judging the proportion of normal transfer;
and the evaluation module is used for quantitatively testing the disaster tolerance capability of the system according to the normal migration proportion of the service flow.
The test system at least comprises: a system S1 deployed in a plurality of regions D1, D2, … …, Dn, wherein local area networks of the plurality of regions D1, D2, … …, Dn are connected to the wide area network through wide area network routers L1, L2, … …, Ln, respectively, and when the system S1 of the region D1 fails or is offline, the related services of the system S1 can be continuously provided by systems S1 of other regions D2, … …, Dn;
the system disaster tolerance capability test method comprises the following steps:
step 1, performing port closing operation on a wide area network router L1 to realize offline operation of a system S1 in a region D1 from a wide area network router L1;
step 2, transferring the service request to a local area network where the system S1 of one or more of other areas D1, D2, … … and Dn is located;
step 3, monitoring the network traffic of the local area network where the system S1 of one or more of the other regions D1, D2, … … and Dn is located based on an Sflow protocol, and counting the traffic components from the system S1 of one or more of the other regions D1, D2, … … and Dn to a wide area network in real time, wherein the traffic components are quintuple < source IP, destination IP, source mac, destination mac and port number >;
step 4, analyzing the flow component quintuple, and judging whether the service flow is transferred to one or more of other regions D1, D2, … … and Dn;
step 5, if the service flow is judged not to be migrated to other regions, after the system S1 of the region D1 is offline, the disaster tolerance capability R of the system S1S1_D1The test result of (2) is recorded as 0, and the step 7 is switched to; otherwise, go to step 6;
step 6, calculating the capacity R of the system S1 to transfer the service corresponding to the system S1 to other areas when the system S1 of the area D1 is off-lineS1_D1
Step 7, performing port opening operation on the wide area network router L1 to realize online operation of the system S1 in the region D1 from the wide area network router L1;
step 8, repeating steps 1-7, sequentially performing port closing operations on the wan routers L2, … …, Ln, and similarly calculating the capability R of the system S1 to transfer the service corresponding to the system S1 to other regions after the system S1 of the regions D2, … …, Dn is offlineS1_D2、……、RS1_Dn
Step 9, calculating the disaster tolerance capability R of the system S1S1And the quantitative test of the disaster tolerance capability of the test system is realized.
From the perspective of system reliability evaluation, the invention provides a system disaster tolerance capability test method, which achieves the effect of quantitatively evaluating the system disaster tolerance capability based on network equipment operation and network flow component analysis, and realizes accurate and convenient evaluation test of the system disaster tolerance capability.
Drawings
FIG. 1 is a system architecture diagram of the disaster tolerance capability testing method of the present invention;
Detailed Description
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. The following detailed description includes references to the accompanying drawings, which form a part of the detailed description. The drawings show illustrations in accordance with exemplary embodiments. These exemplary embodiments, which are also referred to herein as "examples," are described in sufficient detail to enable those skilled in the art to practice the present subject matter. The embodiments may be combined, other embodiments may be utilized, or structural and logical changes may be made without departing from the scope of the claims. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope is defined by the appended claims and their equivalents.
As shown in fig. 1, the present invention provides a method for testing disaster tolerance capability of a system, which is applied to a disaster tolerance test platform to perform a quantitative disaster tolerance test on a test system;
the disaster recovery test platform comprises:
the system comprises an Sflow flow monitoring module, a wide area network monitoring module and a data processing module, wherein the Sflow monitoring module is used for monitoring network flow based on an Sflow protocol and analyzing flow components from each system to the wide area network in real time;
the link interruption and recovery module is used for realizing the online/offline function of the systems in each region by carrying out port on/off operation on the wide area network router;
the disaster backup flow analysis module is used for analyzing the flow component quintuple acquired by the Sflow flow monitoring module after the system of the designated region is offline, and judging whether the service flow is transferred to other regions or not and judging the proportion of normal transfer;
and the evaluation module is used for quantitatively testing the disaster tolerance capability of the system according to the normal migration proportion of the service flow.
The test system at least comprises: a system S1 deployed in a plurality of regions D1, D2, … …, Dn, wherein local area networks of the plurality of regions D1, D2, … …, Dn are connected to the wide area network through wide area network routers L1, L2, … …, Ln, respectively, and when the system S1 of the region D1 fails or is offline, the related services of the system S1 can be continuously provided by systems S1 of other regions D2, … …, Dn;
the system disaster tolerance capability test method comprises the following steps:
step 1, performing port closing operation on a wide area network router L1 to realize offline operation of a system S1 in a region D1 from a wide area network router L1;
step 2, transferring the service request to a local area network where the system S1 of one or more of other areas D1, D2, … … and Dn is located;
step 3, monitoring the network traffic of the local area network where the system S1 of one or more of the other regions D1, D2, … … and Dn is located based on an Sflow protocol, and counting the traffic components from the system S1 of one or more of the other regions D1, D2, … … and Dn to a wide area network in real time, wherein the traffic components are quintuple < source IP, destination IP, source mac, destination mac and port number >;
step 4, analyzing the flow component quintuple, and judging whether the service flow is transferred to one or more of other regions D1, D2, … … and Dn;
step 5, if the service flow is judged not to be migrated to other regions, after the system S1 of the region D1 is offline, the disaster tolerance capability R of the system S1S1_D1The test result of (2) is recorded as 0, and the step 7 is switched to; otherwise, go to step 6;
step 6, calculating the capacity R of the system S1 to transfer the service corresponding to the system S1 to other areas when the system S1 of the area D1 is off-lineS1_D1
Step 7, performing port opening operation on the wide area network router L1 to realize online operation of the system S1 in the region D1 from the wide area network router L1;
step 8, repeating steps 1-7, sequentially performing port closing operations on the wan routers L2, … …, Ln, and similarly calculating the capability R of the system S1 to transfer the service corresponding to the system S1 to other regions after the system S1 of the regions D2, … …, Dn is offlineS1_D2、……、RS1_Dn
Step 9, calculating the disaster tolerance capability R of the system S1S1And the quantitative test of the disaster tolerance capability of the test system is realized.
Preferably, in step 6, after the system S1 of the area D1 is offline, the capacity R of transferring the service corresponding to the system S1 to another area is calculatedS1_D1Specifically, the following formula is used for calculation:
Figure BDA0002476791260000041
wherein, x is a triplet < source IP, source mac and port number > in the quintuple acquired by Sflow; the counter is the number of all triples; bps is the real-time flow size of each triplet; s1_ Dn indicates a system S1 in the region Dn.
Preferably, in step 9, the disaster tolerance capability R of the system S1 is calculatedS1The method realizes the quantitative disaster tolerance test of the test system, and specifically comprises the following steps:
RS1=RS1_D1×RS1_D2×……×RS1_Dn
preferably, the method further comprises analyzing network topology structures of each local area network and each wide area network, obtaining different influences of faults of each geographical topological position on the network, giving different weight parameters, and considering the weight parameters during calculation of the disaster recovery capacity.
Preferably, the method further comprises analyzing the quantitative test result of the disaster tolerance capability of the test system, and generating a route disaster tolerance strategy of the test system.
From the perspective of system reliability evaluation, the invention provides a system disaster tolerance capability test method, which achieves the effect of quantitatively evaluating the system disaster tolerance capability based on network equipment operation and network flow component analysis, and realizes accurate and convenient evaluation test of the system disaster tolerance capability.
While embodiments have been described with reference to specific exemplary embodiments thereof, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the inventive subject matter. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims (5)

1. A system disaster tolerance capability test method is applied to a disaster tolerance test platform and carries out quantitative disaster tolerance test on a test system; the method is characterized in that:
the disaster recovery test platform comprises:
the system comprises an Sflow flow monitoring module, a wide area network monitoring module and a data processing module, wherein the Sflow monitoring module is used for monitoring network flow based on an Sflow protocol and analyzing flow components from each system to the wide area network in real time;
the link interruption and recovery module is used for realizing the online/offline function of the systems in each region by carrying out port on/off operation on the wide area network router;
the disaster backup flow analysis module is used for analyzing the flow component quintuple acquired by the Sflow flow monitoring module after the system of the designated region is offline, and judging whether the service flow is transferred to other regions or not and judging the proportion of normal transfer;
and the evaluation module is used for quantitatively testing the disaster tolerance capability of the system according to the normal migration proportion of the service flow.
The test system at least comprises: a system S1 deployed in a plurality of regions D1, D2, … …, Dn, wherein local area networks of the plurality of regions D1, D2, … …, Dn are connected to a wide area network through wide area network routers L1, L2, … …, Ln, respectively, and when the system S1 of the region D1 fails or is offline, the related services of the system S1 of the region D1 can be continuously provided by systems S1 of other regions D2, … …, Dn;
the system disaster tolerance capability test method comprises the following steps:
step 1, performing port closing operation on a wide area network router L1 to realize offline operation of a system S1 in a region D1 from a wide area network router L1;
step 2, transferring the service request to a local area network where the system S1 of one or more of other areas D1, D2, … … and Dn is located;
step 3, monitoring the network traffic of the local area network where the system S1 of one or more of the other regions D1, D2, … … and Dn is located based on an Sflow protocol, and counting the traffic components from the system S1 of one or more of the other regions D1, D2, … … and Dn to a wide area network in real time, wherein the traffic components are quintuple < source IP, destination IP, source mac, destination mac and port number >;
step 4, analyzing the flow component quintuple, and judging whether the service flow is transferred to one or more of other regions D1, D2, … … and Dn;
step 5, if the service flow is judged not to be migrated to other regions, after the system S1 of the region D1 is offline, the disaster tolerance capability R of the system S1S1_D1The test result of (2) is recorded as 0, and the step 7 is switched to; otherwise, go to step 6;
step 6, calculating the capacity R of the system S1 to transfer the service corresponding to the system S1 to other areas when the system S1 of the area D1 is off-lineS1_D1
Step 7, performing port opening operation on the wide area network router L1 to realize online operation of the system S1 in the region D1 from the wide area network router L1;
step 8, repeating steps 1-7, sequentially performing port closing operations on the wan routers L2, … …, Ln, and similarly calculating the capability R of the system S1 to transfer the service corresponding to the system S1 to other regions after the system S1 of the regions D2, … …, Dn is offlineS1_D2、……、RS1_Dn
Step 9, calculating the disaster tolerance capability R of the system S1S1And the quantitative test of the disaster tolerance capability of the test system is realized.
2. The method as claimed in claim 1, wherein, in the step 6, after the system S1 of the region D1 is offline, the capability R of the system S1 to transfer the corresponding service to other regions is calculatedS1_D1Specifically, the following formula is used for calculation:
Figure FDA0002476791250000021
wherein, x is a triplet < source IP, source mac and port number > in the quintuple acquired by Sflow; the counter is the number of all triples; bps is the real-time flow size of each triplet; s1_ Dn indicates a system S1 in the region Dn.
3. The method according to claim 1, wherein, in step 9, the disaster tolerance capability R of the system S1 is calculatedS1The method realizes the quantitative disaster tolerance test of the test system, and specifically comprises the following steps:
RS1=RS1_D1×RS1_D2×……×RS1_Dn
4. the method of claim 1, further comprising analyzing network topology of each local area network and wide area network to obtain different effects of faults of each geographical topological location on the network, assigning different weight parameters, and taking into account in the calculation of disaster recovery capabilities.
5. The method of claim 1, further comprising analyzing the quantitative test results of the disaster recovery capabilities of the test system to generate a test system routing disaster recovery policy.
CN202010366940.8A 2020-04-30 2020-04-30 System disaster recovery capability test method Active CN111711964B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010366940.8A CN111711964B (en) 2020-04-30 2020-04-30 System disaster recovery capability test method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010366940.8A CN111711964B (en) 2020-04-30 2020-04-30 System disaster recovery capability test method

Publications (2)

Publication Number Publication Date
CN111711964A true CN111711964A (en) 2020-09-25
CN111711964B CN111711964B (en) 2024-02-02

Family

ID=72536841

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010366940.8A Active CN111711964B (en) 2020-04-30 2020-04-30 System disaster recovery capability test method

Country Status (1)

Country Link
CN (1) CN111711964B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000060463A1 (en) * 1999-04-05 2000-10-12 Marathon Technologies Corporation Background synchronization for fault-tolerant systems
CN101365182A (en) * 2008-09-23 2009-02-11 中兴通讯股份有限公司 A method, system, and media gateway for implementing relay gateway disaster recovery and uninterrupted communication
CN101799681A (en) * 2010-02-10 2010-08-11 刘文祥 Intelligent grid
US20120198085A1 (en) * 2009-06-19 2012-08-02 Zte Corporation System and Method for CSCF Entity Disaster Tolerance and Load Balancing
CN104504112A (en) * 2014-12-30 2015-04-08 何业文 Cinema information acquisition system
CN107547273A (en) * 2017-08-18 2018-01-05 国网山东省电力公司信息通信公司 A kind of support method and system of power system virtual instance High Availabitity
CN108712450A (en) * 2018-08-01 2018-10-26 北京闲徕互娱网络科技有限公司 The means of defence and system of ddos attack
CN110958262A (en) * 2019-12-15 2020-04-03 国网山东省电力公司电力科学研究院 Ubiquitous Internet of Things security protection gateway system, method and deployment architecture for power industry
CN111666179A (en) * 2020-06-12 2020-09-15 重庆云海时代信息技术有限公司 Intelligent replication system and server for multi-point data disaster tolerance

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000060463A1 (en) * 1999-04-05 2000-10-12 Marathon Technologies Corporation Background synchronization for fault-tolerant systems
CN101365182A (en) * 2008-09-23 2009-02-11 中兴通讯股份有限公司 A method, system, and media gateway for implementing relay gateway disaster recovery and uninterrupted communication
US20120198085A1 (en) * 2009-06-19 2012-08-02 Zte Corporation System and Method for CSCF Entity Disaster Tolerance and Load Balancing
CN101799681A (en) * 2010-02-10 2010-08-11 刘文祥 Intelligent grid
CN104504112A (en) * 2014-12-30 2015-04-08 何业文 Cinema information acquisition system
CN107547273A (en) * 2017-08-18 2018-01-05 国网山东省电力公司信息通信公司 A kind of support method and system of power system virtual instance High Availabitity
CN108712450A (en) * 2018-08-01 2018-10-26 北京闲徕互娱网络科技有限公司 The means of defence and system of ddos attack
CN110958262A (en) * 2019-12-15 2020-04-03 国网山东省电力公司电力科学研究院 Ubiquitous Internet of Things security protection gateway system, method and deployment architecture for power industry
CN111666179A (en) * 2020-06-12 2020-09-15 重庆云海时代信息技术有限公司 Intelligent replication system and server for multi-point data disaster tolerance

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
GUO LIN: "Research on Disaster Tolerance System Based on Network Dataflow", 《2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY》 *
文涛;: "信息广域网的调度电话容灾系统", 农村电气化, no. 01 *
臧天宁, 中国博士学位论文全文数据库, no. 06 *
贾栋, 中国优秀硕士学位论文全文数据库, no. 01 *

Also Published As

Publication number Publication date
CN111711964B (en) 2024-02-02

Similar Documents

Publication Publication Date Title
EP0753952B1 (en) Management of path routing in packet communication networks
EP2552065B1 (en) Controller placement for fast failover in the split architecture
US10623278B2 (en) Reactive mechanism for in-situ operation, administration, and maintenance traffic
JP2012235461A (en) Network monitoring system, computer readable recording medium, and method of identifying topology of network
US20040078733A1 (en) Method and apparatus for monitoring and maintaining user-perceived quality of service in a communications network
US7796596B2 (en) Methods, systems, and computer program products for producing, transporting, and capturing network traffic data
CN112003747A (en) Fault positioning method of cloud virtual gateway
US20100161769A1 (en) Method and System for Virtual LAN Media Access Control Trouble Diagnostics
JP4733769B2 (en) System, method, and network node for checking consistency of node relation information in nodes of strongly connected network
CN101223760A (en) Method and node for locating network users
US20090238077A1 (en) Method and apparatus for providing automated processing of a virtual connection alarm
CN107347014B (en) A network fault detection method and system
CN108494625A (en) A kind of analysis system on network performance evaluation
US7646729B2 (en) Method and apparatus for determination of network topology
CN111934866A (en) Multi-layer path automatic reduction method and system of quantum communication network
CN114124660B (en) Method and system for repairing network fault
CN111711964B (en) System disaster recovery capability test method
CN119676084A (en) Multi-point networking tunnel configuration optimization method and device
Masruroh et al. Performance evaluation of routing protocol RIPng and OSPFv3 On IPv6 using FHRP protocol
Vogt et al. Availability modeling of services in IP networks
CN114944982A (en) Method and device for positioning two-layer network problem and three-layer network problem
Li et al. A general approach to generate test packets with network configurations
Akashi et al. Multiagent-based cooperative inter-as diagnosis in encore
Bolanowski et al. Measure and compare the convergence time of network routing protocols
JP3842624B2 (en) Route information collection method, apparatus, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant