CN111711964A - System disaster tolerance capability test method - Google Patents
System disaster tolerance capability test method Download PDFInfo
- Publication number
- CN111711964A CN111711964A CN202010366940.8A CN202010366940A CN111711964A CN 111711964 A CN111711964 A CN 111711964A CN 202010366940 A CN202010366940 A CN 202010366940A CN 111711964 A CN111711964 A CN 111711964A
- Authority
- CN
- China
- Prior art keywords
- area network
- test
- disaster tolerance
- regions
- wide area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/04—Arrangements for maintaining operational condition
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
From the perspective of system reliability evaluation, the invention provides a system disaster tolerance capability test method, which achieves the effect of quantitatively evaluating the system disaster tolerance capability based on network equipment operation and network flow component analysis, and realizes accurate and convenient evaluation test of the system disaster tolerance capability.
Description
Technical Field
The invention relates to the technical field of software, in particular to a system disaster tolerance capability testing method.
Background
With the popularization of mobile communication networks, the service quality of internet enterprises becomes one of the most important core competitiveness, the requirement of users on the service quality of mobile communication is higher and higher, the network security is one of the key factors of the service quality of mobile communication, and the system reliability is an important guarantee of high service quality. To increase the level of operation, mobile operators must provide users with high quality, uninterrupted service. Due to human misoperation, equipment failure, natural disasters and the like, the failure of the communication network node is inevitable.
Once the system fails, the system causes great loss to enterprises in various aspects including economy, reputation and the like. The operator requires that the network continues to provide service after the network node fails. In order to improve reliability, an important system needs to have capacity, and a core system needs to have remote disaster recovery capacity. At present, a system disaster tolerance capability test method capable of helping a building or operation and maintenance worker to evaluate the system disaster tolerance capability is urgently needed.
Disclosure of Invention
Based on the above, the invention provides a quantitative test method for the disaster tolerance capability of the system, which helps the construction or operation and maintenance personnel to evaluate the disaster tolerance capability of the system from the viewpoint of system reliability evaluation.
The system disaster tolerance capability test method is applied to a disaster tolerance test platform and carries out quantitative disaster tolerance test on a test system;
the disaster recovery test platform comprises:
the system comprises an Sflow flow monitoring module, a wide area network monitoring module and a data processing module, wherein the Sflow monitoring module is used for monitoring network flow based on an Sflow protocol and analyzing flow components from each system to the wide area network in real time;
the link interruption and recovery module is used for realizing the online/offline function of the systems in each region by carrying out port on/off operation on the wide area network router;
the disaster backup flow analysis module is used for analyzing the flow component quintuple acquired by the Sflow flow monitoring module after the system of the designated region is offline, and judging whether the service flow is transferred to other regions or not and judging the proportion of normal transfer;
and the evaluation module is used for quantitatively testing the disaster tolerance capability of the system according to the normal migration proportion of the service flow.
The test system at least comprises: a system S1 deployed in a plurality of regions D1, D2, … …, Dn, wherein local area networks of the plurality of regions D1, D2, … …, Dn are connected to the wide area network through wide area network routers L1, L2, … …, Ln, respectively, and when the system S1 of the region D1 fails or is offline, the related services of the system S1 can be continuously provided by systems S1 of other regions D2, … …, Dn;
the system disaster tolerance capability test method comprises the following steps:
step 1, performing port closing operation on a wide area network router L1 to realize offline operation of a system S1 in a region D1 from a wide area network router L1;
step 2, transferring the service request to a local area network where the system S1 of one or more of other areas D1, D2, … … and Dn is located;
step 3, monitoring the network traffic of the local area network where the system S1 of one or more of the other regions D1, D2, … … and Dn is located based on an Sflow protocol, and counting the traffic components from the system S1 of one or more of the other regions D1, D2, … … and Dn to a wide area network in real time, wherein the traffic components are quintuple < source IP, destination IP, source mac, destination mac and port number >;
step 4, analyzing the flow component quintuple, and judging whether the service flow is transferred to one or more of other regions D1, D2, … … and Dn;
step 5, if the service flow is judged not to be migrated to other regions, after the system S1 of the region D1 is offline, the disaster tolerance capability R of the system S1S1_D1The test result of (2) is recorded as 0, and the step 7 is switched to; otherwise, go to step 6;
step 6, calculating the capacity R of the system S1 to transfer the service corresponding to the system S1 to other areas when the system S1 of the area D1 is off-lineS1_D1;
Step 7, performing port opening operation on the wide area network router L1 to realize online operation of the system S1 in the region D1 from the wide area network router L1;
step 8, repeating steps 1-7, sequentially performing port closing operations on the wan routers L2, … …, Ln, and similarly calculating the capability R of the system S1 to transfer the service corresponding to the system S1 to other regions after the system S1 of the regions D2, … …, Dn is offlineS1_D2、……、RS1_Dn;
Step 9, calculating the disaster tolerance capability R of the system S1S1And the quantitative test of the disaster tolerance capability of the test system is realized.
From the perspective of system reliability evaluation, the invention provides a system disaster tolerance capability test method, which achieves the effect of quantitatively evaluating the system disaster tolerance capability based on network equipment operation and network flow component analysis, and realizes accurate and convenient evaluation test of the system disaster tolerance capability.
Drawings
FIG. 1 is a system architecture diagram of the disaster tolerance capability testing method of the present invention;
Detailed Description
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. The following detailed description includes references to the accompanying drawings, which form a part of the detailed description. The drawings show illustrations in accordance with exemplary embodiments. These exemplary embodiments, which are also referred to herein as "examples," are described in sufficient detail to enable those skilled in the art to practice the present subject matter. The embodiments may be combined, other embodiments may be utilized, or structural and logical changes may be made without departing from the scope of the claims. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope is defined by the appended claims and their equivalents.
As shown in fig. 1, the present invention provides a method for testing disaster tolerance capability of a system, which is applied to a disaster tolerance test platform to perform a quantitative disaster tolerance test on a test system;
the disaster recovery test platform comprises:
the system comprises an Sflow flow monitoring module, a wide area network monitoring module and a data processing module, wherein the Sflow monitoring module is used for monitoring network flow based on an Sflow protocol and analyzing flow components from each system to the wide area network in real time;
the link interruption and recovery module is used for realizing the online/offline function of the systems in each region by carrying out port on/off operation on the wide area network router;
the disaster backup flow analysis module is used for analyzing the flow component quintuple acquired by the Sflow flow monitoring module after the system of the designated region is offline, and judging whether the service flow is transferred to other regions or not and judging the proportion of normal transfer;
and the evaluation module is used for quantitatively testing the disaster tolerance capability of the system according to the normal migration proportion of the service flow.
The test system at least comprises: a system S1 deployed in a plurality of regions D1, D2, … …, Dn, wherein local area networks of the plurality of regions D1, D2, … …, Dn are connected to the wide area network through wide area network routers L1, L2, … …, Ln, respectively, and when the system S1 of the region D1 fails or is offline, the related services of the system S1 can be continuously provided by systems S1 of other regions D2, … …, Dn;
the system disaster tolerance capability test method comprises the following steps:
step 1, performing port closing operation on a wide area network router L1 to realize offline operation of a system S1 in a region D1 from a wide area network router L1;
step 2, transferring the service request to a local area network where the system S1 of one or more of other areas D1, D2, … … and Dn is located;
step 3, monitoring the network traffic of the local area network where the system S1 of one or more of the other regions D1, D2, … … and Dn is located based on an Sflow protocol, and counting the traffic components from the system S1 of one or more of the other regions D1, D2, … … and Dn to a wide area network in real time, wherein the traffic components are quintuple < source IP, destination IP, source mac, destination mac and port number >;
step 4, analyzing the flow component quintuple, and judging whether the service flow is transferred to one or more of other regions D1, D2, … … and Dn;
step 5, if the service flow is judged not to be migrated to other regions, after the system S1 of the region D1 is offline, the disaster tolerance capability R of the system S1S1_D1The test result of (2) is recorded as 0, and the step 7 is switched to; otherwise, go to step 6;
step 6, calculating the capacity R of the system S1 to transfer the service corresponding to the system S1 to other areas when the system S1 of the area D1 is off-lineS1_D1;
Step 7, performing port opening operation on the wide area network router L1 to realize online operation of the system S1 in the region D1 from the wide area network router L1;
step 8, repeating steps 1-7, sequentially performing port closing operations on the wan routers L2, … …, Ln, and similarly calculating the capability R of the system S1 to transfer the service corresponding to the system S1 to other regions after the system S1 of the regions D2, … …, Dn is offlineS1_D2、……、RS1_Dn;
Step 9, calculating the disaster tolerance capability R of the system S1S1And the quantitative test of the disaster tolerance capability of the test system is realized.
Preferably, in step 6, after the system S1 of the area D1 is offline, the capacity R of transferring the service corresponding to the system S1 to another area is calculatedS1_D1Specifically, the following formula is used for calculation:
wherein, x is a triplet < source IP, source mac and port number > in the quintuple acquired by Sflow; the counter is the number of all triples; bps is the real-time flow size of each triplet; s1_ Dn indicates a system S1 in the region Dn.
Preferably, in step 9, the disaster tolerance capability R of the system S1 is calculatedS1The method realizes the quantitative disaster tolerance test of the test system, and specifically comprises the following steps:
RS1=RS1_D1×RS1_D2×……×RS1_Dn。
preferably, the method further comprises analyzing network topology structures of each local area network and each wide area network, obtaining different influences of faults of each geographical topological position on the network, giving different weight parameters, and considering the weight parameters during calculation of the disaster recovery capacity.
Preferably, the method further comprises analyzing the quantitative test result of the disaster tolerance capability of the test system, and generating a route disaster tolerance strategy of the test system.
From the perspective of system reliability evaluation, the invention provides a system disaster tolerance capability test method, which achieves the effect of quantitatively evaluating the system disaster tolerance capability based on network equipment operation and network flow component analysis, and realizes accurate and convenient evaluation test of the system disaster tolerance capability.
While embodiments have been described with reference to specific exemplary embodiments thereof, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the inventive subject matter. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Claims (5)
1. A system disaster tolerance capability test method is applied to a disaster tolerance test platform and carries out quantitative disaster tolerance test on a test system; the method is characterized in that:
the disaster recovery test platform comprises:
the system comprises an Sflow flow monitoring module, a wide area network monitoring module and a data processing module, wherein the Sflow monitoring module is used for monitoring network flow based on an Sflow protocol and analyzing flow components from each system to the wide area network in real time;
the link interruption and recovery module is used for realizing the online/offline function of the systems in each region by carrying out port on/off operation on the wide area network router;
the disaster backup flow analysis module is used for analyzing the flow component quintuple acquired by the Sflow flow monitoring module after the system of the designated region is offline, and judging whether the service flow is transferred to other regions or not and judging the proportion of normal transfer;
and the evaluation module is used for quantitatively testing the disaster tolerance capability of the system according to the normal migration proportion of the service flow.
The test system at least comprises: a system S1 deployed in a plurality of regions D1, D2, … …, Dn, wherein local area networks of the plurality of regions D1, D2, … …, Dn are connected to a wide area network through wide area network routers L1, L2, … …, Ln, respectively, and when the system S1 of the region D1 fails or is offline, the related services of the system S1 of the region D1 can be continuously provided by systems S1 of other regions D2, … …, Dn;
the system disaster tolerance capability test method comprises the following steps:
step 1, performing port closing operation on a wide area network router L1 to realize offline operation of a system S1 in a region D1 from a wide area network router L1;
step 2, transferring the service request to a local area network where the system S1 of one or more of other areas D1, D2, … … and Dn is located;
step 3, monitoring the network traffic of the local area network where the system S1 of one or more of the other regions D1, D2, … … and Dn is located based on an Sflow protocol, and counting the traffic components from the system S1 of one or more of the other regions D1, D2, … … and Dn to a wide area network in real time, wherein the traffic components are quintuple < source IP, destination IP, source mac, destination mac and port number >;
step 4, analyzing the flow component quintuple, and judging whether the service flow is transferred to one or more of other regions D1, D2, … … and Dn;
step 5, if the service flow is judged not to be migrated to other regions, after the system S1 of the region D1 is offline, the disaster tolerance capability R of the system S1S1_D1The test result of (2) is recorded as 0, and the step 7 is switched to; otherwise, go to step 6;
step 6, calculating the capacity R of the system S1 to transfer the service corresponding to the system S1 to other areas when the system S1 of the area D1 is off-lineS1_D1;
Step 7, performing port opening operation on the wide area network router L1 to realize online operation of the system S1 in the region D1 from the wide area network router L1;
step 8, repeating steps 1-7, sequentially performing port closing operations on the wan routers L2, … …, Ln, and similarly calculating the capability R of the system S1 to transfer the service corresponding to the system S1 to other regions after the system S1 of the regions D2, … …, Dn is offlineS1_D2、……、RS1_Dn;
Step 9, calculating the disaster tolerance capability R of the system S1S1And the quantitative test of the disaster tolerance capability of the test system is realized.
2. The method as claimed in claim 1, wherein, in the step 6, after the system S1 of the region D1 is offline, the capability R of the system S1 to transfer the corresponding service to other regions is calculatedS1_D1Specifically, the following formula is used for calculation:
wherein, x is a triplet < source IP, source mac and port number > in the quintuple acquired by Sflow; the counter is the number of all triples; bps is the real-time flow size of each triplet; s1_ Dn indicates a system S1 in the region Dn.
3. The method according to claim 1, wherein, in step 9, the disaster tolerance capability R of the system S1 is calculatedS1The method realizes the quantitative disaster tolerance test of the test system, and specifically comprises the following steps:
RS1=RS1_D1×RS1_D2×……×RS1_Dn。
4. the method of claim 1, further comprising analyzing network topology of each local area network and wide area network to obtain different effects of faults of each geographical topological location on the network, assigning different weight parameters, and taking into account in the calculation of disaster recovery capabilities.
5. The method of claim 1, further comprising analyzing the quantitative test results of the disaster recovery capabilities of the test system to generate a test system routing disaster recovery policy.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010366940.8A CN111711964B (en) | 2020-04-30 | 2020-04-30 | System disaster recovery capability test method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010366940.8A CN111711964B (en) | 2020-04-30 | 2020-04-30 | System disaster recovery capability test method |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111711964A true CN111711964A (en) | 2020-09-25 |
| CN111711964B CN111711964B (en) | 2024-02-02 |
Family
ID=72536841
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010366940.8A Active CN111711964B (en) | 2020-04-30 | 2020-04-30 | System disaster recovery capability test method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111711964B (en) |
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2000060463A1 (en) * | 1999-04-05 | 2000-10-12 | Marathon Technologies Corporation | Background synchronization for fault-tolerant systems |
| CN101365182A (en) * | 2008-09-23 | 2009-02-11 | 中兴通讯股份有限公司 | A method, system, and media gateway for implementing relay gateway disaster recovery and uninterrupted communication |
| CN101799681A (en) * | 2010-02-10 | 2010-08-11 | 刘文祥 | Intelligent grid |
| US20120198085A1 (en) * | 2009-06-19 | 2012-08-02 | Zte Corporation | System and Method for CSCF Entity Disaster Tolerance and Load Balancing |
| CN104504112A (en) * | 2014-12-30 | 2015-04-08 | 何业文 | Cinema information acquisition system |
| CN107547273A (en) * | 2017-08-18 | 2018-01-05 | 国网山东省电力公司信息通信公司 | A kind of support method and system of power system virtual instance High Availabitity |
| CN108712450A (en) * | 2018-08-01 | 2018-10-26 | 北京闲徕互娱网络科技有限公司 | The means of defence and system of ddos attack |
| CN110958262A (en) * | 2019-12-15 | 2020-04-03 | 国网山东省电力公司电力科学研究院 | Ubiquitous Internet of Things security protection gateway system, method and deployment architecture for power industry |
| CN111666179A (en) * | 2020-06-12 | 2020-09-15 | 重庆云海时代信息技术有限公司 | Intelligent replication system and server for multi-point data disaster tolerance |
-
2020
- 2020-04-30 CN CN202010366940.8A patent/CN111711964B/en active Active
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2000060463A1 (en) * | 1999-04-05 | 2000-10-12 | Marathon Technologies Corporation | Background synchronization for fault-tolerant systems |
| CN101365182A (en) * | 2008-09-23 | 2009-02-11 | 中兴通讯股份有限公司 | A method, system, and media gateway for implementing relay gateway disaster recovery and uninterrupted communication |
| US20120198085A1 (en) * | 2009-06-19 | 2012-08-02 | Zte Corporation | System and Method for CSCF Entity Disaster Tolerance and Load Balancing |
| CN101799681A (en) * | 2010-02-10 | 2010-08-11 | 刘文祥 | Intelligent grid |
| CN104504112A (en) * | 2014-12-30 | 2015-04-08 | 何业文 | Cinema information acquisition system |
| CN107547273A (en) * | 2017-08-18 | 2018-01-05 | 国网山东省电力公司信息通信公司 | A kind of support method and system of power system virtual instance High Availabitity |
| CN108712450A (en) * | 2018-08-01 | 2018-10-26 | 北京闲徕互娱网络科技有限公司 | The means of defence and system of ddos attack |
| CN110958262A (en) * | 2019-12-15 | 2020-04-03 | 国网山东省电力公司电力科学研究院 | Ubiquitous Internet of Things security protection gateway system, method and deployment architecture for power industry |
| CN111666179A (en) * | 2020-06-12 | 2020-09-15 | 重庆云海时代信息技术有限公司 | Intelligent replication system and server for multi-point data disaster tolerance |
Non-Patent Citations (4)
| Title |
|---|
| GUO LIN: "Research on Disaster Tolerance System Based on Network Dataflow", 《2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY》 * |
| 文涛;: "信息广域网的调度电话容灾系统", 农村电气化, no. 01 * |
| 臧天宁, 中国博士学位论文全文数据库, no. 06 * |
| 贾栋, 中国优秀硕士学位论文全文数据库, no. 01 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111711964B (en) | 2024-02-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP0753952B1 (en) | Management of path routing in packet communication networks | |
| EP2552065B1 (en) | Controller placement for fast failover in the split architecture | |
| US10623278B2 (en) | Reactive mechanism for in-situ operation, administration, and maintenance traffic | |
| JP2012235461A (en) | Network monitoring system, computer readable recording medium, and method of identifying topology of network | |
| US20040078733A1 (en) | Method and apparatus for monitoring and maintaining user-perceived quality of service in a communications network | |
| US7796596B2 (en) | Methods, systems, and computer program products for producing, transporting, and capturing network traffic data | |
| CN112003747A (en) | Fault positioning method of cloud virtual gateway | |
| US20100161769A1 (en) | Method and System for Virtual LAN Media Access Control Trouble Diagnostics | |
| JP4733769B2 (en) | System, method, and network node for checking consistency of node relation information in nodes of strongly connected network | |
| CN101223760A (en) | Method and node for locating network users | |
| US20090238077A1 (en) | Method and apparatus for providing automated processing of a virtual connection alarm | |
| CN107347014B (en) | A network fault detection method and system | |
| CN108494625A (en) | A kind of analysis system on network performance evaluation | |
| US7646729B2 (en) | Method and apparatus for determination of network topology | |
| CN111934866A (en) | Multi-layer path automatic reduction method and system of quantum communication network | |
| CN114124660B (en) | Method and system for repairing network fault | |
| CN111711964B (en) | System disaster recovery capability test method | |
| CN119676084A (en) | Multi-point networking tunnel configuration optimization method and device | |
| Masruroh et al. | Performance evaluation of routing protocol RIPng and OSPFv3 On IPv6 using FHRP protocol | |
| Vogt et al. | Availability modeling of services in IP networks | |
| CN114944982A (en) | Method and device for positioning two-layer network problem and three-layer network problem | |
| Li et al. | A general approach to generate test packets with network configurations | |
| Akashi et al. | Multiagent-based cooperative inter-as diagnosis in encore | |
| Bolanowski et al. | Measure and compare the convergence time of network routing protocols | |
| JP3842624B2 (en) | Route information collection method, apparatus, and program |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |