GB2350035A - Management system and method for monitoring stress in a network - Google Patents
Management system and method for monitoring stress in a network Download PDFInfo
- Publication number
- GB2350035A GB2350035A GB9917993A GB9917993A GB2350035A GB 2350035 A GB2350035 A GB 2350035A GB 9917993 A GB9917993 A GB 9917993A GB 9917993 A GB9917993 A GB 9917993A GB 2350035 A GB2350035 A GB 2350035A
- Authority
- GB
- United Kingdom
- Prior art keywords
- value
- network
- stress
- raw
- monitored
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims description 25
- 238000012544 monitoring process Methods 0.000 title description 8
- 230000004044 response Effects 0.000 claims description 32
- 238000012545 processing Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 4
- 101001094649 Homo sapiens Popeye domain-containing protein 3 Proteins 0.000 claims description 3
- 101000608234 Homo sapiens Pyrin domain-containing protein 5 Proteins 0.000 claims description 3
- 101000578693 Homo sapiens Target of rapamycin complex subunit LST8 Proteins 0.000 claims description 3
- 102100027802 Target of rapamycin complex subunit LST8 Human genes 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 230000004931 aggregating effect Effects 0.000 claims description 2
- 238000013507 mapping Methods 0.000 description 20
- 230000036541 health Effects 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 3
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 239000000470 constituent Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000005195 poor health Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/10—Active monitoring, e.g. heartbeat, ping or trace-route
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/16—Threshold monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0823—Errors, e.g. transmission errors
- H04L43/0847—Transmission error
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0852—Delays
- H04L43/0864—Round trip delays
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
- H04L43/0882—Utilisation of link capacity
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Cardiology (AREA)
- General Health & Medical Sciences (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
A raw data value for a monitored characteristic of a network device or link is obtained by the network management system and compared with a predetermined threshold value for the monitored characteristic. If the raw data value is greater than or equal to the threshold value, the management system provides a stress value equal to a default value which is a maximum or minimum of a predefined bounded range; if the raw data value is less than the threshold value the management system calculates a stress value within the bounded range using an appropriate algorithm. Thus raw data is normalised to a predetermined range for example zero to 100 for ease of interpretation by the network manager. Stress values for a plurality of monitored characteristics may be combined to form a single aggregated stress value for the network.
Description
2350035 MANAGEMENT SYSTEM AND METHOD FOR MONITORING STRESS IN A NETWORK
The present invention relates to the management of a communications system or network, and more particularly to the monitoring of "stress" in a network.
The following description is concerned with a data communications system such as a local area network (LAN), that is an Ethernet network. However, the skilled person will appreciate that the present invention will have more general applicability to other types of managed networks including wireless networks.
A local area network (LAN) typically comprises a plurality of computers, computer systems, workstations and other electronic devices connected together by a common media such as twisted pair or coaxial cable or fibre optic cable. Data can be communicated between devices on the network by means of data packets (or frames) in accordance a predefined protocol.
Computers and other devices connected to a network can be managed or unmanaged devices. A managed device has processing capability which enables it inter alia to monitor data traffic sent from, received at, and passing through the ports of the device. Monitored data associated with the ports of the network device is stored in memory on the network device. Unmanaged devices do not have this processing capability.
It is becoming increasingly common and necessary for an individual to be appointed to manage a network. The appointed network manager (or administrator) utilises a network management station which includes network management hardware and software. In particular, the network management station is able to access management data from managed network devices using an appropriate management protocol (e.g. the SNW protocol) and display this data for use by the network manager.
Known network management systems simply read the management data fl-om the managed network devices and present this data to the network manager, typically in the form of numeric text, substantially unchanged. The network manager is expected to interpret the data in managing the network.
One of the important tasks of a network manager is to assess the operational performance of the various network devices and links as well as the network as a whole.
The network manager needs to know when problems are arising within the network and which particular network devices are responsible for such problems etc.
Typical problems which may affect the performance of a network include:
1. slow operating speed of the network, and individual network devices, leading to slow movement of data traffic across the network, indicated by e.g. slow response time for a given network device; 2. high volumes of data traffic on the network due to e.g. over- utilisation of the network links, network devices and the network as a whole; and 3. high error rates in the transmission of data packets across the network, indicated by e.g. the loss of data packets in a network device and errors in received data packets.
The aforementioned problems which occur in the operation of a network contribute to the poor "health" of a network. The concept of the health of a network is well known in the field of network management. In the present description, however, it is more convenient to refer to the "stress" of a network (or network device or link) rather than its "health". It will be understood that a high level of stress equates to a low level of health and vice versa. Although the present invention is described as monitoring stress in a network, it will be appreciated that the present invention is equally applicable to -3 monitoring health in a network.
Problems which affect the performance of a network, and therefore contribute to the level of "stress", may be of greater or lesser significance. For instance, a problem with the operating speed of an end station may be less significant than a problem with the operating speed of a central core device such as a switch. Thus, the level of stress of a network depends upon a number offactors including device type, network media type and the type of problem occurring. The network manager must take all such factors into account when interpreting the management data received from the managed network devices to establish whether the performance of the network is satisfactory.
It would be desirable for a network management system to monitor characteristics of network devices and links which are indicative of problems occurring in a network which contribute to "stress", (such characteristics are referred to herein as "metrics") and provide data to the network manager indicative of the level of stress which is easy to interpret. It would further be desirable to determine a value indicating the overall performance of each network object (a "network object" is defined herein as a network device or link) on a network and/or the overall performance of parts of, or all of, the network, so that the network manager would not be required to interpret the monitored data, but could immediately ascertain the performance of the network (or part of the network) from the value provided by the network management system.
In accordance with a first aspect, the present invention provides a method for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network or a part thereof, the method comprising:
obtaining a raw value of data for a monitored characteristic of a network device or link, comparing said raw value with a predetermined threshold value for said monitored characteristic; if said raw data value is greater than or equal to said threshold value, providing a stress value equal to a default value, said default value being a maximum or minimum value of a predefined bounded range; and if said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic.
In accordance with a second aspect, the present invention provides a computer readable medium having a computer program for carrying out the method of the first aspect of the present invention.
In accordance with a third aspect the present invention provides network management apparatus for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network, the apparatus comprising:
a network connection or port for receiving a raw value of data for a monitored characteristic of a network device or link; a processor for comparing said raw value with a predetermined threshold value for said monitored characteristic obtained from memory; and ifthe comparison determines that raw data value is greater than or equal to said threshold value, providing a stress value equal to a default value, said default value being a maximum or minimum value of a predefined bounded range; and if the comparison determines that said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic stored in memory.
The present invention thus obtains a value for the stress of a network, network object or part thereof, within a predefined, bounded range, that is to say a "normalised" stress value. Since the stress value is a normalised value, is easy to understand, and requires no interpretation by the network manager.
Embodiments ofthe present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
Figure I is a block diagram of a typical network having a network management system according to a preferred embodiment of the present invention; Figure 2 is a stress mapping graph for a first stress metric which may be monitored by the network management system in accordance with the present invention; Figure 3 is a stress mapping graph for a second stress metric which may be monitored by the network management system in accordance with the present invention; and Figure 4 is a flow chart showing the steps carried out by a computer program in accordance with a preferred embodiment of the present invention.
Figure I shows a typical network I incorporating a network management system according to a preferred embodiment of the present invention. The network I includes a network management station 3A which incorporates the necessary hardware and software for network management. In particular, the network management station includes a processor, a memory and a disk drive, and preferably also a modem for internet access, as well as user interfaces such as a keyboard and mouse, and a visual display unit.
Network management application software in accordance with the present invention is loaded into the memory of management station 3A for processing data as described in detail below. The network management station 3 A is connected by network media links to a plurality of managed network devices including core devices such as network switch 7, hubs 11, a router (not shown) and end stations, which may be managed or unmanaged, including personal computers 3 and workstations. The network may also include unmanaged devices, for example peripheral devices such as printers etc.
The network management station 3A is capable of communicating with the managed network devices such as network switch 7 and hubs 11 by means of a network management protocol (e.g. the SNW protocol) in order to obtain network management data. Each managed device includes a processor which monitors and stores data in memory on the device, and such data may be represented to an external management station by a MIB (management information base), as is well known in the art, including data relating to inter alia data traffic at the device. A typical managed device monitors data contained in a number of M1Bs, (of which one or more contains data used in the present invention). An example of a MB containing network management data is MIB-11 (formerly AfiBA) as specified by the IETF (Internet Engineering Task Force) in specification RK1213. MIB-II is common to most vendors' core devices and any network management system should preferably be capable of reading and utilising management data from TY1IB-11. Furthermore, the network management system of the preferred embodiment of the present invention is additionally capable of reading and utilising more complex management data contained in such M1Bs as RMON (Remote Monitoring MIB, RK1271), RMON2 (Remote Monitoring MIB 2, RK2021), the standard bridge NUB (RFC1493), the standard repeater MIB (RFC1516), or any proprietary MIB s produced by original equipment manufacturers (e. g. the 3 Com Remote Poll MIB).
In accordance with the preferred embodiment of the present invention, the network management station 3 A obtains data about a certain metric for a network object which is indicative of the level of "stress" of the network object. For example, the station 3A may request from a managed device, such as switch 7, a single piece of information (called an "object instance" or SNW variable) from a MB about a particular state of the device such as the current error rate in data packets sent from and received at one of its ports. In accordance with the present invention, the data obtained from the MIB is processed using a predetermined "mapping algorithm" for error rate in a port of the switch to obtain a "normalised" stress value (as explained in more detail below). The normalised stress value may be displayed on the visual display unit of the network management station or otherwise communicated to the network manager, e.g. via a printer or through another application such as a word processing application. The normalised stress value represents the perceived level of stress ofthe network device (i.e.
a level which takes into account all relevant factors including the type and location of the network device) which can be readily understood by the network manager as indicating whether the level of stress is acceptable.
The network management system may also obtain metric data for a network object without accessing a MIB in a managed device. For example, the network management station 3A may send a signal to a device over a link which prompts a response from the device (e.g. by 1P ICAP echo or IP Ping, as is known in the art). The network management station will itself then monitor the time taken to receive a response from the device. The normalised stress value is then determined using a predetermined "mapping algorithm" for the monitored metric.
The following tables represent examples of the information/characteristics which the network management system, according to the preferred embodiment of the present invention, monitors. In the tables, the monitored information/characteristic is called a "metric" and the typical level ofthe raw measured value which represents an unacceptably high level of stress (i.e. poor performance) is called the Mefault threshold". The default threshold is set by the vendor of the network management system but may adjustable by the user, if required.
Examples ofmetrics obtained by the network management station from a network device are shown in Table 1. These stress metrics are examples of characteristics monitored by the network management system to ensure that parts ofthe network devices (i.e. discrete hardware or software components) are operating in an acceptable fashion.
Table I metric tVDical default threshold IP Ping response time 1000 milliseconds DNS response time 1000 milliseconds FTP response time 1000 milliseconds HTTP response time 1000 milliseconds POP3 response time 1000 milliseconds SMTP response time 1000 milliseconds NFS response time 1000 milliseconds (Note that not all devices will have all of the functions indicated by the metric. The network management station of the preferred embodiment only sends signals appropriate for each network device to obtain the metrics for that device).
Examples of metrics obtained by the network management station requesting information contained in a MIB in a managed network device such as switch 7 are shown in Table 2. These stress metrics are examples of the characteristics monitored in a core device to determine whether errors and bottlenecks of data traffic (due to excessive traffic) are occurring in the device.
Table 2 metric lypical default threshold fi-ames discarded due to 0 frames per second excessive delay frames discarded due to 0 frames per second MTU exceeded fi-ames discarded because 0 frames per second filtering table full rate of topology changes 0 per second (Note that the metrics listed in Table 2 are monitored and stored in most layer 2 switching devices such as bridges or switches.) Examples of metrics obtained by the network management station from a network device such as a half duplex Ethernet interface or port on a network device are shown in Table 3. These stress metrics are examples of characteristics monitored to identify problems occurring at the ports of network devices such as over-utilisation of the link connected to the port.
Table 3 metric iypical default threshold link utilisation 35 % bandwidth link error rate 5 % all frames collisions 20 per second broadcast frames 3000 per second (Note that the metrics listed in Table 3 are monitored and stored for each port in most managed network devices which support a MIB capable of monitoring traffic flow on a port (e.g. NIM-H, RMON, repeater MIB, or a similar operating MIB).) Finally, an example of a metric obtained by the network management station requesting data from a managed network device which supports a proprietary MIB, for example a M113 of 3Com Corporation, is response time from far end of link, which has a typical default threshold of 1000 milliseconds. The network management station requests from the network device this data, and the processor in the network device sends a signal to the far end of the link and times the period for a response to be received. The monitored response time is then provided to the management station as a raw value indicative of problems relating to the speed of transmission of data across the network.
A normalised stress value is determined by the network management system ofthe preferred embodiment using an appropriate algorithm as discussed in more detail below, based on a) the characteristics of the information being requested; b) the type of managed device being monitored, and c) the type of media to which the device is attached. In the embodiment, the stress value is "normalised" for all stress metrics within a predefined, bounded range. In the preferred embodiment the range is 0 - 100 where 100 is an unacceptable level of stress. As mentioned previously, the stress value for each metric is determined using an appropriate "stress mapping" algorithm which considers the aforementioned factors a) to c) to provide an appropriate level of perceived stress in the range 0 to 100 to the user. In particular, the stress mapping applied to any monitored value will always return a stress value in the same range i.e. 0 to 100. The stress mapping algorithms for all monitored values (i. c. all metrics for all network objects) are designed such that a given reported stress value always has the same meaning for the user, no matter which raw monitored value (i.e. metric) was used to generate the stress value. A better understanding of the manner of implementation of the stress mapping algorithms, which are predetermined by the vendor of the network management system, will be appreciated from Examples 1 and 2.
ExMle Ifirst stress metric - Link utilisation The network management station 3A requests from the network switch 7 a numerical value (or object instance) in MIB-11 (and also in RMON) for determining the link utilisation at a particular port of the switch 7. As is well known in the art, the processor in switch 7 will monitor the numbers of data packets and bytes which are transnted from and received at each of its ports and store this data in memory as an appropriate MIB. The network management station 3A receives the MIB data for a particular port and therefore the link connected to the port, and knowing: the type of link and its state of operation, and therefore its bandwidth; and, the elapsed time since the last request and the last requested value, and hence the traffic rate since the last request; determines the % utilisation of the link. The % utilisation of bandwidth of the link is referred to below as the "utilisation value" and is a "raw value" (i.e. before stress mapping).
The network management station then compares the utilisation value with the default threshold for the link stored in its memory (e.g. 35% bandwidth in accordance with Table 3). If the utilisation value exceeds or equals the default threshold then the unacceptably high level of stress. If the utilisation value is below the default threshold then the processor utilises an appropriate stress mapping algorithm to determine the stress value within the range 0 to 100 corresponding to the utilisation value.
The relationship between the utilisation value and the perceived level of stress is non- linear. The stress mapping for link utilisation may be represented as shown in Figure 2. As can be seen from this graph, the perceived stress on the network link increases dramatically from about 30% to an unacceptabl level at 35% bandwidth utilised (the default threshold in Table 3). Thus 35% bandwidth utilisation or greater equates to the maximum stress value 100 as explained above.
At levels below 3 5% bandwidth utilisation, the non-finear relationship between link utilisation and perceived stress may be represented by key points as shown in Table 4.
Table 4 raw value stress (0 to 100) 0 0 1 8 2 5 10 20 30 32 50 100 The implementation ofthe stress mapping algorithm extrapolates these key points to determine the appropriate stress value between 0 and 100 for any raw value (utilisation value).
ExMle 2: Second stress metric - Link error The network management station 3A requests from the network switch 7 the number of error frames received at a certain port and the number of data packets received at that port. This data is contained in a IfiB of the switch 7 such as 11IBAI, RMON, or a similar standard or proprietary MIB. As is well known in the art, the processor in switch 7 keeps a count of the number of data packets which are received at each of its ports, and the number of data packets received in error at each of its ports, and stores this data as an appropriate MIB. The network management station 3A receives the aforementioned NfiB data for a particular port and knowing the elapsed time since the last request and the last requested values, determines the % data packets in error on the link.
The % data packets in error on the link, i.e. the error rate, is referred to below as the @'error value" and is a "raw value" (i.e. before stress mapping).
The network management station then compares the error value with the default threshold for errors on the particular fink stored in its memory (e.g. 5% frames in error in accordance with Table 3). If the error value exceeds or equals the default threshold then the stress value is determined to be the maximum stress value i.e. 100, indicating an unacceptably high level of stress. If the error value is below the default threshold then the processor utilises an appropriate stress mapping algorithm to determine the stress value within the range 0 to 100 corresponding to the error value.
The relationship between the error value the perceived level of stress is non- linear.
The stress mapping for link error may be represented as shown in Figure 3. As can be seen from this graph, the perceived stress on the network link increases dramatically from about 4% error rate to an unacceptable level at 5% frames in error (the default threshold in Table 3). Thus 5% error or greater equates to the maximum stress value of 100.
At levels below 5% error, the non-linear relationship between % frames in error and perceived stress may be represented by key points as shown in Table 5.
Table 5 raw value stress (0 to 100) 0 0 1 1 2 2 3 5 4 30 100 The implementation of the stress mapping algorithm in the network management station extrapolates the key points in the stress mapping shown in Table 5 to obtain a stress value within the range 0 to 100 corresponding to any raw value (i.e. error value).
In accordance with the preferred embodiment of the present invention, the network management station 3A stores in memory the stress mapping algorithms and default thresholds for all stress metrics monitored in the management system. The processor ofthe network management station running the application software carries out the steps shown in Figure 4. At step 101 the program obtains for a particular network object a raw data value (or object instance) for a given metric and in step 102 compares it to the default threshold. If in step 103 it is found that the raw data value is greater than or equal to the default threshold, a default condition arises whereby the processor determines that the stress value is a default value in step 104, which in the preferred embodiment is a maximum (i.e. 100). Otherwise, the processor retrieves the appropriate algorithm for the metric in step 105 and calculates the normalised stress value (i.e. between 0 and 100) in step 106.
The thus determined stress value represents meaningful information to the network manager since it is normalised within a predetermined range. If the normalised stress value is then sent to a display unit in step 107 to be displayed or to a printer to be printed e.g. as a number on the visual display unit of the network management station, together with a large number of other normalised stress values, the network manager can simply look for high numbers (say above 50) to detect potential problems. Accordingly, by utilising "stress-mapping" to determine a normalised stress value for each metric of a network object, the network manager can compare the stress of different areas of a device or different parts of the network and be able to judge which areas need the most urgent attention, without the need to analyse the data for each device to determine whether it has a relatively high stress level for the type of device, the type of media link and the operating state of the link.
The network management system in accordance with the preferred embodiment of the present invention is designed to monitor a plurality of difiFerent stress metrics for each network object. The system retrieves the monitored data for some or all of the metrics appropriate to a given network object and aggregates the data to form an overall object stress value as explained below. This enables the user to view the data for an individual network object and ascertain its overall performance.
For instance, the stresses of a plurality of individual stress metrics of a network device are monitored and the data obtained is aggregated to form an overall device stress value.
A network device such as switch 7 is typically composed of the following components:
- a stackable backplane of some kind to which individual units (i.e. switches, routers etc) may be attached; - a bus within each unit into which blades can be inserted; and - ports or interfaces on each blade.
The stress of these components may be aggregated together to provide an overall stress value for the network device. Silifflarly, the stress of multiple network devices may be aggregated together to provide a single metric for overall network stress.
ExMple 3Aggregation strategy for stress values for a plurallty of metrics in a network device In this example, the aggregation strategy for a given network device such as switch 7 (which may be a stacked device formed from several units) is as follows:
Step 1: The stress of each blade is determined to be the worst of - the blade's own monitored stress (i.e. the stress derived using measurable metrics on the blade itself);and - the worst stress of any of the ports (or interfaces) on that blade Step 2: The stress of each unit is determined to be the worst of.
- the unit's own monitored stress; and - the worst stress of any of the blades inserted in that unit (as determined in Step 1) Step 3: The stress of each network device is determined to be the worst of..
- the device's own monitored stress; and - the worst stress of any of the units which are part of the device (as determined in Step 2) The overall stress of the network can then be determined, for example to be the worst of the stresses of all of the network objects within the network.
The overall stress of the network is thus the worst stress of any individual monitored component within the network.
ExgMIe 4Alternative aggregLtion strategies Other more advanced aggregation strategies may follow the strategy of Example 3, but with the following additions:
1 For a unit or blade, also consider the effect that the bad performance of that component has on the parent component's ability to support other constituents. 5 For example, a blade which is under high stress may affect the performance of other blades if it congests the common, shared backplane. A unit which is under high stress may affect the performance of a network device if it congests the device's single interface onto the network.
2. For the network device, unit or blade, the stress of the component may be determined to be worse or better than just the worst of the constituent components.
For example, the stresses of the interfaces on a blade may be 60, 70 and 80. But the network management system may intelligently decide that this combination of very high stresses means that the blade itself is stressed to, say, 90. If the stresses were 10, 10 and 80, the management application may decide that this single high stress does not warrant a blade stress of 80, but perhaps only 50. 20 As described above, in accordance with a preferred embodiment of the present invention, thenetwork management system monitors a plurality of stress metrics for all of the managed devices on the network, and in addition to aggregating the data for each object (e.g. device) to form an overall stress value for that object (e.g. device), the network management system may additionally aggregate together the overall object stress values of all of the network objects to provide an overall stress value for the network. Alternatively, the network management system may aggregate the worst ofthe underlying stresses of each of the devices, or at each network level, to provide the overall network stress values. In the latter case, the overall network stress value will represent the stress 30 value of the least healthy device or component across the whole network.
The network management station in accordance with the invention may monitor stress periodically or in response to commands from the network manager.
As will be appreciated from the foregoing, in accordance with a preferred embodiment, the present invention is implemented in the form of a software application which may be provided in the form of a computer program on a computer readable medium. The computer readable medium may be a disk which can be loaded in the disk drive of network management station 3 A or the computer system carrying the website or other form of file server (e.g. FTP) of, for example, the supplier of network devices, which permits downloading of the program by a management station over the internet.
The program steps 10 1 to 107 are illustrated in Figure 4 and have been described above.
As the skilled person will appreciate, various modifications may be made to the described embodiments and examples. For instance, as previously mentioned, if the invention is applied to determine a health value in a network, it will be appreciated that the same metrics, raw values and default thresholds will be applicable as for stress, as described above. However, the mapping algorithms will differ, and the default threshold will return the minimum value within the normalised range indicating poor health. The network manager will then need to look for low values of health to detect potential problems in the network.
The present invention is intended to include all such modifications and equivalents which fall within the scope ofthe present invention as defined in the accompanying claims.
Claims (37)
1. A method for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network or a part thereof, the method comprising:
obtaining a raw value of data for a monitored characteristic of a network device or link; comparing said raw value with a predetermined threshold value for said monitored characteristic; if said raw data value is greater than or equal to said threshold value, providing a stress value equal to a default value, said default value being a maximum or minimum value of a predefined bounded range; and if said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic.
2. A method as claimed in claim 1, wherein said monitored characteristic relates to the port of a network device and concerns the utilisation of the link connected to the device.
3. A method as claimed in claim 2, wherein said monitored characteristic includes one of link utilisation in frames per second; link error rate in frames per second; collisions per second, and broadcast frames per second.
4. A method as claimed in claim 1, 2 or 3, wherein said step of obtaining a raw value comprises receiving said raw value from memory in a network device.
5. A method as claimed in claim 1, 2 or 3, wherein said step of obtaining a raw value comprises receiving data values from memory in a network device, and calculating said -19raw value for said monitored characteristic using said data values.
6. A method as claimed in claim 1, wherein said monitored characteristic relates to the operating speed of hardware within a network device.
7. A method as claimed in claim 6, wherein said monitored characteristic includes one of IP Ping response time; DNS response time: FT? response time; HTT? response time; POP3 response time; SN4TP response time, and NFS response time.
8. A method as claimed in claim 6 or claim 7, wherein said step of obtaining a raw value comprises:
sending a signal to said hardware in said network device, which signal prompts a response from said hardware, and timing the period for the response to be received; wherein said timed time period is the raw value.
9. A method as claimed in claim 1, wherein said monitored characteristic is a characteristic relating to errors occurring in a core network device.
10. A method as claimed in claim 9, wherein said monitored characteristic includes one of frames per second discarded due to excessive delay; frames per second discarded due to N1TU exceeded; frames per second discarded because filtering table is full, and rate of topology changes. 25
11. A method as claimed in claim 9 or claim 10, wherein said step of obtaining a raw value comprises receiving said raw value from memory in said core network device.
12. A method as claimed in any preceding claim, wherein said predefined bounded range is 0 to 100, where the default value is 100 if it is the maximum value and 0 if it is 30 the minimum value.
13. A method as claimed in any preceding claim, further comprising displaying said stress value as alphanumeric text or in graphical form on a display screen.
14. A method as claimed in any preceding claim, wherein the threshold value for the monitored characteristic is adjustable by the user.
15. A method for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network, the method comprising:
obtaining a plurality of raw values of data for a corTesponding plurality of monitored characteristics of a network device or link; comparing each raw value with a predetermined threshold value for said corresponding monitored characteristic; for each raw value determining a stress value within a predetern-dned bounded range of values, wherein if said raw data value is greater than or equal to said corresponding threshold value, determining a stress value equal to a default value, said default value being either the maximum or minimum value of said predefined bounded range; and if said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic.
16. A method as claimed in claim 15, further comprising aggregating the stress values obtained for each of said plurality of monitored characteristics to provide an aggregated stress value.
17. A method for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance ofthe network substantially as hereinbefore described, with reference to the accompanying drawings.
18. A computer readable medium containing a computer program for carrying out the method as claimed in any preceding claim.
19. Apparatus for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network, the apparatus comprising:
a network connection or port for receiving a raw value of data for a monitored characteristic of a network device or link, a processor for comparing said raw value with a predetermined threshold value for said monitored characteristic obtained from memory; and ifthe comparison determines that raw data value is greater than or equal to said threshold value, providing a stress value equal to a default value, said default value being a maximum or minimum value of a predefined bounded range; and if the comparison determines that said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic stored in memory.
20. Apparatus as claimed in claim 19, wherein said monitored characteristic relates to the port of a network device and concerns the utilisation of the link connected to the device.
21. Apparatus as claimed in claim 20, wherein said monitored characteristic includes one of link utilisation in frames per second; fink error rate in frames per second; collisions per second, and broadcast frames per second.
22. Apparatus as claimed in claim 19, 20 or 21, wherein said apparatus obtains said raw value by receiving said raw value at said port from memory in a network device.
23. Apparatus as claimed in claim 19, 20 or 21, wherein said apparatus obtains said raw value by receiving data values at said port from memory in a network device, and said 30 processor calculating said raw value for said monitored characteristic using said received -22data values.
24. Apparatus as claimed in claim 19, wherein said monitored characteristic relates to the operating speed of hardware within a network device.
25. Apparatus as claimed in claim 24, wherein said monitored characteristic includes one of. IP Ping response time; DNS response time: FT? response time, HTTP response time; POP3 response time; SMT? response time, and NFS response time.
26. Apparatus as claimed in claim 25 or claim 26, wherein said apparatus obtains said raw value by sending a signal to said hardware in said network device, which signal prompts a response from said hardware, and timing the period for the response to be received; wherein said timed time period is the raw value used by said processor. 15
27. Apparatus as claimed in claim 19, wherein said monitored characteristic is a characteristic relating to errors occurring in a core network device.
28. Apparatus as claimed in claim 27, wherein said monitored characteristic includes one of. frames per second discarded due to excessive delay; frames per second discarded 20 due to MTU exceeded; frames per second discarded because filtering table is full, and rate of topology changes.
29. Apparatus as claimed in claim 27 or claim 28, wherein said apparatus obtains said raw value by receiving said raw value at said port from memory in said core network 25 device.
30. Apparatus as claimed in claim 27 or claim 28, wherein said apparatus obtains said raw value by receiving data values at said port from memory in a network device, and said processor calculating said raw value for said monitored characteristic using said received 30 data values.
31. Apparatus as claimed in any one of claims 19 to 30, wherein said predefined bounded range is 0 to 100, where the default value is 100 if it is the maximum value and 0 if it is the minimum value.
32. Apparatus as claimed in any one of claims 19 to 3 1, further comprising a display for receiving said stress value from said processor and displaying said stress value as alphanumeric text or in graphical form.
33. Apparatus as claimed in any one of claims 19 to 32, further comprising memory 10 to store said predetermined threshold value.
34. Apparatus as claimed in claim 33, further comprising means for writing a threshold value for a monitored characteristic to said memory.
35. Apparatus as claimed in any one of claims 19 to 34, wherein said apparatus is capable of obtaining a plurality of raw values of data for a corresponding plurality of monitored characteristics of a network device or link at said port, and said processor is adapted to compare each raw value with a predetermined threshold value for said corresponding monitored characteristic retrieved from memory, and for each raw value 20 determining a stress value within a predetermined bounded range ofvalues, wherein if said raw data value is greater than or equal to said corresponding threshold value, determining a stress value equal to a default value, the default value being either a maximum or minimum value of said predefined bounded range; and if said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an 25 appropriate algorithm for said monitored characteristic, the processor thereby providing a plurality of stress values corresponding to said plurality of raw values
36. Apparatus as claimed in claim 35, wherein said processor if further adapted to aggregate the plurality of stress values for each of said plurality of monitored 30 characteristics to provide an aggregated stress value.
37. Apparatus substantially as hereinbefore described with reference to, and as shown in, the accompanying drawings.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US09/479,501 US6704284B1 (en) | 1999-05-10 | 2000-01-07 | Management system and method for monitoring stress in a network |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| GBGB9910838.3A GB9910838D0 (en) | 1999-05-10 | 1999-05-10 | Management system and method for monitoring stress in a network |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| GB9917993D0 GB9917993D0 (en) | 1999-09-29 |
| GB2350035A true GB2350035A (en) | 2000-11-15 |
| GB2350035B GB2350035B (en) | 2001-12-05 |
Family
ID=10853179
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| GBGB9910838.3A Ceased GB9910838D0 (en) | 1999-05-10 | 1999-05-10 | Management system and method for monitoring stress in a network |
| GB9917993A Expired - Fee Related GB2350035B (en) | 1999-05-10 | 1999-07-30 | Management system and method for monitoring stress in a network |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| GBGB9910838.3A Ceased GB9910838D0 (en) | 1999-05-10 | 1999-05-10 | Management system and method for monitoring stress in a network |
Country Status (1)
| Country | Link |
|---|---|
| GB (2) | GB9910838D0 (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB2372667A (en) * | 2001-02-21 | 2002-08-28 | 3Com Corp | Providing improved stress thresholds in network management systems |
| GB2374247B (en) * | 1999-03-17 | 2004-06-30 | Ericsson Telefon Ab L M | Method and arrangement for performance analysis of data networks |
| US7010588B2 (en) | 2001-02-27 | 2006-03-07 | 3Com Corporation | System using a series of event processors for processing network events to reduce number of events to be displayed |
| US7016955B2 (en) | 2001-02-27 | 2006-03-21 | 3Com Corporation | Network management apparatus and method for processing events associated with device reboot |
| US7673035B2 (en) | 2001-02-27 | 2010-03-02 | 3Com Corporation | Apparatus and method for processing data relating to events on a network |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB2271918A (en) * | 1992-10-22 | 1994-04-27 | Hewlett Packard Co | Monitoring system status |
| US5819028A (en) * | 1992-06-10 | 1998-10-06 | Bay Networks, Inc. | Method and apparatus for determining the health of a network |
-
1999
- 1999-05-10 GB GBGB9910838.3A patent/GB9910838D0/en not_active Ceased
- 1999-07-30 GB GB9917993A patent/GB2350035B/en not_active Expired - Fee Related
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5819028A (en) * | 1992-06-10 | 1998-10-06 | Bay Networks, Inc. | Method and apparatus for determining the health of a network |
| GB2271918A (en) * | 1992-10-22 | 1994-04-27 | Hewlett Packard Co | Monitoring system status |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB2374247B (en) * | 1999-03-17 | 2004-06-30 | Ericsson Telefon Ab L M | Method and arrangement for performance analysis of data networks |
| GB2372667A (en) * | 2001-02-21 | 2002-08-28 | 3Com Corp | Providing improved stress thresholds in network management systems |
| GB2372667B (en) * | 2001-02-21 | 2003-05-07 | 3Com Corp | Apparatus and method for providing improved stress thresholds in network management systems |
| US6633230B2 (en) | 2001-02-21 | 2003-10-14 | 3Com Corporation | Apparatus and method for providing improved stress thresholds in network management systems |
| US7010588B2 (en) | 2001-02-27 | 2006-03-07 | 3Com Corporation | System using a series of event processors for processing network events to reduce number of events to be displayed |
| US7016955B2 (en) | 2001-02-27 | 2006-03-21 | 3Com Corporation | Network management apparatus and method for processing events associated with device reboot |
| US7673035B2 (en) | 2001-02-27 | 2010-03-02 | 3Com Corporation | Apparatus and method for processing data relating to events on a network |
Also Published As
| Publication number | Publication date |
|---|---|
| GB2350035B (en) | 2001-12-05 |
| GB9910838D0 (en) | 1999-07-07 |
| GB9917993D0 (en) | 1999-09-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6704284B1 (en) | Management system and method for monitoring stress in a network | |
| US20210119890A1 (en) | Visualization of network health information | |
| US6269401B1 (en) | Integrated computer system and network performance monitoring | |
| US6469986B1 (en) | Method and system for configuring a network management network | |
| US6633230B2 (en) | Apparatus and method for providing improved stress thresholds in network management systems | |
| US6108800A (en) | Method and apparatus for analyzing the performance of an information system | |
| US6966015B2 (en) | Method and system for reducing false alarms in network fault management systems | |
| US10911263B2 (en) | Programmatic interfaces for network health information | |
| US7283555B2 (en) | Method and apparatus for determining a polling interval in a network management system | |
| WO2019133763A1 (en) | System and method of application discovery | |
| EP0996254A2 (en) | A method for quantifying communication performance | |
| US20030088529A1 (en) | Data network controller | |
| US20020165934A1 (en) | Displaying a subset of network nodes based on discovered attributes | |
| US7895333B2 (en) | Estimating network management bandwidth | |
| US7746801B2 (en) | Method of monitoring a network | |
| US20060168263A1 (en) | Monitoring telecommunication network elements | |
| US6954785B1 (en) | System for identifying servers on network by determining devices that have the highest total volume data transfer and communication with at least a threshold number of client devices | |
| GB2350035A (en) | Management system and method for monitoring stress in a network | |
| US7673035B2 (en) | Apparatus and method for processing data relating to events on a network | |
| US7742423B2 (en) | Method of heuristic determination of network interface transmission mode and apparatus implementing such method | |
| GB2362062A (en) | Network management apparatus with graphical representation of monitored values | |
| US6928394B2 (en) | Method for dynamically adjusting performance measurements according to provided service level | |
| Cheikhrouhou et al. | An efficient polling layer for SNMP | |
| CN117215781A (en) | Equipment scheduling system and method | |
| GB2362288A (en) | Generating events in network management systems using filters |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PCNP | Patent ceased through non-payment of renewal fee |
Effective date: 20060730 |