[go: up one dir, main page]

GB2350035A - Management system and method for monitoring stress in a network - Google Patents

Management system and method for monitoring stress in a network Download PDF

Info

Publication number
GB2350035A
GB2350035A GB9917993A GB9917993A GB2350035A GB 2350035 A GB2350035 A GB 2350035A GB 9917993 A GB9917993 A GB 9917993A GB 9917993 A GB9917993 A GB 9917993A GB 2350035 A GB2350035 A GB 2350035A
Authority
GB
United Kingdom
Prior art keywords
value
network
stress
raw
monitored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB9917993A
Other versions
GB2350035B (en
GB9917993D0 (en
Inventor
David James Stevenson
Andrew Hunter Gray
Robert James Duncan
Alastair Hugh Chisholm
Vanessa Serra
Colin Tinto
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
3Com Corp
Original Assignee
3Com Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 3Com Corp filed Critical 3Com Corp
Publication of GB9917993D0 publication Critical patent/GB9917993D0/en
Priority to US09/479,501 priority Critical patent/US6704284B1/en
Publication of GB2350035A publication Critical patent/GB2350035A/en
Application granted granted Critical
Publication of GB2350035B publication Critical patent/GB2350035B/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • H04L43/0847Transmission error
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • H04L43/0864Round trip delays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • H04L43/0882Utilisation of link capacity

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A raw data value for a monitored characteristic of a network device or link is obtained by the network management system and compared with a predetermined threshold value for the monitored characteristic. If the raw data value is greater than or equal to the threshold value, the management system provides a stress value equal to a default value which is a maximum or minimum of a predefined bounded range; if the raw data value is less than the threshold value the management system calculates a stress value within the bounded range using an appropriate algorithm. Thus raw data is normalised to a predetermined range for example zero to 100 for ease of interpretation by the network manager. Stress values for a plurality of monitored characteristics may be combined to form a single aggregated stress value for the network.

Description

2350035 MANAGEMENT SYSTEM AND METHOD FOR MONITORING STRESS IN A NETWORK
The present invention relates to the management of a communications system or network, and more particularly to the monitoring of "stress" in a network.
The following description is concerned with a data communications system such as a local area network (LAN), that is an Ethernet network. However, the skilled person will appreciate that the present invention will have more general applicability to other types of managed networks including wireless networks.
A local area network (LAN) typically comprises a plurality of computers, computer systems, workstations and other electronic devices connected together by a common media such as twisted pair or coaxial cable or fibre optic cable. Data can be communicated between devices on the network by means of data packets (or frames) in accordance a predefined protocol.
Computers and other devices connected to a network can be managed or unmanaged devices. A managed device has processing capability which enables it inter alia to monitor data traffic sent from, received at, and passing through the ports of the device. Monitored data associated with the ports of the network device is stored in memory on the network device. Unmanaged devices do not have this processing capability.
It is becoming increasingly common and necessary for an individual to be appointed to manage a network. The appointed network manager (or administrator) utilises a network management station which includes network management hardware and software. In particular, the network management station is able to access management data from managed network devices using an appropriate management protocol (e.g. the SNW protocol) and display this data for use by the network manager.
Known network management systems simply read the management data fl-om the managed network devices and present this data to the network manager, typically in the form of numeric text, substantially unchanged. The network manager is expected to interpret the data in managing the network.
One of the important tasks of a network manager is to assess the operational performance of the various network devices and links as well as the network as a whole.
The network manager needs to know when problems are arising within the network and which particular network devices are responsible for such problems etc.
Typical problems which may affect the performance of a network include:
1. slow operating speed of the network, and individual network devices, leading to slow movement of data traffic across the network, indicated by e.g. slow response time for a given network device; 2. high volumes of data traffic on the network due to e.g. over- utilisation of the network links, network devices and the network as a whole; and 3. high error rates in the transmission of data packets across the network, indicated by e.g. the loss of data packets in a network device and errors in received data packets.
The aforementioned problems which occur in the operation of a network contribute to the poor "health" of a network. The concept of the health of a network is well known in the field of network management. In the present description, however, it is more convenient to refer to the "stress" of a network (or network device or link) rather than its "health". It will be understood that a high level of stress equates to a low level of health and vice versa. Although the present invention is described as monitoring stress in a network, it will be appreciated that the present invention is equally applicable to -3 monitoring health in a network.
Problems which affect the performance of a network, and therefore contribute to the level of "stress", may be of greater or lesser significance. For instance, a problem with the operating speed of an end station may be less significant than a problem with the operating speed of a central core device such as a switch. Thus, the level of stress of a network depends upon a number offactors including device type, network media type and the type of problem occurring. The network manager must take all such factors into account when interpreting the management data received from the managed network devices to establish whether the performance of the network is satisfactory.
It would be desirable for a network management system to monitor characteristics of network devices and links which are indicative of problems occurring in a network which contribute to "stress", (such characteristics are referred to herein as "metrics") and provide data to the network manager indicative of the level of stress which is easy to interpret. It would further be desirable to determine a value indicating the overall performance of each network object (a "network object" is defined herein as a network device or link) on a network and/or the overall performance of parts of, or all of, the network, so that the network manager would not be required to interpret the monitored data, but could immediately ascertain the performance of the network (or part of the network) from the value provided by the network management system.
In accordance with a first aspect, the present invention provides a method for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network or a part thereof, the method comprising:
obtaining a raw value of data for a monitored characteristic of a network device or link, comparing said raw value with a predetermined threshold value for said monitored characteristic; if said raw data value is greater than or equal to said threshold value, providing a stress value equal to a default value, said default value being a maximum or minimum value of a predefined bounded range; and if said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic.
In accordance with a second aspect, the present invention provides a computer readable medium having a computer program for carrying out the method of the first aspect of the present invention.
In accordance with a third aspect the present invention provides network management apparatus for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network, the apparatus comprising:
a network connection or port for receiving a raw value of data for a monitored characteristic of a network device or link; a processor for comparing said raw value with a predetermined threshold value for said monitored characteristic obtained from memory; and ifthe comparison determines that raw data value is greater than or equal to said threshold value, providing a stress value equal to a default value, said default value being a maximum or minimum value of a predefined bounded range; and if the comparison determines that said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic stored in memory.
The present invention thus obtains a value for the stress of a network, network object or part thereof, within a predefined, bounded range, that is to say a "normalised" stress value. Since the stress value is a normalised value, is easy to understand, and requires no interpretation by the network manager.
Embodiments ofthe present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
Figure I is a block diagram of a typical network having a network management system according to a preferred embodiment of the present invention; Figure 2 is a stress mapping graph for a first stress metric which may be monitored by the network management system in accordance with the present invention; Figure 3 is a stress mapping graph for a second stress metric which may be monitored by the network management system in accordance with the present invention; and Figure 4 is a flow chart showing the steps carried out by a computer program in accordance with a preferred embodiment of the present invention.
Figure I shows a typical network I incorporating a network management system according to a preferred embodiment of the present invention. The network I includes a network management station 3A which incorporates the necessary hardware and software for network management. In particular, the network management station includes a processor, a memory and a disk drive, and preferably also a modem for internet access, as well as user interfaces such as a keyboard and mouse, and a visual display unit.
Network management application software in accordance with the present invention is loaded into the memory of management station 3A for processing data as described in detail below. The network management station 3 A is connected by network media links to a plurality of managed network devices including core devices such as network switch 7, hubs 11, a router (not shown) and end stations, which may be managed or unmanaged, including personal computers 3 and workstations. The network may also include unmanaged devices, for example peripheral devices such as printers etc.
The network management station 3A is capable of communicating with the managed network devices such as network switch 7 and hubs 11 by means of a network management protocol (e.g. the SNW protocol) in order to obtain network management data. Each managed device includes a processor which monitors and stores data in memory on the device, and such data may be represented to an external management station by a MIB (management information base), as is well known in the art, including data relating to inter alia data traffic at the device. A typical managed device monitors data contained in a number of M1Bs, (of which one or more contains data used in the present invention). An example of a MB containing network management data is MIB-11 (formerly AfiBA) as specified by the IETF (Internet Engineering Task Force) in specification RK1213. MIB-II is common to most vendors' core devices and any network management system should preferably be capable of reading and utilising management data from TY1IB-11. Furthermore, the network management system of the preferred embodiment of the present invention is additionally capable of reading and utilising more complex management data contained in such M1Bs as RMON (Remote Monitoring MIB, RK1271), RMON2 (Remote Monitoring MIB 2, RK2021), the standard bridge NUB (RFC1493), the standard repeater MIB (RFC1516), or any proprietary MIB s produced by original equipment manufacturers (e. g. the 3 Com Remote Poll MIB).
In accordance with the preferred embodiment of the present invention, the network management station 3 A obtains data about a certain metric for a network object which is indicative of the level of "stress" of the network object. For example, the station 3A may request from a managed device, such as switch 7, a single piece of information (called an "object instance" or SNW variable) from a MB about a particular state of the device such as the current error rate in data packets sent from and received at one of its ports. In accordance with the present invention, the data obtained from the MIB is processed using a predetermined "mapping algorithm" for error rate in a port of the switch to obtain a "normalised" stress value (as explained in more detail below). The normalised stress value may be displayed on the visual display unit of the network management station or otherwise communicated to the network manager, e.g. via a printer or through another application such as a word processing application. The normalised stress value represents the perceived level of stress ofthe network device (i.e.
a level which takes into account all relevant factors including the type and location of the network device) which can be readily understood by the network manager as indicating whether the level of stress is acceptable.
The network management system may also obtain metric data for a network object without accessing a MIB in a managed device. For example, the network management station 3A may send a signal to a device over a link which prompts a response from the device (e.g. by 1P ICAP echo or IP Ping, as is known in the art). The network management station will itself then monitor the time taken to receive a response from the device. The normalised stress value is then determined using a predetermined "mapping algorithm" for the monitored metric.
The following tables represent examples of the information/characteristics which the network management system, according to the preferred embodiment of the present invention, monitors. In the tables, the monitored information/characteristic is called a "metric" and the typical level ofthe raw measured value which represents an unacceptably high level of stress (i.e. poor performance) is called the Mefault threshold". The default threshold is set by the vendor of the network management system but may adjustable by the user, if required.
Examples ofmetrics obtained by the network management station from a network device are shown in Table 1. These stress metrics are examples of characteristics monitored by the network management system to ensure that parts ofthe network devices (i.e. discrete hardware or software components) are operating in an acceptable fashion.
Table I metric tVDical default threshold IP Ping response time 1000 milliseconds DNS response time 1000 milliseconds FTP response time 1000 milliseconds HTTP response time 1000 milliseconds POP3 response time 1000 milliseconds SMTP response time 1000 milliseconds NFS response time 1000 milliseconds (Note that not all devices will have all of the functions indicated by the metric. The network management station of the preferred embodiment only sends signals appropriate for each network device to obtain the metrics for that device).
Examples of metrics obtained by the network management station requesting information contained in a MIB in a managed network device such as switch 7 are shown in Table 2. These stress metrics are examples of the characteristics monitored in a core device to determine whether errors and bottlenecks of data traffic (due to excessive traffic) are occurring in the device.
Table 2 metric lypical default threshold fi-ames discarded due to 0 frames per second excessive delay frames discarded due to 0 frames per second MTU exceeded fi-ames discarded because 0 frames per second filtering table full rate of topology changes 0 per second (Note that the metrics listed in Table 2 are monitored and stored in most layer 2 switching devices such as bridges or switches.) Examples of metrics obtained by the network management station from a network device such as a half duplex Ethernet interface or port on a network device are shown in Table 3. These stress metrics are examples of characteristics monitored to identify problems occurring at the ports of network devices such as over-utilisation of the link connected to the port.
Table 3 metric iypical default threshold link utilisation 35 % bandwidth link error rate 5 % all frames collisions 20 per second broadcast frames 3000 per second (Note that the metrics listed in Table 3 are monitored and stored for each port in most managed network devices which support a MIB capable of monitoring traffic flow on a port (e.g. NIM-H, RMON, repeater MIB, or a similar operating MIB).) Finally, an example of a metric obtained by the network management station requesting data from a managed network device which supports a proprietary MIB, for example a M113 of 3Com Corporation, is response time from far end of link, which has a typical default threshold of 1000 milliseconds. The network management station requests from the network device this data, and the processor in the network device sends a signal to the far end of the link and times the period for a response to be received. The monitored response time is then provided to the management station as a raw value indicative of problems relating to the speed of transmission of data across the network.
A normalised stress value is determined by the network management system ofthe preferred embodiment using an appropriate algorithm as discussed in more detail below, based on a) the characteristics of the information being requested; b) the type of managed device being monitored, and c) the type of media to which the device is attached. In the embodiment, the stress value is "normalised" for all stress metrics within a predefined, bounded range. In the preferred embodiment the range is 0 - 100 where 100 is an unacceptable level of stress. As mentioned previously, the stress value for each metric is determined using an appropriate "stress mapping" algorithm which considers the aforementioned factors a) to c) to provide an appropriate level of perceived stress in the range 0 to 100 to the user. In particular, the stress mapping applied to any monitored value will always return a stress value in the same range i.e. 0 to 100. The stress mapping algorithms for all monitored values (i. c. all metrics for all network objects) are designed such that a given reported stress value always has the same meaning for the user, no matter which raw monitored value (i.e. metric) was used to generate the stress value. A better understanding of the manner of implementation of the stress mapping algorithms, which are predetermined by the vendor of the network management system, will be appreciated from Examples 1 and 2.
ExMle Ifirst stress metric - Link utilisation The network management station 3A requests from the network switch 7 a numerical value (or object instance) in MIB-11 (and also in RMON) for determining the link utilisation at a particular port of the switch 7. As is well known in the art, the processor in switch 7 will monitor the numbers of data packets and bytes which are transnted from and received at each of its ports and store this data in memory as an appropriate MIB. The network management station 3A receives the MIB data for a particular port and therefore the link connected to the port, and knowing: the type of link and its state of operation, and therefore its bandwidth; and, the elapsed time since the last request and the last requested value, and hence the traffic rate since the last request; determines the % utilisation of the link. The % utilisation of bandwidth of the link is referred to below as the "utilisation value" and is a "raw value" (i.e. before stress mapping).
The network management station then compares the utilisation value with the default threshold for the link stored in its memory (e.g. 35% bandwidth in accordance with Table 3). If the utilisation value exceeds or equals the default threshold then the unacceptably high level of stress. If the utilisation value is below the default threshold then the processor utilises an appropriate stress mapping algorithm to determine the stress value within the range 0 to 100 corresponding to the utilisation value.
The relationship between the utilisation value and the perceived level of stress is non- linear. The stress mapping for link utilisation may be represented as shown in Figure 2. As can be seen from this graph, the perceived stress on the network link increases dramatically from about 30% to an unacceptabl level at 35% bandwidth utilised (the default threshold in Table 3). Thus 35% bandwidth utilisation or greater equates to the maximum stress value 100 as explained above.
At levels below 3 5% bandwidth utilisation, the non-finear relationship between link utilisation and perceived stress may be represented by key points as shown in Table 4.
Table 4 raw value stress (0 to 100) 0 0 1 8 2 5 10 20 30 32 50 100 The implementation ofthe stress mapping algorithm extrapolates these key points to determine the appropriate stress value between 0 and 100 for any raw value (utilisation value).
ExMle 2: Second stress metric - Link error The network management station 3A requests from the network switch 7 the number of error frames received at a certain port and the number of data packets received at that port. This data is contained in a IfiB of the switch 7 such as 11IBAI, RMON, or a similar standard or proprietary MIB. As is well known in the art, the processor in switch 7 keeps a count of the number of data packets which are received at each of its ports, and the number of data packets received in error at each of its ports, and stores this data as an appropriate MIB. The network management station 3A receives the aforementioned NfiB data for a particular port and knowing the elapsed time since the last request and the last requested values, determines the % data packets in error on the link.
The % data packets in error on the link, i.e. the error rate, is referred to below as the @'error value" and is a "raw value" (i.e. before stress mapping).
The network management station then compares the error value with the default threshold for errors on the particular fink stored in its memory (e.g. 5% frames in error in accordance with Table 3). If the error value exceeds or equals the default threshold then the stress value is determined to be the maximum stress value i.e. 100, indicating an unacceptably high level of stress. If the error value is below the default threshold then the processor utilises an appropriate stress mapping algorithm to determine the stress value within the range 0 to 100 corresponding to the error value.
The relationship between the error value the perceived level of stress is non- linear.
The stress mapping for link error may be represented as shown in Figure 3. As can be seen from this graph, the perceived stress on the network link increases dramatically from about 4% error rate to an unacceptable level at 5% frames in error (the default threshold in Table 3). Thus 5% error or greater equates to the maximum stress value of 100.
At levels below 5% error, the non-linear relationship between % frames in error and perceived stress may be represented by key points as shown in Table 5.
Table 5 raw value stress (0 to 100) 0 0 1 1 2 2 3 5 4 30 100 The implementation of the stress mapping algorithm in the network management station extrapolates the key points in the stress mapping shown in Table 5 to obtain a stress value within the range 0 to 100 corresponding to any raw value (i.e. error value).
In accordance with the preferred embodiment of the present invention, the network management station 3A stores in memory the stress mapping algorithms and default thresholds for all stress metrics monitored in the management system. The processor ofthe network management station running the application software carries out the steps shown in Figure 4. At step 101 the program obtains for a particular network object a raw data value (or object instance) for a given metric and in step 102 compares it to the default threshold. If in step 103 it is found that the raw data value is greater than or equal to the default threshold, a default condition arises whereby the processor determines that the stress value is a default value in step 104, which in the preferred embodiment is a maximum (i.e. 100). Otherwise, the processor retrieves the appropriate algorithm for the metric in step 105 and calculates the normalised stress value (i.e. between 0 and 100) in step 106.
The thus determined stress value represents meaningful information to the network manager since it is normalised within a predetermined range. If the normalised stress value is then sent to a display unit in step 107 to be displayed or to a printer to be printed e.g. as a number on the visual display unit of the network management station, together with a large number of other normalised stress values, the network manager can simply look for high numbers (say above 50) to detect potential problems. Accordingly, by utilising "stress-mapping" to determine a normalised stress value for each metric of a network object, the network manager can compare the stress of different areas of a device or different parts of the network and be able to judge which areas need the most urgent attention, without the need to analyse the data for each device to determine whether it has a relatively high stress level for the type of device, the type of media link and the operating state of the link.
The network management system in accordance with the preferred embodiment of the present invention is designed to monitor a plurality of difiFerent stress metrics for each network object. The system retrieves the monitored data for some or all of the metrics appropriate to a given network object and aggregates the data to form an overall object stress value as explained below. This enables the user to view the data for an individual network object and ascertain its overall performance.
For instance, the stresses of a plurality of individual stress metrics of a network device are monitored and the data obtained is aggregated to form an overall device stress value.
A network device such as switch 7 is typically composed of the following components:
- a stackable backplane of some kind to which individual units (i.e. switches, routers etc) may be attached; - a bus within each unit into which blades can be inserted; and - ports or interfaces on each blade.
The stress of these components may be aggregated together to provide an overall stress value for the network device. Silifflarly, the stress of multiple network devices may be aggregated together to provide a single metric for overall network stress.
ExMple 3Aggregation strategy for stress values for a plurallty of metrics in a network device In this example, the aggregation strategy for a given network device such as switch 7 (which may be a stacked device formed from several units) is as follows:
Step 1: The stress of each blade is determined to be the worst of - the blade's own monitored stress (i.e. the stress derived using measurable metrics on the blade itself);and - the worst stress of any of the ports (or interfaces) on that blade Step 2: The stress of each unit is determined to be the worst of.
- the unit's own monitored stress; and - the worst stress of any of the blades inserted in that unit (as determined in Step 1) Step 3: The stress of each network device is determined to be the worst of..
- the device's own monitored stress; and - the worst stress of any of the units which are part of the device (as determined in Step 2) The overall stress of the network can then be determined, for example to be the worst of the stresses of all of the network objects within the network.
The overall stress of the network is thus the worst stress of any individual monitored component within the network.
ExgMIe 4Alternative aggregLtion strategies Other more advanced aggregation strategies may follow the strategy of Example 3, but with the following additions:
1 For a unit or blade, also consider the effect that the bad performance of that component has on the parent component's ability to support other constituents. 5 For example, a blade which is under high stress may affect the performance of other blades if it congests the common, shared backplane. A unit which is under high stress may affect the performance of a network device if it congests the device's single interface onto the network.
2. For the network device, unit or blade, the stress of the component may be determined to be worse or better than just the worst of the constituent components.
For example, the stresses of the interfaces on a blade may be 60, 70 and 80. But the network management system may intelligently decide that this combination of very high stresses means that the blade itself is stressed to, say, 90. If the stresses were 10, 10 and 80, the management application may decide that this single high stress does not warrant a blade stress of 80, but perhaps only 50. 20 As described above, in accordance with a preferred embodiment of the present invention, thenetwork management system monitors a plurality of stress metrics for all of the managed devices on the network, and in addition to aggregating the data for each object (e.g. device) to form an overall stress value for that object (e.g. device), the network management system may additionally aggregate together the overall object stress values of all of the network objects to provide an overall stress value for the network. Alternatively, the network management system may aggregate the worst ofthe underlying stresses of each of the devices, or at each network level, to provide the overall network stress values. In the latter case, the overall network stress value will represent the stress 30 value of the least healthy device or component across the whole network.
The network management station in accordance with the invention may monitor stress periodically or in response to commands from the network manager.
As will be appreciated from the foregoing, in accordance with a preferred embodiment, the present invention is implemented in the form of a software application which may be provided in the form of a computer program on a computer readable medium. The computer readable medium may be a disk which can be loaded in the disk drive of network management station 3 A or the computer system carrying the website or other form of file server (e.g. FTP) of, for example, the supplier of network devices, which permits downloading of the program by a management station over the internet.
The program steps 10 1 to 107 are illustrated in Figure 4 and have been described above.
As the skilled person will appreciate, various modifications may be made to the described embodiments and examples. For instance, as previously mentioned, if the invention is applied to determine a health value in a network, it will be appreciated that the same metrics, raw values and default thresholds will be applicable as for stress, as described above. However, the mapping algorithms will differ, and the default threshold will return the minimum value within the normalised range indicating poor health. The network manager will then need to look for low values of health to detect potential problems in the network.
The present invention is intended to include all such modifications and equivalents which fall within the scope ofthe present invention as defined in the accompanying claims.

Claims (37)

CLAIMS:
1. A method for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network or a part thereof, the method comprising:
obtaining a raw value of data for a monitored characteristic of a network device or link; comparing said raw value with a predetermined threshold value for said monitored characteristic; if said raw data value is greater than or equal to said threshold value, providing a stress value equal to a default value, said default value being a maximum or minimum value of a predefined bounded range; and if said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic.
2. A method as claimed in claim 1, wherein said monitored characteristic relates to the port of a network device and concerns the utilisation of the link connected to the device.
3. A method as claimed in claim 2, wherein said monitored characteristic includes one of link utilisation in frames per second; link error rate in frames per second; collisions per second, and broadcast frames per second.
4. A method as claimed in claim 1, 2 or 3, wherein said step of obtaining a raw value comprises receiving said raw value from memory in a network device.
5. A method as claimed in claim 1, 2 or 3, wherein said step of obtaining a raw value comprises receiving data values from memory in a network device, and calculating said -19raw value for said monitored characteristic using said data values.
6. A method as claimed in claim 1, wherein said monitored characteristic relates to the operating speed of hardware within a network device.
7. A method as claimed in claim 6, wherein said monitored characteristic includes one of IP Ping response time; DNS response time: FT? response time; HTT? response time; POP3 response time; SN4TP response time, and NFS response time.
8. A method as claimed in claim 6 or claim 7, wherein said step of obtaining a raw value comprises:
sending a signal to said hardware in said network device, which signal prompts a response from said hardware, and timing the period for the response to be received; wherein said timed time period is the raw value.
9. A method as claimed in claim 1, wherein said monitored characteristic is a characteristic relating to errors occurring in a core network device.
10. A method as claimed in claim 9, wherein said monitored characteristic includes one of frames per second discarded due to excessive delay; frames per second discarded due to N1TU exceeded; frames per second discarded because filtering table is full, and rate of topology changes. 25
11. A method as claimed in claim 9 or claim 10, wherein said step of obtaining a raw value comprises receiving said raw value from memory in said core network device.
12. A method as claimed in any preceding claim, wherein said predefined bounded range is 0 to 100, where the default value is 100 if it is the maximum value and 0 if it is 30 the minimum value.
13. A method as claimed in any preceding claim, further comprising displaying said stress value as alphanumeric text or in graphical form on a display screen.
14. A method as claimed in any preceding claim, wherein the threshold value for the monitored characteristic is adjustable by the user.
15. A method for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network, the method comprising:
obtaining a plurality of raw values of data for a corTesponding plurality of monitored characteristics of a network device or link; comparing each raw value with a predetermined threshold value for said corresponding monitored characteristic; for each raw value determining a stress value within a predetern-dned bounded range of values, wherein if said raw data value is greater than or equal to said corresponding threshold value, determining a stress value equal to a default value, said default value being either the maximum or minimum value of said predefined bounded range; and if said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic.
16. A method as claimed in claim 15, further comprising aggregating the stress values obtained for each of said plurality of monitored characteristics to provide an aggregated stress value.
17. A method for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance ofthe network substantially as hereinbefore described, with reference to the accompanying drawings.
18. A computer readable medium containing a computer program for carrying out the method as claimed in any preceding claim.
19. Apparatus for processing data representing monitored characteristics in a network comprising network devices and links to provide a stress value representing the performance of the network, the apparatus comprising:
a network connection or port for receiving a raw value of data for a monitored characteristic of a network device or link, a processor for comparing said raw value with a predetermined threshold value for said monitored characteristic obtained from memory; and ifthe comparison determines that raw data value is greater than or equal to said threshold value, providing a stress value equal to a default value, said default value being a maximum or minimum value of a predefined bounded range; and if the comparison determines that said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an appropriate algorithm for said monitored characteristic stored in memory.
20. Apparatus as claimed in claim 19, wherein said monitored characteristic relates to the port of a network device and concerns the utilisation of the link connected to the device.
21. Apparatus as claimed in claim 20, wherein said monitored characteristic includes one of link utilisation in frames per second; fink error rate in frames per second; collisions per second, and broadcast frames per second.
22. Apparatus as claimed in claim 19, 20 or 21, wherein said apparatus obtains said raw value by receiving said raw value at said port from memory in a network device.
23. Apparatus as claimed in claim 19, 20 or 21, wherein said apparatus obtains said raw value by receiving data values at said port from memory in a network device, and said 30 processor calculating said raw value for said monitored characteristic using said received -22data values.
24. Apparatus as claimed in claim 19, wherein said monitored characteristic relates to the operating speed of hardware within a network device.
25. Apparatus as claimed in claim 24, wherein said monitored characteristic includes one of. IP Ping response time; DNS response time: FT? response time, HTTP response time; POP3 response time; SMT? response time, and NFS response time.
26. Apparatus as claimed in claim 25 or claim 26, wherein said apparatus obtains said raw value by sending a signal to said hardware in said network device, which signal prompts a response from said hardware, and timing the period for the response to be received; wherein said timed time period is the raw value used by said processor. 15
27. Apparatus as claimed in claim 19, wherein said monitored characteristic is a characteristic relating to errors occurring in a core network device.
28. Apparatus as claimed in claim 27, wherein said monitored characteristic includes one of. frames per second discarded due to excessive delay; frames per second discarded 20 due to MTU exceeded; frames per second discarded because filtering table is full, and rate of topology changes.
29. Apparatus as claimed in claim 27 or claim 28, wherein said apparatus obtains said raw value by receiving said raw value at said port from memory in said core network 25 device.
30. Apparatus as claimed in claim 27 or claim 28, wherein said apparatus obtains said raw value by receiving data values at said port from memory in a network device, and said processor calculating said raw value for said monitored characteristic using said received 30 data values.
31. Apparatus as claimed in any one of claims 19 to 30, wherein said predefined bounded range is 0 to 100, where the default value is 100 if it is the maximum value and 0 if it is the minimum value.
32. Apparatus as claimed in any one of claims 19 to 3 1, further comprising a display for receiving said stress value from said processor and displaying said stress value as alphanumeric text or in graphical form.
33. Apparatus as claimed in any one of claims 19 to 32, further comprising memory 10 to store said predetermined threshold value.
34. Apparatus as claimed in claim 33, further comprising means for writing a threshold value for a monitored characteristic to said memory.
35. Apparatus as claimed in any one of claims 19 to 34, wherein said apparatus is capable of obtaining a plurality of raw values of data for a corresponding plurality of monitored characteristics of a network device or link at said port, and said processor is adapted to compare each raw value with a predetermined threshold value for said corresponding monitored characteristic retrieved from memory, and for each raw value 20 determining a stress value within a predetermined bounded range ofvalues, wherein if said raw data value is greater than or equal to said corresponding threshold value, determining a stress value equal to a default value, the default value being either a maximum or minimum value of said predefined bounded range; and if said raw value is less than said threshold value, calculating a stress value within said predefined bounded range using an 25 appropriate algorithm for said monitored characteristic, the processor thereby providing a plurality of stress values corresponding to said plurality of raw values
36. Apparatus as claimed in claim 35, wherein said processor if further adapted to aggregate the plurality of stress values for each of said plurality of monitored 30 characteristics to provide an aggregated stress value.
37. Apparatus substantially as hereinbefore described with reference to, and as shown in, the accompanying drawings.
GB9917993A 1999-05-10 1999-07-30 Management system and method for monitoring stress in a network Expired - Fee Related GB2350035B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/479,501 US6704284B1 (en) 1999-05-10 2000-01-07 Management system and method for monitoring stress in a network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GBGB9910838.3A GB9910838D0 (en) 1999-05-10 1999-05-10 Management system and method for monitoring stress in a network

Publications (3)

Publication Number Publication Date
GB9917993D0 GB9917993D0 (en) 1999-09-29
GB2350035A true GB2350035A (en) 2000-11-15
GB2350035B GB2350035B (en) 2001-12-05

Family

ID=10853179

Family Applications (2)

Application Number Title Priority Date Filing Date
GBGB9910838.3A Ceased GB9910838D0 (en) 1999-05-10 1999-05-10 Management system and method for monitoring stress in a network
GB9917993A Expired - Fee Related GB2350035B (en) 1999-05-10 1999-07-30 Management system and method for monitoring stress in a network

Family Applications Before (1)

Application Number Title Priority Date Filing Date
GBGB9910838.3A Ceased GB9910838D0 (en) 1999-05-10 1999-05-10 Management system and method for monitoring stress in a network

Country Status (1)

Country Link
GB (2) GB9910838D0 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2372667A (en) * 2001-02-21 2002-08-28 3Com Corp Providing improved stress thresholds in network management systems
GB2374247B (en) * 1999-03-17 2004-06-30 Ericsson Telefon Ab L M Method and arrangement for performance analysis of data networks
US7010588B2 (en) 2001-02-27 2006-03-07 3Com Corporation System using a series of event processors for processing network events to reduce number of events to be displayed
US7016955B2 (en) 2001-02-27 2006-03-21 3Com Corporation Network management apparatus and method for processing events associated with device reboot
US7673035B2 (en) 2001-02-27 2010-03-02 3Com Corporation Apparatus and method for processing data relating to events on a network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2271918A (en) * 1992-10-22 1994-04-27 Hewlett Packard Co Monitoring system status
US5819028A (en) * 1992-06-10 1998-10-06 Bay Networks, Inc. Method and apparatus for determining the health of a network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5819028A (en) * 1992-06-10 1998-10-06 Bay Networks, Inc. Method and apparatus for determining the health of a network
GB2271918A (en) * 1992-10-22 1994-04-27 Hewlett Packard Co Monitoring system status

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2374247B (en) * 1999-03-17 2004-06-30 Ericsson Telefon Ab L M Method and arrangement for performance analysis of data networks
GB2372667A (en) * 2001-02-21 2002-08-28 3Com Corp Providing improved stress thresholds in network management systems
GB2372667B (en) * 2001-02-21 2003-05-07 3Com Corp Apparatus and method for providing improved stress thresholds in network management systems
US6633230B2 (en) 2001-02-21 2003-10-14 3Com Corporation Apparatus and method for providing improved stress thresholds in network management systems
US7010588B2 (en) 2001-02-27 2006-03-07 3Com Corporation System using a series of event processors for processing network events to reduce number of events to be displayed
US7016955B2 (en) 2001-02-27 2006-03-21 3Com Corporation Network management apparatus and method for processing events associated with device reboot
US7673035B2 (en) 2001-02-27 2010-03-02 3Com Corporation Apparatus and method for processing data relating to events on a network

Also Published As

Publication number Publication date
GB2350035B (en) 2001-12-05
GB9910838D0 (en) 1999-07-07
GB9917993D0 (en) 1999-09-29

Similar Documents

Publication Publication Date Title
US6704284B1 (en) Management system and method for monitoring stress in a network
US20210119890A1 (en) Visualization of network health information
US6269401B1 (en) Integrated computer system and network performance monitoring
US6469986B1 (en) Method and system for configuring a network management network
US6633230B2 (en) Apparatus and method for providing improved stress thresholds in network management systems
US6108800A (en) Method and apparatus for analyzing the performance of an information system
US6966015B2 (en) Method and system for reducing false alarms in network fault management systems
US10911263B2 (en) Programmatic interfaces for network health information
US7283555B2 (en) Method and apparatus for determining a polling interval in a network management system
WO2019133763A1 (en) System and method of application discovery
EP0996254A2 (en) A method for quantifying communication performance
US20030088529A1 (en) Data network controller
US20020165934A1 (en) Displaying a subset of network nodes based on discovered attributes
US7895333B2 (en) Estimating network management bandwidth
US7746801B2 (en) Method of monitoring a network
US20060168263A1 (en) Monitoring telecommunication network elements
US6954785B1 (en) System for identifying servers on network by determining devices that have the highest total volume data transfer and communication with at least a threshold number of client devices
GB2350035A (en) Management system and method for monitoring stress in a network
US7673035B2 (en) Apparatus and method for processing data relating to events on a network
US7742423B2 (en) Method of heuristic determination of network interface transmission mode and apparatus implementing such method
GB2362062A (en) Network management apparatus with graphical representation of monitored values
US6928394B2 (en) Method for dynamically adjusting performance measurements according to provided service level
Cheikhrouhou et al. An efficient polling layer for SNMP
CN117215781A (en) Equipment scheduling system and method
GB2362288A (en) Generating events in network management systems using filters

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20060730