Appleby et al., 2002 - Google Patents
Yemanja—A layered fault localization system for multi-domain computing utilitiesAppleby et al., 2002
- Document ID
- 8432724804270305818
- Author
- Appleby K
- Goldszmidt G
- Steinder M
- Publication year
- Publication venue
- Journal of Network and Systems Management
External Links
Snippet
Yemanja is a model-based event correlation engine for multi-layer fault diagnosis. It targets complex propagating fault scenarios, and can smoothly correlate low-level network events with high-level application performance alerts related to quality-of-service violations. Entity …
- 230000004807 localization 0 title description 16
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing packet switching networks
- H04L43/08—Monitoring based on specific metrics
- H04L43/0805—Availability
- H04L43/0817—Availability functioning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/06—Arrangements for maintenance or administration or management of packet switching networks involving management of faults or events or alarms
- H04L41/0654—Network fault recovery
- H04L41/0659—Network fault recovery by isolating the faulty entity
- H04L41/0663—Network fault recovery by isolating the faulty entity involving offline failover planning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/04—Architectural aspects of network management arrangements
- H04L41/046—Aspects of network management agents
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/50—Network service management, i.e. ensuring proper service fulfillment according to an agreement or contract between two parties, e.g. between an IT-provider and a customer
- H04L41/5003—Managing service level agreement [SLA] or interaction between SLA and quality of service [QoS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0709—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/02—Arrangements for maintenance or administration or management of packet switching networks involving integration or standardization
- H04L41/0213—Arrangements for maintenance or administration or management of packet switching networks involving integration or standardization using standardized network management protocols, e.g. simple network management protocol [SNMP] or common management interface protocol [CMIP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/12—Arrangements for maintenance or administration or management of packet switching networks network topology discovery or management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/22—Arrangements for maintenance or administration or management of packet switching networks using GUI [Graphical User Interface]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing packet switching networks
- H04L43/10—Arrangements for monitoring or testing packet switching networks using active monitoring, e.g. heartbeat protocols, polling, ping, trace-route
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/14—Arrangements for maintenance or administration or management of packet switching networks involving network analysis or design, e.g. simulation, network model or planning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/02—Details
- H04L12/26—Monitoring arrangements; Testing arrangements
- H04L12/2602—Monitoring arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/08—Configuration management of network or network elements
- H04L41/0803—Configuration setting of network or network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/16—Network management using artificial intelligence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3633511B1 (en) | Method and system for automatic real-time causality analysis of end user impacting system anomalies using causality rules and topological understanding of the system to effectively filter relevant monitoring data | |
US6856942B2 (en) | System, method and model for autonomic management of enterprise applications | |
US10924329B2 (en) | Self-healing Telco network function virtualization cloud | |
Meyer et al. | Decentralizing control and intelligence in network management | |
US6393386B1 (en) | Dynamic modeling of complex networks and prediction of impacts of faults therein | |
US7065566B2 (en) | System and method for business systems transactions and infrastructure management | |
US9509524B2 (en) | System and method for service level management | |
Yan et al. | G-rca: a generic root cause analysis platform for service quality management in large ip networks | |
Appleby et al. | Yemanja-a layered event correlation engine for multi-domain server farms | |
US11916721B2 (en) | Self-healing telco network function virtualization cloud | |
Sandur et al. | Jarvis: Large-scale server monitoring with adaptive near-data processing | |
Appleby et al. | Yemanja—A layered fault localization system for multi-domain computing utilities | |
Wang et al. | Spatio-temporal patterns in network events | |
Cho et al. | CBR-based network performance management with multi-agent approach | |
Liu et al. | Causal network telemetry | |
Lutfiyya et al. | Fault management in distributed systems: A policy-driven approach | |
Katchabaw et al. | Policy-driven fault management in distributed systems | |
Barone et al. | A management architecture for active networks | |
Kim et al. | ExNet: An Intelligent Network Management System. | |
Jailani et al. | FMS: A computer network fault management system based on the OSI standards | |
Olawuyi et al. | Computer network fault management system using the OSI standards | |
Kätker¹ et al. | FAULT MANAGEMENT IN QOS-ENABLED DISTRIBUTED | |
Lin | Check for Microscope: Pinpoint Performance Issues with Causal Graphs in Micro-service Environments Jinjin Lin, Pengfei Chen (), and Zibin Zheng | |
Kätker et al. | Fault management in QoS-enabled distributed systems | |
Popescu et al. | A semantic-oriented framework for system diagnosis |