Toslali et al., 2024 - Google Patents
An Online Probabilistic Distributed Tracing SystemToslali et al., 2024
View PDF- Document ID
- 13930558259601147045
- Author
- Toslali M
- Qasim S
- Parthasarathy S
- Oliveira F
- Huang H
- Stringhini G
- Liu Z
- Coskun A
- Publication year
- Publication venue
- arXiv preprint arXiv:2405.15645
External Links
Snippet
Distributed tracing has become a fundamental tool for diagnosing performance issues in the cloud by recording causally ordered, end-to-end workflows of request executions. However, tracing in production workloads can introduce significant overheads due to the extensive …
- 241000055285 Astraea <gastropod> 0 abstract description 140
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0766—Error or fault reporting or storing
- G06F11/0775—Content or structure details of the error report, e.g. specific table structure, specific error fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3636—Software debugging by tracing the execution of the program
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
- G06F11/3072—Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/008—Reliability or availability analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/86—Event-based monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance or administration or management of packet switching networks
- H04L41/14—Arrangements for maintenance or administration or management of packet switching networks involving network analysis or design, e.g. simulation, network model or planning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing packet switching networks
- H04L43/08—Monitoring based on specific metrics
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lee et al. | Eadro: An end-to-end troubleshooting framework for microservices on multi-source data | |
Böhme | STADS: Software testing as species discovery | |
Chen et al. | CauseInfer: Automated end-to-end performance diagnosis with hierarchical causality graph in cloud environment | |
Jiang et al. | Discovering likely invariants of distributed transaction systems for autonomic system management | |
US10303539B2 (en) | Automatic troubleshooting from computer system monitoring data based on analyzing sequences of changes | |
Aggarwal et al. | Localization of operational faults in cloud applications by mining causal dependencies in logs using golden signals | |
US20130097109A1 (en) | Method for determining a preferred node in a classification and regression tree for use in a predictive analysis | |
Reidemeister et al. | Mining unstructured log files for recurrent fault diagnosis | |
Yu et al. | TraceRank: Abnormal service localization with dis‐aggregated end‐to‐end tracing data in cloud native systems | |
US8321362B2 (en) | Methods and apparatus to dynamically optimize platforms | |
Oliner et al. | Online detection of multi-component interactions in production systems | |
Wu et al. | Causal inference techniques for microservice performance diagnosis: Evaluation and guiding recommendations | |
Ji et al. | Perfce: Performance debugging on databases with chaos engineering-enhanced causality analysis | |
Chen et al. | ARF-predictor: Effective prediction of aging-related failure using entropy | |
US11438239B2 (en) | Tail-based span data sampling | |
Agarwal et al. | Diagnosing mobile applications in the wild | |
US20200042418A1 (en) | Real time telemetry monitoring tool | |
Duplyakin et al. | In datacenter performance, the only constant is change | |
Yang et al. | Microres: Versatile resilience profiling in microservices via degradation dissemination indexing | |
Iqbal et al. | CADET: Debugging and fixing misconfigurations using counterfactual reasoning | |
Chow et al. | {DQBarge}: Improving {Data-Quality} Tradeoffs in {Large-Scale} Internet Services | |
Toslali et al. | Unleashing performance insights with online probabilistic tracing | |
Oliner et al. | Advances and Challenges in Log Analysis: Logs contain a wealth of information for help in managing systems. | |
WO2016085443A1 (en) | Application management based on data correlations | |
Toslali et al. | An Online Probabilistic Distributed Tracing System |