inferno/lib.rs
1//! Inferno is a set of tools that let you to produce [flame graphs] from performance profiles of
2//! your application. It's a port of parts Brendan Gregg's original [flamegraph toolkit] that aims
3//! to improve the performance of the original flamegraph tools and provide programmatic access to
4//! them to facilitate integration with _other_ tools (like [not-perf]).
5//!
6//! Inferno, like the original flame graph toolkit, consists of two "stages": stack collapsing and
7//! plotting. In the original Perl implementations, these were represented by the `stackcollapse-*`
8//! binaries and `flamegraph.pl` respectively. In Inferno, collapsing is available through the
9//! [`collapse`] module and the `inferno-collapse-*` binaries, and plotting can be found in the
10//! [`flamegraph`] module and the `inferno-flamegraph` binary.
11//!
12//! # Command-line use
13//!
14//! ## Collapsing stacks
15//!
16//! Most sampling profilers (as opposed to [tracing profilers]) work by repeatedly recording the
17//! state of the [call stack]. The stack can be sampled based on a fixed sampling interval, based
18//! on [hardware or software events], or some combination of the two. In the end, you get a series
19//! of [stack traces], each of which represents a snapshot of where the program was at different
20//! points in time.
21//!
22//! Given enough of these snapshots, you can get a pretty good idea of where your program is
23//! spending its time by looking at which functions appear in many of the traces. To ease this
24//! analysis, we want to "collapse" the stack traces so if a particular trace occurs more than
25//! once, we instead just keep it _once_ along with a count of how many times we've seen it. This
26//! is what the various collapsing tools do! You'll sometimes see the resulting tuples of stack +
27//! count called a "folded stack trace".
28//!
29//! Since profiling tools produce stack traces in a myriad of different formats, and the flame
30//! graph plotter expects input in a particular folded stack trace format, each profiler needs a
31//! separate collapse implementation. While the original Perl implementation supports _lots_ of
32//! profilers, Inferno currently only supports four: the widely used [`perf`] tool (specifically
33//! the output from `perf script`), [DTrace], [sample], and [VTune]. Support for xdebug is
34//! [hopefully coming soon], and [`bpftrace`] should get [native support] before too long.
35//!
36//! Inferno supports profiles from applications written in any language, but we'll walk through an
37//! example with a Rust program. To profile a Rust application, you would first set
38//!
39//! ```toml
40//! [profile.release]
41//! debug = true
42//! ```
43//!
44//! in your `Cargo.toml` so that your profile will have useful function names and such included.
45//! Then, compile with `--release`, and then run your favorite performance profiler:
46//!
47//! ### perf (Linux)
48//!
49//! ```console
50//! # perf record --call-graph dwarf target/release/mybin
51//! $ perf script | inferno-collapse-perf > stacks.folded
52//! ```
53//!
54//! For more advanced uses, see Brendan Gregg's excellent [perf examples] page.
55//!
56//! Note: For larger binaries (like Firefox), the perf script can be significantly slowed down
57//! by a non-optimal performance of the addr2line tool. Starting from perf version 6.12, you can
58//! use an alternative addr2line tool (by using `perf script --addr2line=/path/to/addr2line`),
59//! where the recommended one would be the Rust implementation from [Gimli project].
60//!
61//! ### DTrace (macOS)
62//!
63//! ```console
64//! $ target/release/mybin &
65//! $ pid=$!
66//! # dtrace -x ustackframes=100 -n "profile-97 /pid == $pid/ { @[ustack()] = count(); } tick-60s { exit(0); }" -o out.user_stacks
67//! $ cat out.user_stacks | inferno-collapse-dtrace > stacks.folded
68//! ```
69//!
70//! For more advanced uses, see also upstream FlameGraph's [DTrace examples].
71//! You may also be interested in something like [NodeJS's ustack helper].
72//!
73//! ### xctrace (macOS)
74//!
75//! ```console
76//! $ target/release/mybin &
77//! $ pid=$!
78//! # xctrace record --template 'Time Profiler' --attach $pid --output out.trace
79//! $ xctrace export --input out.trace --xpath '/trace-toc/*/data/table[@schema="time-profile"]' | inferno-collapse-xctrace | inferno-flamegraph > flamegraph.svg
80//! ```
81//!
82//! If you want demangled output(xctrace won't demangle rust symbols for you), you can use [rustfilt]:
83//!
84//! ```console
85//! $ xctrace export --input out.trace --xpath '/trace-toc/*/data/table[@schema="time-profile"]' | inferno-collapse-xctrace | rustfilt | inferno-flamegraph > flamegraph.svg
86//! ```
87//!
88//! For more advanced uses, see man page of xctrace [xctrace]
89//!
90//! ### sample (macOS)
91//!
92//! ```console
93//! $ target/release/mybin &
94//! $ pid=$!
95//! $ sample $pid 30 -file sample.txt
96//! $ inferno-collapse-sample sample.txt > stacks.folded
97//! ```
98//!
99//! ### VTune (Windows and Linux)
100//!
101//! ```console
102//! $ amplxe-cl -collect hotspots -r resultdir -- target/release/mybin
103//! $ amplxe-cl -R top-down -call-stack-mode all -column=\"CPU Time:Self\",\"Module\" -report-out result.csv -filter \"Function Stack\" -format csv -csv-delimiter comma -r resultdir
104//! $ inferno-collapse-vtune result.csv > stacks.folded
105//! ```
106//!
107//! ## Producing a flame graph
108//!
109//! Once you have a folded stack file, you're ready to produce the flame graph SVG image. To do so,
110//! simply provide the folded stack file to `inferno-flamegraph`, and it will print the resulting
111//! SVG. Following on from the example above:
112//!
113//! ```console
114//! $ cat stacks.folded | inferno-flamegraph > profile.svg
115//! ```
116//!
117//! And then open `profile.svg` in your viewer of choice.
118//!
119//! ## Differential flame graphs
120//!
121//! You can debug CPU performance regressions with the help of differential flame graphs.
122//! They let you easily visualize the differences between two profiles performed before and
123//! after a code change. See Brendan Gregg's [differential flame graphs] blog post for a great
124//! writeup. To create one you must first pass the two folded stack files to `inferno-diff-folded`,
125//! then send the output to `inferno-flamegraph`. Example:
126//!
127//! ```console
128//! $ inferno-diff-folded folded1 folded2 | inferno-flamegraph > diff2.svg
129//! ```
130//!
131//! The flamegraph will be colored based on higher samples (red) and smaller samples (blue). The
132//! frame widths will be based on the 2nd folded profile. This might be confusing if stack frames
133//! disappear entirely; it will make the most sense to ALSO create a differential based on the 1st
134//! profile widths, while switching the hues. To do this, reverse the order of the input files
135//! and pass the `--negate` flag to `inferno-flamegraph` like this:
136//!
137//! ```console
138//! $ inferno-diff-folded folded2 folded1 | inferno-flamegraph --negate > diff1.svg
139//! ```
140//!
141//! # Feature flags
142//! All features below are enabled by default
143//! - `cli`: Also builds the `inferno` command-line tools
144//! - `multithreaded`: Enables multithreaded stack-collapsing
145//! - `nameattr`: Allows for adding customizing and adding attributes to the svg of [`flamegraph`]. See the `--nameattr` option for the flamegraph cli
146//!
147//! # Development
148//!
149//! This crate was initially developed through [a series of live coding sessions]. If you want to
150//! contribute to the code, that may be a good way to learn why it's all designed the way it is!
151//!
152//! [flame graphs]: http://www.brendangregg.com/flamegraphs.html
153//! [flamegraph toolkit]: https://github.com/brendangregg/FlameGraph
154//! [not-perf]: https://github.com/nokia/not-perf
155//! [tracing profilers]: https://danluu.com/perf-tracing/
156//! [call stack]: https://en.wikipedia.org/wiki/Call_stack
157//! [hardware or software events]: https://perf.wiki.kernel.org/index.php/Tutorial#Events
158//! [stack traces]: https://en.wikipedia.org/wiki/Stack_trace
159//! [`perf`]: https://perf.wiki.kernel.org/index.php/Main_Page
160//! [DTrace]: https://www.joyent.com/dtrace
161//! [xctrace]: https://keith.github.io/xcode-man-pages/xctrace.1.html
162//! [hopefully coming soon]: https://twitter.com/DanielLockyer/status/1094605231155900416
163//! [native support]: https://github.com/jonhoo/inferno/issues/51#issuecomment-466732304
164//! [`bpftrace`]: https://github.com/iovisor/bpftrace
165//! [perf examples]: http://www.brendangregg.com/perf.html
166//! [DTrace examples]: http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#DTrace
167//! [NodeJS's ustack helper]: http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-its-time/
168//! [rustfilt]: https://github.com/luser/rustfilt
169//! [a series of live coding sessions]: https://www.youtube.com/watch?v=jTpK-bNZiA4&list=PLqbS7AVVErFimAvMW-kIJUwxpPvcPBCsz
170//! [differential flame graphs]: http://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html
171//! [sample]: https://gist.github.com/loderunner/36724cc9ee8db66db305#profiling-with-sample
172//! [VTune]: https://software.intel.com/en-us/vtune-amplifier-help-command-line-interface
173//! [gimli project]: https://github.com/gimli-rs/addr2line
174
175#![cfg_attr(doc, warn(rustdoc::all))]
176#![cfg_attr(doc, allow(rustdoc::missing_doc_code_examples))]
177#![deny(missing_docs)]
178#![warn(unreachable_pub)]
179#![allow(clippy::disallowed_names)]
180
181/// Stack collapsing for various input formats.
182///
183/// See the [crate-level documentation] for details.
184///
185/// [crate-level documentation]: ../index.html
186pub mod collapse;
187
188/// Tool for creating an output required to generate differential flame graphs.
189///
190/// See the [crate-level documentation] for details.
191///
192/// [crate-level documentation]: ../index.html
193pub mod differential;
194
195/// Tools for producing flame graphs from folded stack traces.
196///
197/// See the [crate-level documentation] for details.
198///
199/// [crate-level documentation]: ../index.html
200pub mod flamegraph;