Function black_box

1.66.0 (const: unstable) · Source

pub fn black_box<T>(dummy: T) -> T

Expand description

Prevents compiler optimizations on a value.

black_box should only be used on inputs and outputs of benchmarks. Newcomers to benchmarking may be tempted to also use black_box within the implementation, but doing so will overly pessimize the measured code without any benefit.

§Benchmark Inputs

When benchmarking, it’s good practice to ensure measurements are accurate by preventing the compiler from optimizing based on assumptions about benchmark inputs.

The compiler can optimize code for indices it knows about, such as by removing bounds checks or unrolling loops. If real-world use of your code would not know indices up front, consider preventing optimizations on them in benchmarks:

use divan::black_box;

const INDEX: usize = // ...
const SLICE: &[u8] = // ...

#[divan::bench]
fn bench() {
    work(&SLICE[black_box(INDEX)..]);
}

The compiler may also optimize for the data itself, which can also be avoided with black_box:

#[divan::bench]
fn bench() {
    work(black_box(&SLICE[black_box(INDEX)..]));
}

§Benchmark Outputs

When benchmarking, it’s best to ensure that all of the code is actually being run. If the compiler knows an output is unused, it may remove the code that generated the output. This optimization can make benchmarks appear much faster than they really are.

At the end of a benchmark, we can force the compiler to treat outputs as if they were actually used:

#[divan::bench]
fn bench() {
    black_box(value.to_string());
}

To make the code clearer to readers that the output is discarded, this code could instead call black_box_drop.

Alternatively, the output can be returned from the benchmark:

#[divan::bench]
fn bench() -> String {
    value.to_string()
}

Returning the output will black_box it and also avoid measuring the time to drop the output, which in this case is the time to deallocate a String. Read more about this in the #[divan::bench] docs.

Standard Library Documentation

An identity function that hints to the compiler to be maximally pessimistic about what black_box could do.

Unlike std::convert::identity, a Rust compiler is encouraged to assume that black_box can use dummy in any possible valid way that Rust code is allowed to without introducing undefined behavior in the calling code. This property makes black_box useful for writing code in which certain optimizations are not desired, such as benchmarks.

Note however, that black_box is only (and can only be) provided on a “best-effort” basis. The extent to which it can block optimisations may vary depending upon the platform and code-gen backend used. Programs cannot rely on black_box for correctness, beyond it behaving as the identity function. As such, it must not be relied upon to control critical program behavior. This also means that this function does not offer any guarantees for cryptographic or security purposes.

§When is this useful?

While not suitable in those mission-critical cases, black_box’s functionality can generally be relied upon for benchmarking, and should be used there. It will try to ensure that the compiler doesn’t optimize away part of the intended test code based on context. For example:

fn contains(haystack: &[&str], needle: &str) -> bool {
    haystack.iter().any(|x| x == &needle)
}

pub fn benchmark() {
    let haystack = vec!["abc", "def", "ghi", "jkl", "mno"];
    let needle = "ghi";
    for _ in 0..10 {
        contains(&haystack, needle);
    }
}

The compiler could theoretically make optimizations like the following:

The needle and haystack do not change, move the call to contains outside the loop and delete the loop
Inline contains
needle and haystack have values known at compile time, contains is always true. Remove the call and replace with true
Nothing is done with the result of contains: delete this function call entirely
benchmark now has no purpose: delete this function

It is not likely that all of the above happens, but the compiler is definitely able to make some optimizations that could result in a very inaccurate benchmark. This is where black_box comes in:

use std::hint::black_box;

// Same `contains` function
fn contains(haystack: &[&str], needle: &str) -> bool {
    haystack.iter().any(|x| x == &needle)
}

pub fn benchmark() {
    let haystack = vec!["abc", "def", "ghi", "jkl", "mno"];
    let needle = "ghi";
    for _ in 0..10 {
        // Adjust our benchmark loop contents
        black_box(contains(black_box(&haystack), black_box(needle)));
    }
}

This essentially tells the compiler to block optimizations across any calls to black_box. So, it now:

Treats both arguments to contains as unpredictable: the body of contains can no longer be optimized based on argument values
Treats the call to contains and its result as volatile: the body of benchmark cannot optimize this away

This makes our benchmark much more realistic to how the function would actually be used, where arguments are usually not known at compile time and the result is used in some way.

Function black_boxCopy item path

§Benchmark Inputs

§Benchmark Outputs

Standard Library Documentation

§When is this useful?

Function black_box