Item 11: Prefer generics to trait objects

Item 2 described the use of traits to encapsulate behaviour in the type system, as a collection of related methods, and observed that there are two ways to make use of traits: as trait bounds for generics, or in trait objects. This Item explores the trade-offs between these two possibilities.

Rust's generics are roughly equivalent to C++'s templates: they allow the programmer to write code that works for some arbitrary type T, and specific uses of the generic code are generated at compile time – a process known as monomorphization in Rust, and template instantiation in C++. Unlike C++, Rust explicitly encodes the expectations for the type T in the type system, in the form of trait bounds for the generic.

In comparison, trait objects are fat pointers (Item 8) that combine a pointer to the underlying concrete item with a pointer to a vtable that in turn holds function pointers for all of the trait implementation's methods.

    let square = Square::new(1, 2, 2);
    let draw: &dyn Drawable = □
Trait object

These basic facts already allow some immediate comparisons between the two possibilities:

  • Generics are likely to lead to bigger code sizes, because the compiler generates a fresh copy of the code generic::<T>(t: &T) for every type T that gets used; a traitobj(t: &dyn T) method only needs a single instance.
  • Invoking a trait method from a generic will generally be slightly faster than from code that uses a trait object, because the latter needs to perform two dereferences to find the location of the code (trait object to vtable, vtable to implementation location).
  • Compile times for generics may be longer, as the compiler is building more code and the linker has more work to do to fold duplicates.

In most situations, these aren't significant differences; you should only use optimization-related concerns as a primary decision driver if you've measured the impact and found that it has a genuine effect (a speed bottleneck or a problematic occupancy increase).

A more significant difference is that generic trait bounds can used to conditionally make methods available, depending on whether the type parameter implements multiple traits.

trait Drawable {
    fn bounds(&self) -> Bounds;
}
    struct Container<T>(T);

    impl<T: Drawable> Container<T> {
        // The `area` method is available for all `Drawable` containers.
        fn area(&self) -> i64 {
            let bounds = self.0.bounds();
            (bounds.bottom_right.x - bounds.top_left.x)
                * (bounds.bottom_right.y - bounds.top_left.y)
        }
    }

    impl<T: Drawable + Debug> Container<T> {
        // The `show` method is only available if `Debug` is also implemented.
        fn show(&self) {
            println!("{:?} has bounds {:?}", self.0, self.0.bounds());
        }
    }
    let square = Container(Square::new(1, 2, 2)); // Square is not Debug
    let circle = Container(Circle::new(3, 4, 1)); // Circle is Debug

    println!("area(square) = {}", square.area());
    println!("area(circle) = {}", circle.area());
    circle.show();
    // The following line would not compile.
    // square.show();

A trait object only encodes the implementation vtable for a single trait, so doing something equivalent is much more awkward. For example, a combination DebugDrawable trait could be defined for the show() case, together with some conversion operations (Item 6) to make life easier. However, if there are multiple different combinations of distinct crates, it's clear that the combinatorics of this approach rapidly become unwieldy.

Item 2 described the use of trait bounds to restrict what type parameters are acceptable for a generic function. Trait bounds can also be applied to trait definitions themselves:

trait Shape: Drawable {
    fn render_in(&self, bounds: Bounds);
    fn render(&self) {
        self.render_in(overlap(SCREEN_BOUNDS, self.bounds()));
    }
}

In this example, the render() method's default implementation (Item 12) makes use of the trait bound, relying on the availability of the bounds() method from Drawable.

Programmers coming from object-oriented languages often confuse trait bounds with inheritance, under the mistaken impression that a trait bound like this means that a Shape is-a Drawable. That's not the case: the relationship between the two types is better expressed as Shape also-implements Drawable.

Under the covers, trait objects for traits that have trait bounds

    let square = Square::new(1, 2, 2);
    let draw: &dyn Drawable = &square;
    let shape: &dyn Shape = &square;

have a single combined vtable that includes the methods of the top-level trait, plus the methods of all of the trait bounds:

Trait objects for trait bounds

This means that there is no way to "upcast" from Shape to Drawable, because the (pure) Drawable vtable can't be recovered at runtime (see Item 18 for more on this). There is no way to convert between related trait objects, which in turn means there is no Liskov substitution.

Repeating the same point in different words, a method that accepts a Shape trait object

  • can make use of methods from Drawable (because Shape also-implements Drawable, and because the relevant function pointers are present in the Shape vtable)
  • cannot pass the trait object on to another method that expects a Drawable trait object (because Shape is-not Drawable, and because the Drawable vtable isn't available).

In contrast, a generic method that accepts items that implement Shape

  • can use methods from Drawable
  • can pass the item on to another generic method that has a Drawable trait bound, because the trait bound is monomorphized at compile time to use the Drawable methods of the concrete type.

Another restriction on trait objects is the requirement for object safety: only traits that comply with the following two rules can be used as trait objects.

  • The trait's methods must not be generic.
  • The trait's methods must not return a type that includes Self.

The first restriction is easy to understand; a generic method f is really an infinite set of methods, potentially encompassing f::<i16>, f::<i32>, f::<i64>, f::<u8>, … The trait object's vtable, on the other, is very much a finite collection of function pointers, and so it's not possible to fit an infinite quart into a finite pint pot.

The second restriction is a little bit more subtle, but tends to be the restriction that's hit more often in practice – traits that impose Copy or Clone trait bounds (Item 5) immediately fall under this rule. To see why it's disallowed, consider code that has a trait object in its hands; what happens if that code calls (say) let y = x.clone()? The calling code needs to reserve enough space for y on the stack, but it has no idea of the size of y because Self is an arbitrary type. As a result, return types that mention1 Self lead to a trait that is not object safe.

The balance of factors so far leads to the advice to prefer generics to trait objects, but there are situations where trait objects are the right tool for the job.

Trait objects fundamentally involve type erasure: information about the concrete type is lost in the conversion to a trait object (see also Item 18). One place where this is useful is collections of heterogeneous objects – code that just relies on the methods of the trait can invoke and combine the methods of differently typed items. The traditional OO example of rendering a list of shapes would be one example of this: the same render() method could be used for squares, circles, ellipses and stars in the same loop.

    let shapes: Vec<&dyn Shape> = vec![&square, &circle];
    for shape in shapes {
        shape.render()
    }

A much more obscure example is when the available types are not known at compile-time; if new code is dynamically loaded at run-time (e.g via dlopen(3)), then items that implement traits in the new code can only be invoked via a trait object.


1: At present, the restriction on methods that return Self includes types like Box<Self> that could be safely stored on the stack; this restriction might be relaxed in future.