Item 19: Avoid reflection

Programmers coming to Rust from other languages are often used to reaching for reflection as a tool in their toolbox. They can waste a lot of time trying to implement reflection-based designs in Rust, only to discover that what they're attempting can only be done poorly, if at all. This Item hopes to save that time wasted exploring dead-ends, by describing what Rust does and doesn't have in the way of reflection, and what can be used instead.

Reflection is the ability of a program to examine itself at run-time. Given an item at run-time, it covers:

  • What information can be determined about the item's type?
  • What can be done with that information?

Programming languages with full reflection support have extensive answers to these questions – as well as determining an item's type at run-time, its contents can be explored, its fields modified and its methods invoked. Languages that have this level of reflection support tend to be dynamically typed languages (e.g. Python, Ruby), but there are also some notable statically typed languages that also support this, particularly Java and Go.

Rust does not support this type of reflection, which makes the advice to avoid reflection easy to follow at this level – it's just not possible. For programmers coming from languages with support for full reflection, this absence may seem like a significant gap at first, but Rust's other features provide alternative ways of solving many of the same problems.

C++ has a more limited form of reflection, known as run-time type identification (RTTI). The typeid operator returns a unique identifier for every type, for objects of polymorphic type (roughly: classes with virtual functions):

  • typeid can recover the concrete class of an object referred to via a base class reference
  • dynamic_cast<T> allows base class references to be converted to derived classes, when it is safe and correct to do so.

Rust does not support this RTTI style of reflection either, continuing the theme that the advice of this Item is easy to follow.

Rust does support some features that provide similar functionality (in the std::any module), but they're limited (in ways explored below) and so best avoided unless no other alternatives are possible.

The first reflection-like feature looks magic at first – a way of determining the name of an item's type:

    let x = 42u32;
    let y = Square::new(3, 4, 2);
    println!("x: {} = {}", tname(&x), x);
    println!("y: {} = {:?}", tname(&y), y);
x: u32 = 42
y: reflection::Square = Square { top_left: Point { x: 3, y: 4 }, size: 2 }

The implementation of tname() reveals what's up the compiler's sleeve; the function is generic (as per Item 12) and so each invocation of it is actually a different function (tname::<u32> or tname::<Square>):

#![allow(unused)]
fn main() {
fn tname<T: ?Sized>(_v: &T) -> &'static str {
    std::any::type_name::<T>()
}
}

The std::any::type_name<T> library function only has access to compile-time information; nothing clever is happening at run-time.

The string returned by type_name is only suitable for diagnostics – it's explicitly a "best-effort" helper whose contents may change, and may not be unique – so don't attempt to parse type_name results. If you need a globally unique type identifier, use TypeId instead:

#![allow(unused)]
fn main() {
use std::any::TypeId;

fn type_id<T: 'static + ?Sized>(_v: &T) -> TypeId {
    TypeId::of::<T>()
}
}
    println!("x has {:?}", type_id(&x));
    println!("y has {:?}", type_id(&y));
x has TypeId { t: 14816064564273904734 }
y has TypeId { t: 7700407161019666586 }

The output is less helpful for humans, but the guarantee of uniqueness means that the result can be used in code. However, it's usually best not to do so directly, but to use the std::any::Any trait1 instead.

This trait has a single method type_id(), which returns the TypeId value for the type that implements the trait. You can't implement this trait yourself though, because Any already comes with a blanket implementation for every type T:

impl<T: 'static + ?Sized> Any for T {
    fn type_id(&self) -> TypeId {
        TypeId::of::<T>()
    }
}

Recall from Item 9 that a trait object is a fat pointer that holds a pointer to the underlying item, together with a pointer to the trait implementation's vtable. For Any, the vtable has a single entry, for a method that returns the item's type.

    let x_any: Box<dyn Any> = Box::new(42u64);
    let y_any: Box<dyn Any> = Box::new(Square::new(3, 4, 3));
Any trait objects, each with pointers to concrete items and vtables

Modulo a couple of indirections, a dyn Any trait object is effectively a combination of a raw pointer and a type identifier. This means that Any can offer some additional generic methods:

  • is<T> to indicate whether the trait object's type is equal to some specific other type T.
  • downcast_ref<T> which returns a reference to the concrete type T, provided that the type matches.
  • downcast_mut<T> for the mutable variant of downcast_ref.

Observe that the Any trait is only approximating reflection functionality: the programmer chooses (at compile-time) to explicitly build something (&dyn Any) that keeps track of an item's compile-time type as well as its location. The ability to (say) downcast back to the original type is only possible if the overhead of building an Any trait object has happened.

There are comparatively few scenarios where Rust has different compile-time and run-time types associated with an item. Chief among these is trait objects: an item of a concrete type Square can be coerced into a trait object dyn Shape for a trait that the type implements. This coercion builds a fat pointer (object+vtable) from a simple pointer (object/item).

Recall also from Item 12 that Rust's trait objects are not really object-oriented. It's not the case that a Square is-a Shape, it's just that a Square implements Shape's interface. The same is true for trait bounds: a trait bound Shape: Drawable does not mean is-a, it just means also-implements; the vtable for Shape includes the entries for the methods of Drawable.

For some simple trait bounds:

trait Drawable: Debug {
    fn bounds(&self) -> Bounds;
}

trait Shape: Drawable {
    fn render_in(&self, bounds: Bounds);
    fn render(&self) {
        self.render_in(overlap(SCREEN_BOUNDS, self.bounds()));
    }
}

the equivalent trait objects:

    let square = Square::new(1, 2, 2);
    let draw: &dyn Drawable = &square;
    let shape: &dyn Shape = &square;

have a layout whose arrows make the problem clear: given a dyn Shape object, there's no way to build a dyn Drawable trait object, because there's no way to get back to the vtable for impl Drawable for Square – even though the relevant parts of its contents (the address of the Square::bounds method) is theoretically recoverable.

Trait objects for trait bounds, with distinct vtables for Shape and Square

Comparing with the previous diagram, it's also clear that an explicitly constructed &dyn Any trait object doesn't help. Any allows recovery of the original concrete type of the underlying item, but there is no run-time way to see what traits it implement, nor to get access to the relevant vtable that might allow creation of a trait object.

So what's available instead?

The primary tool to reach for is trait definitions, and this is in line with advice for other languages – Effective Java Item 65 recommends "Prefer interfaces to reflection". If code needs to rely on certain behaviour being available for an item, encode that behaviour as a trait (Item 2). Even if the desired behaviour can't be expressed as a set of method signatures, use marker traits to indicate compliance with the desired behaviour – it's safer and more efficient than (say) introspecting the name of a class to check for a particular prefix.

Code that expects trait objects can also be used with objects whose backing code was not available at program link time, because it has been dynamically loaded at run-time (via dlopen(3) or equivalent) – which means that monomorphization of a generic (Item 12) isn't possible.

Relatedly, reflection is sometimes also used in other languages to allow multiple incompatible versions of the same dependency library to be loaded into the program at once, bypassing linkage constraints that There Can Be Only One. This is not needed in Rust, where Cargo already copes with multiple versions of the same library (Item 25).

Finally, macros – especially derive macros – can be used to auto-generate ancillary code that understands an item's type at compile-time, as a more efficient and more type-safe equivalent to code that parses an item's contents at run-time.


1: The C++ equivalent of Any is std::any, and advice is to avoid it too