Item 7: Embrace the newtype pattern

Item 1 described tuple structs, where the fields of a struct have no names and are instead referred to by number (self.0). This Item focuses on tuple structs that have a single entry, which is a pattern that's sufficiently pervasive in Rust that it deserves its own Item and has its own name: the newtype pattern.

The simplest use of the newtype pattern is to indicate additional semantics for a type, over and above its normal behaviour. To illustrate this, imagine a project that's going to send a satellite to Mars1. It's a big project, so different groups have built different parts of the project. One group has handled the code for the rocket engines:

    /// Fire the thrusters. Returns generated force in Newton seconds.
    pub fn thruster_impulse(direction: Direction) -> f64 {
        // ...
        return 42.0;
    }

while a different group handles the inertial guidance system:

#![allow(unused)]
fn main() {
    /// Update trajectory model for impulse, provided in pound force seconds.
    pub fn update_trajectory(force: f64) {
        // ...
    }
}

Eventually these different parts eventually need to be joined together:

        let thruster_force: f64 = thruster_impulse(direction);
        let new_direction = update_trajectory(thruster_force);

Ruh-roh.

Rust includes a type alias feature, which allows the different groups to make their intentions clearer:

    /// Units for force.
    pub type NewtonSeconds = f64;

    /// Fire the thrusters. Returns generated force.
    pub fn thruster_impulse(direction: Direction) -> NewtonSeconds {
        // ...
        return 42.0;
    }
    /// Units for force.
    pub type PoundForceSeconds = f64;

    /// Update trajectory model for impulse.
    pub fn update_trajectory(force: PoundForceSeconds) {
        // ...
    }

However, the type aliases are effectively just documentation; they're a stronger hint than the doc comments of the previous version, but nothing stops a NewtonSeconds value being used where a PoundForceSeconds value is expected:

        let thruster_force: NewtonSeconds = thruster_impulse(direction);
        let new_direction = update_trajectory(thruster_force);

Ruh-roh once more.

This is the point where the newtype pattern helps.

/// Units for force.
pub struct NewtonSeconds(pub f64);

/// Fire the thrusters. Returns generated force.
pub fn thruster_impulse(direction: Direction) -> NewtonSeconds {
    // ...
    return NewtonSeconds(42.0);
}
/// Units for force.
pub struct PoundForceSeconds(pub f64);

/// Update trajectory model for impulse.
pub fn update_trajectory(force: PoundForceSeconds) {
    // ...
}

As the name implies, a newtype is a new type, and as such the compiler objects to type conversions (cf. Item 6):

    let thruster_force: NewtonSeconds = thruster_impulse(direction);
    let new_direction = update_trajectory(thruster_force);
error[E0308]: mismatched types
  --> newtype/src/main.rs:76:43
   |
76 |     let new_direction = update_trajectory(thruster_force);
   |                                           ^^^^^^^^^^^^^^ expected struct `PoundForceSeconds`, found struct `NewtonSeconds`

The same pattern of using a newtype to mark additional "unit" semantics for a type can also help to make boolean arguments less ambiguous. Revisiting the example from Item 1, using newtypes makes the meaning of arguments clear:

#![allow(unused)]
fn main() {
struct DoubleSided(pub bool);

struct ColourOutput(pub bool);

fn print_page(sides: DoubleSided, colour: ColourOutput) {
    // ...
}
}
    print_page(DoubleSided(true), ColourOutput(false));

If size efficiency or binary compatibility is a concern, then the [repr(transparent)] attribute ensures that a newtype has the same representation in memory as the inner type.

That's the simple use of newtype, and it's a specific example of Item 1 – encoding semantics into the type system, so that the compiler takes care of policing those semantics.

Bypassing the Orphan Rule for Traits

The other common, but more subtle, scenario that requires the newtype pattern revolves around Rust's orphan rule. Roughly speaking, this says that you can only implement a trait for a type if:

  • you own the trait, or
  • you own the type.

Attempting to implement a foreign trait for a foreign type:

impl fmt::Display for rand::rngs::StdRng {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> Result<(), fmt::Error> {
        write!(f, "<StdRng instance>")
    }
}

leads to a compiler error (which in turn points the way back to newtypes).

error[E0117]: only traits defined in the current crate can be implemented for arbitrary types
   --> newtype/src/main.rs:125:1
    |
125 | impl fmt::Display for rand::rngs::StdRng {
    | ^^^^^^^^^^^^^^^^^^^^^^------------------
    | |                     |
    | |                     `StdRng` is not defined in the current crate
    | impl doesn't use only types from inside the current crate
    |
    = note: define and implement a trait or new type instead

The reason for this restriction is due to the risk of ambiguity: if two different crates in the dependency graph (Item 25) were both to (say) impl std::fmt::Display for rand::rngs::StdRng, then the compiler/linker has no way to choose between them.

This can frequently lead to frustration: for example, if you're trying to serialize data that includes a type from another crate, the orphan rule prevents2 you from writing impl serde::Serialize for somecrate::SomeType.

But the newtype pattern means that you're creating a new type, which you own, and so the second part of the orphan trait rule applies. Implementing a foreign trait is now possible:

#![allow(unused)]
fn main() {
struct MyRng(rand::rngs::StdRng);

use std::fmt;
impl fmt::Display for MyRng {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> Result<(), fmt::Error> {
        write!(f, "<Rng instance>")
    }
}
}

Newtype Limitations

The newtype pattern solves these two classes of problems – preventing unit conversions and bypassing the orphan rule – but it does come with some awkwardness: every operation that involves the newtype needs to forward to the inner type.

On a trivial level that means that the code has to use thing.0 throughout, rather than just thing, but that's easy and the compiler will tell you where it's needed.

The more significant awkwardness is that any trait implementations on the inner type are lost: because the newtype is a new type, the existing inner implementation doesn't apply.

For derivable traits this just means that the newtype declaration ends up with lots of derives:

    #[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd)]
    pub struct NewType(InnerType);

However, for more sophisticated traits some forwarding boilerplate is needed to recover the inner type's implementation, for example:

    use std::fmt;
    impl fmt::Display for NewType {
        fn fmt(&self, f: &mut fmt::Formatter<'_>) -> Result<(), fmt::Error> {
            self.0.fmt(f)
        }
    }


1: Specifically, the Mars Climate Orbiter.

2: This is a sufficiently common problem for serde that it includes a mechanism to help.