Item 7: Use builders for complex types

This Item describes the builder pattern, where complex data structures have an associated builder type that makes it easier for users to create instances of the data structure.

Rust insists that all fields in a struct must be filled in when a new instance of that struct is created. This keeps the code safe by ensuring that there are never any uninitialized values but does lead to more verbose boilerplate code than is ideal.

For example, any optional fields have to be explicitly marked as absent with None:

#![allow(unused)]
fn main() {
/// Phone number in E164 format.
#[derive(Debug, Clone)]
pub struct PhoneNumberE164(pub String);

#[derive(Debug, Default)]
pub struct Details {
    pub given_name: String,
    pub preferred_name: Option<String>,
    pub middle_name: Option<String>,
    pub family_name: String,
    pub mobile_phone: Option<PhoneNumberE164>,
}

// ...

let dizzy = Details {
    given_name: "Dizzy".to_owned(),
    preferred_name: None,
    middle_name: None,
    family_name: "Mixer".to_owned(),
    mobile_phone: None,
};
}

This boilerplate code is also brittle, in the sense that a future change that adds a new field to the struct requires an update to every place that builds the structure.

The boilerplate can be significantly reduced by implementing and using the Default trait, as described in Item 10:

let dizzy = Details {
    given_name: "Dizzy".to_owned(),
    family_name: "Mixer".to_owned(),
    ..Default::default()
};

Using Default also helps reduce the changes needed when a new field is added, provided that the new field is itself of a type that implements Default.

That's a more general concern: the automatically derived implementation of Default works only if all of the field types implement the Default trait. If there's a field that doesn't play along, the derive step doesn't work:

#[derive(Debug, Default)]
pub struct Details {
    pub given_name: String,
    pub preferred_name: Option<String>,
    pub middle_name: Option<String>,
    pub family_name: String,
    pub mobile_phone: Option<PhoneNumberE164>,
    pub date_of_birth: time::Date,
    pub last_seen: Option<time::OffsetDateTime>,
}
error[E0277]: the trait bound `Date: Default` is not satisfied
  --> src/main.rs:48:9
   |
41 |     #[derive(Debug, Default)]
   |                     ------- in this derive macro expansion
...
48 |         pub date_of_birth: time::Date,
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `Default` is not
   |                                       implemented for `Date`
   |
   = note: this error originates in the derive macro `Default`

The code can't implement Default for chrono::Utc because of the orphan rule; but even if it could, it wouldn't be helpful—using a default value for date of birth is going to be wrong almost all of the time.

The absence of Default means that all of the fields have to be filled out manually:

let bob = Details {
    given_name: "Robert".to_owned(),
    preferred_name: Some("Bob".to_owned()),
    middle_name: Some("the".to_owned()),
    family_name: "Builder".to_owned(),
    mobile_phone: None,
    date_of_birth: time::Date::from_calendar_date(
        1998,
        time::Month::November,
        28,
    )
    .unwrap(),
    last_seen: None,
};

These ergonomics can be improved if you implement the builder pattern for complex data structures.

The simplest variant of the builder pattern is a separate struct that holds the information needed to construct the item. For simplicity, the example will hold an instance of the item itself:

pub struct DetailsBuilder(Details);

impl DetailsBuilder {
    /// Start building a new [`Details`] object.
    pub fn new(
        given_name: &str,
        family_name: &str,
        date_of_birth: time::Date,
    ) -> Self {
        DetailsBuilder(Details {
            given_name: given_name.to_owned(),
            preferred_name: None,
            middle_name: None,
            family_name: family_name.to_owned(),
            mobile_phone: None,
            date_of_birth,
            last_seen: None,
        })
    }
}

The builder type can then be equipped with helper methods that fill out the nascent item's fields. Each such method consumes self but emits a new Self, allowing different construction methods to be chained:

/// Set the preferred name.
pub fn preferred_name(mut self, preferred_name: &str) -> Self {
    self.0.preferred_name = Some(preferred_name.to_owned());
    self
}

/// Set the middle name.
pub fn middle_name(mut self, middle_name: &str) -> Self {
    self.0.middle_name = Some(middle_name.to_owned());
    self
}

These helper methods can be more helpful than just simple setters:

/// Update the `last_seen` field to the current date/time.
pub fn just_seen(mut self) -> Self {
    self.0.last_seen = Some(time::OffsetDateTime::now_utc());
    self
}

The final method to be invoked for the builder consumes the builder and emits the built item:

/// Consume the builder object and return a fully built [`Details`]
/// object.
pub fn build(self) -> Details {
    self.0
}

Overall, this allows clients of the builder to have a more ergonomic building experience:

let also_bob = DetailsBuilder::new(
    "Robert",
    "Builder",
    time::Date::from_calendar_date(1998, time::Month::November, 28)
        .unwrap(),
)
.middle_name("the")
.preferred_name("Bob")
.just_seen()
.build();

The all-consuming nature of this style of builder leads to a couple of wrinkles. The first is that separating out stages of the build process can't be done on its own:

let builder = DetailsBuilder::new(
    "Robert",
    "Builder",
    time::Date::from_calendar_date(1998, time::Month::November, 28)
        .unwrap(),
);
if informal {
    builder.preferred_name("Bob");
}
let bob = builder.build();
error[E0382]: use of moved value: `builder`
   --> src/main.rs:256:15
    |
247 |     let builder = DetailsBuilder::new(
    |         ------- move occurs because `builder` has type `DetailsBuilder`,
    |                 which does not implement the `Copy` trait
...
254 |         builder.preferred_name("Bob");
    |                 --------------------- `builder` moved due to this method
    |                                       call
255 |     }
256 |     let bob = builder.build();
    |               ^^^^^^^ value used here after move
    |
note: `DetailsBuilder::preferred_name` takes ownership of the receiver `self`,
      which moves `builder`
   --> src/main.rs:60:35
    |
27  |     pub fn preferred_name(mut self, preferred_name: &str) -> Self {
    |                               ^^^^

This can be worked around by assigning the consumed builder back to the same variable:

let mut builder = DetailsBuilder::new(
    "Robert",
    "Builder",
    time::Date::from_calendar_date(1998, time::Month::November, 28)
        .unwrap(),
);
if informal {
    builder = builder.preferred_name("Bob");
}
let bob = builder.build();

The other downside to the all-consuming nature of this builder is that only one item can be built; trying to create multiple instances by repeatedly calling build() on the same builder falls foul of the compiler, as you'd expect:

let smithy = DetailsBuilder::new(
    "Agent",
    "Smith",
    time::Date::from_calendar_date(1999, time::Month::June, 11).unwrap(),
);
let clones = vec![smithy.build(), smithy.build(), smithy.build()];
error[E0382]: use of moved value: `smithy`
   --> src/main.rs:159:39
    |
154 |   let smithy = DetailsBuilder::new(
    |       ------ move occurs because `smithy` has type `base::DetailsBuilder`,
    |              which does not implement the `Copy` trait
...
159 |   let clones = vec![smithy.build(), smithy.build(), smithy.build()];
    |                            -------  ^^^^^^ value used here after move
    |                            |
    |                            `smithy` moved due to this method call

An alternative approach is for the builder's methods to take a &mut self and emit a &mut Self:

/// Update the `last_seen` field to the current date/time.
pub fn just_seen(&mut self) -> &mut Self {
    self.0.last_seen = Some(time::OffsetDateTime::now_utc());
    self
}

This removes the need for self-assignment in separate build stages:

let mut builder = DetailsBuilder::new(
    "Robert",
    "Builder",
    time::Date::from_calendar_date(1998, time::Month::November, 28)
        .unwrap(),
);
if informal {
    builder.preferred_name("Bob"); // no `builder = ...`
}
let bob = builder.build();

However, this version makes it impossible to chain the construction of the builder together with invocation of its setter methods:

let builder = DetailsBuilder::new(
    "Robert",
    "Builder",
    time::Date::from_calendar_date(1998, time::Month::November, 28)
        .unwrap(),
)
.middle_name("the")
.just_seen();
let bob = builder.build();
error[E0716]: temporary value dropped while borrowed
   --> src/main.rs:265:19
    |
265 |       let builder = DetailsBuilder::new(
    |  ___________________^
266 | |         "Robert",
267 | |         "Builder",
268 | |         time::Date::from_calendar_date(1998, time::Month::November, 28)
269 | |             .unwrap(),
270 | |     )
    | |_____^ creates a temporary value which is freed while still in use
271 |       .middle_name("the")
272 |       .just_seen();
    |                   - temporary value is freed at the end of this statement
273 |       let bob = builder.build();
    |                 --------------- borrow later used here
    |
    = note: consider using a `let` binding to create a longer lived value

As indicated by the compiler error, you can work around this by letting the builder item have a name:

let mut builder = DetailsBuilder::new(
    "Robert",
    "Builder",
    time::Date::from_calendar_date(1998, time::Month::November, 28)
        .unwrap(),
);
builder.middle_name("the").just_seen();
if informal {
    builder.preferred_name("Bob");
}
let bob = builder.build();

This mutating builder variant also allows for building multiple items. The signature of the build() method has to not consume self and so must be as follows:

/// Construct a fully built [`Details`] object.
pub fn build(&self) -> Details {
    // ...
}

The implementation of this repeatable build() method then has to construct a fresh item on each invocation. If the underlying item implements Clone, this is easy—the builder can hold a template and clone() it for each build. If the underlying item doesn't implement Clone, then the builder needs to have enough state to be able to manually construct an instance of the underlying item on each call to build().

With any style of builder pattern, the boilerplate code is now confined to one place—the builder—rather than being needed at every place that uses the underlying type.

The boilerplate that remains can potentially be reduced still further by use of a macro (Item 28), but if you go down this road, you should also check whether there's an existing crate (such as the derive_builder crate, in particular) that provides what's needed—assuming that you're happy to take a dependency on it (Item 25).