Item 35: Prefer bindgen to manual FFI mappings

Item 34 discussed the mechanics of invoking C code from a Rust program, describing how declarations of C structures and functions need to have an equivalent Rust declaration to allow them to be used over FFI. The C and Rust declarations need to be kept in sync, and Item 34 also warned that the toolchain wouldn't help with this—mismatches would be silently ignored, hiding problems that would arise later.

Keeping two things perfectly in sync sounds like a good target for automation, and the Rust project provides the right tool for the job: bindgen. The primary function of bindgen is to parse a C header file and emit the corresponding Rust declarations.

Taking some of the example C declarations from Item 34:

/* File lib.h */
#include <stdint.h>

typedef struct {
    uint8_t byte;
    uint32_t integer;
} FfiStruct;

int add(int x, int y);
uint32_t add32(uint32_t x, uint32_t y);

the bindgen tool can be manually invoked (or invoked by a build.rs build script) to create a corresponding Rust file:

% bindgen --no-layout-tests \
          --allowlist-function="add.*" \
          --allowlist-type=FfiStruct \
          -o src/generated.rs \
          lib.h

The generated Rust is identical to the handcrafted declarations in Item 34:

#![allow(unused)]
fn main() {
/* automatically generated by rust-bindgen 0.59.2 */

#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct FfiStruct {
    pub byte: u8,
    pub integer: u32,
}
extern "C" {
    pub fn add(
        x: ::std::os::raw::c_int,
        y: ::std::os::raw::c_int,
    ) -> ::std::os::raw::c_int;
}
extern "C" {
    pub fn add32(x: u32, y: u32) -> u32;
}
}

and can be pulled into Rust code with the source-level include! macro:

// Include the auto-generated Rust declarations.
include!("generated.rs");

For anything but the most trivial FFI declarations, use bindgen to generate Rust bindings for C code—this is an area where machine-made, mass-produced code is definitely preferable to artisanal handcrafted declarations. If a C function definition changes, the C compiler will complain if the C declaration no longer matches the C definition, but nothing will complain that a handcrafted Rust declaration no longer matches the C declaration; auto-generating the Rust declaration from the C declaration ensures that the two stay in sync

This also means that the bindgen step is an ideal candidate to include in a CI system (Item 32); if the generated code is included in source control, the CI system can error out if a freshly generated file doesn't match the checked-in version.

The bindgen tool comes into its own when you're dealing with an existing C codebase that has a large API. Creating Rust equivalents to a big lib_api.h header file is manual and tedious, and therefore error-prone—and as noted, many categories of mismatch error will not be detected by the toolchain. bindgen also has a panoply of options that allow specific subsets of an API to be targeted (such as the --allowlist-function and --allowlist-type options previously illustrated).1

This also allows a layered approach for exposing an existing C library in Rust; a common convention for wrapping some xyzzy library is to have the following:

  • An xyzzy-sys crate that holds (just) the bindgen-erated code—use of which is necessarily unsafe
  • An xyzzy crate that encapsulates the unsafe code and provides safe Rust access to the underlying functionality

This concentrates the unsafe code in one layer and allows the rest of the program to follow the advice in Item 16.

Beyond C

The bindgen tool has the ability to handle some C++ constructs but only a subset and in a limited fashion. For better (but still somewhat limited) integration, consider using the cxx crate for C++/Rust interoperation. Instead of generating Rust code from C++ declarations, cxx takes the approach of auto-generating both Rust and C++ code from a common schema, allowing for tighter integration.


1

The example also used the --no-layout-tests option to keep the output simple; by default, the generated code will include #[test] code to check that structures are indeed laid out correctly.