Item 35: Prefer bindgen
to manual FFI mappings
Item 34 discussed the mechanics of invoking C code from a Rust program, describing how declarations of C structures and functions need to have an equivalent Rust declaration to allow them to be used over FFI. The C and Rust declarations need to be kept in sync, and Item 34 also warned that the toolchain wouldn't help with this—mismatches would be silently ignored, hiding problems that would arise later.
Keeping two things perfectly in sync sounds like a good target for automation, and the Rust project provides the
right tool for the job: bindgen
. The primary function of bindgen
is to parse a C header file and emit the corresponding Rust declarations.
Taking some of the example C declarations from Item 34:
/* File lib.h */
#include <stdint.h>
typedef struct {
uint8_t byte;
uint32_t integer;
} FfiStruct;
int add(int x, int y);
uint32_t add32(uint32_t x, uint32_t y);
the bindgen
tool can be manually invoked (or invoked by a build.rs
build
script) to create a corresponding Rust file:
% bindgen --no-layout-tests \
--allowlist-function="add.*" \
--allowlist-type=FfiStruct \
-o src/generated.rs \
lib.h
The generated Rust is identical to the handcrafted declarations in Item 34:
#![allow(unused)] fn main() { /* automatically generated by rust-bindgen 0.59.2 */ #[repr(C)] #[derive(Debug, Copy, Clone)] pub struct FfiStruct { pub byte: u8, pub integer: u32, } extern "C" { pub fn add( x: ::std::os::raw::c_int, y: ::std::os::raw::c_int, ) -> ::std::os::raw::c_int; } extern "C" { pub fn add32(x: u32, y: u32) -> u32; } }
and can be pulled into Rust code with the source-level include!
macro:
// Include the auto-generated Rust declarations.
include!("generated.rs");
For anything but the most trivial FFI declarations, use bindgen
to generate Rust bindings for C
code—this is an area where machine-made, mass-produced code is definitely preferable to artisanal handcrafted
declarations. If a C function definition changes, the C compiler will complain if the C declaration no longer matches
the C definition, but nothing will complain that a handcrafted Rust declaration no longer matches the C declaration;
auto-generating the Rust declaration from the C declaration ensures that the two stay in sync
This also means that the bindgen
step is an ideal candidate to include in a CI system (Item 32);
if the generated code is included in source control, the CI system can error out if a freshly generated file doesn't
match the checked-in version.
The bindgen
tool comes into its own when you're dealing with an existing C codebase that has a large API. Creating
Rust equivalents to a big lib_api.h header file is manual and tedious, and therefore error-prone—and as noted, many categories of mismatch error will not be detected by the toolchain. bindgen
also has a
panoply of
options that allow specific subsets of an API to be
targeted (such as the --allowlist-function
and --allowlist-type
options previously illustrated).1
This also allows a layered approach for exposing an existing C library in Rust; a common convention for wrapping some
xyzzy
library is to have the following:
- An
xyzzy-sys
crate that holds (just) thebindgen
-erated code—use of which is necessarilyunsafe
- An
xyzzy
crate that encapsulates theunsafe
code and provides safe Rust access to the underlying functionality
This concentrates the unsafe
code in one layer and allows the rest of the program to follow the advice in Item 16.
Beyond C
The bindgen
tool has the ability to handle some C++ constructs
but only a subset and in a limited fashion. For better (but still somewhat limited) integration, consider using
the cxx
crate for C++/Rust interoperation. Instead of generating Rust code from C++
declarations, cxx
takes the approach of auto-generating both Rust and C++ code from a common schema, allowing for
tighter integration.