Item 34: Control what crosses FFI boundaries

Even though Rust comes with with a comprehensive standard library and a burgeoning crate ecosystem, there is still a lot more non-Rust code available than there is Rust code.

As with other recent languages, Rust helps with this problem by offering a foreign function interface (FFI) mechanism, which allows interoperation with code and data structures written in different languages – despite the name, FFI is not restricted to just functions. This opens up the use of existing libraries in different languages, not just those that have succumbed to the Rust community's efforts to "rewrite it in Rust" (RIIR).

The default target for Rust's interoperability is the C programming language; this is the same interop target that other languages aim at. This is partly driven by the ubiquity of C libraries, but is also driven by simplicity: C acts as a "least common denominator" of interoperability, because it doesn't need toolchain support of any of the more advanced features that would be necessary for compatibility with other languages (e.g. garbage collection for Java or Go, exceptions and templates for C++, function overrides for Java and C++, …).

However, that's not to say that interoperability with plain C is simple. By including code written in a different language, all of the guarantees and protections that Rust offers are up for grabs, particularly those involving memory safety.

As a result, FFI code in Rust is automatically unsafe, and the advice of Item 16 has to be bypassed. This Item explores some replacement advice, and Item 35 will explore some tooling that helps to avoid some (but not all) of the footguns involved in working with FFI. (The FFI chapter of the Rustonomicon also contains helpful advice and information.)

Invoking C Functions from Rust

The simplest FFI interaction is for Rust code to invoke a C function, taking "immediate" arguments that don't involve pointers, references or memory addresses:

/* C function definition. */
int add(int x, int y) {
  return x + y;
}

To use this function in Rust, there needs to be an equivalent declaration:

#![allow(unused)]
fn main() {
use std::os::raw::c_int;
extern "C" {
    pub fn add(x: c_int, y: c_int) -> c_int;
}
}

The declaration is marked as extern "C" to indicate that an external C library will provide the actual code for the function. The extern "C" marker also automatically marks the function as #[no_mangle], which is explored more below.

(Note that if the FFI functionality you want to use is just the standard C library, then you don't need to create these declarations – the libc crate already provides them.)

The build system will typically also need an indication of how/where to find the library holding the C code, either via the link attribute or the links manifest key.

But even this simplest of examples comes with some gotchas. First, use of FFI functions is automatically unsafe:

error[E0133]: call to unsafe function is unsafe and requires unsafe function or block
   --> ffi/src/main.rs:156:13
    |
156 |     let x = add(1, 1);
    |             ^^^^^^^^^ call to unsafe function
    |
    = note: consult the function's documentation for information on how to avoid undefined behavior

The next thing to watch out for is the use of C's int type, represented as std::os::raw::c_int. How big is an int? It's probably true that

  • the size of an int for the toolchain that compiled the C library, and
  • the size of a std::os::raw::c_int for the Rust toolchain

are the same. But why take the chance? Prefer sized types at FFI boundaries, where possible – which for C means making use of the types defined in <stdint.h>. However, if you're dealing with an existing codebase that already uses int / long / size_t this may be a luxury you don't have.

The final practical concern is that the C code and the equivalent Rust declaration need to exactly match. Worse still, if there's a mismatch, the build tools will not emit a warning – they will just silently emit incorrect code.

Item 35 discusses the use of the bindgen tool to prevent this problem, but it's worth understanding the basics of what's going on under the covers to understand why the build tools can't detect the problem.

Compiled languages generally support separate compilation, where different parts of the program are converted into machine code as separate chunks (object files), which can then be combined into a complete program by the linker. This means that if only one small part of the program's source code changes, only the corresponding object file needs to be regenerated; the link step then rebuilds the program, combining both the changed object and all the other unmodified objects.

The link step is (roughly speaking) a "join the dots" operation: some object files provide definitions of functions and variables, other object files have placeholder markers indicating that they expect to use a definition from some other object, but it wasn't available at compile time. The linker combines the two: it ensures that any placeholder in the compiled code is replaced with a reference to the corresponding concrete definition.

The linker performs this correlation between the placeholders and the definitions by simply checking for a matching name, meaning that there is a single global namespace for all of these correlations.

Historically, this was fine for linking C language programs, where a single name could not be re-used in any way, but the introduction of C++ caused a problem. C++ allows overridden definitions with the same name:

// C++ code
namespace ns1 {
int32_t add(int32_t a, int32_t b) { return a+b; }
int64_t add(int64_t a, int64_t b) { return a+b; }
}
namespace ns2 {
int32_t add(int32_t a, int32_t b) { return a+b; }
}

The solution for this is name mangling: the compiler encodes the signature and type information for the overridden functions into the name that's emitted in the object file, and the linker continues to perform its simple-minded 1:1 correlation between placeholders and definitions.

On UNIX-like systems, the nm tool can help show what the linker works with, and the c++filt tool helps translate this back into what would be visible in C++ code:

% nm ffi-cpp-lib.o | grep add  # what the linker sees
0000000000000000 T __ZN3ns13addEii
0000000000000020 T __ZN3ns13addExx
0000000000000040 T __ZN3ns23addEii
% nm ffi-cpp-lib.o | grep add | c++filt  # what the programmer sees
0000000000000000 T ns1::add(int, int)
0000000000000020 T ns1::add(long long, long long)
0000000000000040 T ns2::add(int, int)

Because the mangled name includes type information, the linker can and will complain about any mismatch in the type information between placeholder and definition. This gives some measure of type safety: if the definition changes but the place using it is not updated, the toolchain will complain.

Returning to Rust, extern "C" foreign functions are implicitly marked as #[no_mangle], which means that this level of type safety is lost – the linker only sees the "bare" names for functions and variables, and if there are any differences in type expectations between definition and use, this will only cause problems at runtime.

Accessing C Data from Rust

"I'm playing all the right notes, but not necessarily in the right order." – Eric Morecambe

Even though the example of the previous section passed the simplest possible data – an integer that fits in a machine register – between Rust and C, there were still things to be careful about. It's no surprise then that dealing with more complex data structures also has wrinkles to watch out for.

Both C and Rust use the struct to combine related data into a single data structure. However, when a struct is realised in memory, the two languages may well choose to put different fields in different places or even in different orders (the layout). To prevent mismatches, use #[repr(C)] for Rust types used in FFI; this representation is designed for the purpose of allowing C interoperability.

/* C data structure definition. */
/* Changes here must be reflected in lib.rs. */
typedef struct {
    uint8_t byte;
    uint32_t integer;
} FfiStruct;
// Equivalent Rust data structure.
// Changes here must be reflected in lib.h / lib.c.
#[repr(C)]
pub struct FfiStruct {
    pub byte: u8,
    pub integer: u32,
}

The structure definitions have a comment to remind the humans involved that the two places need to be kept in sync. Relying on the constant vigilance of humans is likely to go wrong in the long term; as for function signatures, it's better to automate this synchronization between the two languages via a tool like bindgen (Item 35).

One type (pun intended) of data that's worth thinking about carefully for FFI interactions is strings. The default definitions of what makes up a string are somewhat different between C and Rust.

  • A Rust String holds UTF-8 encoded data, possibly including zero bytes, with an explicitly known length.
  • A C string (char *) holds byte values (which may or may not be signed), with its length implicitly determined by the first zero byte (\0) found in the data.

Fortunately, dealing with C-style strings in Rust is comparatively straightforward, because the Rust library designers have already done the heavy lifting by providing a pair of types to encode them. Use the CString type to hold strings that need to be interoperable with C, and then use the as_ptr() method to pass the contents to any FFI function that's expecting a const char* C string. Note that the const is important: this can't be used for an FFI function that needs to modify the contents (char *) of the string that's passed to it.

Lifetimes

Most data structures are too big to fit in a register, and so have to be held in memory instead. That in turn means that access to the data is performed via the location of that memory. In C terms this means a pointer: a number that encodes a memory address – with no other semantics attached.

In Rust, a location in memory is generally represented as a reference, and its numeric value can be extracted as a raw pointer, ready to feed into an FFI boundary.

extern "C" {
    pub fn use_struct(v: *const FfiStruct) -> u32;
}
    let v = FfiStruct {
        byte: 1,
        integer: 42,
    };
    let x = unsafe { use_struct(&v as *const FfiStruct) };

However, a Rust reference comes with additional constraints around the lifetime of the associated chunk of memory (as described in Item 14), and these constraints get lost in the conversion to a raw pointer. That makes use of raw pointers inherently unsafe, as a marker that Here Be Dragons: the C code on the other side of the FFI boundary could do any number of things that will destroy Rust's memory safety:

  • The C code could hang on to the value of the pointer, and use it at a later point when the associated memory has either been freed from the heap ("use-after-free"), or re-used on the stack.
  • The C code could decide to cast away the const-ness of a pointer that's passed to it, and modify data that Rust expects to be immutable.
  • The C code is not subject to Rust's Mutex protections, so the spectre of data races (Item 17) rears its ugly head.
  • The C code could mistakenly return associated heap memory to the allocator (by calling C's free() library function), meaning that the Rust code might now be performing use-after-free operations.

All of these dangers form part of the cost-benefit analysis of using an existing library via FFI. On the plus side, you get to re-use existing code that's (presumably) in good working order, with only the need to write (or auto-generate) corresponding declarations; on the minus side, you lose the memory protections that are a big reason to use Rust in the first place.

As a first step to reduce the chances of memory-related problems, allocate and free memory on the same side of the FFI boundary. For example, this might appear as a symmetric pair of functions:

/* C functions. */
FfiStruct* new_struct(uint32_t v);
void free_struct(FfiStruct* s);

with corresponding Rust FFI declarations:

extern "C" {
    pub fn new_struct(v: u32) -> *mut FfiStruct;
    pub fn free_struct(s: *mut FfiStruct);
}

To make sure that allocation and freeing are kept in sync, it can be a good idea to implement an RAII wrapper that automatically prevents C-allocated memory from being leaked. The wrapper structure owns the C-allocated memory:

/// Wrapper structure that owns memory allocated by the C library.
struct FfiWrapper {
    // Invariant: inner is non-NULL.
    inner: *mut FfiStruct,
}

and the Drop implementation returns that memory to the C library, to avoid the potential for leaks:

/// Manual implementation of [`Drop`] which ensures that memory allocated by the
/// C library is freed by it.
impl Drop for FfiWrapper {
    fn drop(&mut self) {
        // Safety: `inner` is non-NULL, and besides `free_struct()` copes with
        // NULL pointers.
        unsafe { free_struct(self.inner) }
    }
}

The same principle applies to more that just heap memory; as described in Item 11, implement Drop to apply RAII to FFI-derived resources – open files, database connections, etc.

Encapsulating the interactions with the C library into a wrapper struct also makes it possible to catch some other potential footguns, transforming an otherwise invisible failure into a Result:

impl FfiWrapper {
    pub fn new(val: u32) -> Result<Self, Error> {
        let p: *mut FfiStruct = unsafe { new_struct(val) };
        // Raw pointers are not guaranteed to be non-NULL.
        if p.is_null() {
            Err("Failed to get inner struct!".into())
        } else {
            Ok(Self { inner: p })
        }
    }
}

The wrapper structure can then offer safe methods that allow use of the C library's functionality:

impl FfiWrapper {
    pub fn set_byte(&mut self, b: u8) {
        let r: &mut FfiStruct = unsafe { &mut *self.inner };
        r.byte = b;
    }
}

Alternatively, if the underlying C data structure has an equivalent Rust mapping, and if it's safe to directly manipulate that data structure, then implementations of the AsRef and AsMut traits allow more direct use:

impl AsMut<FfiStruct> for FfiWrapper {
    fn as_mut(&mut self) -> &mut FfiStruct {
        // Safety: `inner` is non-NULL.
        unsafe { &mut *self.inner }
    }
}
        let mut wrapper = FfiWrapper::new(42).expect("real code would check");
        wrapper.as_mut().byte = 12;

This example illustrates a useful principle for dealing with FFI: encapsulate access to an unsafe FFI library inside safe Rust code; this allows the rest of the application to follow the advice of Item 16 and avoid writing unsafe code. It also concentrates all of the dangerous code in one place, which you can then study (and test) carefully to uncover problems – and treat as the most likely suspect when something does go wrong.

Invoking Rust from C

What counts as "foreign" depends on where you're standing; if you're writing an application in C, then it may be a Rust library that's accessed via a foreign function interface.

The basics of exposing a Rust library to C code are similar to the opposite direction:

  • Rust functions that are exposed to C need an extern "C" marker to ensure they're C-compatible.
  • Rust symbols are name mangled1 by default (like C++), so function definitions also need a #[no_mangle] attribute to ensure that they're accessible via a simple name. This in turn means that the function name is part of a single global namespace that can clash with any other symbol defined in the program. As such, consider using a prefix for exposed names to avoid ambiguities (mylib_...).
  • Data structure definitions need the #[repr(C)] attribute to ensure that the layout of the contents is compatible with an equivalent C data structure.

Also like the opposite direction, more subtle problems arise when dealing with pointers, references and lifetimes. A C pointer is different from a Rust reference, and you forget that at your peril:

#[no_mangle]
pub extern "C" fn add_contents(p: *const FfiStruct) -> u32 {
    let s: &FfiStruct = unsafe { &*p }; // Ruh-roh
    s.integer + s.byte as u32
}
    /* C code invoking Rust. */
    uint32_t result = add_contents(NULL); // Boom!

When you're dealing with raw pointers, it's your responsibility to ensure that any use of them complies with Rust's assumptions and guarantees around references.

#[no_mangle]
pub extern "C" fn add_contents_safer(p: *const FfiStruct) -> u32 {
    let s = match unsafe { p.as_ref() } {
        Some(r) => r,
        None => return 0, // Pesky C code gave us a NULL.
    };
    s.integer + s.byte as u32
}

In the examples above, the C code provides a raw pointer to the Rust code, and the Rust code converts it to a reference in order to operate on the structure. But where did that pointer come from? What does the Rust reference refer to?

The very first example in Item 9 showed how Rust's memory safety prevents references to expired stack objects from being returned; those problems reappear if you try to hand out a raw pointer:

// No compilation errors here.
#[no_mangle]
pub extern "C" fn new_struct(v: u32) -> *mut FfiStruct {
    let mut s = FfiStruct::new(v);
    &mut s // return raw pointer to a stack object that's about to expire!
}

Any pointers passed back from Rust to C should generally refer to heap memory, not stack memory. But naively trying to put the object on the heap via a Box doesn't help:

// No compilation errors here either.
#[no_mangle]
pub extern "C" fn new_struct_heap(v: u32) -> *mut FfiStruct {
    let s = FfiStruct::new(v); // create `FfiStruct` on stack
    let mut b = Box::new(s); // move `FfiStruct` to heap
    &mut *b // return raw pointer to a heap object that's about to expire!
}

because the owning Box is on the stack, so when it goes out of scope it will free the heap object.

The tool for the job here is Box::into_raw, which abnegates responsibility for the heap object, effectively "forgetting" about it:

#[no_mangle]
pub extern "C" fn new_struct_raw(v: u32) -> *mut FfiStruct {
    let s = FfiStruct::new(v); // create `FfiStruct` on stack
    let b = Box::new(s); // move `FfiStruct` to heap

    // Consume the `Box` and take responsibility for the heap memory.
    Box::into_raw(b)
}

This of course raises the question of how the heap object now gets freed. The advice above was that allocating and freeing memory should happen on the same side of the FFI boundary, which means that we need to persuade the Rust side of things to do the freeing. The corresponding tool for the job is Box::from_raw, which builds a Box from a raw pointer:

#[no_mangle]
pub extern "C" fn free_struct_raw(p: *mut FfiStruct) {
    let _b = unsafe { Box::from_raw(p) }; // assumes non-NULL
} // `_b` drops at end of scope, freeing the `FfiStruct`

This does still leave the Rust code at the mercy of the C code; if the C code gets confused and asks Rust to free the same pointer twice, Rust's allocator is likely to become terminally confused.

That illustrates the general theme of this Item: using FFI exposes you to risks that aren't present in standard Rust. That may well be worthwhile, as long as you're aware of the dangers and costs involved. Controlling the details of what passes across the FFI boundary helps to reduce that risk, but by no means eliminates it.


1: The Rust equivalent of the c++filt tool for translating mangled names back to programmer-visible names is rustfilt.