Item 25: Manage your dependency graph

Like most modern programming languages, Rust makes it easy to pull in external libraries, in the form of crates. Most nontrivial Rust programs use external crates, and those crates may themselves have additional dependencies, forming a dependency graph for the program as a whole.

By default, Cargo will download any crates named in the [dependencies] section of your Cargo.toml file from crates.io and find versions of those crates that match the requirements configured in Cargo.toml.

A few subtleties lurk underneath this simple statement. The first thing to notice is that crate names from crates.io form a single flat namespace—and this global namespace also overlaps with the names of features in a crate (see Item 26).1

If you're planning on publishing a crate on crates.io, be aware that names are generally allocated on a first-come, first-served basis; so you may find that your preferred name for a public crate is already taken. However, name-squatting—reserving a crate name by preregistering an empty crate—is frowned upon, unless you really are going to release code in the near future.

As a minor wrinkle, there's also a slight difference between what's allowed as a crate name in the crates namespace and what's allowed as an identifier in code: a crate can be named some-crate, but it will appear in code as some_crate (with an underscore). To put it another way: if you see some_crate in code, the corresponding crate name may be either some-crate or some_crate.

The second subtlety to understand is that Cargo allows multiple semver-incompatible versions of the same crate to be present in the build. This can seem surprising to begin with, because each Cargo.toml file can have only a single version of any given dependency, but the situation frequently arises with indirect dependencies: your crate depends on some-crate version 3.x but also depends on older-crate, which in turn depends on some-crate version 1.x.

This can lead to confusion if the dependency is exposed in some way rather than just being used internally (Item 24)—the compiler will treat the two versions as being distinct crates, but its error messages won't necessarily make that clear.

Allowing multiple versions of a crate can also go wrong if the crate includes C/C++ code accessed via Rust's FFI mechanisms (Item 34). The Rust toolchain can internally disambiguate distinct versions of Rust code, but any included C/C++ code is subject to the one definition rule: there can be only a single version of any function, constant, or global variable.

There are restrictions on Cargo's multiple-version support. Cargo does not allow multiple versions of the same crate within a semver-compatible range (Item 21):

  • some-crate 1.2 and some-crate 3.1 can coexist
  • some-crate 1.2 and some-crate 1.3 cannot

Cargo also extends the semantic versioning rules for pre-1.0 crates so that the first non-zero subversion counts like a major version, so a similar constraint applies:

  • other-crate 0.1.2 and other-crate 0.2.0 can coexist
  • other-crate 0.1.2 and other-crate 0.1.4 cannot

Cargo's version selection algorithm does the job of figuring out what versions to include. Each Cargo.toml dependency line specifies an acceptable range of versions, according to semantic versioning rules, and Cargo takes this into account when the same crate appears in multiple places in the dependency graph. If the acceptable ranges overlap and are semver-compatible, then Cargo will (by default) pick the most recent version of the crate within the overlap. If there is no semver-compatible overlap, then Cargo will build multiple copies of the dependency at different versions.

Once Cargo has picked acceptable versions for all dependencies, its choices are recorded in the Cargo.lock file. Subsequent builds will then reuse the choices encoded in Cargo.lock so that the build is stable and no new downloads are needed.

This leaves you with a choice: should you commit your Cargo.lock files into version control or not? The advice from the Cargo developers is as follows:

  • Things that produce a final product, namely applications and binaries, should commit Cargo.lock to ensure a deterministic build.
  • Library crates should not commit a Cargo.lock file, because it's irrelevant to any downstream consumers of the library—they will have their own Cargo.lock file; be aware that the Cargo.lock file for a library crate is ignored by library users.

Even for a library crate, it can be helpful to have a checked-in Cargo.lock file to ensure that regular builds and CI (Item 32) don't have a moving target. Although the promises of semantic versioning (Item 21) should prevent failures in theory, mistakes happen in practice, and it's frustrating to have builds that fail because someone somewhere recently changed a dependency of a dependency.

However, if you version-control Cargo.lock, set up a process to handle upgrades (such as GitHub's Dependabot). If you don't, your dependencies will stay pinned to versions that get older, outdated, and potentially insecure.

Pinning versions with a checked-in Cargo.lock file doesn't avoid the pain of handling dependency upgrades, but it does mean that you can handle them at a time of your own choosing, rather than immediately when the upstream crate changes. There's also some fraction of dependency-upgrade problems that go away on their own: a crate that's released with a problem often gets a second, fixed, version released in a short space of time, and a batched upgrade process might see only the latter version.

The third subtlety of Cargo's resolution process to be aware of is feature unification: the features that get activated for a dependent crate are the union of the features selected by different places in the dependency graph; see Item 26 for more details.

Version Specification

The version specification clause for a dependency defines a range of allowed versions, according to the rules explained in the Cargo book:

  • Avoid a too-specific version dependency: Pinning to a specific version ("=1.2.3") is usually a bad idea: you don't see newer versions (potentially including security fixes), and you dramatically narrow the potential overlap range with other crates in the graph that rely on the same dependency (recall that Cargo allows only a single version of a crate to be used within a semver-compatible range). If you want to ensure that your builds use a consistent set of dependencies, the Cargo.lock file is the tool for the job.
  • Avoid a too-general version dependency: It's possible to specify a version dependency ("*") that allows any version of the dependency to be used, but it's a bad idea. If the dependency releases a new major version of the crate that completely changes every aspect of its API, it's unlikely that your code will still work after a cargo update pulls in the new version.

The most common Goldilocks specification—not too precise, not too vague—is to allow semver-compatible versions ("1") of a crate, possibly with a specific minimum version that includes a feature or fix that you require ("1.4.23").

Both of these version specifications make use of Cargo's default behavior, which is to allow versions that are semver-compatible with the specified version. You can make this more explicit by adding a caret:

  • A version of "1" is equivalent to "^1", which allows all 1.x versions (and so is also equivalent to "1.*").
  • A version of "1.4.23" is equivalent to "^1.4.23", which allows any 1.x versions that are larger than 1.4.23.

Solving Problems with Tooling

Item 31 recommends that you take advantage of the range of tools that are available within the Rust ecosystem. This section describes some dependency graph problems where tools can help.

The compiler will tell you pretty quickly if you use a dependency in your code but don't include that dependency in Cargo.toml. But what about the other way around? If there's a dependency in Cargo.toml that you don't use in your code—or more likely, no longer use in your code—then Cargo will go on with its business. The cargo-udeps tool is designed to solve exactly this problem: it warns you when your Cargo.toml includes an unused dependency ("udep").

A more versatile tool is cargo-deny, which analyzes your dependency graph to detect a variety of potential problems across the full set of transitive dependencies:

  • Dependencies that have known security problems in the included version
  • Dependencies that are covered by an unacceptable license
  • Dependencies that are just unacceptable
  • Dependencies that are included in multiple different versions across the dependency tree

Each of these features can be configured and can have exceptions specified. The exception mechanism is usually needed for larger projects, particularly the multiple-version warning: as the dependency graph grows, so does the chance of transitively depending on different versions of the same crate. It's worth trying to reduce these duplicates where possible—for binary-size and compilation-time reasons if nothing else—but sometimes there is no possible combination of dependency versions that can avoid a duplicate.

These tools can be run as a one-off, but it's better to ensure they're executed regularly and reliably by including them in your CI system (Item 32). This helps to catch newly introduced problems—including problems that may have been introduced outside of your code, in an upstream dependency (for example, a newly reported vulnerability).

If one of these tools does report a problem, it can be difficult to figure out exactly where in the dependency graph the problem arises. The cargo tree command that's included with cargo helps here, as it shows the dependency graph as a tree structure:

dep-graph v0.1.0
├── dep-lib v0.1.0
│   └── rand v0.7.3
│       ├── getrandom v0.1.16
│       │   ├── cfg-if v1.0.0
│       │   └── libc v0.2.94
│       ├── libc v0.2.94
│       ├── rand_chacha v0.2.2
│       │   ├── ppv-lite86 v0.2.10
│       │   └── rand_core v0.5.1
│       │       └── getrandom v0.1.16 (*)
│       └── rand_core v0.5.1 (*)
└── rand v0.8.3
    ├── libc v0.2.94
    ├── rand_chacha v0.3.0
    │   ├── ppv-lite86 v0.2.10
    │   └── rand_core v0.6.2
    │       └── getrandom v0.2.3
    │           ├── cfg-if v1.0.0
    │           └── libc v0.2.94
    └── rand_core v0.6.2 (*)

cargo tree includes a variety of options that can help to solve specific problems, such as these:

  • --invert: Shows what depends on a specific package, helping you to focus on a particular problematic dependency
  • --edges features: Shows what crate features are activated by a dependency link, which helps you figure out what's going on with feature unification (Item 26)
  • --duplicates: Shows crates that have multiple versions present in the dependency graph

What to Depend On

The previous sections have covered the more mechanical aspect of working with dependencies, but there's a more philosophical (and therefore harder-to-answer) question: when should you take on a dependency?

Most of the time, there's not much of a decision involved: if you need the functionality of a crate, you need that function, and the only alternative would be to write it yourself.2

But every new dependency has a cost, partly in terms of longer builds and bigger binaries but mostly in terms of the developer effort involved in fixing problems with dependencies when they arise.

The bigger your dependency graph, the more likely you are to be exposed to these kinds of problems. The Rust crate ecosystem is just as vulnerable to accidental dependency problems as other package ecosystems, where history has shown that one developer removing a package, or a team fixing the licensing for their package can have widespread knock-on effects.

More worrying still are supply chain attacks, where a malicious actor deliberately tries to subvert commonly used dependencies, whether by typo-squatting, hijacking a maintainer's account, or other more sophisticated attacks.

This kind of attack doesn't just affect your compiled code—be aware that a dependency can run arbitrary code at build time, via build.rs scripts or procedural macros (Item 28). That means that a compromised dependency could end up running a cryptocurrency miner as part of your CI system!

So for dependencies that are more "cosmetic", it's sometimes worth considering whether adding the dependency is worth the cost.

The answer is usually "yes", though; in the end, the amount of time spent dealing with dependency problems ends up being much less than the time it would take to write equivalent functionality from scratch.

Things to Remember

  • Crate names on crates.io form a single flat namespace (which is shared with feature names).
  • Crate names can include a hyphen, but it will appear as an underscore in code.
  • Cargo supports multiple versions of the same crate in the dependency graph, but only if they are of different semver-incompatible versions. This can go wrong for crates that include FFI code.
  • Prefer to allow semver-compatible versions of dependencies ("1", or "1.4.23" to include a minimum version).
  • Use Cargo.lock files to ensure your builds are repeatable, but remember that the Cargo.lock file does not ship with a published crate.
  • Use tooling (cargo tree, cargo deny, cargo udep, …) to help find and fix dependency problems.
  • Understand that pulling in dependencies saves you writing code but doesn't come for free.

1

It's also possible to configure an alternate registry of crates (for example, an internal corporate registry). Each dependency entry in Cargo.toml can then use the registry key to indicate which registry a dependency should be sourced from.

2

If you are targeting a no_std environment, this choice may be made for you: many crates are not compatible with no_std, particularly if alloc is also unavailable (Item 33).