Moving from the shell to Rust with commandspec
Almost every project I’ve worked on has grown a shell script named “build.sh”, and not much later a “test.sh” and “run.sh”. At this point, you have to make a decision as a developer whether your goal is to accidentally reinvent make
or if your codebase’s needs are better met by an executable to manage your workflow.
My hobby projects are mostly written in Rust, and there was a brilliant post earlier this year about how to write your own cargo tools using cargo’s command aliases. This works well for individual tools (like your own cargo todo
), and has the benefit of needing no additional setup to work out of the box. The downside of this approach, besides discoverability, is that you don’t have a common entry point for your commands. In my case, most of my tooling shares a lot of common abstractions like debug/release mode switches or common environment variables. Rather than manage a constellation of separate binaries, here’s how I structured a unified build tool written for a moderate-sized Rust project.
To build a single entry point for these, I picked up cargo-script and created the executable “x.rs”, invoking Rust’s own build system “x.py”. The goal is to have a build process as straightforward as running “./x.rs build” from the root of the project.
Design
The project this tool is for is edit-text, a collaborative text editor written in Rust which runs in the browser via WebAssembly. (Here is its ./x.rs script.) Because the editor is collaborative, it needs a server to coordinate editing. Because code can be shared between the server and client, common code is broken out into its own crate. The client is able to be compiled for the web or run as a standalone command-line binary, so we need to support multiple compilation targets; the frontend is an npm module bundled with webpack; also, WebAssembly requires an additional wasm-bindgen
step when compiling. Long story short, it has accumulated a lot of build tools spread across many sub-projects.
Instead of having a command line interface that focuses on the action being taken, like how cargo build --bin <target>
requires “build” be first, we can make it centered on the component itself: ./x.rs server-build
, ./x.rs server-watch
, and ./x.rs server
will build the server, rebuild while polling its source code for changes, and lastly run the binary. This makes server the target of our commands, and makes it especially easy to reverse-i-search (Ctrl+R) in the terminal for previous server commands. When needed, ./x.rs
recursively can shell out to itself, as in the case with ./x.rs server-test
which builds and spawns ./x.rs server
before running a WebDriver process in the foreground.
The build script is designed to have few dependencies as possible. It depends on the stellar quicli and clap utilities for writing clean command line interfaces, and the failure crate for ergonomic error handling. This ensures we compile fast the first time ./x.rs
is run and when it’s next modified. (Sidenote: I’ll probably take this a step further and drop quicli
, since structopt
is less suited than clap
’s builder API for adding new one-off subcommands.)
The build script doesn’t have any knowledge of code running contained in its subcrates, so it remains small and quick to bootstrap but also so build errors from subcrates don’t propagate up to the build tool. Complex functionality can be broken out into crates themselves. For example, ./x.rs logs
is a command that shells out to the command cargo run --bin edit-server-logs -- {args}
. In this way the logic for the log parser can be part of the project that understands and writes those logs, and we can also independently test and use this functionality outside the build tool. Sending edit-server
and edit-server-logs
binaries to the server is more useful the vendoring the entire build script.
The last dependency is self-written and provides an abstraction for talking to the shell.
Scripting with commandspec
The challenge of moving from a scripting to a programming language is that shell scripting is very expressive. Using the child process API is trying to express the same thing with way more characters. Because the command line is versatile but permissive, the best strategy to adapting it to modern programming is to limit it to clean subset of features, like how XML is a less permissive HTML. But also, we need to increase the expressive power of the shell to make it easy to work with Rust-native data structures; so really we want what JSX to HTML, a clean, restrictive syntax the allows interpolation.
So I wrote commandspec, a macro_rules! macro to inline shell syntax with your code without requiring it to run in an unsafe shell subprocess. This makes porting from the shell familiar, incremental, and robust. Say we start with this build.sh
script:
git submodule update --init --recursive
cd frontend
npm install
npm run build
When we need more flexibility language, we can embed this script directly into a Rust program:
sh_execute!(r"
git submodule update --init --recursive
cd frontend
npm install
npm run build
")?;
This just runs code directly in a shell process, so we can still make it safer by invoking these programs directly. What our build script is doing is running three commands and making one environment change (changing directories). If we consider each {directory, environment, command}
group separately, we can rewrite each of these to use the execute!
macro, a safe wrapper over familiar shell syntax:
execute!(r"
git submodule update --init --recursive
")?;
execute!(r"
cd frontend
npm install
")?;
execute!(r"
cd frontend
npm run build
")?;
Each step breaks the script into its component parts. You can add one cd
instruction, multiple export name=value
environment variables, and then ultimately your shell command. But the DSL parses this std::process::Command API underneath, which is cross-platform and doesn’t require a shell process at all:
// This executes the native rustc command in cmd.exe or in sh
// it prints the version the shell and returns Ok(())
execute!(r"
rustc -V
")?;
From here, Rust-native abstractions can be added (like using the std::fs APIs instead of mv
or cp
commands) or be left alone, in cases where you would just be replacing it with the equivalent std::process::Command constructor.
A benefit of this pattern is that our commands can be copied almost verbatim into a new shell session and tested in isolation. The syntax is also similar enough for non POSIX shells (Powershell and cmd.exe, namely) that you can trivially rewrite it to run in those terminals.
This execute!
method wraps the command!
macro, which is identical except for returning a std::process::Command
object directly. It then calls .execute()
, a new trait method on Command
that returns Result<(), CommandError>
, the error being an object that wraps over non-0 error codes and filesystem or interrupt errors. Thus we can use execute!(...)?
to return immediately for any command that is not successful (exit code of 0
), or we can decompose the error as an enum, or we can use the .unwrap_err().error_code()
to get the error code directly.
It’s straightforward to combine the command!
macro with the full std::process::Command API (or even something more complex, like os_pipe).
let cmd = command!(r"
tar -xvf -
")?
.stdin(Stdio::piped())
.spawn()?;
cmd.stdin.write(/* my precious, piped data */);
commandspec
works on stable Rust. This is not a hard requirement for me (edit-text absolutely requires nightly for now) but it’s a good requirement for writing maintainable code generally. It accomplishes this by internally leveraging format!
’s macro syntax. As such, you can use a familiar {}
syntax and get compile-type typechecking to verify your code will work. Unlike format!
, commandspec
pre-escapes all strings passed as arguments to the format function to be escaped for the shell:
execute!(
r"
export RUST_BACKTRACE=1
cd my-server
cargo run {release} -- {custom_args}
",
// Pass options that will be omitted if None
release = if not_debug { Some("--release") } else { None },
// Pass vectors to expand to multiple arguments
custom_args = vec!["./file1", "./file2"],
)?;
All arguments are embedded as strings. This breaks formatting features aside from positional ({}
) or indexed embedding ({0}
, {release}
), unfortunately. Since the command line treats strings and other types indiscriminately, individually format!
-ing arguments is an easy workaround.
Even with the release of Macros 2.0 I think this a good balance for embedding shell scripts. If these removed quotes from the macro as in execute!(cargo run --release)?
, we’d have to distinguish --release
from the raw tokens -
-
release
, which is non-trivial.
Takeaways
My approach to build scripts here evolved over about six months for one project, so I assume it’ll change in the future. However, my takeaways are that building should be 1) easy to refactor and 2) easy to interface with. In order to encourage better abstractions than shell scripting, it should be easy if not pleasant to work with executables with a programmatic API. Using make
as a script runner and compilation tool is fundamentally a worser interface than splitting your build system into a project-centric interface, and then well-tested, domain-tailored build tools.
This post is also a shoutout to my favorite Rust tool, cargo-script. It’s brilliant for all sorts of one-off scripts, but I found it also holds up well to the task of managing entire workflows. When I started using cargo-script
, it had invalidation issues that caused my script to be re-compiled each time it was run. This no longer seems to be the case, and running command line scripts (that are small enough to compile) feels similar to the workflow of running an interpreted language. If you haven’t legitimately considered Rust for scripting, quicli
+ cargo-script
present a very flattering case.