Exporting Serde types to TypeScript
I built my first web application with Rust and WebAssembly back in 2017. At the
time, support for compiling Rust with the wasm32-unknown-unknown
target had
just landed, letting you run Rust code in the browser with few modifications.
The downside was that loading and interacting with WebAssembly might
require you to explicitly allocate and track memory. You
might even need to manually decode UTF-8 strings in JavaScript:
window.Module = {};
fetchAndInstantiate("./factorial.wasm", {})
.then(mod => {
Module.fact_str = function(n) {
let outptr = mod.exports.fact_str(n);
let result = copyCStr(Module, outptr);
return result;
};
})
// This method reaches into the WebAssembly memory map and copies a
// null-terminated string starting at *ptr. This method also consumes
// the old data, so we tell WebAssembly to deallocate this memory.
function copyCStr(module, ptr) {
const buffer_as_u8 = new Uint8Array(collectCString(module, ptr))
const utf8Decoder = new TextDecoder("UTF-8");
const buffer_as_utf8 = utf8Decoder.decode(buffer_as_u8);
module.dealloc_str(orig_ptr);
...
(Adapted from Hello Rust, a guide for writing Rust and WebAssembly bindings by hand.)
This code looked fragile and prone to copy + paste errors. If any of these manual memory allocations or conversions happened at the wrong time, it could even cause Rust code to panic! After I got a demo working, instead of building out a complex API, I decided to abstract as much of this possible and behind a simple message-passing interface to exchange JSON-encoded strings between WebAssembly and JavaScript.
One upside of this design is that it makes it easier to run and test your
WebAssembly code outside of a web context: rather than boot up a web browser, you can have it receive messages from
a virtual frontend. It also becomes straightforward to delegate WebAssembly code to a
Web Worker and communicate with the main thread using the postMessage
interface.
The tradeoff is that we have to assume all messages sent by
JavaScript are well-formed. Whenever a new message variant is added or changed, the
frontend must manually be kept in sync.
What wasm-bindgen exports to TypeScript
It’s now 2019, and the amazing work done by the Rust WebAssembly WG on wasm-bindgen makes it easy to expose functions from Rust, import JavaScript methods, and even manipulate rich data types across the runtime boundary.
Using wasm-bindgen, Rust structs, methods,
free functions, and even basic enums can be imported to JavaScript. The command
line wasm-bindgen
tool can also generate TypeScript definitions (when you
pass the
--typescript
argument). This lets you type-check both your Rust code and your
frontend components at the same time, catching type errors during compilation rather than
at runtime. This is especially valuable given how hard WebAssembly runtime
errors are to debug!
The caveat is that wasm-bindgen only supports a subset of Rust types: those with a corresponding semantic representation in TypeScript.
It supports exporting struct
fields and impl
methods as part of a single
class
definition, for example, and exporting simple C-like enum
s as
TypeScript enum
s. If we wanted to export a Rust struct with a single field:
#[wasm_bindgen]
pub struct SimpleMessage {
pub value: f64,
}
For this type, wasm-bindgen would generate the following TypeScript definition:
export class SimpleMessage {
free(): void;
value: number;
}
Note that our one-field struct is exported as a class
that is allocated
on WebAssembly’s heap (hence the free()
method). Rather than initialize a
variable with an anonymous JavaScript object, we instead have have to
instantiate a member of the SimpleMessage
class with new
and assign its
properties directly.
// expected SimpleMessage, found literal
let message: SimpleMessage = { value: 5 };
// this works though
let message = new SimpleMessage();
message.value = 5;
You also can’t add any fields to the struct definition in Rust which do not
implement Copy
, which includes String
. The types of objects we can
expose to TypeScript are powerful, but limited in complexity.
The reason wasm-bindgen has these restrictions is because its design is to expose an interface between the two systems, whereas we want to share a data structure. A Typescript definition that looks exactly like our Rust type when serialized into JSON would enable compile-time type-checking. With that, we could add new messages to Rust and know we’ve either handled them explicitly on the frontend, or trigger a compilation error pointing out the missing type.
#[derive(TypeScriptDefinition)]
Let’s say we want to generate a TypeScript definition for any type which can derive the Serialize
trait. The types that can implement this trait include:
struct
types, which can be a unit type (Struct()
), newtypeStruct(u64)
, tuple typeStruct(u64, String)
, or struct typeStruct { value: u64, name: String }
- C-style
enum
types, which contain only unit discriminants with associated values enum
types whose variants contain other values, aka sum types. This includes unit, newtype, tuple, or struct variants
My first attempt at exporting custom TypeScript definitions was to hack on the
wasm-bindgen
crate itself and modify how it generates .d.ts
files.
It was hard to distinguish what types should be exported as JSON instead of
fully interactive interfaces. And a Rust type that can implement Serialize isn’t
necessarily supported as a type by wasm-bindgen
; for example, support for
things like enums with complex variants (sum types) had to be written from scratch.
My next idea was to write a stand-alone crate that understood how serde
serialized data and have it print out a JSON-compatible TypeScript definition.
Instead of using constructs like class
to model a Rust type, this crate would
export a type alias for an object literal that matched its JSON serialization.
Implementing code in an external crate made it easier to support serde-specific attributes like
#[serde(rename="...")]
, which changes the structure of the type (like renaming
fields) but only when serialized.
All this needed was a way to add strings directly to wasm-bindgen’s
TypeScript pass. Based on Rust’s wasm_custom_section
attribute, I submitted a
feature to the
wasm-bindgen crate
to let you inject strings directly into its generated TypeScript
definition file. You can now use the
typescript_custom_section
attribute to inject custom types when running wasm-bindgen:
// This static string will be injected into the TypeScript definition file.
#[wasm_bindgen(typescript_custom_section)]
const TS_APPEND_CONTENT: &'static str = r#"
export type Coords = { "latitude": number, "longitude": number };
"#;
Note that these &’static str
values can be dynamically generated by a macro,
for example a #[derive()]
macro that adds a custom type definition.
Creating such a macro that could parse a Serialize
-able type and generate a type definition was surprisingly
easy. Assuming someone had already written a serde-aware crate that printed
a self-describing schema, I came across the
rust-serde-schema
crate (MIT +
Apache-2 licensed) which recursively walks the structure of a Rust type and
builds up a context object. To generate code, it’s even easier to recursively
build up a list of tokens instead. These tokens can then be flattened into a
string and inserted by our macro with the
#[wasm_bindgen(typescript_custom_section)]
annotation.
Putting these ideas together: You can try my experimental
wasm-typescript-definition
crate, which provides a TypeScriptDefintion
macro that exports serde-compatible
TypeScript definitions when running wasm-bindgen.
A type only needs to be annotated with the
#[derive(TypeScriptDefinition)]
attribute to work, though it makes sense to
implement Serialize
and Deserialize
as well so it works with serde_json.
A nice demonstration of this macro is with sum types. We can now convert Rust
types that are otherwise inexpressible in JavaScript into data types that
TypeScript understands. If we wanted to define an enum whose variants are
different messages our frontend can receive:
#[derive(TypeScriptDefinition, Serialize, Deserialize)]
#[serde(tag = "tag", content = "fields")]
pub enum FrontendMessage {
Init { id: String, },
ButtonState { selected: Vec<String>, time: u32, }
Render { html: String, time: u32, },
}
When we run wasm-bindgen, it generates a type definition that is inlined into the TypeScript definition file:
export type FrontendMessage =
| { "tag": "Init", "fields": { "id": string, } },
| { "tag": "ButtonState", "fields": { "selected": Array<string>, "time": number, } },
| { "tag": "Render", "fields": { "html": string, "time": number, } },
;
The pipe operator |
in a type definition means that it can be one of a list of different variants, much like enum variants in Rust.
Now when we create a new message in TypeScript like:
let msg: FrontendMessage =
{"tag": "ButtonState", "fields": {"selected": ["bold", "list"], time: 0 }};
We can rely on compile
time type-checking to ensure the message is well-formed. That guarantees it can
be serialized into a string with JSON.stringify
and parsed without errors
using serde_json::from_str
in Rust. It is now possible to change and
introduce new messages to interface between the two runtimes without the risk of
introducting runtime errors.
A primary benefit of wasm-bindgen
is to push Rust’s safety guarantees into
frontend code. I think there’s a compelling case for a “serde_typescript” crate
that can generate TypeScript definitions for serializable data types, and has
complete support for all of serde’s custom serialization attributes. For now,
you can check out my Github repo for
wasm-typescript-definition.
Hope you find it useful!
Bonus: Exhaustive switch statements in TypeScript
Strong typing on the frontend gets us many of the same guarantees we have in
Rust and makes it possible to keep frontend and WebAssembly code in sync. But we
lack one important aspect of Rust’s pattern matching capabilities: its
requirement that all match
arms must be “exhaustive” (that is, there are no
unhandled cases).
If we matched against the FrontendMessage
enum type, but we only handled the
Init
and ButtonState
variants, our program would fail at compile time
indicating that we are missing a match arm for the third variant Render
. In
TypeScript, we have to work to get the same guarantee. Assuming we choose
serde’s “internally tagged”
representation for our enum, we can
branch in TypeScript based on which variant it is by doing e.g.:
switch (message.tag) {
case "Init": ...
case "ButtonState": ...
case "Render": ...
}
At build time, the compiler can guarantee that each of the
values we use for “tag” matches the name of an actual variant, which prevents
you from misspelling or omitting a message type. But the compiler doesn’t check to
see that your case
statements are exhaustive—that every case is
handled—unless you explicitly guarantee that the “default” code path will never be taken.
For this arcane purpose, TypeScript provides the never
keyword, and in its
Advanced Types documentation suggests how you can make a switch statement
exhaustively check all variants of
a “discriminated union” (essentially the same as our “internally tagged” enum
representation):
function assertNever(x: never): never {
throw new Error("Unexpected object: " + x);
}
function sendFrontendMessage(msg: FrontendMessage) {
switch (msg.type) {
case "Init": return App.initialize(msg.id);
case "ButtonState": return App.updateButtons(msg.selected, msg.time);
case "Render": return App.render(msg.html, msg.time);
// If we missed an enum variant, TypeScript would now error out here.
default: return assertNever(msg);
}
}
The assertNever function accepts an argument with a never
type, meaning it
should never be called given a valid FrontendMessage type. As long as this
default: return assertNever(msg);
case appears in every switch statement
operating on a “discriminated union”, TypeScript can provide compile-time type
checking that you’ve handled all possible variants. See this TypeScript
playground to try out never types for yourself.