Exporting Serde types to TypeScript

I built my first web application with Rust and WebAssembly back in 2017. At the time, support for compiling Rust with the wasm32-unknown-unknown target had just landed, letting you run Rust code in the browser with few modifications. The downside was that loading and interacting with WebAssembly might require you to explicitly allocate and track memory. You might even need to manually decode UTF-8 strings in JavaScript:

window.Module = {};
fetchAndInstantiate("./factorial.wasm", {})
.then(mod => {
  Module.fact_str = function(n) {
    let outptr = mod.exports.fact_str(n);
    let result = copyCStr(Module, outptr);
    return result;
  };
})

// This method reaches into the WebAssembly memory map and copies a
// null-terminated string starting at *ptr. This method also consumes
// the old data, so we tell WebAssembly to deallocate this memory.
function copyCStr(module, ptr) {
  const buffer_as_u8 = new Uint8Array(collectCString(module, ptr))
  const utf8Decoder = new TextDecoder("UTF-8");
  const buffer_as_utf8 = utf8Decoder.decode(buffer_as_u8);
  module.dealloc_str(orig_ptr);
  ...

(Adapted from Hello Rust, a guide for writing Rust and WebAssembly bindings by hand.)

This code looked fragile and prone to copy + paste errors. If any of these manual memory allocations or conversions happened at the wrong time, it could even cause Rust code to panic! After I got a demo working, instead of building out a complex API, I decided to abstract as much of this possible and behind a simple message-passing interface to exchange JSON-encoded strings between WebAssembly and JavaScript.

One upside of this design is that it makes it easier to run and test your WebAssembly code outside of a web context: rather than boot up a web browser, you can have it receive messages from a virtual frontend. It also becomes straightforward to delegate WebAssembly code to a Web Worker and communicate with the main thread using the postMessage interface. The tradeoff is that we have to assume all messages sent by JavaScript are well-formed. Whenever a new message variant is added or changed, the frontend must manually be kept in sync.

What wasm-bindgen exports to TypeScript

It’s now 2019, and the amazing work done by the Rust WebAssembly WG on wasm-bindgen makes it easy to expose functions from Rust, import JavaScript methods, and even manipulate rich data types across the runtime boundary.

Using wasm-bindgen, Rust structs, methods, free functions, and even basic enums can be imported to JavaScript. The command line wasm-bindgen tool can also generate TypeScript definitions (when you pass the --typescript argument). This lets you type-check both your Rust code and your frontend components at the same time, catching type errors during compilation rather than at runtime. This is especially valuable given how hard WebAssembly runtime errors are to debug!

The caveat is that wasm-bindgen only supports a subset of Rust types: those with a corresponding semantic representation in TypeScript. It supports exporting struct fields and impl methods as part of a single class definition, for example, and exporting simple C-like enums as TypeScript enums. If we wanted to export a Rust struct with a single field:

#[wasm_bindgen]
pub struct SimpleMessage {
    pub value: f64,
}

For this type, wasm-bindgen would generate the following TypeScript definition:

export class SimpleMessage {
  free(): void;
  value: number;
}

Note that our one-field struct is exported as a class that is allocated on WebAssembly’s heap (hence the free() method). Rather than initialize a variable with an anonymous JavaScript object, we instead have have to instantiate a member of the SimpleMessage class with new and assign its properties directly.

// expected SimpleMessage, found literal
let message: SimpleMessage = { value: 5 }; 

// this works though
let message = new SimpleMessage();
message.value = 5;

You also can’t add any fields to the struct definition in Rust which do not implement Copy, which includes String. The types of objects we can expose to TypeScript are powerful, but limited in complexity.

The reason wasm-bindgen has these restrictions is because its design is to expose an interface between the two systems, whereas we want to share a data structure. A Typescript definition that looks exactly like our Rust type when serialized into JSON would enable compile-time type-checking. With that, we could add new messages to Rust and know we’ve either handled them explicitly on the frontend, or trigger a compilation error pointing out the missing type.

#[derive(TypeScriptDefinition)]

Let’s say we want to generate a TypeScript definition for any type which can derive the Serialize trait. The types that can implement this trait include:

struct types, which can be a unit type (Struct()), newtype Struct(u64), tuple type Struct(u64, String), or struct type Struct { value: u64, name: String }
C-style enum types, which contain only unit discriminants with associated values
enum types whose variants contain other values, aka sum types. This includes unit, newtype, tuple, or struct variants

My first attempt at exporting custom TypeScript definitions was to hack on the wasm-bindgen crate itself and modify how it generates .d.ts files. It was hard to distinguish what types should be exported as JSON instead of fully interactive interfaces. And a Rust type that can implement Serialize isn’t necessarily supported as a type by wasm-bindgen; for example, support for things like enums with complex variants (sum types) had to be written from scratch.

My next idea was to write a stand-alone crate that understood how serde serialized data and have it print out a JSON-compatible TypeScript definition. Instead of using constructs like class to model a Rust type, this crate would export a type alias for an object literal that matched its JSON serialization. Implementing code in an external crate made it easier to support serde-specific attributes like #[serde(rename="...")], which changes the structure of the type (like renaming fields) but only when serialized.

All this needed was a way to add strings directly to wasm-bindgen’s TypeScript pass. Based on Rust’s wasm_custom_section attribute, I submitted a feature to the wasm-bindgen crate to let you inject strings directly into its generated TypeScript definition file. You can now use the typescript_custom_section attribute to inject custom types when running wasm-bindgen:

// This static string will be injected into the TypeScript definition file. 
#[wasm_bindgen(typescript_custom_section)]
const TS_APPEND_CONTENT: &'static str = r#"

export type Coords = { "latitude": number, "longitude": number }; 

"#;

Note that these &’static str values can be dynamically generated by a macro, for example a #[derive()] macro that adds a custom type definition.

Creating such a macro that could parse a Serialize-able type and generate a type definition was surprisingly easy. Assuming someone had already written a serde-aware crate that printed a self-describing schema, I came across the rust-serde-schema crate (MIT + Apache-2 licensed) which recursively walks the structure of a Rust type and builds up a context object. To generate code, it’s even easier to recursively build up a list of tokens instead. These tokens can then be flattened into a string and inserted by our macro with the #[wasm_bindgen(typescript_custom_section)] annotation.

Putting these ideas together: You can try my experimental wasm-typescript-definition crate, which provides a TypeScriptDefintion macro that exports serde-compatible TypeScript definitions when running wasm-bindgen.

A type only needs to be annotated with the #[derive(TypeScriptDefinition)] attribute to work, though it makes sense to implement Serialize and Deserialize as well so it works with serde_json. A nice demonstration of this macro is with sum types. We can now convert Rust types that are otherwise inexpressible in JavaScript into data types that TypeScript understands. If we wanted to define an enum whose variants are different messages our frontend can receive:

#[derive(TypeScriptDefinition, Serialize, Deserialize)]
#[serde(tag = "tag", content = "fields")]
pub enum FrontendMessage {
  Init { id: String, },
  ButtonState { selected: Vec<String>, time: u32, }
  Render { html: String, time: u32, },
}

When we run wasm-bindgen, it generates a type definition that is inlined into the TypeScript definition file:

export type FrontendMessage =
  | { "tag": "Init", "fields": { "id": string, } },
  | { "tag": "ButtonState", "fields": { "selected": Array<string>, "time": number, } },
  | { "tag": "Render", "fields": { "html": string, "time": number, } },
  ;

The pipe operator | in a type definition means that it can be one of a list of different variants, much like enum variants in Rust.

Now when we create a new message in TypeScript like:

let msg: FrontendMessage =
  {"tag": "ButtonState", "fields": {"selected": ["bold", "list"], time: 0 }};

We can rely on compile time type-checking to ensure the message is well-formed. That guarantees it can be serialized into a string with JSON.stringify and parsed without errors using serde_json::from_str in Rust. It is now possible to change and introduce new messages to interface between the two runtimes without the risk of introducting runtime errors.

A primary benefit of wasm-bindgen is to push Rust’s safety guarantees into frontend code. I think there’s a compelling case for a “serde_typescript” crate that can generate TypeScript definitions for serializable data types, and has complete support for all of serde’s custom serialization attributes. For now, you can check out my Github repo for wasm-typescript-definition. Hope you find it useful!

Bonus: Exhaustive switch statements in TypeScript

Strong typing on the frontend gets us many of the same guarantees we have in Rust and makes it possible to keep frontend and WebAssembly code in sync. But we lack one important aspect of Rust’s pattern matching capabilities: its requirement that all match arms must be “exhaustive” (that is, there are no unhandled cases).

If we matched against the FrontendMessage enum type, but we only handled the Init and ButtonState variants, our program would fail at compile time indicating that we are missing a match arm for the third variant Render. In TypeScript, we have to work to get the same guarantee. Assuming we choose serde’s “internally tagged” representation for our enum, we can branch in TypeScript based on which variant it is by doing e.g.:

switch (message.tag) {
  case "Init": ... 
  case "ButtonState": ...
  case "Render": ...
}

At build time, the compiler can guarantee that each of the values we use for “tag” matches the name of an actual variant, which prevents you from misspelling or omitting a message type. But the compiler doesn’t check to see that your case statements are exhaustive—that every case is handled—unless you explicitly guarantee that the “default” code path will never be taken.

For this arcane purpose, TypeScript provides the never keyword, and in its Advanced Types documentation suggests how you can make a switch statement exhaustively check all variants of a “discriminated union” (essentially the same as our “internally tagged” enum representation):

function assertNever(x: never): never {
    throw new Error("Unexpected object: " + x);
}

function sendFrontendMessage(msg: FrontendMessage) {
    switch (msg.type) {
        case "Init": return App.initialize(msg.id);
        case "ButtonState": return App.updateButtons(msg.selected, msg.time);
        case "Render": return App.render(msg.html, msg.time);

        // If we missed an enum variant, TypeScript would now error out here.
        default: return assertNever(msg); 
    }
}

The assertNever function accepts an argument with a never type, meaning it should never be called given a valid FrontendMessage type. As long as this default: return assertNever(msg); case appears in every switch statement operating on a “discriminated union”, TypeScript can provide compile-time type checking that you’ve handled all possible variants. See this TypeScript playground to try out never types for yourself.