PhaultLines

Validating Markdoc attribute values with Zod

In a React application written in TypeScript, the language’s type system is the first line of defense against type errors in component props. In Markdoc, where content is decoupled from rendering logic and can’t take advantage of TypeScript types, the schema-based validation system plays an important role in bridging the gap between declarative tag definitions and typed React components.

The Markdoc validator ensures that tag attribute values in a document match the types declared in the corresponding schema. Type validation helps improve the general correctness of Markdoc content, but it’s especially important when using a React renderer—client-side JavaScript errors can result when invalid tag attributes pass into a React component as props.

The type system used in Markdoc schemas is relatively simple compared to TypeScript, however. One particularly troublesome shortcoming is the lack of support for complex typing on object and array attribute values. A Markdoc schema can specify that a given attribute expects an object or array value, but it doesn’t provide a way to define the shape of the object or what types of values are accepted in an array. For more granular validation of object and array attribute types, one typically writes custom validation functions.

The following example shows how to write an attribute validation function that checks to make sure that the attribute value is an array that contains only numbers:

const foo = {
  attributes: {
    bar: {
      type: Array,
      validate(value) {
        return value.every(x => typeof x === 'number') ? [] : [{
          id: 'invalid-array-element',
          level: 'error',
          message: 'Attribute "bar" expects an array of numbers'
        }]
      }
    }
  }
};

Reproducing this sort of logic in custom validation functions for every schema attribute that expects an object or array would be prohibitively cumbersome, especially in cases where you have nested data structures or need even more granular checks on the values.

A pattern for richer attribute validation

I’ve considered expanding Markdoc’s declarative schema format, addressing these issues by providing more elaborate primitives. But creating a full-blown type system for Markdoc that is rich enough to cover the full spectrum of relevant cases feels a lot like needlessly reinventing the wheel, particularly when there are already a number of powerful validation libraries in the JavaScript ecosystem that can be adopted to serve the purpose.

The solution is to create a custom attribute validation function that accepts a schema from a general-purpose validation system, uses the schema to check the attribute value, and then converts the returned errors to Markdoc’s ValidationError type. Some improvements to Markdoc’s attribute validation functionality—like introducing an extra parameter for attribute validation functions that carries the name of the attribute—make this approach more viable.

Here’s an example of how to write a reusable attribute validator using Zod, a popular schema validation library:

function withZod(schema) {
  return (value, config, attrName) => {
    const output = schema.safeParse(value);
    return output.success ? [] : output.error.errors.map(({ message }) => ({
      id: 'attribute-value-invalid',
      level: 'error',
      message: `Invalid value for attribute "${attrName}": ${message}`
    }))
  }
}

And here’s how you would use this function to create an attribute validator equivalent to the one earlier in this post that checks to make sure that the value is an array of numbers:

import { z } from 'zod';

const foo = {
  attributes: {
    bar: {
      type: Array,
      validate: withZod(z.array(z.number()))
    }
  }
};

Of course, you can take advantage of Zod’s expressiveness to perform even more sophisticated validation, including checking the structure of deeply-nested objects and arrays. You can refer to Zod’s reference documentation to see all of the checks that are available.

In some cases, users may incorporate variables or function calls inside of Markdoc data structures, which means that your validation schema should explicitly allow those values where applicable. Every Markdoc AST node has an $$mdtype property with a string value that indicates its identity. You can have the validator check for that property to identify variables and functions.

The following example shows how to write the validator from the example above, but with support for allowing variables or function calls inside of the array in addition to numbers:

const MarkdocValue = z.object({
  $$mdtype: z.union([
    z.literal('Variable'),
    z.literal('Function')
  ])
});

withZod(z.array(z.number().or(MarkdocValue)))

Zod expressions are composable, so you can assign a specific subexpression to a variable for later reuse. Assigning the $$mdtype check to a variable makes it easy to incorporate into multiple validators without having to reproduce it exactly each time.

An added bonus of using Zod is that it can infer TypeScript types from the schema, which can be used in your transform function.

Using other validation libraries

As supporting third-party validation libraries for attribute validation is fairly trivial, and there are a wide range of validation libraries to choose from, we’ve decided not to build support for any one specific library into Markdoc—you can choose the solution that makes the most sense for your project.

Some alternatives to Zod that you may wish to consider include Ajv and Joi. Ajv uses the ubiquitous JSON schema validation format to express validation rules. The following is the Ajv equivalent of the previous examples:

import Ajv from "ajv";

const ajv = new Ajv();

function withAjv(schema) {
  return (value, config, attrName) => {
    const valid = ajv.validate(schema, value);
    return valid ? [] : [{
      id: 'attribute-value-invalid',
      level: 'error',
      message: `Invalid value for "${attrName}": ${ajv.errorsText()}`
    }]
  }
}

const MarkdocValue = {
  type: 'object',
  additionalProperties: true,
  properties: {
    $$mdtype: {
      type: 'string',
      enum: ['Variable', 'Function']
    },
  }
};

const foo = {
  attributes: {
    bar: {
      type: Array,
      validate: withAjv({
        type: 'array',
        items: {oneOf: [{type: 'number'}, MarkdocValue]}
      })
    }
  }
};

Note the definition of an equivalent MarkdocValue schema that can be used to support variables and functions, just like in the Zod example.