conwy.co

Subformats

21 September 2024History

software-development

TL;DR:

Subformats is a technique by which I extend an established format (such as Markdown), to add my own custom features, while remaining backwards-compatible, so I can re-use the content with third-party tools.

Let's face it: even the most versatile formats can become limiting.

For example, Markdown seems to support every kind of textual element we could ever want. Yet there will always be some users (like me) who want to, say, apply special styling to pull-quotes.

In this article I present the idea of a "subformat": a backwards-compatible variation of an existing format, which adds additional information to that format. That additional information can be used by a parser which understands the subformat.

The problem#

I recently encountered a problem while migrating content to my new homepage.

The articles are in plain Markdown format (.md) but I had to migrate them to MDX format in order to allow certain sections to be rendered as React components.

Example of React-rendered components enabled by using MDX format:

  • Code samples - rendered by my MdxCode component, using PrismJS
  • Diagrams - rendered by my MdxMermaid component, using MermaidJS

However I also wanted the ability to copy and paste the source Markdown code directly into other Markdown-based content platforms such as Github and DEV.to.

To do this, I would have to convert the content from MDX format back into plain Markdown format, then copy and paste it into the platform.

This wouldn't be so easy, with so many Mdx tags all over the place.

There are a couple of options to speed up the conversion:

  • Use a free online converter such as MDX Formatter
  • Integrate an automated converter such as mdx-to-md, perhaps bootstrapped using my builder infrastructure

However both of these options would require some busywork / tedium and might bring up errors and anomalies.

I eventually settled on what seems like a better solution: subformats.

What is a "subformat"?#

I coined this term because there doesn't seem to be any pre-existing term that describes what I'm trying to do.

By "subformat", I basically mean a variation of an existing format, which adds additional information to that format. That additional information can be used by a parser which understands the subformat.

For example: I use the Markdown format, but then add my own small backwards-compatible variations to it, creating my own Markdown subformat.

In this case, I add small pieces of text to Markdown, which do not interfere with the normal rendering of the Markdown, but which can still be picked up and used by my own parser.

This keeps the content backwards-compatible (i.e., the same content can be used by normal Markdown parsers) while allowing a newer / different parser to access the additional information.

Using this additional information, my own parser can add enhancements such as React controls or special styling.

Markdown subformat I'm using#

I'm using a Markdown subformat to provide additional functionality to my Markdown-based articles in a way that maintains compatibility with regular Markdown parsers like Github and DEV.to.

Here are the variations that make up the subformat:

Markdown elementVariationUsed for
>
(block-quote)
Content prefixed with:
Blurb:
Brief summary in purple text, at the top of an article.
>
(block-quote)
Content prefixed with:
Aside:
Aside text in a box, in an article body.
>
(block-quote)
Content prefixed with:
Pull-quote:
Pull-quote in a box, with a quotes icon, in an article body.
```
(code block)
Content prefixed with:
// Lines (x)
Code with lines (x) highlighted.

Example: pull-quote#

You can see an example implementation in my pull-quote variation of the Markdown blockquote element.

For context: I'm using MDX, which allows Markdown to be rendered using React components. The mdx-components.tsx file defines a mapping between DOM nodes and React components, including blockquoteMdxBlockBuote.tsx.

export function useMDXComponents(components: MDXComponents): MDXComponents {
  return {
    a: MdxA,
    blockquote: MdxBlockQuote, // *
    // ... etc
  };
}

In the MdxBlockQuote component which renders the Markdown blockquote, I pass the props through getBlockQuoteSubformatProps:

export function MdxBlockQuote(props: MdxBlockQuoteProps) {
  props = getBlockQuoteSubformatProps(props);

  return (
    <blockquote {...props}>
      {children}
    </blockquote>
  );
}

The getBlockQuoteSubformatProps function in turn passes the props through a set of chained functions, each applying a variation and returning the resulting props.

function getBlockQuoteSubformatProps(props) {
  props = getAsideVariationProps(props);
  props = getPullQuoteVariationProps(props); // *
}

The getPullQuoteVariationProps function first checks if the children prop has the "Pull-quote:" prefix (this is handled by getIsSubformatChildrenPrefixed). If it does, then this is a pull-quote. So then it adds the pullQuote class (imported from CSS modules), which applies pull-quote styling to the element.

const PULL_QUOTE_PREFIX = "Pull-quote:";

export function getBlockQuotePullQuoteSubformatProps(
  props: MdxBlockQuoteProps,
) {
  if (!getIsSubformatChildrenPrefixed(PULL_QUOTE_PREFIX, props.children)) {
    return props;
  }

  // Add pull-quote class
  const className = cn(props.className ?? "", moduleStyles.pullQuote);

  // Remove pull-quote prefix
  const children = removeSubformatChildrenPrefix(
    PULL_QUOTE_PREFIX,
    props.children,
  );

  return {
    ...props,
    className,
    children,
  };
}

Now I can write the following in my Markdown file:

This is a paragraph.

> Aside:
>
> This is an aside

This is another paragraph.

If I copy/paste it into, say, Github, it will render it like this:

Screenshot of aside block-quote in Github
Screenshot of aside block-quote in Github

But on my own website, it will render like this:

Screenshot of aside block-quote in conwy.co
Screenshot of aside block-quote in conwy.co

Use cases#

I think the subformats concept could be useful in the following senario:

  • You want to use a popular/common format or standard (such as Markdown) and
  • You want to add certain custom features not natively supported by that format and
  • You want to maintain compatibility with the format and
  • You don't have good reason to believe that your custom features will make it into the popular format anytime soon

Based on these situations I can think of the following use cases for subformats:

  • Embedding site-specific formatting into Markdown documents (such as the scenario above).
  • Embedding type information into JSON files using structured comments.
  • Demarcating insertion/interpolation points in files using comments, e.g. HTML templates using <!-- / --> comments.

Caveat#

If the underlying format will support your use case in the near future, or there is a simpler alternative technique, then a subformat might be better avoided. This is a very specific tool for a very small subset of use cases.

© 2024 Jonathan Conway