/proposal-string-dedent

TC39 Proposal to remove common leading indentation from multiline template strings

Primary LanguageHTMLMIT LicenseMIT

String dedent

Champions: @jridgewell, @hemanth

Author: @mmkal

Status: Stage 2

Problem

When trying to embed formatted text (for instance, Markdown contents, or the source text of a JS program) in JS code, developers are forced to make awkward concessions for readability of the code or output. For instance, to make the embedded text look consistent with the surrounding code, we'd write:

class MyClass {
  print() {
    console.log(`
      create table student(
        id int primary key,
        name text
      )
    `);
  }
}

This outputs (using ^ to mark the beginning of a line and · to mark a leading space):

^
^······create table student(
^········id int primary key,
^········name text
^······)
^····

In order to for the output to look sensible, our code becomes illegible:

class MyClass {
  print() {
    console.log(`create table student(
  id int primary key,
  name text
)`);
  }
}

This outputs a sensible:

create table student(
  id int primary key,
  name text
)

With a library

It's possible to write sensible code and have sensible output with the help of libraries.

import dedent from 'dedent'

class MyClass {
  print() {
    console.log(dedent`
      create table student(
        id int primary key,
        name text
      )
    `);
  }
}

This outputs the sensible:

create table student(
  id int primary key,
  name text
)

However, these libraries incur a runtime cost, and are subtly inconsistent with the way they perform "dedenting". The most popular package is stagnant without bug fixes and has problematic interpreting of the Template Object's .raw array, and none are able to pass the dedented text to tag template functions.

pythonInterpreter`
  print('Hello Python World')
`; // IndentationError: unexpected indent

const dedented = dedent`
  print('Hello Python World')
`;

pythonInterpreter`${dedented}`; // <- this doesn't work right.

Additionally, even if a userland library were to support passing to tagged templates, the array would not be a true Template Object in proposals like Array.isTemplateObject. This harms the ability of tagged templates functions to differentiate dedented templates that exist in the actual program source text (and ascribe a higher trust level to) vs a dynamically generated string (which may contain a user generated exploit string).

Proposed solution

Implement a String.dedent tag template function, for a tagged template literal behaving almost the same as a regular single backticked template literal, with a few key differences:

  • The opening line (everything immediately right of the opening `) must contain only a literal newline char.
  • The opening line's literal newline is removed.
  • The closing line (everything immediately to the left of the closing `) may contain whitespace, but the whitespace is removed.
  • The closing line's preceding literal newline char is removed.
  • Lines which only contain whitespace are emptied.
  • The "common indentation" of all non-empty content lines (lines that are not the opening or closing) are calculated.
  • That common indentation is removed from the start of every line.

Play around with a REPL implementation.

The examples above would be solved like this:

class MyClass {
  print() {
    console.log(String.dedent`
      create table student(
        id int primary key,
        name text
      )
    `);
  }
}

This outputs the sensible:

create table student(
  id int primary key,
  name text
)

Expressions can be directly supported, as well as composition with another tagged template function:

const message = 'Hello Python World';
String.dedent(pythonInterpreter)`
  print('${message}')
`;

In other languages

Q&A

Why not use a library?

To summarise the problem section above:

  • avoid a dependency for the desired behaviour of the vast majority of multiline strings (dedent has millions of downloads per week).
  • avoiding inconsistencies between the multiple current implementations.
  • improved performance.
  • better discoverability - the feature can be documented publicly, and used in code samples which wouldn't otherwise rely on a package like dedent.
  • give code generators a way to output readable code with correct indentation properties (e.g. jest inline snapshots).
  • support "dedenting" tagged template literal functions with customized expression parameter behaviour (e.g. slonik).
  • allow formatters/linters to safely enforce code style without needing to be coupled to the runtime behaviour of multiple libraries in combination.

Additional Links