microsoft/TypeScript

Emit does not preserve empty lines

NoelAbrahams opened this issue ยท 45 comments

Hi,

TS Version: 1.1

Given

function foo() {

    var x = 10;

    var y = 11;
}

We used to get

function foo() {
    var x = 10;

    var y = 11;
}

In the new compiler the line break is missing

function foo() {
    var x = 10;
    var y = 11;
}

(Both compilers removed the first empty line, but the new compiler has gone a step further.)

This can affect the experience when debugging JavaScript in the browser.

Adding some background info... The reason the new compiler removes all blank lines is that this is the only thing we can do consistently. Blank line preservation is always going to be a hit or miss thing when it comes to constructs that are subject to rewriting, e.g. class and module declarations. We face exactly the same issues with comment preservation. So, while I'm sympathetic to resolving this, it's not an easy thing to do.

@NoelAbrahams I'm curious what issues you see when debugging?

@ahejlsberg @NoelAbrahams I created a prototype emitted in the original CodePlex project that actually did a terrific job with comment and newline preservation. Even on constructs that got heavily rewritten (like arrow-functions to function expressions) it would almost always do what you intuitively expected it to do.

Line preservation mostly a function of just reusing the line breaks when preserving old code, and using relative spacing when doing transformations. i.e. if you have:

module M {
}
module N {
}

Then when you're emitting the IIFE for 'N' you say "we should keep the trivia between the module we're rewriting and the previous syntactic element".

Here is the before/after for how my emitter prototype worked that preserved comments/newlines with high levels of fidelity:

https://typescript.codeplex.com/SourceControl/latest#tests/Fidelity/emitter2/ecmascript5/Parser.ts
https://typescript.codeplex.com/SourceControl/latest#tests/Fidelity/emitter2/ecmascript5/Parser.ts.expected

You can see a lot of examples here as well:
https://typescript.codeplex.com/SourceControl/latest#tests/Fidelity/emitter/ecmascript5/

The only feature i didn't implement was 'alignment'. i.e. if you had code that was aligned in some way in the original (very common with parameter declarations), we'd want the emitted code to preserve that as well.

But it would have been very trivial to do.

That said, conversion of contructs from TS to JS did try to preserve indentation. You can see that here:
https://typescript.codeplex.com/SourceControl/latest#tests/Fidelity/emitter/ecmascript5/ClassDeclaration/ClassDeclaration2.ts
https://typescript.codeplex.com/SourceControl/latest#tests/Fidelity/emitter/ecmascript5/ClassDeclaration/ClassDeclaration2.ts.expected

Note how the statements are properly indendented (even when they spread over multiple lines) even after we convert the nested modules and class into IIFEs

@ahejlsberg, there are no significant issues when debugging in the browser. It just makes it easer to navigate the JavaScript code and locate lines for setting breakpoints when there is an exact correspondence with the actual TypeScript source code.

I could personally live without the empty lines, but since TS has gone to such lengths to preserve and emit _beautiful_ JavaScript (:smile:) it seems only natural that this is implemented.

@ahejlsberg @NoelAbrahams There is one issue that exists with debugging in the browser that is slightly related to this conversation. When using setter/getter chains (like with jquery) or chaining promises, the new line feeds are lost during translation. That being said, it is a huge pain point when working with Arrow functions.

As an example:

(<any> x).a('#test')
    .b('test')
    .c(() => 'foo')
    .d(() => 'bar')
    .e(() => 5)
    .f(() => 6);

Becomes:

x.a('#test').b('test').c(function () { return 'foo'; }).d(function () { return 'bar'; }).e(function () { return 5; }).f(function () { return 6; });

Using Chrome and sourceMaps, the breakpoints are still skipped.

http://www.typescriptlang.org/Playground#src=(%3Cany%3E%20x).a('%23test')%0A%20%20%20%20.b('test')%0A%09.c(()%20%3D%3E%20'foo')%0A%09.d(()%20%3D%3E%20'bar')%0A%09.e(()%20%3D%3E%205)%0A%09.f(()%20%3D%3E%206)%3B

@mtraynham, actually I think the problem you highlight is a slightly different one.

In previous versions the body of an inline function was always emitted on new lines:

// TS
var x = () => 'foo';

// JS - old
var x = function () { 
             return 'foo'; 
       };

// JS - new
var x = function () { return 'foo'; };

I too have found this to be a problem - having to occasionally go back and create a function so that I can set a breakpoint when debugging in the browser.

@NoelAbrahams Ahh yes, I have been using this exact same temp solution... I wasn't sure if this was an appropriate bug to contribute this problem to (erasure of line feeds), or should I open another?

I created #2259 to the separate issue.

As an engineering director exploring moving our javascript development community to typescript the new line capability would really help. One of the primary appeal of typescript was maintaining the readability and structure of the code created in typescript generatedJavascript.

Great to see comments are kept in the recent update and addition of the โ€œ--removeComments" command line directive.

I would also like this for a similar reason as @timjmartel - when developers see the emitted JS looks nice they are less resistant to adopt. Preserving (at least vertical) whitespace makes the code look less like it was generated by a machine and more like idomatic JS code written by a human.

If our team ever decided to abandon TS and instead continue with the transpiled JS it would be much easier to adopt the emitted JS sources if they had human-friendly whitespace.

In regard to empty lines, would it be possible - for now - to have them emitted, even if they are hit or miss? Such experimental feature could be asked with a โ€œ--keepEmptyLines" option. It would be not so much for having nice JS, but for having more readable JS.

In regard to chained function calls, would it be possible to have one call for line, to let users set breakpoints? Again, this feature could be asked with a "--oneCallForLine" option, if it is another "hit or miss" thing.

Thanks for your attention.

Actually, chained function calls could be split by a source code beautifier, therefore it doesn't make sense to embed such feature in TypeScript.

This shouldn't be that hard, shouldn't there just be an option in tsconfig.json

preserveWhitespace: true/false

?

I just realized. One good reason to not keep whitespace is that it will really help prevent you from accidentally editing the .js instead of the .ts. I guess one thing to do would be to write out the .js files as readonly/execute only. So maybe this is a non-issue.

That being said, I don't think tsc writes out .js files as readonly/execute only, is there a way to configure tsc to do this?

No, not at the moment. Feel free to open up a separate issue for that though.

@DanielRosenwasser you're saying if we want to preserve whitespace that we should open a separate issue? Is this issue not enough? Nevermind LOL more than a month later I realize you were saying to open a separate issue for read/write/execution permissions on transpiled files :)

It would be nice to have this.

The crowd wants preserveWhitespace: true/false
@ORESoftware ++

Reason this is important is that TypeScript is supposed to "degrade gracefully" to JS. Right now it can't preserve new lines making the JS a bit dense to read, especially if you write TypeScript, but you are supposed to deliver JS to some place else.

+1 preserveWhitespace: true/false

Temporary hack

Use esformatter to add linebreaks.

With the following configuration file:

{
  "lineBreak": {
    "before": {
      "FunctionDeclaration": ">=2",
      "FunctionDeclarationOpeningBrace": 0,
      "FunctionDeclarationClosingBrace": 1,
      "MethodDefinition": ">=2",
      "ClassDeclaration": ">=2"
    },
    "after": {
      "FunctionDeclaration": ">=2",
      "FunctionDeclarationOpeningBrace": 1,
      "MethodDefinitionClosingBrace": ">=2",
      "ClassClosingBrace": ">=2"
    }
  }
}

@mtraynham Your example:

(<any> x).a('#test')
    .b('test')
    .c(() => 'foo')
    .d(() => 'bar')
    .e(() => 5)
    .f(() => 6);

with the latest compiler produces this JS:

x.a('#test')
    .b('test')
    .c(function () { return 'foo'; })
    .d(function () { return 'bar'; })
    .e(function () { return 5; })
    .f(function () { return 6; });

(see TS PlayGround https://goo.gl/JViurr )

A lot has changed since TS Version 1.1 (the version of TypeScript for which this issue was created). Maybe this issue can be closed? @ahejlsberg ?

@valera-rozuvan I had opened #2259 instead. My issue was related to line feeds, but not the exact same issue described by this bug. #2259 was closed a while back (May 2015).

This is the bril-andrew esformatter configuration but with a bug fixed, (when the class declaration contained the word import there was no line break):

{
    "lineBreak": {
        "before": {
            "FunctionDeclaration": ">=2",
            "FunctionDeclarationOpeningBrace": 0,
            "FunctionDeclarationClosingBrace": 1,
            "MethodDefinition": ">=2",
            "ClassDeclaration": ">=2",
            "ExportNamedDeclaration": 2
        },
        "after": {
            "FunctionDeclaration": ">=2",
            "FunctionDeclarationOpeningBrace": 1,
            "MethodDefinitionClosingBrace": ">=2",
            "ClassClosingBrace": ">=2"
        }
    }
}

I don't think using esformatter solves the whole problem. Sure, it can automatically insert blank lines around functions etc. But for me, blank lines within functions are even more crucial. We use blank lines like paragraphs in prose: to group individual thoughts.

Those blank lines within functions help to communicate the structure of the function. Without them, I find that readability suffers.

@ahejlsberg I am seeing incorrect line numbers in my unit testing output when using ts-jest, and this issue seems to be caused by the empty lines being removed in the js output. I am curious why it is so difficult to leaves these lines in the final js. Is there any more information out there on this? Can we help somehow to make this happen? :)

Or has it been merged already and not released yet? --> V4 Next Big Version #3143

@JimTheMan If you use source maps then maybe the source-map-support package can help you get the correct stack traces in the output.

I also run into this issue. I came up with a workaround by creating a diff patch and revert whitespace changes in the patch. jsdiff allows you to create a structured patch object and manipulate it as you wish.

import * as diff from 'diff';

const patch =
      diff.parsePatch(diff.createPatch('file', oldText, newText, '', ''));
const hunks = patch[0].hunks;
for (let i = 0; i < hunks.length; ++i) {
  let lineOffset = 0;
  const hunk = hunks[i];
  hunk.lines = hunk.lines.map(line => {
    if (line === '-') {
      lineOffset++;
      return ' ';
    }
    return line;
  });
  hunk.newLines += lineOffset;
  for (let j = i + 1; j < hunks.length; ++j) {
    hunks[j].newStart += lineOffset;
  }
}
return diff.applyPatch(oldText, patch);

With this workaround, you can preserve all the line breaks from the original file.

@zeroliu Does it introduce a noticeable time delay in compile step?

@ahejlsberg Do you think it's worth fixing this issue?

@valera-rozuvan depending on the size of your project. For my use cases where I transpile 10-ish files of 100-1000 LOC, it doesn't introduce any noticeable delay.

Any solutions here yet? I run in this trouble too...

I was part way into trying to fix this in the compiler itself when my teammate @emadum reminded me that tsc can preserve comments. Here's a little gulp pipeline that seems to do a fairly decent job of preserving newlines:

const gulp = require('gulp');
const ts = require('gulp-typescript');
const through = require('through2');

function preserveNewlines() {
  return through.obj(function(file, encoding, callback) {
    const data = file.contents.toString('utf8');
    const fixedUp = data.replace(/\n\n/g, '\n/** THIS_IS_A_NEWLINE **/');
    file.contents = Buffer.from(fixedUp, 'utf8');
    callback(null, file);
  });
}

function restoreNewlines() {
  return through.obj(function(file, encoding, callback) {
    const data = file.contents.toString('utf8');
    const fixedUp = data.replace(/\/\*\* THIS_IS_A_NEWLINE \*\*\//g, '\n');
    file.contents = Buffer.from(fixedUp, 'utf8');
    callback(null, file);
  });
}

gulp.task('default', function () {
  return gulp.src('src/**/*.ts')
    .pipe(preserveNewlines())
    .pipe(ts({
      removeComments: false
    }))
    .pipe(restoreNewlines())
    .pipe(gulp.dest('lib'));
});

I think a more clever comment would prevent some false positives, but it seems to work well for us today. I've only tested this on a relatively small file, but the performance overhead there was minimal.

hth

@mbroadst

I ended up using your idea as a base, and eventually expanded upon it until it became an npm module:
https://www.npmjs.com/package/gulp-preserve-typescript-whitespace

I credited your post in the Readme, hopefully you don't mind :)

Is this still being considered? I see this experiment: #37814 and #42303, but don't know if there is any plan to deliver it some day.

Is there a way I could contribute to fixing this? If I could get some guidance around what an acceptable fix might look like, I'd be willing to take a stab at it.

@CarabusX I took slightly minimalistic approach, but very similar to yours:

        function preserveWhitespace(content) {
          return content.replace(/(\s)/g, (m) => {
            return `/**_P${m.charCodeAt(0)}**/`;
          });
        }
        function restoreWhitespace(content) {
          return content.replace(/\/\*\*_P(\d+)\*\*\//g, (_, m) => {
            return String.fromCharCode(m);
          });
        }
        content = preserveWhitespace(content);
        const output = ts.transpileModule(content, {
          reportDiagnostics: false,
          compilerOptions: {
            target: ts.ScriptTarget.ESNext,
            importsNotUsedAsValues: ts.ImportsNotUsedAsValues.Preserve,
            sourceMap: true,
            removeComments: false,
          },
        });
        output.outputText = restoreWhitespace(output.outputText.replace(/\s+/g, ''));

however it doesn't work well with imports

favna commented

I tried many suggestions from this thread but none worked for me. After a while, however, I managed to come up with my own solution.

Before I share the code, I should note that this code relies on Prettier for formatting to get the JS code consistent with the TS code. I already use Prettier in all my projects, so I just copied the config I am using anyway.

One final note, as my use case for this was the Docusaurus website for https://github.com/sapphiredev, I have also decided to publish this conversion and formatting as an NPM package: https://www.npmjs.com/package/@sapphire/docusaurus-plugin-ts2esm2cjs

Now on to the code. This transpiles TS into ESM and then further converts that ESM into CJS.

const { runTransform } = require('esm-to-cjs');
const ts = require('typescript');
const prettier = require('prettier');
const sapphirePrettierConfig = require('@sapphire/prettier-config');

const documentationPrettierConfig = {
	...sapphirePrettierConfig,
	tabWidth: 2,
	useTabs: false,
	printWidth: 120,
	parser: 'babel'
};

/** @type {import('ts').CompilerOptions} */
const tsCompilerOptions = {
	module: ts.ModuleKind.ESNext,
	newLine: ts.NewLineKind.LineFeed,
	moduleResolution: ts.ModuleResolutionKind.NodeJs,
	target: ts.ScriptTarget.ESNext,
	removeComments: false,
	esModuleInterop: true,
	pretty: true
};

/**
 * Transpiles input TypeScript code to ESM code.
 * @param {string} code The code to transpile
 * @returns {{ outputText: string; diagnostics: unknown[]; sourceMapText?: string }} Input code transpiled to ESM
 */
const tsToEsm = (code) => ts.transpileModule(code, { reportDiagnostics: false, compilerOptions: tsCompilerOptions });

/**
 * Transforms input ESM code to CJS code.
 * @param {string} code The code to transform
 * @returns Input code transformed to CommonJS
 */
const esmToCjs = (code) => runTransform(code, { quote: 'single' });

/**
 * Escaped new lines in code with block comments so they can be restored by {@link restoreNewLines}
 * @param {string} code The code to escape new lines in
 * @returns The same code but with new lines escaped using block comments
 */
const escapeNewLines = (code) => code.replace(/\n\n/g, '\n/* :newline: */');

/**
 * Reverses {@link escapeNewLines} and restores new lines
 * @param {string} code The code with escaped new lines
 * @returns The same code with new lines restored
 */
const restoreNewLines = (code) => code.replace(/\/\* :newline: \*\//g, '\n');

/**
 * Formats the code using Prettier
 * @param {string} code The code to prettier format
 * @returns Prettier formatted code
 */
const prettierFormatCode = (code) => prettier.format(code, documentationPrettierConfig).slice(0, -1);

/**
 * The example code that we'll be transpiling and formatting.
 * Of note here is that it also has a type only import that will be removed during transpilation,
 * and it needs to be accounted for with the restoration of new lines.
 */
const codeStr = `import { Command, Args } from '@sapphire/framework';
import type { Message } from 'discord.js';

export class EchoCommand extends Command {
  public constructor(context: Command.Context, options: Command.Options) {
    super(context, {
      ...options,
      aliases: ['parrot', 'copy'],
      description: 'Replies with the text you provide'
    });
  }

  public async messageRun(message: Message, args: Args) {
    // ...
  }
}`;

const emptyLineEscapedTsCode = escapeNewLines(codeStr);

const esmCode = tsToEsm(emptyLineEscapedTsCode).outputText;
const cjsCode = esmToCjs(esmCode);
wbt commented

I would also like to see this implemented, with a few specific reasons in mind:

  • Consistency with general design philosophy per NoelAbrahams' comment above
  • Ciantic's comment about needing to deliver JS elsewhere
  • Being able to cleanly compare JS generated through TS and JS that existed before, where demonstrating a close match is important to adoption
  • Not having long lines requiring horizontal scrolling in the emitted code, where the source included line breaks to make things neat and beautiful and meeting linter/style specs (also commented on above)
  • Being able to continue working with transpiled JS if all these overhead costs of TypeScript become too much; anticipating difficulty here creates lock-in costs that raise barriers to adoption in the first place.

This PR, discussed in the 3.9 release notes, show code actions preserving newlines; why can't the emitted code do something similar?

devope commented

Are there any updates for almost 10 years?
I would like to be able to configure preserving new lines

I'm changing this to Won't Fix. Reasons:

  • This has obviously not been a key blocker for anyone
  • Of all the "keep trivia" problems we have, this is surely the lowest-priority among them
  • At this point, many alternative transpilers with different configurable behaviors around comments/whitespace exist; if you need truly need newline preservation, you can use one of those
  • Performance remains top-of-mind for us and anything that makes emit slower needs to be carrying a lot more value than newline preservation.

At this point, many alternative transpilers with different configurable behaviors around comments/whitespace exist

Does anyone know of a good option for this?

favna commented

At this point, many alternative transpilers with different configurable behaviors around comments/whitespace exist

Does anyone know of a good option for this?

It depends on your exact needs. There are several suggestions in this issue and a list of other transpilers is as follows. Note that this list hasn't been sanitized for the specific requirement: