rome/tools

☂️ Linting

sebmck opened this issue · 38 comments

The current area of focus for Rome is linting. This is the path of least resistance for finishing a part of Rome that can be used most easily.

With diagnostics #3 the lints that Rome can produce will be more detailed than any other existing JS linter. The most difficult part of this will be editor integration and writing a set of lint rules.

One large question is how we handle configuration. Are we going to take an opinionated stance and have some lints non-configurable? Will we have plugins?

slobo commented

Whether opinionated or fully configurable, it would be great to have ability to disable linter and/or prettifyer on blocks/sections of code - for those situation where a human believes they know better.

Something to this effect (excuse the unrealistic examples).

Sample Syntax:

// @linter-off rule1 [, rule2,...]: Reason why we need to disable listed rules

/* section of code that linter ignores */

// @linter-on

Examples:

  • Turn off a particular linter rule on a block of code
// @linter-off prohibit-vars: Some lame excuse why I really need a var here
var blah;
// @linter-on
  • Turn off a particular linter rule on a single line
var blah; // @linter-off prohibit-vars: because
  • Similar for auto-formatter
// @formatter-off: I want my custom alignments, dammit 

fooBarnicate( 'hi',    'world' );
fooBarnicate( 'quick', 'fox'   );

// @formatter-on

Idea being that escape hatches are provided, but there is a sufficient friction in writing annotations and the extra line-noise they introduce hopefully makes people reach for them only when really justified.

Hopefully not a wrong place to voice this?

sesam commented

IMHO: Have good defaults, and make it slightly painful to disable the linter for a line. Add good linting explanations that link specifically to the linting repo....
This ensures that linter-rage :tableflips: reactions get targeted at the linting repo, and then just have chat-bots attend the issues there. The bots can be trained at calming people down and telling them their way is probably the best, but the linting rules are there to help those less experienced to better understand the code and avoid producing hard to grasp structures and bug magnets. 😄 And that might even be true.
For fixing any early and eventually bad linting decisions: this is a monorepo with no external dependencies so we're always just ONE nicely motivated PR away from fixing that cleanly and nicely. Though it would probably also need to include a codemod-style automatic code upgrader, to not put-off contributors who have unmerged branches.

One large question is how we handle configuration. Are we going to take an opinionated stance...

I think the extendability of formatting linting should be limited. Perhaps a small set of options could be exposed in config (a starting point could be a subset of those Prettier exposes https://prettier.io/docs/en/options.html, like newline and semicolon).

It's just one less thing to have to think about - which complements the goals of this project. We can take a cue from go fmt here, IMO.

Opening up to extendability will also require a ton more design decisions to be made. Perhaps it is something that can be iterated on later (extensibility)?

While I agree with @jloleysens, one thing I think is fairly common in large codebases is introducing lint rules as a means to migrate away from specific APIs in addition to codemods. However, if there is the capability participate in compilation through plugins one could create their own application specific linting tools based on the Rome AST.

I'm torn on escape hatches. They are needed, but also I've seen them abused by devs who don't care. I think the main question for that is what are we trying to achieve with listing rules. Are they style rules for formatting, or are they actual bug prevention rules? Style rules can always be overridden, but some bug prevention stuff I feel pretty strongly should not have an out.

As for plugins, it is a double edge sword because the more plugins you have, the harder it is to update the core utility if you have to wait on the plugin to also update. I think to @chadhietala point, there should be some way to lint and run codemods on larger code bases through special rules. I just don't know if that warrants a "plugin" framework. Not to mention that there can be friction when updating plugins if they change rules that cause your code to format differently.

I'd also like to callout that it is harder to adopt rules in a larger codebase because normally it is "all or nothing". The only thing I have seen that gets around this is https://www.kadamwhite.com/archives/2017/progressive-linting, but it isn't super ideal as it means areas of code will remain ignored until they are touched. It would be interesting to see if there is a way to take this into consideration when expanding the linting tool of Rome.

I think instead of error/warnings like eslint, Rome should categorize code based on levels, example:
1: Errors. Reserved only for invalid JS code, things that will throw exceptions when executed.
2: JS quirks. Things that JS allows, but will most likely produce unexpected result. Examples: duplicate object keys, using variables before defining.
3: Ambiguous code: Things that are possibly wrong, but might also be fine, like parseInt(x). Should allow inline disable.
4: Preferences: Any coding conventions that should be followed by the developers, like disallowing ternary. Should be configurable, all rules disabled by default, and should allow inline disable.
5. Style: Basically what Prettier does, but a bit more extensive. In other words, eslint --autofix-eable rules like prefer-ternary. Should allow configuration. Single/double quotes config should be here.

By default the max level displayed should be 4 - styling issues should not be reported since Rome will take care of them at some point (save or rome --autofix), and they are too much noise that most people usually don't care about.

Having such levels or categories will allow richer editor experience: not just red or yellow underlines, but other colors as well to indicate different type of problems.

4 and 5 are very close to each other, the only difference is 4 rules cannot be autofixed.

Dumping some of my brief thoughts here:

  • No formatting options. Zero. None. Prettier initially tried to do this but caved under demands for things like an option for quote type etc. If you aren't happy with the defaults that Rome provides then you can use prettier or another tool.
  • AST transforms for prettification. This is a hard stance that prettier has taken where they don't want to modify the AST in any way. However there are a variety of transforms you can apply that retain all the semantics but make the code more readable. These can be things like flipping conditionals, pruning dead code/empty blocks etc.
  • Suppression comments must be specific. You must explicitly to all rules you want to disable for a particular line. Right now with other linters it's possible, and common practice, to suppress all lints for a specific line. Any future lint rules wont effect previous suppressions, this is bad!
  • Autofixing AST rules. Rome allows you to write lint autofixes with an AST transform. This is more powerful than ESLint which is just string based. With this capability we should be able to have autofixes for a majority of errors.

Love all of those @sebmck. If we learned one thing from the rails world, it's setting up conventions that are inmovable. Allows us all to have consistent projects and stop any/all arguments. If you want things to be ultra configurable, then the individual tools exist (i.e. prettier, eslint, etc).

PHP's static analysis tools have started to introduce the concept of a baseline file which suppresses errors for existing code but not for new code.

I've found this helps limit the proliferation of suppression comments while at the same time providing just enough friction to discourage overuse.

I'm all for zero formatting options. I can get over anything even if it deeply offends my aesthetic sensibilities, as long as we're saved from styleguide wars.

That said, it might not be obvious where to draw the line between formatting and semantic/logic/error lint rules, such as prefer-const. One would say it's formatting, I'd say it's code-complexity issue.

Oh noes, are we back to styleguide wars?

I'm also on board with no (or very limited) options. One of the things I love most about Go is how strong its opinions are, even about linting/formatting. It might ruffle some feathers, but it's something that people can learn to live with (I learned to live with tabs in Go 😖).

I agree with having a set of configurations that come out of the box. At this initial stage, it is probably a good idea to leverage this feature to show that Rome will be somewhat opinionated. However, as mentioned above by a few, there is reason to have flexibility in such enforced rules. Developers shouldn't have to battle against the default lint rules. They should be guidelines to maintain a standard of code. Rome should define it's base rules which can be overridden by an alternative.

As for things like escape hatches, we should be trying to minimise occurrences of this with a sensible approach that is in-scope of this project. The idea proposed by @slobo is worth exploring but I'm not sure whether the friction it causes will actually resort in escape hatches being used less.

slobo commented

What if instead of offering a wide range of rules that can be configured in .rc and/or disabled via annotations, there were just a few strongly opinionated idioms that can be chosen when needed inside specific sections of code. (Sounds like a distinction without a difference, but bear with me as I think it through)

The example that comes to mind at the moment is a formatting one. I'm a big fan of using whitespace to align columns (I find visually symmetry helps comprehension, eases spotting of errors, looks satisfying etc), so I'll use that for illustration.

So, there I am, editing the code, and notice something that I'd like visually aligned. I add // @fmt-table and automatically get the formatter to do the following:

  • Data definitions
// @fmt-table
const data = [
  [  1,  2,  3,  4 ],
  [ 94, 82, 23, 34 ],
];
// @fmt-table
const data = [
  { name: 'Foo',    type: 'Bar'    },
  { name: 'Baz123', type: 'XyzWQr' },
];
  • Or even code
// @fmt-table
if ( isBaz( var1     ) ) fooBarnicate( 'hi',    'world' );
if ( isBaz( othervar ) ) fooBarnicate( 'quick', 'fox'   );

In other words, annotation would not tell the formatter to ignore bits of code, but would instead in this case instruct rome to do a specific, well tuned, alignment automatically.

But the, arguably any formatting that enhances readability should be always applied, and annotation is just there where rome can't figure out by itself what works best. And annotations are rather noisy, so you always pay the penalty of extra visual clutter, negating readability benefits.

So, instead of annotations, perhaps existing code format could be used as a hint that some legibility rule could be applied.

Ex:

  • This code would be left as is:
if ( baz )
    foo( bar, var2 );
if ( othervar )
    foo( yadda, bar );
  • But, this code:
if (baz) foo( bar, var2 );
if (othervar) foo( yadda, bar );

would end up being transformed into:

if ( baz      ) foo( bar,   var2 );
if ( othervar ) foo( yadda, bar  );

In this case rome_fmt(code) != rome_fmt(strip_whitespace(code)), but perhaps that's not too terrible? It would allow for lead-by-example code formatting, and I can imagine it could be a pleasant experience in a tight edit-save-reformat loop. In ex. above, adding a newline would essentially reformat related subsequent lines, and all that would be needed for anyone to undo it is to remove that said newline.

Anything of substance here?

When should linting happen?
Assuming auto-fixing by modifying the AST, it's possible that these transforms may be destructive. Removing unused imports automatically, for example, would be incredible useful to prevent checked in code containing them, but may get in the way while developing. Mis-type + save and half of the file tree-shakes itself to cmd-z. Other rules, like those to catch bugs early are more valuable at the time they are introduced.

Is formatting effectively creating a "compile target" for checked in source code? Is that fundamentally different to linting for correctness.

Opt-ing out of rules
Very specific opt-out sounds sensible, but it won't discourage the use of them. When someone needs to opt-out, is that a bug in the linting? If the human is right, can it teach the machine to be right. If exclusions that were bugs were marked as such, the linter could remove the exclusions as they were fixed.

// @linter-bug some-rule

That way, if the bug is fixed, and Rome lints this again (and correctly doesn't report the rule) it could remove the comment.

(Could also be used to search Github for bugs – and be able to see them in their full context)

@stephenwf

Assuming auto-fixing by modifying the AST, it's possible that these transforms may be destructive. Removing unused imports automatically, for example, would be incredible useful to prevent checked in code containing them, but may get in the way while developing. Mis-type + save and half of the file tree-shakes itself to cmd-z. Other rules, like those to catch bugs early are more valuable at the time they are introduced.

That's a good point. I think there would be two operations. Autoformatting and autofixing. Autoformatting would be an implicit action, on editor save. To perform autofixing that (potentially) changes the semantics of the code, it would have to be an explicit action, by a button in an editor integration, or from the CLI.

@slobo

What if instead of offering a wide range of rules that can be configured in .rc and/or disabled via annotations, there were just a few strongly opinionated idioms that can be chosen when needed inside specific sections of code.

My concern would be the vast amount of possible forms this could take. If the value is in consistency then that flies against it.

@dwelle

That said, it might not be obvious where to draw the line between formatting and semantic/logic/error lint rules, such as prefer-const. One would say it's formatting, I'd say it's code-complexity issue.

A linter doesn't just deal with formatting issues. A linter is broadly just a tool that finds potential problems in your code. A type checker is also another form of a linter. I think we should push the envelope and get out of the box that linters are currently categorized as in JavaScript.

@sebmck oh, no argument from me there. It's just that for formatting, I'd be fine with enforcing one styleguide for all, without ability to customize. But for rules like prefer-const, which I personally don't consider as a formatting rule, I'm not sure where you stand on it. Enforce or not enforce?

I think that people can get over pure-formatting rules even if they dislike them (hence Prettier), but they'd be much less happy if we enforced rules that go further away on that spectrum of formatting-vs-logic rules.

Better example: no-async-promise-executor. If we decide to enforce it because it may lead to errors, I'm sure there are many people who think they know what they're doing and will be very angry to have this rule on by default (and no way of turning it off --- provided we decide it falls into formatting rules, or generally prevent customization).

So, all I wanted to say is: there are gonna be rules that aren't clear whether they fall into formatting category, or logic category, and it'll be hard to decide whether to allow customization for them, or not.

(Of course we can disable customization for the linter as a whole, but then I bet people would en masse disable Rome's linting completely).

@dwelle

oh, no argument from me there. It's just that for formatting, I'd be fine with enforcing one styleguide for all, without ability to customize. But for rules like prefer-const, which I personally don't consider as a formatting rule, I'm not sure where you stand on it. Enforce or not enforce?

Yeah I'm not really sure about that. I personally don't use that rule and use const as a special signifier for top-level variables. Some of the decisions will be arbitrary but ultimately we'll be able to have autofixes for almost everything so compared to ESLint it really wont be as big of a deal to have.

Better example: no-async-promise-executor. If we decide to enforce it because it may lead to errors, I'm sure there are many people who think they know what they're doing and will be very angry to have this rule on by default (and no way of turning it off --- provided we decide it falls into formatting rules, or generally prevent customization).

Ideally the proper way to signal "I know what I'm doing" is with a specific comment.

// @rome-ignore no-async-promise-executor

If you wanted to suppress a lint you would have to refer to the name, there'd be no way to say "disable all lint rules for this line/block".

I went through the list of ESLint rules and made a note of the ones I think we should have. I omitted all the ones that have to do with code style, and those where a type system could detect more reliably. A vast majority of these can be autofixed.

  • enforce return statements in getters (getter-return)
  • disallow using an async function as a Promise executor (no-async-promise-executor)
  • disallow comparing against -0 (no-compare-neg-zero)
  • disallow assignment operators in conditional expressions (no-cond-assign)
  • disallow the use of debugger (no-debugger)
  • disallow duplicate arguments in function definitions (no-dupe-args)
  • disallow duplicate keys in object literals (no-dupe-keys)
  • disallow a duplicate case label (no-duplicate-case)
  • disallow empty character classes in regular expressions (no-empty-character-class)
  • disallow reassigning exceptions in catch clauses (no-ex-assign)
  • disallow unnecessary boolean casts (no-extra-boolean-cast)
  • disallow reassigning function declarations (no-func-assign)
  • disallow assigning to imported bindings (no-import-assign)
  • disallow variable or function declarations in nested blocks (no-inner-declarations)
  • disallow invalid regular expression strings in RegExp constructors (no-invalid-regexp)
  • disallow multiple spaces in regular expression literals (no-regex-spaces)
  • disallow returning values from setters (no-setter-return)
  • disallow sparse arrays (no-sparse-arrays)
  • disallow unreachable code after return, throw, continue, and break statements (no-unreachable)
  • disallow control flow statements in finally blocks (no-unsafe-finally)
  • disallow negating the left operand of relational operators (no-unsafe-negation)
  • disallow template literal placeholder syntax in regular strings (no-template-curly-in-string)
  • disallow deleting variables (no-delete-var)
  • disallow labels that are variable names (no-label-var)
  • disallow shadowing of restricted names (no-shadow-restricted-names)
  • suggest template literals instead of string concatenation (prefer-template)

And I thought of some others:

  • disallow specific global variables such as event, error etc
  • disallow statements other than blocks in if, while etc
  • disallow duplicate import sources
  • disallow var (the presence of this in source code already makes Rome slower!)
  • disallow arguments
  • disallow wildcard import that only has static members

I also have the following rules already written:

  • disallow empty blocks
  • disallow undeclared variables
  • disallow unused variables

There's also a category of framework-level lints that are specific to things like React etc. I'm not sure how to handle them since we don't want them in Rome. I really really really want to avoid having JavaScript plugins in Rome so I'm wondering what could be achieved with a DSL where you write expression patterns to disallow.

I also merged #72 which adds formatting functionality separate to lint autofixing.

Dumping some of my brief thoughts here:

  • No formatting options. Zero. None. Prettier initially tried to do this but caved under demands for things like an option for quote type etc. If you aren't happy with the defaults that Rome provides then you can use prettier or another tool.
  • AST transforms for prettification. This is a hard stance that prettier has taken where they don't want to modify the AST in any way. However there are a variety of transforms you can apply that retain all the semantics but make the code more readable. These can be things like flipping conditionals, pruning dead code/empty blocks etc.
  • Suppression comments must be specific. You must explicitly to all rules you want to disable for a particular line. Right now with other linters it's possible, and common practice, to suppress all lints for a specific line. Any future lint rules wont effect previous suppressions, this is bad!
  • Autofixing AST rules. Rome allows you to write lint autofixes with an AST transform. This is more powerful than ESLint which is just string based. With this capability we should be able to have autofixes for a majority of errors.

imo there should be a basic linter config for only general style preferences, like:

{
  "useTabs": false,
   "semi": false,
   "singleQuotes":true
}

And only these rules, because being too opinionated as a linter might make some people to have to use Prettier just because they don't like semi or quotes

Rome won’t have any formatting options. If you want to use prettier, then use prettier. It’s honestly not that big of a deal. Rome’s many other advantages should outweigh any minor temporary annoyance.

@sebmck ok, maybe I interpreted "if you aren't happy with the defaults that Rome provides then you can use prettier or another tool." in a wrong way

enforce return statements in getters (getter-return)
disallow using an async function as a Promise executor (no-async-promise-executor)
disallow comparing against -0 (no-compare-neg-zero)
disallow assignment operators in conditional expressions (no-cond-assign)
disallow the use of debugger (no-debugger)
disallow duplicate arguments in function definitions (no-dupe-args)
disallow duplicate keys in object literals (no-dupe-keys)
disallow a duplicate case label (no-duplicate-case)
disallow empty character classes in regular expressions (no-empty-character-class)
disallow reassigning exceptions in catch clauses (no-ex-assign)
disallow unnecessary boolean casts (no-extra-boolean-cast)
disallow reassigning function declarations (no-func-assign)
disallow assigning to imported bindings (no-import-assign)
disallow variable or function declarations in nested blocks (no-inner-declarations)
disallow invalid regular expression strings in RegExp constructors (no-invalid-regexp)
disallow multiple spaces in regular expression literals (no-regex-spaces)
disallow returning values from setters (no-setter-return)
disallow sparse arrays (no-sparse-arrays)
disallow unreachable code after return, throw, continue, and break statements (no-unreachable)
disallow control flow statements in finally blocks (no-unsafe-finally)
disallow negating the left operand of relational operators (no-unsafe-negation)
disallow template literal placeholder syntax in regular strings (no-template-curly-in-string)
disallow deleting variables (no-delete-var)
disallow labels that are variable names (no-label-var)
disallow shadowing of restricted names (no-shadow-restricted-names)
suggest template literals instead of string concatenation (prefer-template)

@sebmck Do you have a way that you prefer we track these? I just submitted a PR for unsafe negation [here] (#87) and am happy to include others in it or open other PRs for other rules.

@macovedj I'm wondering the same thing. For starters, I believe we can mention this issue in the PRs that are implementing items from the lint list @sebmck shared.

Created #94 to track the implementation of the ESLint rules mentioned.

@sebmck What are you thoughts on library specific lint rules? React is a big example that I could understand being built in first class, but smaller libraries can benefit from custom lint rules as well. eslint-plugin-tailwind being a good example.

I can see similar problems coming up once you start working on proper bundling support as well. CSS-in-JS libraries being an obvious example for having opinions in the compiling/bundling layer.

Awesome project btw, keen to try and contribute. 👍

Edit:

There's also a category of framework-level lints that are specific to things like React etc. I'm not sure how to handle them since we don't want them in Rome. I really really really want to avoid having JavaScript plugins in Rome so I'm wondering what could be achieved with a DSL where you write expression patterns to disallow.

Missed the above before posting this.

Sorry to backtrack a little to @slobo 's comment:

// @linter-off rule1 [, rule2,...]: Reason why we need to disable listed rules

/* section of code that linter ignores */

// @linter-on

^ a danger with this pattern is that it doesn't enforce a boundary at the end. For instance with ESLint you can have a typo /* eslint-nable */ and accidentally disable linting for the rest of your file.

So a couple ideas:

  • Only allow unclosed ignore-blocks at top of file
  • Perhaps try to catch common typo's in the enable comment

Also in general I personally think disabling all linting rules with a comment should be impossible. That has trade-offs for certain cases but leaves you with a much higher overall reliability.

I think the best option is to only allow line suppression comments.

  • disallow specific global variables such as event, error etc
  • disallow statements other than blocks in if, while etc
  • disallow duplicate import sources
  • disallow var (the presence of this in source code already makes Rome slower!)
  • disallow arguments
  • disallow wildcard import that only has static members

Have you decided yet if you want these/other things to happen between #94 being completed and your first release? Are there things that would be most helpful for people to pickup still?

Yeah they're still helpful to pick up. I didn't put them in the ESLint rules task since their behaviour is less clear but happy to accept PRs!

Sweet. I'll probably pick up duplicate imports. FYI there are only a couple others that I feel I could pick up without further clarification, which I'm sure is something you would have guessed. This is my first time contributing to open source, so I'm not really sure what the best means is by which to gather clarity around them. If you wanted to do that here or anywhere else though, I'd be excited to do so. I think that disallow-var is probably the only one I'd feel I was able to pick up.

Awesome! There's no formal process around issue creation and discussion yet since there's not a lot of structure, so feel free to post any questions here or create a new issue. We also have a fairly active Discord server: https://discord.gg/rome

Hello, I'm interested in contributing, I've got a question about this lint rule:

  • disallow arguments

Eslint provides multiple rules for this case:

So, would we want to implement these 2 rules? Maybe we just want to create one rule to prohibit the use of the arguments keyword?

In my opinion, It can be better to implement the 2 eslint-like rules, what do you think about this?

I feel like a lot of interesting thoughts and discussions took place in this thread. I am wondering if some sort of 'mission statement' can be generated from this. Basically is there a mission (specific to the linting part of Rome) that will differentiate it from existing solutions? Or will the result more or less be the same?

@Lagily

Basically is there a mission (specific to the linting part of Rome) that will differentiate it from existing solutions? Or will the result more or less be the same?

Rome will be different in a few ways.

Advanced autofixes. All of our lint autofixes are regular compiler transforms. This means we can do a lot more complicated fixes than traditional linters like eslint. In ESLint you insert raw strings, so are severely limited by the sorts of things you can do.

Rich diagnostics. Our diagnostics format allows us to provide diagnostic metadata and point to multiple source locations and produce messages like "Did you mean one of these?" or pointing to the original location where we found a duplicate occurrence etc.

Heavy defaults. Other linters do this but it typically requires configuration.

Ideally for a user of Rome, they should be able to install Rome, run rome lint --fix to perform autofixes, and the remaining errors should be legitimate or potential bugs with hopefully limited noise.

I think this can be closed out in favor of more specific issues. I don't think we should focus on other general JS lint rules at this time. Let's move towards a release! Thanks everyone for your contributions so far. I encourage you to get involved in #341 (React lint rules), #18 (Published release) or #496 (Website for release). Thank you again!