Fix/rewrite Scope.hs

Question

Fix/rewrite Scope.hs

Opened this issue 5 years ago · 10 comments

The way Scope.hs behaves is fairly erratic. For example:

-- (correctly) suggests mapM
import qualified Control.Monad
x = flip Control.Monad.forM

-- no hint
-- Should suggest: Control.Monad.mapM or Prelude.mapM
import qualified Control.Monad
import qualified Prelude
x = flip Control.Monad.forM

-- suggests mapM
-- Should suggest: Control.Monad.mapM or Prelude.mapM
import qualified Control.Monad
import qualified Prelude
import Data.Int
x = flip Control.Monad.forM

I guess a name should be assumed, if there's no other source of information, to come from Prelude. With this assumption, in the second case it should be able to suggest Prelude.mapM, and in the third case it should know that Data.Int is irrelevant.

Answer 1 · 2020-05-16T14:41:43.000Z

Agreed! Scope needs a significant overhaul and performance improvements too. There are increasingly more tickets which are scope bugs. I'll create a label for them.

Answer 2 · 2020-05-16T14:50:20.000Z

Plan of attack is probably first to figure out what the semantics of the "perfect" scope module are, then how to implement that efficiently, then finally implement it.

Answer 3 · 2020-06-06T21:33:29.000Z

Here's an initial outline of the Scope problem. Throughout, I use name to mean things like foo and qname to mean things like Foo.bar.

Hint Scope

A hint scope is responsible for mapping a name or qname to many possible qnames it might represent. For example, fmap in a hint might be Prelude.fmap or Data.Functor.fmap.
For some names, e.g. map, we may choose to map them to Prelude.fmap and Prelude.map, on the basis that any fmap hint also applies to map (not sure yet).
A hint scope is constructed from information in a .yaml file, which can be anything we want, but a sequence of imports plus additional information is likely convenient.

Module Scope

Like a hint scope, a module scope is responsible for mapping a name or qname to many possible qnames it might represent.
A module scope is constructed from information in a .hs file, notably the list of imports and the presence/absence of the ImplicitPrelude extension.

Operations

Given a name/qname in a hint, and a name/qname in a module, we need to know if these two refer to the same identifier. In principle that means computing all possible qnames on both sides and seeing if they intersect.
Given a qname in a hint RHS, e.g. Control.Monad.forM, looking up a suitable name/qname in the module. E.g. if the module imports import qualified Control.Monad as CM then CM.forM would be the result.

Efficiency

Seeing if two names intersect is by far the most common operation.
Given a module scope and a hint scope, many comparisons will be done between them, so partially applying these two values together is likely to offer performance wins.

Answer 4 · 2020-06-07T04:59:37.000Z

Thanks @ndmitchell! One possible way to make hint scopes more accurate and configurable is to put the scope information in a separate .yaml file, organized by names. For example:

# scope.yaml
- mapM: [Prelude, Control.Monad, Data.Traversable]
- forM: [Control.Monad, Data.Traversable]
- forM_: [Control.Monad, Data.Foldable]

Multiple occurrences are allowed, e.g., there may be another forM:

- forM: [Data.Vector]

Such information can already, for the most part, be encoded in the current yaml config. However, as we can see, mapM, forM and forM_ are all exposed from slightly different modules, so to make it accurate, we'd need to create a different package for each of them in the yaml, which is quite verbose. Also things get complicated when a rule mentions both mapM and forM.

Moreover, having a scope.yaml separate from rule config files (hlint.yaml, .hlint.yaml) has a few additional benefits:

It allows scope.yaml to be consumed by builtin hints, which is needed to fix #794.
It allows scope.yaml to be overridden, and also allows using a different scope config file for the same hints. In my work repo we have a custom prelude called P, which is neither a subset nor a superset of Prelude. So it would be useful to tell HLint which names are from P. It is also common to re-export a name from a different module, e.g., one may re-export Control.Monad.forM from Foo.Bar.

A possible syntax for overriding is

Control.Monad.forM: [+, Foo.Bar] # add a module
Control.Monad.forM: [-, Data.Traversable] # remove a module
Control.Monad.forM: [Control.Monad, Data.Traversable, Foo.Bar] # reset the modules

(if there's only one forM then it need not be qualified)

Multiple scope config files can be passed to HLint. The later ones override the earlier ones (i.e., each one is a "delta" wrt the previous one).

Answer 5 · 2020-06-07T05:01:26.000Z

For some names, e.g. map, we may choose to map them to Prelude.fmap and Prelude.map, on the basis that any fmap hint also applies to map (not sure yet).

Interesting idea - it would be very desirable to not have to add a pure version for each rule that mentions return!

Answer 6 · 2020-06-07T08:45:36.000Z

I don't see any reason for a separate file - we can just introduce scope entries in the existing .hlint.yaml file. We have mechanisms for overriding/extending that, so we don't have to reinvent such mechanisms. I think it should be possible to have your own custom hlint.yaml that refines the scopes, and to use that scope information in hints.

The idea of name: [qname, qname, qname] is interesting. I'd been thinking more along the lines of similar to imports, e.g. import Data.List(map) import Prelude as Data.List to say that map comes from Data.List and Prelude just reexports things in Data.List. That worked well when there were huge chunks of overlapping exports in the Prelude and other libraries like List, but maybe it's no longer the right encoding.

Answer 7 · 2020-06-07T23:40:27.000Z

We have mechanisms for overriding/extending that, so we don't have to reinvent such mechanisms.

It seems this is currently possible but not easy. For example, if I want to change the imports of the default group in hlint.yaml, I need to:

define a package in my .hlint.yaml with the imports I want
copy all rules in default group from hlint.yaml to .hlint.yaml, and use the new imports
disable the original default group in hlint.yaml

On the other hand, by separating scope yaml from rule yaml, you can simply swap the original scope.yaml for your own.

import Data.List(map) import Prelude as Data.List to say that map comes from Data.List and Prelude just reexports things in Data.List

Yeah, so to encode

- mapM: [Prelude, Control.Monad, Data.Traversable]
- forM: [Control.Monad, Data.Traversable]
- forM_: [Control.Monad, Data.Foldable]

using imports, it'd probably look like

import Prelude (mapM)
import Control.Monad (mapM, forM, forM_)
import Data.Traversable (mapM, forM)
import Data.Foldable (forM_)

The advantage of the former is that it immediately follows that Control.Monad.forM and Data.Traversable.forM are the same forM. So a rule with lhs: ... Control.Monad.forM ... will also fire on Data.Traversable.forM. In the latter encoding this information is missing.

Answer 8 · 2020-06-13T20:48:15.000Z

If people swap a scope.yaml for their own, then they stop getting bug fixes from HLint. It would be much better for people to specify overrides, and once you're doing that, I don't see the need for a separate file.

Answer 9 · 2020-06-15T06:25:17.000Z

I'd love to learn about how overrides are specified (or will be specified). In my work repo there's a custom prelude P that re-exports for, so I'd like all rules involving for (including, ideally, the builtin rules) to be aware of that fact and act accordingly. If there's an easy and natural way to achieve that, then I don't have much preference on the actual approach or the syntax.

Answer 10 · 2020-09-12T16:07:42.000Z

My guess would be users specify that in their .hlint.yaml, which gets merged with the standard data file we supply. No idea on syntax yet - that's something that would need brainstorming.