Fix/rewrite Scope.hs
Opened this issue · 10 comments
The way Scope.hs behaves is fairly erratic. For example:
-- (correctly) suggests mapM
import qualified Control.Monad
x = flip Control.Monad.forM
-- no hint
-- Should suggest: Control.Monad.mapM or Prelude.mapM
import qualified Control.Monad
import qualified Prelude
x = flip Control.Monad.forM
-- suggests mapM
-- Should suggest: Control.Monad.mapM or Prelude.mapM
import qualified Control.Monad
import qualified Prelude
import Data.Int
x = flip Control.Monad.forM
I guess a name should be assumed, if there's no other source of information, to come from Prelude. With this assumption, in the second case it should be able to suggest Prelude.mapM
, and in the third case it should know that Data.Int
is irrelevant.
Agreed! Scope needs a significant overhaul and performance improvements too. There are increasingly more tickets which are scope bugs. I'll create a label for them.
Plan of attack is probably first to figure out what the semantics of the "perfect" scope module are, then how to implement that efficiently, then finally implement it.
Here's an initial outline of the Scope problem. Throughout, I use name to mean things like foo
and qname to mean things like Foo.bar
.
Hint Scope
- A hint scope is responsible for mapping a name or qname to many possible qnames it might represent. For example,
fmap
in a hint might bePrelude.fmap
orData.Functor.fmap
. - For some names, e.g.
map
, we may choose to map them toPrelude.fmap
andPrelude.map
, on the basis that anyfmap
hint also applies tomap
(not sure yet). - A hint scope is constructed from information in a
.yaml
file, which can be anything we want, but a sequence of imports plus additional information is likely convenient.
Module Scope
- Like a hint scope, a module scope is responsible for mapping a name or qname to many possible qnames it might represent.
- A module scope is constructed from information in a
.hs
file, notably the list of imports and the presence/absence of theImplicitPrelude
extension.
Operations
- Given a name/qname in a hint, and a name/qname in a module, we need to know if these two refer to the same identifier. In principle that means computing all possible qnames on both sides and seeing if they intersect.
- Given a qname in a hint RHS, e.g. Control.Monad.forM, looking up a suitable name/qname in the module. E.g. if the module imports
import qualified Control.Monad as CM
thenCM.forM
would be the result.
Efficiency
- Seeing if two names intersect is by far the most common operation.
- Given a module scope and a hint scope, many comparisons will be done between them, so partially applying these two values together is likely to offer performance wins.
Thanks @ndmitchell! One possible way to make hint scopes more accurate and configurable is to put the scope information in a separate .yaml
file, organized by names. For example:
# scope.yaml
- mapM: [Prelude, Control.Monad, Data.Traversable]
- forM: [Control.Monad, Data.Traversable]
- forM_: [Control.Monad, Data.Foldable]
Multiple occurrences are allowed, e.g., there may be another forM
:
- forM: [Data.Vector]
Such information can already, for the most part, be encoded in the current yaml config. However, as we can see, mapM
, forM
and forM_
are all exposed from slightly different modules, so to make it accurate, we'd need to create a different package
for each of them in the yaml, which is quite verbose. Also things get complicated when a rule mentions both mapM
and forM
.
Moreover, having a scope.yaml
separate from rule config files (hlint.yaml
, .hlint.yaml
) has a few additional benefits:
- It allows
scope.yaml
to be consumed by builtin hints, which is needed to fix #794. - It allows
scope.yaml
to be overridden, and also allows using a different scope config file for the same hints. In my work repo we have a custom prelude calledP
, which is neither a subset nor a superset ofPrelude
. So it would be useful to tell HLint which names are fromP
. It is also common to re-export a name from a different module, e.g., one may re-exportControl.Monad.forM
fromFoo.Bar
.
A possible syntax for overriding is
Control.Monad.forM: [+, Foo.Bar] # add a module
Control.Monad.forM: [-, Data.Traversable] # remove a module
Control.Monad.forM: [Control.Monad, Data.Traversable, Foo.Bar] # reset the modules
(if there's only one forM
then it need not be qualified)
Multiple scope config files can be passed to HLint. The later ones override the earlier ones (i.e., each one is a "delta" wrt the previous one).
For some names, e.g. map, we may choose to map them to Prelude.fmap and Prelude.map, on the basis that any fmap hint also applies to map (not sure yet).
Interesting idea - it would be very desirable to not have to add a pure
version for each rule that mentions return
!
I don't see any reason for a separate file - we can just introduce scope
entries in the existing .hlint.yaml
file. We have mechanisms for overriding/extending that, so we don't have to reinvent such mechanisms. I think it should be possible to have your own custom hlint.yaml that refines the scopes, and to use that scope information in hints.
The idea of name: [qname, qname, qname]
is interesting. I'd been thinking more along the lines of similar to imports, e.g. import Data.List(map)
import Prelude as Data.List
to say that map
comes from Data.List
and Prelude
just reexports things in Data.List
. That worked well when there were huge chunks of overlapping exports in the Prelude and other libraries like List
, but maybe it's no longer the right encoding.
We have mechanisms for overriding/extending that, so we don't have to reinvent such mechanisms.
It seems this is currently possible but not easy. For example, if I want to change the imports
of the default
group in hlint.yaml
, I need to:
- define a
package
in my.hlint.yaml
with the imports I want - copy all rules in
default
group fromhlint.yaml
to.hlint.yaml
, and use the new imports - disable the original
default
group inhlint.yaml
On the other hand, by separating scope yaml from rule yaml, you can simply swap the original scope.yaml for your own.
import Data.List(map) import Prelude as Data.List to say that map comes from Data.List and Prelude just reexports things in Data.List
Yeah, so to encode
- mapM: [Prelude, Control.Monad, Data.Traversable]
- forM: [Control.Monad, Data.Traversable]
- forM_: [Control.Monad, Data.Foldable]
using imports, it'd probably look like
import Prelude (mapM)
import Control.Monad (mapM, forM, forM_)
import Data.Traversable (mapM, forM)
import Data.Foldable (forM_)
The advantage of the former is that it immediately follows that Control.Monad.forM
and Data.Traversable.forM
are the same forM
. So a rule with lhs: ... Control.Monad.forM ...
will also fire on Data.Traversable.forM
. In the latter encoding this information is missing.
If people swap a scope.yaml for their own, then they stop getting bug fixes from HLint. It would be much better for people to specify overrides, and once you're doing that, I don't see the need for a separate file.
I'd love to learn about how overrides are specified (or will be specified). In my work repo there's a custom prelude P
that re-exports for
, so I'd like all rules involving for
(including, ideally, the builtin rules) to be aware of that fact and act accordingly. If there's an easy and natural way to achieve that, then I don't have much preference on the actual approach or the syntax.
My guess would be users specify that in their .hlint.yaml
, which gets merged with the standard data file we supply. No idea on syntax yet - that's something that would need brainstorming.