carymrobbins/intellij-haskforce

Parse OverloadedLabels

Opened this issue · 3 comments

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE OverloadedLabels #-}
module Label00001 where

import Control.Lens ((^.))
import Data.Generics.Labels ()

data Foo = Foo { bar :: Bar } deriving (Generic)

data Bar = Bar { baz :: Char } deriving (Generic)

example :: Char
example = (Foo (Bar 'a')) ^. #bar . #baz

One tricky part is this case -

example = foo & (#bar . #baz) .~ 'z'

Clearly we want to parse both #bar and #baz as labels, but the lexer gets confused and thinks (# is an unboxed tuple. May need to make the lexer smarter (or maybe dumber?) to be able to handle cases like this. The smarter case seems tough as we don't really know if we're in an unboxed tuple until we hit the close paren and discover it's not a #); and even then dealing with all the nested cases could be a nightmare. On the other hand, the dumber case might be to just lex (, #, and bar separately and make the determination in the parser. The nested cases might still be tricky, but nothing like trying to do it from the lexer.

Also, as an aside, the example expression can also be written this way, avoiding the problem -

example = foo & #bar . #baz .~ 'z'

However, this is clearly no solution and we still need to think about how to handle it.

tydeu commented

One tricky part is this case -

example = foo & (#bar . #baz) .~ 'z'

This is not valid syntax with UnboxedTuples enabled. The GHC user guide says:

Note that when unboxed tuples are enabled, (# is a single lexeme, so for example when using operators like # and #- you need to write ( # ) and ( #- ) rather than (#) and (#-).

You would need to write the expression like so:

example = foo & ( #bar . #baz ) .~ 'z'

Yes, but if UnboxedTuples is not enabled, it should work. Having the IntelliJ lexer change what is a lexeme based on language extensions will be tricky.