the big refactor plan for the refactoring
fommil opened this issue ยท 35 comments
@mlangc @xeno-by @smarter @sschaef @dragos @rorygraves (please add more as I don't know all handles)
Just to summarise what we discussed.
- a lot of refactorings currently don't work. We are happy for the next release to remove all but: "rename", "organize [sic] imports" and "move class"
- the scalac Tree API desugars the Trees such that printing the refactored trees is not an option. We are proposing to use the presentation compiler to produce the Tree diff, but then interpret the diff in a sugar-preserving Concrete Syntax Tree, such as the one produced by
scala.meta(but we could also investigate using scalariform or https://github.com/lihaoyi/fastparse or bribe @sirthias with lots of beer to write it in parboiled2). Aka the "libre monad".
Actually, it is not such a big refactor plan. Moving to scala.meta would be a big refactoring plan but we ended up with the conclusion that we won't do that since it is too unstable. We also came to the conclusion that we don't remove any refactorings but to mention in the README that they are not actively maintained. Matthias wants to try out the Libre Monad idea however. If there is a parser required we need to use scalariform, since we have to build all Scala dependencies against the nightlies of Scala itself.
Well, the first thing I want to try is to implement rename refactorings without any tree transformations at all. The approach we discussed is:
- Use the presentation compiler to find all places where the symbol that should be renamed occurs in the source code.
- Do simple string replacements in exactly these places, instead of tree transformations and tree printing.
- Eventually rename source files.
I think that this should work just fine - but maybe I'm missing something. If this approach should turn out to be successful, we might consider similar strategies for "Move Class".
- "Organize Imports" needs an update from the presentation compiler, that allows us to fix #1001793. This has been discussed with @dragos . A hybrid approach as suggested above could definitely make sense here: Use presentation compiler ASTs to find out which imports can be removed/have to be kept, but perform the actual transformation on a scalameta CST (concrete syntax tree). @wpopielarski what do you think?
- I'd rather not think too much about more advanced refactorings until we get the basics right.
As for unmaintained refactorings: We want to keep them in the library as @sschaef pointed out. They should however be dropped from Scala-IDE (are they included in ENSIME?), out of respect to our users ;-).
cool stuff! In terms of the unmaintained refactorings, I agree with @mlangc that if they aren't stable we'd be happy just removing them. I'll create a ticket for that.
I thought the parser part of scala.meta was stable enough to use? I'd really love to write a parboiled2 parser for scala, but unfortunately it's not in my top 10 "pain" list and they tend to get fixed long before my top 10 "love to do" list ๐ข
On 02/14/2016 12:49 PM, Sam Halliday wrote:
I thought the parser part of |scala.meta| was stable enough to use?
I'd really love to write a parboiled2 parser for scala, but
unfortunately it's not in my top 10 "paint" list and they tend to get
fixed long before my top 10 "love to do" list ๐ขWe would have to compile it against every nightly of scalac, which adds
additional complexity in our build infrastructure. Beside from that a
parboiled2 parser for scala already exists, created by lihaoyi and paulp.
@sschaef The pb2 parser that you're referring to doesn't generate trees - it just verifies the fact that a given string is valid wrt Scala's syntax.
can you please link me to the parser? I thought howey wrote his own parser library for this
It looks like they both removed the parser from their repos but I still have it on my own computer. I can send it to you but as Eugene said, the parser doesn't create a tree. Shouldn't be too difficult to add one though.
might be worth asking them where they moved the parser to
I think this is the most up-to-date parser: https://github.com/lihaoyi/fastparse/tree/master/scalaparse/shared/src/main/scala/scalaparse.
If you plan to create custom trees for scalaparse, you may want to consider integration with scala.meta trees. You could also try out scala.meta trees first to see whether the notion of CSTs will be useful for your purposes.
my preference would be to use scala.meta first, falling back to the parboiled2 or scalaparse implementations.
@lihaoyi can your scala parser return a tree representation? Would it be possible to be able to provide a position in the source code and get back a lens that points at an element in the tree? Consider this use case: we know exactly which position of all variables that need to be renamed, and we would like to be able to provide them to a tree representation of the code, make the change, then write out the refactored version.
Sorry for not following this discussion more closely, but here are my 2c regarding the pb2 Scala parser:
The work is almost entirely @lihaoyi and paulp's and the pb2 version is essentially unmaintained and probably not in very good shape.
If @lihaoyi has continued the work on the Scala parser in the context of his fastparse lib (as I assume) I'd definitely recommend to build on that one rather than pb2 for this project.
Without having any experience with fastparse myself it's my impression is that it is receiving more love from its authors than pb2 at this point, making it a better base for a large language parser like the one for Scala.
I do have plans for pb2 but very limited capacity for it right now, so, if you really want to put large weight on a parsing toolkit, pb2 might not be the best choice, unfortunately...
Yeah the PB2 version is dead. The FastParse one is heavily used in Ammonite and Scalatex and other places and isn't going anywhere.
I'm not currently generating trees because I don't need them. Doing so would be straightforward, if slightly tedious, but it shouldn't require any novel techniques or problem solving. Someone just needs to define the 100 or so nodes that form the Scala Concrete Syntax Tree and put in the 100-200 .map calls to plumb them into Scalaparse.
The runtime perf will be worse than PB2 but the Quality of Life is much better and developer velocity probably higher
ps. to answer the earlier question, no it does not currently track RangePositions, just like it doesn't even build an AST. Making it track 100% accurate RangePositions would be basically a bunch of grunt-work and isn't inherently difficult at all. Someone just needs to put a bunch of Index parsers before and after each node and plumb the resultant integers into where-ever they belong inside the AST
@lihaoyi just when I thought you were going to write your own scala compiler, you go and let me down.
A Google Summer of Code idea, we'll be submitting this via EPFL http://ensime.github.io/contributing/ideas/
It might interest you that rename refactorings no longer do any tree printing at all: #133
Very cool stuff @mlangc
nice!!!!
@pavelfatin where is the IntelliJ CST code? I'd like to link to it from our proposal
@fommil I'm not sure if I understand your SOC proposal: You mean implementing something like
def findInCst(astNode: CompilerAstNode): Option[CstNode]
that would return Some(x) in all but a few corner cases? That could be quite useful for the refactoring library, if we take the hybrid approach discussed above. The reverse mapping
def findClosestMatchInAst(cstNode: CstNode): Option[CompilerAstNode]
might also be of interest, but I don't have a specific use case for it in mind at the moment (although I'm almost sure that there are some).
@xeno-by Did I understand you correctly if I think that you are currently struggling with removing the Option from these transformations, so that you can enrich your CSTs appropriately?
Pretty much, but a lens would allow you to find it within your tree, and swap it out.
Hmm, can you give me an example?
Well basically instead of returning an Option[Cst] it would return Option[CstLens] and CstLens would have get/set on it.
Ahh... makes sense to me now ;-).
the idea would be to come up with another Tree-like API (Cst, a sealed family of case classes) which is preserving of the source code (several exist already) and then provide a Lens something like
// tree is a branch in the full AST
// cst is the full CST for the compilation unit
lens(tree: Tree, cst: Cst): Lens
trait Lens {
// get the source code preserving element referenced by the Tree
def get: Cst
// return a new full Cst for the compilation unit with this one part replaced
def set(cst: Cst): Cst
}Hmm, there are trees in the AST that cannot be easily mapped to corresponding CST constructs. Run
case class Example(i: Int = 0, j: Int = 0) {
def foo = copy(j = 1)
}through scalac -Xprint:4 Example.scala and marvel at all the <synthetic> stuff to see what I mean. Thus I think that we would rather have
lens(tree: Tree, cst: Cst): Option[Lens]With ScalaSphere having just finished up is there now any plan to start using ScalaMeta/Scalafix for refactorings?
there is a GSoC and it's up to @mlangc to select it or not depending on the quality
I've endorsed the proposal, so lets hope that it gets accepted by Google (I can't tell what the chances are though).