go-llvm/llgo

Rewrite llgo to use go.tools/ssa

axw opened this issue · 19 comments

axw commented

We should be using ssa in llgo. We would first build the pure-Go SSA representation via that package, and then translate to LLVM IR.

Pros:

  • Alleviates some burden/complexity from llgo (ssa takes care of identifying captures for closures, synthesising bridge functions, etc.).
  • Will simplify writing pure-Go optimisation passes.
  • Reduces scope of #42 (no need to implement various functionality taken care of by ssa).
  • Good citizenship (exercises ssa, with the view to improve the ecosystem).

Cons:

  • Will take time away from other work.
  • Introduces a new dependency.
  • ?

+1.

IMO the new dependency is not a problem as we already use go/types which is from
the same sub-repository.

axw commented

Yeah, I don't think it's a major concern. Particularly because ssa is in go.tools, which means getting changes made upstream isn't going to be problematic.

IMO trying this out and implementing #42 should take priority and could make a good next milestone. While it's true that it'll take time away from other work, that other work might turn out to be solved already by using ssa or worse that it won't fit the new implementation and needs to be rewritten.

I've used the SSA package extensively to create a working prototype translating Go to Haxe (see haxe.org) which then generates working programs in JS, Java, C++, C#, Flash, PHP & Neko. I've just got a prototype working of translated go code running inside an OpenFl application (see openfl.org). I hope to clean-up the project and open-source it later in the year.
I think the SSA package was written with exactly your type of project in mind, by people with a background in using llvm, so I think it would be a very good fit indeed.
Considering I've implemented all except one of the SSA instructions (I've not done Select yet), I've found very few "features" and only one minor bug in the SSA package.
As the Google team have now moved on to writing the optimisation phases (see go.tools/pointer) there are now fewer changes to the SSA code, so it seems pretty stable.
The only problem is that until the optimisation work is complete, the SSA generated code is currently un-optimised and so requires later processing to improve it... but the LLVM optimisation is legendary, so this may not be a problem for llgo.
As you can tell, I'm an enthusiast for the SSA package, which I think will extend the reach of the Go language to many more niches as more code generators are written.
Let me know if I can help...

axw commented

@elliott5 Very cool, I'll look forward to seeing your project.

I've been in contact with Alan Donovan, the author if the package. He originally wrote it for the oracle (static analysis) tool. But yes, it's now very close to complete, so there's nothing holding us back in that regard.

Thanks, I'll keep you in mind if I hit any stumbling blocks. Right now I just need to find some time.

Is there anything I can do here or on some other issue to aid? I've got some time this week, but from next week it's going to become trickier.

axw commented

@quarnster Not that I can think of. If I think of something, I'll let you know. It's not an easily parallelisable task. If you're keen to keep fixing things, then the runtime could do with some love (interface conversions in particular). That's not likely to be affected too much by this work, I think.

I'll be on airplanes for ~24h in a couple of weeks, so I expect I'll get a bunch done on this then, if not before.

Just to clarify: although the algorithm implemented by go.tools/pointer (Andersen's analysis) is used for optimization in many compilers including gcc and LLVM, that is not why we've built it, nor will it be used that way: our implementation is unsound w.r.t. aliases created by (T*)(unsafe.Pointer(x)). Our pointer analysis is intended only for code comprehension tools. Also, we have no specific plans to optimize the SSA code, but as you point out, LLVM can already do this.

@adonovan Many thanks for the clarification, much appreciated.

axw commented

I've pushed some preliminary work on using the ssa package in llgo. This is in a new "ssa" branch, here: https://github.com/axw/llgo/tree/ssa

It's incomplete, and most likely bug-ridden.

axw commented

I've changed the title; I'm committed to doing this.

@axw - fantastic!

axw commented

Brief update: I've got the majority of the easy bits done. Still remaining are:

  • interface support; namely conversions, "invoke" method calls
  • panic/recover
  • handling of recursive pointers (peano)

panic/recover are going to be reimplemented to use setjmp/longjmp. LLVM doesn't support non-call exceptions (http://llvm.org/PR1269), which means no segfault-to-exception translation and the like. Instead, we can use setjmp/longjmp with signal handlers to do the job.

I'm going to rework interfaces to represent them the same way that gc does. This will make the transition to libgo simpler, as well as being the right thing to do in itself.

I'm delighted that the re-write is going so well and that you have decided to use libgo.
I spoke about this aspect of the llgo project as part of my talk at the Go London User Group last night.
Slides at https://speakerdeck.com/elliott5/ssa
The audience were very interested in what you are doing...

axw commented

@elliott5 Thanks for the shout out! Looks like it would've been an interesting talk (and not just the bits about llgo ;))

axw commented

This is just about done. There are some problems relating to the CFG being modified, such that the information that go.tools/ssa emits is no longer correct (namely the Phi edges). There's only one place now that we generate additional blocks and branches, and that's in Call. I'm working on fixing that. This will be tracked in #40.

axw commented

Close to merging now. #40 is done in the ssa branch. llgo now no longer generates additional basic blocks (an oversimplification; llgo no longer generates BBs that invalidate the pred/succ information yielded by ssa.)

I've been working through some bugs in runtime type descriptor generation, and it's now working well (and better than the master branch; runtime.eqtyp now covers all type kinds, and struct and function runtime types are now generated more completely).

I need to re-enable the export/import code, and debug code. I may do the latter after merging, but export/import should come first. I will probably also update llgo-dist to generate the jmp_buf structure (via cgo). Right now it's only been done, by hand, for linux-x86_64.

Soo... Can this be closed now? ;)
Awesome work!

axw commented

Yep, thanks for reminding me :)