AndrewRadev/splitjoin.vim

Support for Elm

Bastes opened this issue ยท 22 comments

Hi,

I'm using your plugin a lot in my JavaScript/Ruby life, an would love to be able to use it with the Elm programming language too ๐Ÿ™‚ I'd offer a PR right away if I was already adept at vim plugin programming but since I'm far from it I wouldn't know where to start ๐Ÿ˜… but with a few pointers I'd be willing to try.

I'm happy to hear you're interested in contributing :). I have an old blog post that explains some stuff about splitjoin, but it's kind of high-level and not meant for guiding people around the codebase. Still, here it is: https://andrewra.dev/2012/11/27/vimberlin-lessons-learned-from-building-splitjoin/.

There are two components to filetype support in splitjoin -- the filetype file and the actual implementation of the functions. Here's an example, the file ftplugin/javascript/splitjoin.vim:

if !exists('b:splitjoin_split_callbacks')
let b:splitjoin_split_callbacks = [
\ 'sj#js#SplitFunction',
\ 'sj#js#SplitObjectLiteral',
\ 'sj#js#SplitFatArrowFunction',
\ 'sj#js#SplitArray',
\ 'sj#js#SplitOneLineIf',
\ 'sj#js#SplitArgs',
\ ]
endif
if !exists('b:splitjoin_join_callbacks')
let b:splitjoin_join_callbacks = [
\ 'sj#js#JoinFatArrowFunction',
\ 'sj#js#JoinArray',
\ 'sj#js#JoinArgs',
\ 'sj#js#JoinFunction',
\ 'sj#js#JoinOneLineIf',
\ 'sj#js#JoinObjectLiteral',
\ ]
endif

These are buffer-local variables, each of which is a dictionary with function names to call, in that order, to transform the code.

The actual functions are implemented in autoload/js/<filetype>.vim. Here's one of them:

function! sj#js#SplitObjectLiteral()
let [from, to] = sj#LocateBracesAroundCursor('{', '}')
if from < 0 && to < 0
return 0
else
let pairs = sj#ParseJsonObjectBody(from + 1, to - 1)
let body = join(pairs, ",\n")
if sj#settings#Read('trailing_comma')
let body .= ','
endif
let body = "{\n".body."\n}"
call sj#ReplaceMotion('Va{', body)
if sj#settings#Read('align')
let body_start = line('.') + 1
let body_end = body_start + len(pairs) - 1
call sj#Align(body_start, body_end, 'json_object')
endif
return 1
endif
endfunction

These functions are supposed to return 1 if they were successful, or 0 if they don't apply. So, the first line in that example tries to locate curly brackets somewhere around the cursor. If it fails, that means the cursor is not within curly brackets, the function returns 0 and the next function in that list is attempted. Otherwise, the code transforms the buffer and returns 1, which indicates to the plugin that its work is done. That should be enough for you to set up elm support.

What I can suggest is that you start by thinking about what kinds of operations you'd like to do on Elm code. I'm not familiar with Elm myself, so I don't know what transformations are common. I see in the tutorial something like Browser.sandbox { init = 0, update = update, view = view } and that might just work with the sj#ParseJsonObjectBody as it's being used in the example I gave you. Then again, it might not work, because of wrong assumptions in the code, and maybe you could take a look at constructing your own "parser" (which is a strong word for it) particularly for some elm construct: https://github.com/AndrewRadev/splitjoin.vim/blob/62d42e1ac5dcf8f3c70bd344d31300ee39d0e580/autoload/sj/argparser/js.vim

If you decide this is too much Vimscript and you won't be able to manage it, I can try to implement cases that you come up with. But it'll probably take me some time to get around to it. If you can build a PR, I'll be happy to give you feedback and suggest adjustments. You can also run automated tests, which might make your work easier, but elm is not a built-in filetype, so setting things up might be more effort than it's worth.

Thanks a lot :) I'll be looking into it and see where it goes :)

(oooh, rspec specs ; that will be handy ๐Ÿ˜‰)

(and there's one not passing (./spec/plugin/ruby_spec.rb:461) ; I'll try and look into it once I'm comfortable enough with the codebase)

(and there's one not passing (./spec/plugin/ruby_spec.rb:461) ; I'll try and look into it once I'm comfortable enough with the codebase)

This is one I recently enabled -- it relies on a git submodule being checked out, to test an optional plugin dependency. I feel like I should add an exception during set up, if the plugins are not checked out, to init git submodules. Maybe later.

Ok :) I'll re-test with submodules :) thanks for your help so far, I hope to be soon up-and-running.

Hey, it's been quite some time but I finally got around to some progress :) thanks for your help figuring out how to get started, I'll be sure and post a link to the branch as soon as I have something that works for part of the scope if you want to have a critical look at how it starts.

Well, here's my first kinda successful attempt:
Bastes@137bc0d

It does split an array correctly when I try manually... but the specs fail as if nothing had happened. I don't know where I went wrong there ๐Ÿ˜… don't hesitate to tell me if you see what I missed.

I'll try and get further tomorrow, maybe the night will help.

Good work :). The problem you're having is that the tests run Vim without any of your plugins or config and elm support is not built-in. You could fix the issue by adding a set filetype after starting to edit the file:

set_file_contents <<~EOF
  list =
      [1, 2, 3, 4]
EOF
vim.command('set filetype=elm')

Although this will also not indent things correctly. Ideally, you want to vendor elm support:

diff --git .gitmodules .gitmodules
index 335a922..12712f0 100644
--- .gitmodules
+++ .gitmodules
@@ -7,3 +7,6 @@
 [submodule "spec/support/tabular"]
 	path = spec/support/tabular
 	url = https://github.com/godlygeek/tabular
+[submodule "spec/support/vim-elm-syntax"]
+	path = spec/support/vim-elm-syntax
+	url = https://github.com/andys8/vim-elm-syntax
diff --git spec/spec_helper.rb spec/spec_helper.rb
index 9a6646c..a45a962 100644
--- spec/spec_helper.rb
+++ spec/spec_helper.rb
@@ -14,12 +14,14 @@ Vimrunner::RSpec.configure do |config|
     # Up-to-date filetype support:
     vim.prepend_runtimepath(plugin_path.join('spec/support/rust.vim'))
     vim.prepend_runtimepath(plugin_path.join('spec/support/vim-javascript'))
+    vim.prepend_runtimepath(plugin_path.join('spec/support/vim-elm-syntax'))
 
     # Alignment tool for alignment tests:
     vim.add_plugin(plugin_path.join('spec/support/tabular'), 'plugin/Tabular.vim')
 
     # bootstrap filetypes
     vim.command 'autocmd BufNewFile,BufRead *.rs set filetype=rust'
+    vim.command 'autocmd BufNewFile,BufRead *.elm set filetype=elm'
 
     if vim.echo('exists(":packadd")').to_i > 0
       vim.command('packadd matchit')
diff --git spec/support/vim-elm-syntax spec/support/vim-elm-syntax
new file mode 160000
index 0000000..846a592
--- /dev/null
+++ spec/support/vim-elm-syntax
@@ -0,0 +1 @@
+Subproject commit 846a5929bff5795256fbca96707e451dbc755e36

So, use git submodule add to get up-to-date elm support, next to the Rust and Javascript ones, and load it during tests.

In general what I do to debug situations like these is to throw a require 'pry'; binding.pry wherever I need Vim to stop. You should have a gui Vim instance waiting and you can interact with it directly, see what doesn't work.

Awesome, thanks for everything ๐Ÿ‘

As far as I can see, I might have to make two parsers at least (one for functions, which are not coma-separated in Elm, along with the one for arrays that should also work with tuples and records with minor tweaking). I'll keep posting updates until it grows up to PR material ๐Ÿ˜‰

As far as I can see, I might have to make two parsers at least (one for functions, which are not coma-separated in Elm, along with the one for arrays that should also work with tuples and records with minor tweaking).

For the arrays at least, I wonder if one of the existing parsers could do the trick. The array in the test you wrote seems very similar to lots of other languages, other than the actual form with commas in the beginning. But the "split into parts" action could re-use something else -- you could take a look at this for inspiration if you haven't already:

function! sj#js#SplitArray()
return s:SplitList(['[', ']'], 'cursor_inside')
endfunction

Of course, maybe there's particular elements in elm that could confuse the "JSON parser" used over there, so you might really need it. I see that your parser checks for ] to determine the end of a list, but that might be unreliable -- there's nested lists (though I only glanced at it, so I might be missing some context). If you follow that function, you'll see that I first pick out the correct start and end of the function and then run the insides through the JSON parser.

Anyway, it's your call -- as you say, keep posting things, step by step, build up a test suite that you're happy with and I'll help out with reviewing later.

Thanks for your guidance ; so far I was more tinkering until it clicked than anything (TDD habits: go green as fast as possible, even if it's not polished ๐Ÿ˜‰ hence my joy when I discovered you were using rspec), and I was planning on writing tests for those kinds of cases anyways (with sub-strings, tuples, arrays and records) to be safe there too.

Stay tuned ๐Ÿ™‚

This is going rather well ๐Ÿ™‚ I've tackled the first real problem (matching strings, sub-lists, etc.) and here's a new version:
Bastes@fcb1fdc

My strategy evolved rather radically when I realized vim was making a better job capturing inside quotes and other markers than I would parsing char by char so I removed the parser altogether and made my own loop from scratch exploiting the cursor position and normal! calls.

I tried to neatly label those normal! calls so as to keep the thing readable. I'm likely to bundle that as a parser afterwards. There are already quite some tests so I guess refactoring won't be too much trouble (but I'll have to get to bed for now ๐Ÿ˜‰).

Next step (probably): splitting tuples and records.

Again, thanks for your help getting started ๐Ÿ™‚ ๐Ÿ‘

Yes, using normal-mode actions, particularly for text objects can be a really efficient and convenient way to get and replace text :). You could try using this helper function, which also takes care of storing and restoring the cursor position:

" function! sj#GetMotion(motion) {{{2
"
" Execute the normal mode motion "motion" and return the text it marks.
"
" Note that the motion needs to include a visual mode key, like "V", "v" or
" "gv"
function! sj#GetMotion(motion)
call sj#PushCursor()
let saved_register_text = getreg('z', 1)
let saved_register_type = getregtype('z')
let @z = ''
exec 'silent normal! '.a:motion.'"zy'
let text = @z
if text == ''
" nothing got selected, so we might still be in visual mode
exe "normal! \<esc>"
endif
call setreg('z', saved_register_text, saved_register_type)
call sj#PopCursor()
return text
endfunction

You could use it in the "capture matching character" function:

function sj#elm#CaptureMatching(character)
  return sj#GetMotion('va'.a:character)
endfunction

Thanks again for the tip ๐Ÿ˜„ that could make things tidier indeed.

Maybe I'll eventually put some of the more generic functions there too, the capturing ones might not be that elm-related after all.

Some news ๐Ÿ™‚ this is going rather well, splitting for lists, tuples and records is done and I'm rather happy with the shape of the code (there'll be some documentation to do, I didn't want to be constantly re-writing it).

I ended up relying on searchpair, searchpairpos and normal! %) which allowed me to write only one function for the three cases that can target the outermost pair of (), [] or {} and ignore the rest (no sense in splitting a sub-element if the encompassing element is not split already).

I documented the cases where the syntax makes for a poor output, mainly so I can attempt to fix the syntax itself once this PR is done.

Now I think I'm going to focus on joining those structures, I'll let you know when there's something interesting to review there ๐Ÿ™‚

Some (good) news: https://github.com/Bastes/splitjoin.vim/tree/elm :)

Now the split/join mechanic is about where I wanted it for the braces cases, I think I'm going to document, cleanup/rename functions for clarity and suggest a PR before attempting to tackle the (more ambitious) function chaining problem.

If you have ideas on how to go about this, here what I'd like to do:

thingy =
    someValue
        |> someFunction someParam
        |> someOtherFunction (\foo -> (foo, bar))
        |> aThirdFunction (\baz -> 2 * baz) otherParams

-- ^
-- split

-- join
-- v
thingy =
    aThirdFunction (\baz -> 2 * baz) otherParams (someOtherFunction (\foo -> (foo, bar)) (someFunction someParam someValue))

Joining seems rather straightforward (detect a patter where consecutive lines start with a |> and put them back in reverse order progressively surrounded by parens) but splitting less so ; maybe search at the end of the line for parens that are not tuples and take the last param of each and split and reverse? Still a bit confused about that. I'll let you comment if you like :)

Now the split/join mechanic is about where I wanted it for the braces cases, I think I'm going to document, cleanup/rename functions for clarity and suggest a PR

Great, good work ๐Ÿ‘ :)

If you have ideas on how to go about this, here what I'd like to do:

First, I would recommend focusing on a single split instead of looking for a way to affect the entire structure. You can always call the resulting function in a loop until it fails to split anything, but being able to selectively merge two piped functions, or split one expression into pipes is probably going to give you more flexibility when working. The same can be said about joining pipes into function calls.

As for how to do it, the only way I can imagine is by implementing an argument parser to get the pieces of the expression. The biggest challenge might be figuring out where to start. For a language like python, you can safely assume that keyword( is the start of a function. For ruby, keyword<space> can be, because the delimiter is ,. For elm and haskell, I imagine it can be trickier.

You could start with the assumption that a function call you'll be splitting is either:

  • at the start of the line
  • after a =
  • after a |>

And you could possibly enumerate other cases. In the worst case scenario, at least the most common cases you can think of might work, even if you need to manually restructure more complicated ones. And as you go, you can reevaluate what "the start of a function" is and make changes. That's how I've been rolling pretty much everything in splitjoin -- start with some assumptions, see new cases, think of ways to integrate them.

(You could also decide that the start of the function is the word under the cursor, expand('<cword>'), and then find the next space and keep going with arguments from that point on. Not sure how reliably that would work.)

You might consider copying the HTML arg parser, since HTML attributes are also space-delimited: https://github.com/AndrewRadev/splitjoin.vim/blob/20e41455e1155f5989ecac007fc92c9415244822/autoload/sj/argparser/html_args.vim.

You wouldn't need the "end of a tag" logic and the brackets would be different (maybe only (). And the result of that parser would spit out the first argument as the function. Here's what I tried:

function! sj#argparser#elm#Construct(start_index, end_index, line)
  let parser = sj#argparser#common#Construct(a:start_index, a:end_index, a:line)

  call extend(parser, {
        \ 'Process': function('sj#argparser#elm#Process'),
        \ })

  return parser
endfunction

function! sj#argparser#elm#Process() dict
  while !self.Finished()
    if self.body[0] == ' '
      if self.current_arg != ''
        call self.PushArg()
      endif
      call self.Next()
      continue
    elseif self.body[0] =~ '["''(]'
      call self.JumpPair('"''(', '"'')')
    endif

    call self.PushChar()
  endwhile

  if len(self.current_arg) > 0
    call self.PushArg()
  endif
endfunction

With that, I tried running

let parser = sj#argparser#elm#Construct(0, col('$'), getline('.'))
call parser.Process()
echo parser.args

And I got:

['aThirdFunction', '(\baz -> 2 * baz)', 'otherParams', '(someOtherFunction (\foo -> (foo, bar)) (someFunction someParam someValue))']

Which seems pretty reasonable. Of course, instead of 0 and col('$'), you'd need to figure out the start and end of the function.

Once you're done splitting arguments, you take a look at the last one and see if it looks like (funcCall<space>.*) and if so, unpack it and put it on the previous line, leave the current with |>. If you want to do the whole thing in one go, you can keep calling the function, or just let the user continue, if they want to split it.

I'm sure there will be a lot of details to manage, but this would be what I'd start with, at least.

Side note: here's some more of my thoughts on working with space-delimited stuff: AndrewRadev/sideways.vim#36 (comment). It's for a different plugin, but the problems are the same. If you have foo bar baz and your cursor is on bar, how would the plugin know that foo is the function call and bar is the variable? In splitjoin's case, it might actually not be a problem, because you can tell the user "put your cursor on the function, please", sideways is a different story.

Thanks @AndrewRadev for all those ideas, I'll try and see where this all leads me once the cleanup is done and I've already got PR material ;) stay tuned...

So, here we are, the first PR :) thank you for your patient support, I'll be working on split/joining function chains now.

Well, we can consider this one closed ;) I'll maybe open another one for the pipeline transformation we discussed.