Joining multiple levels at once adds unneeded whitespace

Notes

I'm using Neovim v0.3.2-996-g55c518588.
let g:splitjoin_normalize_whitespace = 0 was tested to no avail.
By joining one level at a time, instead of the whole at once, it appears to work well.

Structure 1

foo(
  \/ ------------------------------------ CURSOR HERE
  bar(
    baz(3,4,5),
    biz(2,3)
  )
)

When typing gJ, with the cursor at bar, I get the following result ✔️

foo(
  bar(baz(3,4,5), biz(2,3))
)

Structure 2

\/ ------------------------------------ CURSOR HERE
foo(
  bar(
    baz(3,4,5),
    biz(2,3)
  )
)

When typing gJ, with the cursor at foo, I get the following result 🔴

foo(bar( baz(3,4,5), biz(2,3) ))

Notice the extra paddings around the inner parentheses.

Expectation

I'd expect to be able to configure this behavior, or at least have it be consistent throughout the whole formatting. Either

foo(bar(baz(3,4,5), biz(2,3)))

or

foo( bar( baz(3,4,5), biz(2,3) ) )

Depending on plug-in preferences in regards to joining parentheses or commas.

I'm wondering if this is due to

Note that |gJ| is a built-in mapping that is used for joining lines while
preserving whitespace. However, if no splitting/joining is possible at this
point, the plugin will fall back to the default mapping.

In which case, it would be nice if we could provide our own fallback callback mechanism like this answer instead of having to be bound to gJ.

No, it's not the gJ fallback. The way the plugin joins function arguments is by going over each line and considering it an individual argument. It joins them up with spaces, then, taking care of a potential trailing comma. Here's the entire code for this particular branch (for ruby, at least):

splitjoin.vim/autoload/sj/ruby.vim

Lines 1049 to 1063 in 315cfcc

    
             normal! $ 
        
             let body = sj#GetMotion('Vi(',) 
        
             if sj#settings#Read('normalize_whitespace') 
        
               let body = substitute(body, '\s*=>\s*', ' => ', 'g') 
        
               let body = substitute(body, '\s\+\k\+\zs:\s\+', ': ', 'g') 
        
             endif 
        
             " remove trailing comma 
        
             let body = substitute(body, ',\ze\_s*$', '', '') 
        
             let body = join(sj#TrimList(split(body, "\n")), ' ') 
        
             call sj#ReplaceMotion('Va(', '('.body.')') 
        
             return 1

Basically, the plugin doesn't "work recursively", in a way. If you were to join the inner "bar" function, and then the outer "foo" function, that'd work just fine, but if you try joining "foo", it'll simply consider the lines as separate arguments.

The only way to do what you'd like is to parse the structure of the code, and I would really rather not attempt to do that. Splitjoin does parse argument lists, but only ones that are contained within a single line. It gets much more complicated when you try to parse multiline stuff (another plugin of mine, sideways, does that, but there's a lot of code there I'd rather not port). Not to mention in this case, it's not just an argument list -- it's any arbitrary expression. Even if I did try parsing multiline argument lists, that wouldn't be enough to figure out the correct structure. I'd need a full expression parser.

What I'd recommend is to join the nested expressions first, and then join the parent ones. That's what I'd do for any other kind of structure as well, hashes and stuff. I know it's not ideal, but I can't really think of a reasonable way to implement the functionality otherwise. If you have any ideas, I'm open to considering them.

I agree that you'd need an AST parser to untangle nested code using the "text-processing" approach. And as this plugin already covers a lot of cases, I also agree that it's not worth to pile this on.

I'll reword the issue to make it easier to find later, and then close it. Thank you!

	normal! $

	let body = sj#GetMotion('Vi(',)
	if sj#settings#Read('normalize_whitespace')
	let body = substitute(body, '\s=>\s', ' => ', 'g')
	let body = substitute(body, '\s\+\k\+\zs:\s\+', ': ', 'g')
	endif

	" remove trailing comma
	let body = substitute(body, ',\ze\_s*$', '', '')

	let body = join(sj#TrimList(split(body, "\n")), ' ')
	call sj#ReplaceMotion('Va(', '('.body.')')

	return 1