Request: strip _common_ leading whitespace from input before passing to kernel
sglyon opened this issue · 18 comments
I'd love it if hydrogen stripped leading whitespace that was common to all lines in the selection before passing to the kernel.
For example, suppose I have this python code:
def build_grids(p: Params):
# build grid over I choices and N states
Imax = 10
expected_life = (1-p.δ) / p.δ
Igrid = np.arange(1, Imax+1)
Ngrid = np.arange(0, round(Imax*expected_life)+1, 2)
zgrid = p.z
# ... more code ....
If I were to highlight from the start of the commented line # build grid...
(including the leading whitespace) all the way to p.z
, and try to evaluate I would get a complaint from the python kernel about unexpected whitespace.
If I were to take that same highlighted region and copy it to my system clipboard and then use the %paste
magic in an ipython terminal session, the leading whitespace would be stripped and evaluation would proceed without errors.
Note that in my example if I were to start my selection at the I
character in Imax
then the behavior should be to still throw an error because the first line in my selection doesn't have leading whitespace in common with all other lines.
I was going to try to take a stab at this, but I don't know where to look inside here. I figured you would know better than me ;)
My first impression is that there isn't a truly language agnostic way to do this from hydrogen's end.
Interesting issue, I think an init script or plugin solution could be fairly straight forward, but I'm not sure this should be built into hydrogen itself.
Is there a precedent for this with the jupyter notebook?
No, the notebook doesn't do this, but the console does... with the %paste
magic:
~|⇒ ipython vpn-192-168-100-7
Python 3.6.1 |Anaconda 4.4.0 (x86_64)| (default, May 11 2017, 13:04:09)
Type "copyright", "credits" or "license" for more information.
IPython 5.3.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]:
...: def foo():
...: return "hello!"
...:
...:
...: foo()
File "<ipython-input-1-ede5e3b3ed2a>", line 2
def foo():
^
IndentationError: unexpected indent
In [2]: paste
def foo():
return "hello!"
foo()
## -- End pasted text --
Out[2]: 'hello!'
It may not be an issue in notebook since it normally runs code in cells instead of selection based
Good point ☝️
I think that is the main reason I've never missed that in the notebook, but am happy to have it in the console and would like to see it here.
If I were to try to put this in my init script is there a recommended way for me to add a processing layer before hydrogen sends code to the kernel?
If I were to try to put this in my init script is there a recommended way for me to add a processing layer before hydrogen sends code to the kernel?
Not exactly, (yet 😄 ) .
Now that I think it through in order to use the init solution you would probably need to change the actual text buffer in selection then change it back after dispatching hydrogen:run
. Could get pretty janky.
I think in this case the best way to enable this would be through enhancing our plugin api (something we have talked about doing for other reasons anyway).
This issue is something I've been running into as well in my usage. I wouldn't be opposed to having hydrogen strip out common leading whitespace, rather than offloading this to a plugin. The criteria for this are pretty simple (detect leading whitespace on first line, and strip it from later lines) so it doesn't actually require any language-specific parsing.
It would be language specific just in the sense that there is no need to do this in most languages (the described workflow works fine in julia, js, R, etc.)
on the other hand, I guess if we are doing any language a favor it should be Python.
My other concern would be: can this have side effects for other kernels? Seems unlikely, but something to think about.
I agree with @BenRussert, but still want this feature supported by hydrogen.
Since it's just simple and easy than ask/wait for each kernel provider to support this.
How about implement this indent removing feature and make it configurable by scope name.
- Add
stripIndentBeforeSendToKernel
config with default['source.python']
. - When editor have matching scope, strip leading whilte-space(=indent) before sending to kernel.
One side effect for this is, code is evaluated in same environment unless restarting kernel.
If user selected some deep indented lines of code, it evaluated as no indented level, this affect scope of vars its effective depending on language.
But this is also true for hydrogen:run
it can change order of code evaluated, so user have to being careful for the effect of these commands anyway.
I think this should really be in its own command (and likely in a language-specific plugin). Otherwise, if trimming was the default behaviour, how would we handle:
- multi-line strings?
- and the pollution of the global scope?
I think it would be OK to strip whitespace on code that is executed on a selection basis.
We could quite easily add this here since we already normalize the line endings before sending code to a kernel.
how would we handle multi-line strings?
Is this really a problem if leading whitespace is stripped?
how would we handle the pollution of the global scope?
I don't think we need to handle this case. Right now one can already execute deeply nested code by selecting it line by line. Stripping the whitespace only makes it convenient to execute multiple nested lines.
pyCharm can handle this issue pretty well, why can't hydrogen do the same?
PyCharm is funded, Hydrogen is not. We operate on contributions to the code for changes. As for why not it's because no one has done it, including yourself @kkonevets.
I thought that updates to the plugin API would make it easy to implement this feature as a plugin, but it turns out that this is not the case.
It so happens that hydrogen always trims the string before sending it to the kernel, so e.g.
x = 5
print(x)
becomes
x = 5
print(x)
and it's rather difficult for a plugin to undo that transformation and detect that actually there was common leading whitespace all along.
(Just wanted to post this thought here in case it sparks any ideas)
Now that the plugin API changes are merged, it's possible to mostly implement this feature in a plugin. I have a working plugin implementation at https://github.com/nikitakit/hydrogen-python
Note that there are some limitations (related to my comment above), but at least in my code those cases occur infrequently compared to the many times I want to run code in an indented block.
I reproduced this in IPython 5.3.0, and it works fine.
The reason why it didn't work in the above comment is because there's a silent \n
on line 1, hence why the IndentationError
is raised on line 2.
Given that IPython is fine with common whitespace, as long as Hydrogen doesn't remove leading whitespace, this issue should be solved. See #1363.