Submersible/node-python-bridge

Optionally importing unicode_literals from __future__

shlomiassaf opened this issue · 4 comments

Hi,

Frist, thank you! Great work!

While working with some legacy libraries I sometimes unicode issues due to the use of unicode_literals

In one of the libraries we use object (dict) keys are used by the library as XPath path.
I did not see the internal implementation but I assume some logic is applied to tokenize, parse and process the "path" and the library does not expect to get a unicode string there and does not handle them.

When node_python_bridge.py does:

from __future__ import unicode_literals

It breaks.

For example, if I send the key default

root.something['defualt']

I will get an error:

badly formatted or nonexistent path (8): Bad key "d e f a u l t"

The library handled the unicode bytes as string without converting first, which somehow ended up with a space between every char pair.

I know that the problem is in the library I use and not in this library, but can it be set to optionally import unicode_literals, based on some option parameter?

Thanks!

For reference: http://python-future.org/unicode_literals.html

munro commented

Hmm yea from __future__ import unicode_literals should not be the default, since it isn't when not using this bridge! I added a broken test :D [1]

Maybe you can help me along, I was thinking perhaps dynamically setting from __future__ import ..., except it only works at the start of the file. So then I was thinking dynamically generate the node_python_bridge.py with the desired future imports... but then I realized that was really hacky.

Soo when I'm in ipython, I can set from __future__ import unicode_literals whenever I want—so ideally there's a way we can reproduce that behavior in node_python_bridge.py. Let me know if you find anything, I'll try to circle back when I have time to dig deeper.

[1] 0bae318

munro commented

Ahh, I figured it out. There's a module called code [1] that let's you spin up a REPL environment, this should be used in place of exec [2].

I gotta blast but I would greatly accept a PR!

[1] https://docs.python.org/2/library/code.html
[2] https://github.com/Submersible/node-python-bridge/blob/master/node_python_bridge.py#L80

munro commented

Same performance! I dove into code and found out how it was working: it uses the codeop.Compile class which keeps track of any compiler flags that change. So when the compiler sees from __future__ import unicode_literals, it updates the compiler flags to parse Python strings as unicode instead of byte strings. yay!