dsagal/pynbox

Archive value out of uid_t range (Windows)

Opened this issue · 3 comments

Rybec commented

On Windows, I get a series of errors from tar saying the uid_t and gid_t are out of range, when extracting the sandbox_outer archive. (Reporting this from another system, otherwise I would copy/paste the error, though I already researched it for you (see below), so you probably don't need it.)

According to some research I did, this typically means you are trying to extract the archive as root (not sure how this plays out in Windows), and the id of the user that created the file on the source system is higher than your system allows. According to tar's documentation, the --no-same-owner switch is supposed to avoid this error by extracting the file as being owned by the current user, however, when I added this in the pynbox script, it didn't work. With or without, though, it did extract the files, the problem is that the tar error terminates the script early, which prevents it from installing the rest of the packages.

I believe this can be easily fixed, by repackaging sandbox_outer, after changing the owner and group of the files and directory it contains to a user with an id lower than ~65000. (Evidently this is a common problem for users on Windows domains, because they use absurdly large user IDs.)

In case anyone else has this problem, this is how I got around it:

  1. I opened the pynbox script, and I commented out the tar line (line 135, on the version I have).
  2. I ran the script as specified in the instructions, which now will download all of the specified components but does not extract them.
  3. I went into the build/packages subdirectory and manually extracted them with something like tar -C ~/destination -xvf python.2.7.11d.tbz2

Note that when I extracted the sandbox_outer package, I still got the tar error, but the files seem to be intact. The other packages extracted without error.

Rybec commented

Alright, not much luck making it actually work. bin/run python doesn't return any error output, but it also just instantly returns with no output at all, and the sandbox isn't communicating with my application. (Note that my application does work, and I am merely trying to get it to work in Windows. I developed it in Linux, where it works very well!)

I might look at it again later, but for now I'm going to have to do it on a VM or something. Sadly, I can't even do it on a Raspberry Pi, due to the lack of ARM support. I just don't have time (or energy, at this point) to figure out how to compile pynbox for ARM though.

I'm glad it works somewhere though. I've been looking for a good Python sandbox for years, and the few that are actually secure have never worked at all for me. So thanks for your work on this!

I appreciate the thanks, but as you can maybe tell from the activity on this repo, it's gotten inactive. The main reason is that NativeClient itself -- the sandboxing technology behind it, from Google -- is no longer maintained: https://developer.chrome.com/docs/native-client/.

In practice, this means that security can no longer be assumed (any vulnerabilities in NaCl won't be fixed), and the toolchain -- which is pretty involved -- might already be unusable. As a case in point, after hours of trying, I wasn't able to get Python3 to work with this toolchain (this was some years ago, I am sure it hasn't gotten easier). So Pynbox is currently limited to Python2 which has officially reached end-of-life.

In short, while Pynbox still sort of works in its current state, I don't think it's worth anyone's effort to keep maintaining it.

As for alternatives, I can suggest two to consider:

  • gvisor (https://github.com/google/gvisor). From non-Linux, it would required a VM as well. But if it's sufficient, you get an isolated Linux environment, and can run things without having to build them specially.
  • Pyodide (https://github.com/pyodide/pyodide). This is Python in the browser (using WebAssembly), and experimentally available in Node.js. It's much slower, but at least it's an active project.

Neither is quite like NativeClient / Pynbox (lightweight, userland, cross-platform), but I am not aware of any closer replacement :(

Rybec commented

First, no worries. I understand how it is developing open source software.

Second, I did notice that NativeClient wasn't seeing any activity for a while. Honestly, I found pynbox around a year and a half ago, but I didn't have time to get my project going then. I couldn't find anything more recent with comparable security though. Thanks for those links. Neither of those came up when I started looking again a few weeks ago (probably because search results were flooded with old and/or broken attempts to sandbox Python). I was going to try PyPy's sandbox again, but they dropped the 2.7 version in 2019 to work on a Python 3 version that doesn't seem to have even been started. (I can't find a download for it anywhere, anyhow.)

I did notice this uses Python 2.7. That doesn't bother me too much, as while it may not be maintained anymore, it's still way more proven than any Python 3. I only fully switched to 3 recently myself, and I still code for backward compatibility. But yeah, it still means that if a bug is found, it won't get fixed.

Anyhow, I figured you might not be maintaining it anymore, which is why I included the "how I worked around it part". I'll take a look at those suggestions. Honestly, I'm really disappointed that the Python project hasn't taken up the issue of sandboxing, because it's a popular issue and has been for well over a decade. It needs to be a part of the main project. I've been trying to make a scripting video game for almost 10 years that uses Python, which would benefit the project significantly, but every time I try to find a good way of sandboxing Python that is reasonably repeatable, I come up either empty handed or with projects that have long since been abandoned. (To be clear though, I'm not blaming you. If you don't have time to do the free labor, you just don't, and on top of that, the main dependency isn't maintained anymore either, so even if you did have time, it wouldn't matter. I don't mind blaming Google though!)

Anyhow, thanks again, even if you aren't still maintaining it. It's something that works, which is better than anything else I've tried, so even if I can't make the newer stuff you suggested work, I now have something that does.