Path to expand "standard library" or move things to userland?
Qard opened this issue · 5 comments
There's a few things the module directory currently covers that seem to be typical standard library material. However, there are some notable things missing, networking in particular. TCP sockets and HTTP will likely be needed before any sort of package manager can exist, unless it's written in something else. I think that would be a huge missed opportunity to create a sizeable complex application purely within the confines of the language itself though.
Do you see more of these things living in wren itself, included in the module directory, or do you have something else in mind? It'd be good to get your ideas for this laid out, even if they aren't super fleshed out yet. I'd love to help out however I can. I've been working on node.js core for a long time, so I'm quite familiar with libuv.
Some things I'd like to see:
- Expose native Buffer types similar to typed arrays in JS and the Buffer type in node, allowing binary data to be passed around without needing to serialize unless it is used within wren code. (ie: File.write('dest', File.read('source')) would never serialize the read data into a wren string)
- Stream base classes for File and, eventually, Socket to inherit from, for consistent read/write interfaces. Stream should probably inherit from Sequence for fancy iteration magic.
- TCP and maybe UDP Socket classes inheriting a DuplexStream class, or something like that
- An HTTP Parser stream
Also, I feel like all of this stuff, including what is already in module could probably actually live in a separate project which gets pulled in just like libuv currently. It'd be great to have some sort of consistent mechanism to attach native modules, similar to what mruby does. I think that half of the module system could exist before a package manager to make it easier to move as much of what is needed to make a package manager work into userland space.
If we can decide on an interface for native modules, I'd be happy to get started making that work.
However, there are some notable things missing, networking in particular. TCP sockets and HTTP will likely be needed before any sort of package manager can exist, unless it's written in something else.
Yup, networking is definitely on my radar.
Do you see more of these things living in wren itself, included in the module directory, or do you have something else in mind?
Yes, basically anything that's implemented in foreign methods that call into libuv would go in modules. If it's written in pure Wren, that would hopefully be in userland, though that will be a chore to consume until we have a package manager.
Anything written in a combination of Wren and C that doesn't have external dependencies can go into optional.
I'd love to help out however I can. I've been working on node.js core for a long time, so I'm quite familiar with libuv.
Woo! You certainly know it better than I do. :)
Expose native Buffer types similar to typed arrays in JS and the Buffer type in node, allowing binary data to be passed around without needing to serialize unless it is used within wren code. (ie: File.write('dest', File.read('source')) would never serialize the read data into a wren string)
I've thought some about having buffers in addition to strings, but I'm still on the fence. The main user visible difference I'm aware of is that buffers are mutable where strings are not in Wren. Unlike node, which uses UTF-16 for strings, strings in Wren are pure byte arrays, so it's possible to use strings for IO without needing any explicit serialization step. There is one wrinkle around the fact that strings currently eagerly cache their hash code, but that's probably solvable.
What that leaves, then, is mutability. Unless there are other things I'm not aware of. I'm not generally opposed to buffers, though I might want to keep them out of core and have them just live in module. (We try to keep core as small as possible to keep Wren suitable for embedding in memory constrained applications like IoT, mobile games, etc.)
Stream base classes for File and, eventually, Socket to inherit from, for consistent read/write interfaces. Stream should probably inherit from Sequence for fancy iteration magic.
Yes. I'm still learning my way around that corner of libuv which is why there isn't much there yet, but that's on my radar. I'm still thinking through the best way to expose push-based streams (which I find a chore to use in JS) in a way that feels natural in Wren. With Wren we have fibers, so we should be able to have a nice synchronous-seeming API that suspends fibers under the hood.
The Stdin class does a little bit of this now, but I'm sure it has problems. If you can help out on this, that would be great.
- TCP and maybe UDP Socket classes inheriting a DuplexStream class, or something like that
- An HTTP Parser stream
Yup and yup.
Also, I feel like all of this stuff, including what is already in module could probably actually live in a separate project which gets pulled in just like libuv currently.
We talked about doing that a while back (see wren-lang/wren#108), but it ended up making sense to roll it into one project. The core Wren repo needs some kind of command line app that can run tests and print to the console. It seemed confusing and redundant to have a minimal one of those (essentially like d8 for V8) and also have a more full-featured one, so we just have the latter.
In practice, I think it's been really handy. I've been writing modules and that gives me a lot of invaluable first-hand experience being a user of Wren's C API, so Wren is overall better because of this.
Even though the module stuff is in the main Wren repo, that doesn't mean users who want to embed Wren have to consume it. If you just do make vm
, it builds just the bare VM and doesn't even download libuv.
It'd be great to have some sort of consistent mechanism to attach native modules, similar to what mruby does.
That's also important, but I think it's probably a good bit farther down the road. I could be wrong, but I think it would be hard to make all of the libuv-backed modules go through that. I think you kind of need to have the scheduler and other low-level machinery like that baked in.
But, yes, it will also be important to have "third-party" userland modules that can have foreign methods implemented in C eventually.
My general priorities, which I could probably document better are:
- Make sure the VM is stable and efficient.
- Stabilize the language and make sure it hangs together well and is pleasant to use. (The import syntax and semantics are one open area here.)
- Ditto for the C API.
- Ditto for the core module—String, list, numbers, etc.
- Implement IO modules—files, networking, etc.—so people can write Wren programs that do useful stuff.
- Write Wren programs that do useful stuff.
- Build out the ecosystem—package manager, formatter, testing modules, etc.
There isn't a linear order for doing these, though. Most of the earlier items rely on feedback we get from doing later ones. So, for example, hacking on wrenalyzer this morning (which falls under 8), made me wish for a better method to repeat a string (5). Likewise, hacking on the IO modules (6) has done a ton for improving the C API (4).
If we can decide on an interface for native modules, I'd be happy to get started making that work.
The C API already defines how you create classes that have C-backed storage and methods implemented in C. What's not there yet—and will be part of the CLI, not the core VM—is the higher level stuff around dynamically loading a shared library and wiring it up.
That's definitely something we'll want at some point, though personally I think there are bigger blocking issues before that.
With node buffers, one of the big and not immediately obvious advantages is being able to pass around data that is not allocated in V8. We don't have to take binary data read from a file by libuv and copy that into the V8 heap somewhere, we just pass around a pointer to it. They also support finalizers to clean up the memory later, since they are managed outside of V8. A big part of the speed of node is avoiding allocations at all costs.
Also, I agree that the immutability of strings is nice, I'd actually prefer buffers be immutable. Perhaps the concept of buffers could be a more transparent thing that looks like a string but doesn't actually become a string until something tries to use it as one? If you try to do something like buffer + "string"
it'd internally copy to the heap before the + op happens.
As for the native module stuff, I'm more thinking of that higher level stuff to automatically attach the native code of an external module into the VM. Also, for the purpose of supporting embedded environments, it'd probably be better to support compiling native modules directly into the main binary. mruby does this, and it's very nice.
With node buffers, one of the big and not immediately obvious advantages is being able to pass around data that is not allocated in V8. We don't have to take binary data read from a file by libuv and copy that into the V8 heap somewhere, we just pass around a pointer to it.
That's less of a concern with Wren because Wren doesn't own its own complex allocation magic. You just give it an allocate function (usually a thin shim around malloc()
) and it uses that.
It should be pretty straightforward to have external code allocate memory for a buffer/string that it can give to Wren without having to copy it.
Perhaps the concept of buffers could be a more transparent thing that looks like a string but doesn't actually become a string until something tries to use it as one?
My hunch is that that will end up being a rathole of trying to make buffers support everything strings support. If you want one to be usable everywhere a string can be used, it will even have to extend String so that code that does is String
works with it. And, at that point, it is a string. There's little reason for it to be a separate class.
As for the native module stuff, I'm more thinking of that higher level stuff to automatically attach the native code of an external module into the VM.
I need to write docs, but I think some of that is there. At least, there is already syntax in the language to say "this method/class is implemented in native code" and the VM will call out to the host app to wire it up. It could definitely be automated more in the CLI by looking up symbols in a DLL to do the final binding.
Also, for the purpose of supporting embedded environments, it'd probably be better to support compiling native modules directly into the main binary. mruby does this, and it's very nice.
Ah interesting. That sounds cool, though it's probably a ways down the road for us.
Keep in mind that Wren itself is easy to embed in an application. It's more Lua than Ruby. So one easy way to accomplish that is to flip the relationship around. Instead of embedding your native modules in Wren, embed Wren in your native application. :)
Yeah, I think in a lot of cases the native app would handle most things and only expose the higher-level bits into the VM. I can see a lot of benefit to being able to pull in lower-level native modules written by other people though.
For example, playing a sound or managing a sqlite save file are operationally very complex but may have a rather simple surface area to the needed API, so they could be excellent candidates for just pulling in some ready made native module. If you are making a game for desktop only, dynamic libraries are no big deal, but for a mobile game it's different.
(Also, it'd be super cool if I could use wren to make a 3DS game.
I can see a lot of benefit to being able to pull in lower-level native modules written by other people though.
Yeah, totally.
(Also, it'd be super cool if I could use wren to make a 3DS game.
😉 )
The last game I worked on was a (original) DS game, so this is so close to my heart!