purescript-node/purescript-node-fs

Reorganising node-fs and node-fs-aff

Closed this issue · 5 comments

/cc @felixSchl

I think the current situation with purescript-node-fs and purescript-node-fs-aff is less than ideal. Some thoughts:

  • node-fs, in many ways, is a low-level FFI binding to Node.js' fs module. However, it has some high-level features, eg Perms, Encoding.
  • node-fs-aff is a higher-level library than node-fs, but only in one way as far as I can tell (i.e. it uses aff)
  • This is a weird mixture of high-level and low-level things which creates unnecessary work for us.
  • We should have one place for high-level filesystem things, and one place for low-level filesystem things.
  • type FilePath = String has no place in a high-level filesystem library. In a high-level filesystem library, we should be using something like purescript-pathy for paths.

I propose that we merge both of these into one library (purescript-filesystem?) with separate high-level and low-level parts. The low-level parts should closely mirror node's fs module, use String for everything, and return Eff actions, while the high-level parts should be designed first and foremost in a way that makes sense and is easy to use, should use proper types like Perms, Encoding, and Path, and return Aff actions.

If there's a need for it, we can break out the low-level parts into a separate library.

It also might be a good time to think about how to deal with other backends. Preferably the high-level library would be able to support backends other than node.js. (Perhaps free monads would be a good fit for enabling this?)

I agree on type FilePath = String, we should not have to define it.

I am, however, a bit worried if we would end up over-engineering this, for what it does. After all, as "low" as we go, is the ffi. However, having an ffi component is so common in purescript libraries, that I don't really see the need to split out purescript-node-fs in such a way.

I originally talked to @garyb when we made the decision to have purescript-node-fs-aff as it's own library. The reason was that aff is just "an" abstraction, among others. Asynchronous programming via callbacks is so deeply rooted in the node space, I think that this is the real problem we have to solve in purescript: We need to standardise what a callback is - i.e. it's type. I don't know how well this works with different libraries requiring different effects, however I found it pretty annoying having to redefine a type Callback .... Also, I think we simply need to settle on a way to do callbacks: ContT, Aff, Promise? All valid options. If I had to take a pick, I would say either a raw ContT to allow for arbitrary abstractions on top or Aff for convenience. After all, working in node using callbacks is not nice - often described as callback hell - , yet, callbacks are just the "low" level way of doing continuations and things like promises are draped over the top. Also, do we have to worry about performance when it comes to this? We jump through a lot of functions already, jumping through wrappers / unwrappers of ContT could make things worse? Or is this negligible?

However, having an ffi component is so common in purescript libraries, that I don't really see the need to split out purescript-node-fs in such a way.

I think the common-ness of FFI code in purescript libraries may just be because of a lack of pre-existing libraries doing the things you want, and the inline string feature making FFI too tempting. I expect (and hope) that the amount of FFI code will decrease over time as people switch to 0.7 and as more libraries emerge.

I also don't understand what the connection is between common-ness of FFI code and a need to separate a high-level and low-level API.

The reason I proposed this separation was that the high-level API will have to make some subjective decisions, which may not be appropriate in all cases. This is what the low-level API is for: if you're in one of those cases.

I suppose, in a perfect world, we would allow users to pick and choose between any of:

  • using String or a real file path type
  • using Aff, Eff, Promise, or ContT
  • using Perms or just Int for permissions
  • using Encoding or just String for encodings.

However this gives us 2 * 4 * 2 * 2 = 32 separate APIs, which is clearly not workable. As a compromise, I think having two options as described earlier works well. Hopefully we can come to an agreement on the 'right' selection of the above for the high-level set, although I expect the only potentially controversial one will be Aff vs Promise etc.

Also, I think we simply need to settle on a way to do callbacks: ContT, Aff, Promise? All valid options. If I had to take a pick, I would say either a raw ContT to allow for arbitrary abstractions on top or Aff for convenience

All of these allow arbitrary abstractions on top via monad transformers, right? The only requirement for slapping a ReaderT, StateT, whatever on top is that the original thing is a monad.

I don't like promises: if you want to understand them, you need to wade through tons of JavaScript and absurdly complex specifications. Aff is much easier to get your head around, especially since it's largely implemented in PureScript, and based on type classes with laws. It also seems better maintained: Promise doesn't work with 0.7 yet, for example, while Aff has stellar docs and lots of existing libraries. Additionally, Aff has plenty of nice error handling facilities that none of the alternatives have. Anyway, it should be possible to convert between Aff and Promise fairly easily, right? I think makeAff is almost identical to new Promise(...).

My vote would definitely be for Aff. Even for situations where async isn't important, or backends which are not async by default, I think Aff should be the way to go, because of its error handling.

We jump through a lot of functions already, jumping through wrappers / unwrappers of ContT could make things worse

Compared to the overhead of accessing the filesystem, I think this will be negligible in the vast majority of cases. This is perhaps a good example of a use case for a low-level API though: if you were worried about the cost of these abstractions, a low-level API would be useful.


I've also just thought of a new potential feature: having the ability to swap out the actual implementation, like HVFS. For example, this is useful for testing programs that use the filesystem, or for security, so that you can block certain operations or log filesystem accesses. Not necessarily via a type class. Perhaps a free monad might work better. Maybe that should go into a separate library, though.

I think the common-ness of FFI code in purescript libraries may just be because of a lack of pre-existing libraries

Yeah, but I would argue also that it is because so many problems have already been solved and are "battle tested" that it makes sense to just slap a ffi wrapper over it, rather than re-implementing it. Further, in this particular case, we are wrapping wrappers, because there's simply no way to "do it ourselves". I think we may have to define "low level" in order to not talk past each other.

All of these allow arbitrary abstractions on top via monad transformers, right? The only requirement for slapping a ReaderT, StateT, whatever on top is that the original thing is a monad.

As long as these arbitrary abstractions cannot be expressed in terms of one another, I guess that's fine. I'm guess I am just worried about the developer experience using the library and using the library in conjunction with other libraries. I pretty much accept when I pull a node module it's a very high chance it has callbacks on the module's interface functions. This way, I can drape a promise over the top, think bluebirds Promise.promisifyAll. Maybe we need more meta programming capabilities to generate these APIs w//o the manual legwork?

I don't like promises: if you want to understand them, you need to wade through tons of JavaScript and absurdly complex specifications.

Promises are a fairly simple concept, and will be part of ES6. I won't make my case for using it in purescript, I don't know if the paradigms gel well. However, I found working with them in javascript very, very productive and intuitive.

My vote would definitely be for Aff. Even for situations where async isn't important, or backends which are not async by default, I think Aff should be the way to go, because of its error handling.

Absolutely 👍

I've also just thought of a new potential feature: having the ability to swap out the actual implementation, like HVFS

That would be cool! I've wondered how to do that as well. Could you explain the "free monad" concept? A quick google search yielded only confusing and overly complex answers... It sounds awesome, though, I want to learn about it.

I think we may have to define "low level" in order to not talk past each other.

Right, I see what you mean. By "low level" I mean an API as close as possible to the node fs module. So, for example, everything is Eff, and callbacks are of the form a -> Eff e b.

Maybe we need more meta programming capabilities to generate these APIs w//o the manual legwork?

I think it's acceptably straightforward to turn Eff into Aff or Promise or ContT currently. Although the meta-programming you're talking about does sound nice...

Promises are a fairly simple concept, and will be part of ES6 [...]

Ok, I guess I was bit harsh on promises. I agree, I think they're quite nice to use in JS, too. Having googled, the spec isn't nearly as bad as I thought it was. I suppose my problem with them is that I can't ever imagine reading a JS implementation and understanding it, whereas I can imagine understanding, eg, Aff.

Could you explain the "free monad" concept?

I think this post is what made it click for me. I don't want to take up too much space talking about them on this issue but if you want to chat about them some more you can find me in IRC or something?

Closed by #75