HigherOrderCO/Bend

Implement file IO in Bend

Closed this issue · 0 comments

File IO has been implemented in hvm-c and now needs to be ported to Bend.

(we need to wait for the implementation to be completed)

The tracking issue for file IO in the HVM is HigherOrderCO/HVM#375

I propose this base interface that mimics the HVM IO function interfaces:

  • IO/Fs/open, receives path, mode, encoding, returns a file.
  • IO/Fs/close, receives file, returns nothing.
  • IO/Fs/write, receives file, text, returns nothing.
  • IO/Fs/read, receives file, amount, returns a string.
  • IO/Fs/seek, receives file, amount, mode, returns nothing.

Here a file is not simple a file descriptor, but a datatype that also holds other data like the encoding.
Another option is to have files being just fds and then passing the encoding to read and write.
A third option is to read and write only in bytes and have encoding/decoding be separate functions the user must call.

To use this for other file-like things (the way unix works), we would probably need to differentiate blocking vs non-blocking functions? I think as a first version we could have just blocking read/write/seek and later we can implement more options as we need them.

For seek, we could have three modes: from the beginning, from the end and relative to the current position. We could also have just one and not have a mode at all.
We also have the question of whether mode should be number constants or an enum. In this case I think numbers (hidden behind functions) are better but we have to think of the overall homogeneity of the builtins.

Similarly, we should decide whether file mode and encoding should be strings or specialized datatypes. I favor the later, but I think most users would prefer strings like in python, at least for file mode.

Lastly, the read and seek amount can be in bytes or in characters. If an encoding is passed, I think it would make sense for it to be in characters, otherwise it would be problematic when reading variable-length encoding like utf-8 and utf-16.

Personally I'm favorable to working only with bytes at this lower level of functions.
This would make the function signatures be

  • IO/Fs/open: (path: String) -> (mode: String) -> U24
  • IO/Fs/close: (fd: U24) -> *
  • IO/Fs/write: (fd: U24) -> (text: String (of bytes)) -> *
  • `IO/Fs/read: (fd: U24) -> (bytes: U24) -> String
  • IO/Fs/seek: (fd: U24) -> (bytes: U24) -> (mode: U24) -> *

Here I ignored everything related to error checking. I'm not sure what the best way of adding that would be or how to format errors.