Rust code doesn't conform to commonly accepted `unsafe` semantics
ashpil opened this issue · 2 comments
Hi,
So I'm not very familiar with the project, but I've done a fair bit of Rust and one of my friends is currently taking 6120, so I looked a little bit through the code. It seems some of the Brilrs code isn't using the commonly accepted Rust definitions of unsafe, which means that one could accidentally run into undefined behavior. Now these issues aren't actually being realized at the moment, but it might be good to conform to Rust standards to ensure correctness in the future.
The commonly accepted Rust standard is to say that
A function should be marked as
unsafe
if it's possible to cause undefined behavior by calling it from safe code.
If one were to do (in entirely safe code)
let bbprog: BBProgram = prog.try_into()?;
interp::execute_main(&bbprog, out, &input_args, profiling)?;
one could actually cause undefined behavior if the program passed into execute_main
is not a valid program. This is because (if I'm reading correctly) execute_main
assumes the program has been type-checked
Lines 184 to 218 in 30c836c
which is not enforced by the Rust compiler. Thus one can cause undefined behavior in entirely safe code, which is against Rust conventions. As I said, this is of course not a real problem and it doesn't currently ever materialize itself, but it might be nice to write better Rust.
There's a couple possible solutions to this, if it is worth solving:
- Have
type_check
return some sort ofTypedProgram
, andexecute_main
can only take in aTypedProgram
, guaranteeing that the program has been type-checked. This solution makes the most sense to me. - Mark
execute_main
as unsafe. This is probably the least work but is very sad. - An unreasonable solution: rework nodes/type-checking so that type-checking returns a new tree that doesn't even have the nodes that should be unreachable. This is very Haskell-y and a huge pain to do.
Just to add some context to this issue. This specific issue arises from two design decisions of brilirs
: to separate type checking from the execution of instructions(which opens the possibility of type checking to be ignored)(motivated by #122 (review)) and the introduction of unsafe code for performance reasons(motivated by #122 (comment)).
I think this only became an issue with #180 which defined the public library interface of the brilirs
and allows other programmers to bypass type checking which is otherwise currently guaranteed to run in brilirs
.
To elaborate on the suggested solutions:
- To avoid code duplication this should probably be implemented with https://doc.rust-lang.org/rust-by-example/generics/new_types.html where the
TypedBBProgram
is just a wrapper aroundBBProgram
. On would need to make sure that a user couldn't create aTypedBBProgram
themselves which would defeat the purpose. (Low effort) - I may be more sympathetic to just marking
execute_main
as unsafe to avoid forced type checking/validation but I understand it isn't the cleanest solution. (Low effort) - You could take this to different levels. The only node that this applies to in the tree is the
Literal
node which I don't think Rust has the typing support to make any neater. However, (with benchmarking) this could probably be returned to safe code since I don't think it would add much runtime cost. The benefit of unsafe comes from accessing the correct constructor of theValue
enum which is the representation of data in theEnvironment
(aka stack) andHeap
. One could imagine parameterizing the Environment based on the type of the data it holds likeEnvironment<i64>
and maintaining an environment for each type of data. There would probably be a performance hit(more benchmarking) from having to generate multiple environments(Could also do multiple heaps?) on each function call but each of the accesses would avoid any unsafe code. There may be other type wizardry that I'm not aware of that does this better. (This is a comparatively larger amount of work and and could be slower).
Some additional(but not necessarily better) solutions:
- Remove the public api to
brilirs
or replace it with a function that calls type checking +execute_main
. (Low effort) - Do nothing and make a note of the precondition for valid Bril programs in the documentation. (Low effort and non-idiomatic).
- Remove the unsafe code and take the performance hit. (Low effort but against the principle of
brilirs
)
I don't have a strong opinion either way, but I just wanted to point out that the new type pattern actually could (if properly exported) guarantee that TypedBBProgram
is constructed only by the type checker and nobody else has the power to "wrap" stuff themselves.