Problem with value lifetimes when trying to use a symbol table

Question

Problem with value lifetimes when trying to use a symbol table

timsueberkrueb opened this issue 2 years ago · 4 comments

Hey, first of all, thank you for this crate!

I'm trying it out by implementing a small toy project. My initial plan was to use a simple symbol table, e.g. HashMap<String, mlir::Value<'c>> (where 'c is the Context lifetime) to translate variables (e.g. similar to how it is done in the MLIR example project).

However, my naive approach doesn't work because values resulting from operations are bound to the lifetime of the operation which is itself tied to the lifetime of the block.

A simple hack making it borrow check is to extend the following lifetimes to the context lifetime 'c like so:

-    pub fn append_operation(&self, operation: Operation) -> OperationRef {
+    pub fn append_operation(&self, operation: Operation) -> OperationRef<'c> {

-pub struct OperationRef<'a> {
+pub struct OperationRef<'c> {
     raw: MlirOperation,
-    _reference: PhantomData<&'a Operation<'a>>,
+    _reference: PhantomData<&'c Context>,
 }

However, I guess this is not desired. Do you have any advice on how to work around this problem or what changes should perhaps be made to the library, if any? Thanks a lot in advance!

Answer 1 · 2022-09-20T04:25:47.000Z

I think you need to structure your code so that every use of OperationRefs and OperationResults does not escape the scopes of their owner blocks.

I'm not sure how you are using the symbol map but you probably need to overlay local symbols to the symbol map of the parent scope. And I think Rust's lifetime logic should be able to handle this case.

For example, instead of:

fn compile_loop(..., variables: &mut HashMap<&str, OperationRef<'a>>) {
  ...

  variables.insert("foo", compile_opeartion());

  ...
}

fn compile_operation(...) -> Operation { ... }

do something like this using overlayed hash maps (e.g. https://github.com/raviqqe/turtle-build/blob/main/src/compile/chain_map.rs):

fn compile_loop(..., variables: &ChainMap<&str, OperationRef<'a>>) {
  let mut variables = variables.fork();
  ...

  variables.insert("foo", compile_opeartion());

  ...
}

fn compile_operation(...) -> Operation { ... }

If you can give me a detailed example, that would be great!

BTW, please make sure you use the latest one via cargo add melior --git https:://github.comm/raviqqe/melior or something for now as currently we have some CI problems with LLVM 15 and cannot release the master branch onto crates.io yet.

Answer 2 · 2022-09-20T07:47:05.000Z

Thanks a lot for the quick response!

I think this should work. However, wouldn't this lead to a linear lookup time for variables? Each time I pass an operation, I would have to fork my ChainMap essentially ending up with a linked list.

I'm wondering if it would be possible for melior to guarantee in its API that blocks/operations are bound by the lifetime of the enclosing region or context (perhaps modelling a safe subset of what is supported by mlir's ownership model)? Would this work if blocks/operations are only ever added (perhaps too limited?)?

//! - IR object references returned from functions that move ownership of
//!   arguments might get invalidated later.
//!   - This is because we need to borrow `&self` rather than `&mut self` to
//!     return such references.
//!   - e.g. `Region::append_block()`
//!   - Fix plan: Use dynamic check, such as `RefCell`, for the objects.

Would your plan on resorting to dynamic checks also solve this problem?

Answer 3 · 2022-09-20T15:33:12.000Z

wouldn't this lead to a linear lookup time for variables?

You can still put variables in the same scope as many as you want. This is still a linear time but it wouldn't be a bottleneck if the number of variables in scopes are greater enough than the number of scopes (e.g. if for statements.)

let mut variables = variables.fork();

variables.insert("foo", operation1);
variables.insert("bar", operation);
// ...

I'm wondering if it would be possible for melior to guarantee in its API that blocks/operations are bound by the lifetime of the enclosing region or context (perhaps modelling a safe subset of what is supported by mlir's ownership model)? Would this work if blocks/operations are only ever added (perhaps too limited?)?

I'm not sure either if any operations are guaranteed not to be modified until we run passes or translations. But essentially, "operations are owned by blocks or modules" means that the owners can do whatever they want unless they are borrowed. I will take a look.

Would your plan on resorting to dynamic checks also solve this problem?

Not sure either for this one. I'll take a look at how many methods we have for blocks or modules to modify their owned operations during IR building.

Answer 4 · 2023-06-02T11:33:16.000Z

It's been improved especially by #177 and other small changes. At least, I don't have any problems at compile time when building small languages. Closing for now but feel free to re-open this.