HeliosLang/compiler

Optimization: Make use of force/delay

Closed this issue ยท 8 comments

Problem

Currently, Helios doesn't use delay in its compiled UPLC, and force is only used when it's a must (e.g., Plutus builtins). Zero-argument functions & methods in both Helios Lang and Helios IR are implemented as lambdas with "ignored" variables - and those lambdas are later applied with a unit constant. However, it's inefficient compared to using delay and force in terms of both size & execution units.

Zero-argument functions or delayed computations, in general, don't have a native representation in UPLC; so there's no "right" or "wrong" way to do this, we just need to use the more efficient ones, especially when we don't consider Typed PLC a compilation target.

Helios IR UPLC (current) UPLC (proposed)
Define () -> { expr } (lam _ expr)
1 lam
(delay expr)
1 delay
Call f() [f (con unit ()]
1 con, 1 apply
(force f)
1 force

In the Define case, delay and lam have the same serialization size and execution units (with the current protocol parameters). But in the Call one, con unit () is already worse than a single force, let alone an extra apply.

Note: This not only applies to user-defined functions but several Helios constructs and builtins, especially ones using Plutus ifThenElse under the hood, as they're all implemented using zero-argument lambdas in Helios IR. So, such optimization would benefit almost all Helios smart contracts.

Benchmark

Two similar programs, the only difference with program 2 is the use of force & delay.

Program 1

(program
  1.0.0
  (lam
    n
    [
      [
        [
          [
            (force (builtin ifThenElse))
            [ [ (builtin lessThanInteger) n ] (con integer 0) ]
          ]
          (lam _ [ [ (builtin subtractInteger) (con integer 0) ] n ])
        ]
        (lam _ n)
      ]
      (con unit ())
    ]
  )
)

Program 2

(program
  1.0.0
  (lam
    n
    (force
      [
        [
          [
            (force (builtin ifThenElse))
            [ [ (builtin lessThanInteger) n ] (con integer 0) ]
          ]
          (delay [ [ (builtin subtractInteger) (con integer 0) ] n ])
        ]
        (delay n)
      ]
    )
  )
)
Size ExUnits (1000) ExUnits (-1000)
1 24 bytes cpu: 704063
memory: 1902
cpu: 1002540
memory: 2304
2 23 bytes cpu: 681063
memory: 1802
cpu: 979540
memory: 2204

Proposed Solution

Helios Lang

No syntax change would need to be introduced. Although 2 potential improvements can be made later

  • A shortcut for defining zero-argument functions / delayed computations might be a potential improvement, as () -> ReturnType { expr } is quite verbose, () { expr } might be good enough.

  • Revisit zero-argument methods, since with how it's implemented in Helios at the moment, an extra apply/force is always needed for a struct.method(). I understand that it's necessary for letting devs have access to self-bound functions via struct.method but maybe a computed property syntax would be good when devs want to remove this overhead?

Helios IR

Since there's no currying in Helios, it might be possible to change how zero-argument functions are defined/called without affecting the correctness. There's also no ambiguity since zero-argument functions don't really exist in UPLC, so it can be considered a "special" construct in Helios Lang & IR.

  • () -> { expr } is compiled to (delay expr).
  • fn() is compiled to (force fn).

Very nice proposal!

Changes to how Helios IR is turned into UPLC are indeed very easy. It will though change the byte-code, and thus address, of all Helios smart contracts. But that is probably acceptable because we're still not at version 1.0.0.

I think the UPLC evaluator will also need some changes to handle force/delay properly (so it actually throws an error if delay/force are applied wrongly, right now those terms don't do anything except increase the exBudget).

Helios Lang changes

I'm unsure if the extra language features you propose are necessary:

  • giving smart contract developers the option to use delay explicitly might confuse the readers (delay/force is a concept that takes time to understand)
  • I feel like zero-argument methods and anon-funcs aren't very common
  • giving smart contract developers the option to use delay explicitly might confuse the readers (delay/force is a concept that takes time to understand)

Yeah, that's why I think it's important to keep the zero-argument function syntax intact, and only change how it's compiled to UPLC under the hood (sorry I didn't make it clear in the issue description). The proposed change is purely a syntactic sugar for anonymous functions - would be great if the return type of anonymous functions can be inferred.

  • I feel like zero-argument methods and anon-funcs aren't very common

Zero-argument anon-funcs are indeed uncommon, considering how verbose it is currently. For zero-argument methods, not so, because it's the most straightforward way to define some derived/computed properties for a struct/enum. The other 2 ways don't look very natural: top-level functions with 1 argument, and associated functions with 1 argument.

Yeah, that's why I think it's important to keep the zero-argument function syntax intact, and only change how it's compiled to UPLC under the hood (sorry I didn't make it clear in the issue description). The proposed change is purely a syntactic sugar for anonymous functions - would be great if the return type of anonymous functions can be inferred.

I suggest opening a new issue for 'Type inference of anonymous function return types', with some example code samples that clearly demonstrate it would make things more readable.

Zero-argument anon-funcs are indeed uncommon, considering how verbose it is currently. For zero-argument methods, not so, because it's the most straightforward way to define some derived/computed properties for a struct/enum. The other 2 ways don't look very natural: top-level functions with 1 argument, and associated functions with 1 argument.

Not sure what you mean by 'top-level functions with 1 argument, and associated functions with 1 argument'. Many languages have special syntax for getters, perhaps such a thing could be introduced into Helios, but I actually think that the optimization is able to handle the zero-argument case properly, so I don't think this is a performance issue (I need to verify this though).

I suggest opening a new issue for 'Type inference of anonymous function return types', with some example code samples that clearly demonstrate it would make things more readable.

Totally agree!

Not sure what you mean by 'top-level functions with 1 argument, and associated functions with 1 argument'. Many languages have special syntax for getters, perhaps such a thing could be introduced into Helios, but I actually think that the optimization is able to handle the zero-argument case properly, so I don't think this is a performance issue (I need to verify this though).

Here's an example

spending foo

struct Datum {
  foo: Int
  bar: Int

  // Normal method
  func sum1(self) -> Int {
    self.foo + self.bar
  }

  // Associated function
  func sum2(datum: Datum) -> Int {
    datum.foo + datum.bar
  }
}

// Top-level function
func sum3(datum: Datum) -> Int {
  datum.foo + datum.bar
}

func main(datum: Datum) -> Bool {
  datum.sum1() + Datum::sum2(datum) + sum3(datum) > 123
}

Those 3 functions compile to different Helios IR code:

/*__user__Datum__sum1*/
(self) -> {
    () -> {
        __helios__int____add(__user__Datum__foo(self))(__user__Datum__bar(self))
    }
}

/*__user__Datum__sum2*/
(datum) -> {
    __helios__int____add(__user__Datum__foo(datum))(__user__Datum__bar(datum))
}

/*sum3*/
(datum) -> {
    __helios__int____add(__user__Datum__foo(datum))(__user__Datum__bar(datum))
}

sum1 requires an extra apply/force to call in UPLC. The IR of sum2 and sum3 is what I expect from a getter / computed property (in terms of performance). In a broader sense, not just in the case of getters, zero-argument methods ((self) -> { () -> { ... } } in Helios IR) are always less efficient than one-argument top-level or associated functions.

P/S: It feels like micro-optimizations at this point ๐Ÿ˜…. But I think it might be worth it... as zero-argument methods are really common in Helios builtins.

Force/delay optimization now implemented in dev branch.

Savings on the ./test-profile.js benchmark are:

  • 1.5% on cpu
  • 3.5% on mem
  • 4% on size

I'll also have a look at the zero-argument method case. The type-checker could be augmented to detect getter-like methods, and subsequently generate IR getters instead of regular methods.

Latest dev branch optimizes zero-arg functions at the IR level.

I will close this issue once merged into main

Merged as of v0.9.0