typelevel/spire

RFE: inline methods for spire algebra typeclass instances

Opened this issue · 3 comments

I believe it could be an improvement, especially for scala-3, to provide inline definitions of algebra typeclass methods.

To try and illustrate, consider this example. Here I will define a couple of variations on an add function, and also a definition of additive semigroup whose plus method is inline:

object defs:
    import algebra.ring.*
    // an in-line function using semigroup typeclass
    inline def add[V](x: V, y: V)(using alg: AdditiveSemigroup[V]): V =
        alg.plus(x, y)
    // a non-inline version of same function
    def add2[V](x: V, y: V)(using alg: AdditiveSemigroup[V]): V =
        alg.plus(x, y)
    // a typeclass with inline 'plus' method
    given AdditiveSemigroup[Double] with
        inline def plus(x: Double, y: Double): Double =
            x + y
    val x = 1.0

To compare them, here are two equivalent blocks of code, but one uses the standard Spire algebras and the other uses the inline typeclass above:

object useSpire:
    import spire.implicits.*
    import defs.*
    val z = add(x, 1.0)

object useInline:
    import defs.*
    import defs.given
    val z = add(x, 1.0)

If you compare the resulting bytecode, I believe the typeclass with inline methods gives a more efficient result, using the inline add function:

using standard spire algebra:

  public static {};
    Code:
       0: new           #2                  // class coulomb/useSpire$
       3: dup
       4: invokespecial #18                 // Method "<init>":()V
       7: putstatic     #20                 // Field MODULE$:Lcoulomb/useSpire$;
      10: getstatic     #25                 // Field spire/implicits$.MODULE$:Lspire/implicits$;
      13: invokevirtual #29                 // Method spire/implicits$.DoubleAlgebra:()Lalgebra/ring/Field;
      16: getstatic     #34                 // Field coulomb/defs$.MODULE$:Lcoulomb/defs$;
      19: invokevirtual #38                 // Method coulomb/defs$.x:()D
      22: invokestatic  #44                 // Method scala/runtime/BoxesRunTime.boxToDouble:(D)Ljava/lang/Double;
      25: dconst_1
      26: invokestatic  #44                 // Method scala/runtime/BoxesRunTime.boxToDouble:(D)Ljava/lang/Double;
      29: invokeinterface #50,  3           // InterfaceMethod algebra/ring/Field.plus:(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
      34: invokestatic  #54                 // Method scala/runtime/BoxesRunTime.unboxToDouble:(Ljava/lang/Object;)D
      37: putstatic     #56                 // Field z:D
      40: return

Using algebra with inline method:

  public static {};
    Code:
       0: new           #2                  // class coulomb/useInline$
       3: dup
       4: invokespecial #18                 // Method "<init>":()V
       7: putstatic     #20                 // Field MODULE$:Lcoulomb/useInline$;
      10: getstatic     #25                 // Field coulomb/defs$.MODULE$:Lcoulomb/defs$;
      13: invokevirtual #29                 // Method coulomb/defs$.x:()D
      16: dconst_1
      17: dadd
      18: putstatic     #31                 // Field z:D
      21: return

If both arguments are literal constants, the inlined improvements are even more pronounced:

object useInline:
    import defs.*
    import defs.given
    val z = add(1.0, 1.0)  // two constant args

scala-3 compiles it down to a single constant value in the byte-code:

  public static {};
    Code:
       0: new           #2                  // class coulomb/useInline$
       3: dup
       4: invokespecial #18                 // Method "<init>":()V
       7: putstatic     #20                 // Field MODULE$:Lcoulomb/useInline$;
      10: ldc2_w        #21                 // double 2.0d
      13: putstatic     #24                 // Field z:D
      16: return

Lastly, if one uses the non-inline add2 function, both versions of the typeclass yield the same bytecode, which looks like this:

object useInline:
    import defs.*
    import defs.given
    val z = add2(x, 1.0)
  public static {};
    Code:
       0: new           #2                  // class coulomb/useInline$
       3: dup
       4: invokespecial #23                 // Method "<init>":()V
       7: putstatic     #25                 // Field MODULE$:Lcoulomb/useInline$;
      10: getstatic     #30                 // Field coulomb/defs$.MODULE$:Lcoulomb/defs$;
      13: getstatic     #30                 // Field coulomb/defs$.MODULE$:Lcoulomb/defs$;
      16: invokevirtual #34                 // Method coulomb/defs$.x:()D
      19: invokestatic  #40                 // Method scala/runtime/BoxesRunTime.boxToDouble:(D)Ljava/lang/Double;
      22: dconst_1
      23: invokestatic  #40                 // Method scala/runtime/BoxesRunTime.boxToDouble:(D)Ljava/lang/Double;
      26: getstatic     #43                 // Field coulomb/defs$given_AdditiveSemigroup_Double$.MODULE$:Lcoulomb/defs$given_AdditiveSemigroup_Double$;
      29: invokevirtual #47                 // Method coulomb/defs$.add2:(Ljava/lang/Object;Ljava/lang/Object;Lalgebra/ring/AdditiveSemigroup;)Ljava/lang/Object;
      32: invokestatic  #51                 // Method scala/runtime/BoxesRunTime.unboxToDouble:(Ljava/lang/Object;)D
      35: putstatic     #53                 // Field z:D
      38: return

In conclusion, I believe that defining algebra typeclasses with inline methods will never be worse than the current definitions, but in cases where the calling code is inlined, may be substantially better.

Something like this might need to be part of the .../scala-3/spire/... specialized code. It is possible that it's mostly an advantage for the native types Int, Double, etc, less so for the non-atomic types like BigInt, Rational, etc, in which case only a few would need specialized inline method definitions

Yes, this is a good idea, would merge a PR :)