cashapp/paraphrase

Optimize argument generation

Closed this issue · 2 comments

The current code always uses mapOf. For example:

val arguments = mapOf("first" to first, "second" to second, "third" to third, "fourth" to fourth, "fifth" to fifth)

or

val arguments = mapOf("0" to arg0, "1" to arg1, "2" to arg2, "3" to arg3, "4" to arg4)

This is inefficient for a few reasons:

  • It creates a bunch of kotlin.Pair instances and an object array only for mapOf to throw both away and have the changed into java.util.Map.Entry instances in a separate array internally.
  • The map never changes in size and mapOf is backed by LinkedHashMap which over-provisions its backing storage.
  • The keys are sometimes contiguous numbers in which case an array can be used.

For contiguous integers keys a simple Array<Any?> should be constructed (using arrayOf) and passed to the format overload which accepts Object[]. This requires a change to FormattedResource to accept both (but perhaps only guarantee one is set).

data class FormattedResource(
  @StringRes val id: Int,
  val arrayArguments: Array<Any?>? = null,
  val mapArguments: Map<String, Any>? = null,
) {
  init {
    check((arrayArguments == null) != (mapArguments == null)) {
      "One of mapArguments or arrayArguments must be set to non-null."
    }
  }
}

Avoiding the map is a big enough win to add a second codepath. Maps are (relatively) expensive things! It's not worth an interface with two implementations either, no one should be creating this type by hand.

-val arguments = mapOf("0" to arg0, "1" to arg1, "2" to arg2, "3" to arg3, "4" to arg4)
+val arguments = arrayOf(arg0, arg1, arg2, arg3, arg4)

When a map is needed, either for non-contiguous integer keys or arbitrary strings, we should use ArrayMap. We don't need a dependency on androidx.collections, the one baked in the platform will be just fine. With this we can ensure it never over-provisions backing storage and it uses a backing array (hence the name) rather than a complex sparse bucket + linked list structure.

-val arguments = mapOf("first" to first, "second" to second, "third" to third, "fourth" to fourth, "fifth" to fifth)
+val arguments = ArrayMap<String, Any?>(5)
+arguments.put("first", first)
+arguments.put("second", second)
+arguments.put("third", third)
+arguments.put("fourth", fourth)
+arguments.put("fifth", fifth)

core-ktx contains an arrayMapOf which may be tempting to use but we absolutely should not. It suffers the exact same problems as mapOf in allocating Pairs and an array which are both thrown away.

Despite looking like more code, we actually end up generating less bytecode in the end by doing this change! So it's a perf AND size win.

There's another optimization here for maps that can be done on second pass: ensuring that we sort the entries so that they are added as simply as possible into the backing array. The array uses the hashcode of the key and then a binary search to place items, so we basically do that operation in the plugin an emit the put operations in an order that will cause the runtime computation to be as efficient as possible.

If this all seems overkill, it is! But because we're going to generate hundreds or thousands of these, the economics of generated code mean that small optimizations have a huge impact.

Actions:

  • Use ArrayMap for map (#26)
  • Use Array for contiguous numbered args (#42)
  • Optimize ArrayMap order (#185)

Actually I think the FormattedResource type can have arguments just become Any and be documented to be a Map or Array since that's how MessageFormat/Format's APIs work already.

FormattedResource should override the default equals function if it is used to hold arrays.