Kotlin List Comprehensions

One of the nicest features of the Python programming language is the conciseness with which you can write your programs. One of those features is the Python list comprehension.

With it, you can create and fill lists programmatically in a single line.

Think of a loop of which the result of each iteration is appended to a list which is then returned.

This is a really simplistic list comprehension in Python, essentially being the same as just converting the range object to a list, but you can go a lot further with it.

[i for i in range(5)]

List comprehensions are for sure my favorite feature of Python and mostly were the thing that I missed instantly when I first saw it.

But did you know you can actually have list comprehensions in Kotlin, as well as most modern programming languages? The secret is the map function found not only in Kotlin, but almost any high-level language, that, combined with range expressions, can achieve the same level of conciseness as a list comprehension.

Let's look at the example from above. We can use the nice range expressions from Kotlin, and achieve exactly the same thing!

(0..5).map { it }

This of course, just like in Python doesn't just work with range expressions, but, in fact, with every Iterable.

Of major advantage this has over Python we have a true lambda here that allows us to create local variables, making more complex list comprehensions possible. Not only that, Kotlin list comprehensions are also usually very efficient because the map function is implemented as an inline function. If we look at the generated Java code, we can see that the lambda is completely eliminated after compilation.

fun main() {
    val list = (0..5).map { it }

    println(list)
}
Iterable $this$map$iv = (Iterable)(new IntRange(var1, 5));
Collection destination$iv$iv = (Collection)(new ArrayList(CollectionsKt.collectionSizeOrDefault($this$map$iv, 10)));
Iterator var6 = $this$map$iv.iterator();

while(var6.hasNext()) {
   int item$iv$iv = ((IntIterator)var6).nextInt();
   Integer var11 = item$iv$iv;
   destination$iv$iv.add(var11);
}

List list = (List)destination$iv$iv;
System.out.println(list);

One notable thing here is that the Kotlin compiler doesn't realize that you could replace the ranges with a simple loop, so we actually end up with a lot more object allocations than necessary.

In this case there is actually a just as neat but more efficient way to write this. The Kotlin Array constructor takes a size and optionally also an instantiate variable, meaning you can write this instead which yields very clean bytecode.

fun main() {
    val list = IntArray(5) { it }

    println(list)
}
byte var1 = 5;
int[] var2 = new int[var1];

for(int var3 = 0; var3 < var1; var2[var3] = var3++) { }

System.out.println(var2);

I couldn't have written it better myself, and we now just end up with a single object allocation. This is really only interesting in high performance environments, but I still included it as a reminder that looking at the generated bytecode can often be worth it as the benchmarks will show in a moment.

Let's look at the execution time for the two Kotlin, Java and a reference implemented in C.

In order to get meaningful measurements, we'll create 1000 arrays with 1_000_000 members each.

Creating the array the nice way in Kotlin, took around 3.5s to execute. This is not that bad of a time, but do keep in mind, the actual thing we are paying for here is handling the range object. If we used an iterable, this would be substantially faster.

If we use the Array Initializer syntax, we can get down to an impressive 800ms execution time.

As this was more or less just clean Java code, we can just assign the 800ms to Java. Keep in mind, if you write old-school Java, you probably get an edge in performance, but if you look at the generated bytecode of Kotlin, you can make sure you can match it.

The more interesting thing happens with the native C code, because as it turns out, Java outperformed the native implementation by a huge margin. How is that possible? The main issue here is that my C compiler isn't as good optimizing code as the JVM.

As it turns out, the JVM keeps the iteration variables in the hardware registers, while the C compiler doesn't.

If we force that behavior in C, we can cut down our execution to more than half of it. Java is still twice as fast, presumably because allocations are handled more efficiently in the JVM, only when we move the allocation out of the loop we can finally match the speed of Java.

I was absolutely blown away by those results I have to say. Optimized C will probably always beat Java, but damn has the race gotten close.

That's all I have for you today, see you next time around!