TrigerSoft/jaque

Consider using a lambda serialization hack to crack the lambdas

tomwhoiscontrary opened this issue · 4 comments

This is the approach used in Jinq:

https://github.com/my2iu/Jinq/blob/master/analysis/src/com/user00/thunk/SerializedLambda.java

The advantage of this approach over the dumpProxyClasses approach is that it does not require setting a system property or writing to the filesystem. Both of these things may be forbidden, or at least highly undesirable, in some situations.

One disadvantage is that it requires the input lambdas to be serializable, but this does not seem to be a particularly onerous requirement in practice. It may also be fragile in the face of changes in the definition of the JDK's SerializedLambda class, but i'm not sure about that.

Nice catch!

  • It's indeed fragile as depends on internals of JDK's SerializedLambda class, but in practice it's not likely to be changed. It can be added as a fallback mechanism if dumpProxyClasses is not set.
  • The serializability requirement is indeed not onerous if you know the technique.

Nice feature and I'll definitely find some time to add it or accept a pull request ;).

BTW, the stream manipulation is not required, it's enough to return the right class in ObjectInputStream's resolveClass:

@Override
protected Class<?> resolveClass(ObjectStreamClass desc)
        throws IOException, ClassNotFoundException {

    Class<?> resolvedClass = super.resolveClass(desc);
    if (resolvedClass == java.lang.invoke.SerializedLambda.class)
        return SerializedLambda.class;
    return resolvedClass;
}

Added an initial implementation: f8340a0.
Not final - there is still some problem with reducing expressions of type r -> (r < 6 ? r > 1 : r < 4) to boolean.

I just realised that it's simpler and more robust to do the hack on serialization, rather than on deserialization. Override replaceObject in ObjectOutputStream (and call enableReplaceObject(true)), check if the object is a SerializedLambda, and if it is, serialize a replacement. At that point, you have an instance of the JDK's SerializedLambda, so you don't need to make assumptions about the field layout, you can just call getters as usual.

Even better, you don't need to serialize anything special - you can just capture the SerializedLambda into a variable somewhere, and then use it directly. You can completely discard the serialized bytes, and don't need to do any deserialization!

This works:

    public static SerializedLambda crack(Object obj) {
        class LambdaCapturingObjectOutputStream extends ObjectOutputStream {
            private SerializedLambda lambda;

            LambdaCapturingObjectOutputStream(OutputStream out) throws IOException {
                super(out);
                enableReplaceObject(true);
            }

            @Override
            protected Object replaceObject(Object obj) {
                if (lambda == null && obj instanceof SerializedLambda) {
                    lambda = (SerializedLambda) obj;
                }
                return obj;
            }
        }

        try (LambdaCapturingObjectOutputStream oout = new LambdaCapturingObjectOutputStream(OutputStream.nullOutputStream())) {
            oout.writeObject(obj);
            return Objects.requireNonNull(oout.lambda, String.valueOf(obj));
        } catch (IOException e) {
            throw new UncheckedIOException(e);
        }
    }

I know you're not maintaining this project any more, but i wanted to write this down somewhere.

You are completely right. I realized it few years ago and implemented in Jaque's successor project - https://github.com/streamx-co/ExTree/blob/master/src/main/java/co/streamx/fluent/extree/expression/ExpressionClassCracker.java#L143-L162