archimatetool/archi-scripting-plugin

[Bug] GraalVM: NullPointerException accessing VisualObject bounds

rchevallier opened this issue ยท 43 comments

Version of jArchi, Operating System

Windows 10 20H2

Expected Behaviour

let x = vObj.bounds.x;
Return the bounds.x value
Works OK in ES6 dialect

Actual Behaviour

In GraalVM, throws a NullPointerException when accessing a property of a bounds
throws Script Error: javax.script.ScriptException: java.lang.NullPointerException

Steps to Reproduce the Behaviour (with attached test files/script)

run the script:

"use strict";
console.clear();
let aView = model.createArchimateView("Test view");
let aDO = model.createElement("data-object", "Foo");
let vObj = aView.add(aDO, 10, 10, -1, -1);
console.log(vObj.bounds);
let x = vObj.bounds.x;
console.log(x);

The NPE is thrown because a null object is not caught in jArchi's console.log() native code. I can fix that.

However, the cause of the NPE is due to GraalVM. It seems that you have found one of its quirks.

Unlike Nashorn, GraalVM does not, by default, support JS type getters as if the member was a field like this:

var member = object.member;

You have to use the Java type getter:

var member = object.getMember();

But the good news is that, according to the GraalVM documentation, you can set a System property to provide Nashorn compatibility:

System.getProperties().put("polyglot.js.nashorn-compat", "true");

And then the Nashorn type getters work. jArchi sets that system property and it works.

But this seems not to apply to Java's Map type objects.

Here's the Java native code for object.getBounds():

    public Map<String, Integer> getBounds() {
        IBounds b = getEObject().getBounds();
        
        HashMap<String, Integer> map = new HashMap<>();
        map.put("x", b.getX());
        map.put("y", b.getY());
        map.put("width", b.getWidth());
        map.put("height", b.getHeight());
        
        return map;
    }

It returns a HashMap object.

Nashorn does some magic so we can access the map's members as if they were fields like var x = bounds.x. But in GraalVM this does not work for Map objects.

So, you have to access the members like this:

var x = bounds.get("x");
var y = bounds.get("y");
var width = bounds.get("width");
var height = bounds.get("height");

Unless there is some setting that I've missed this is how it is. As I said, GraalVM support is experimental.

I found this issue:

oracle/graaljs#397

Another option is to wrap your Map into a ProxyObject. Then you can limit the changes to the JavaSide and need not change the JavaScript code. For a flat array, use ProxyObject.fromMap(yourJavaMap) on the Java side before passing the result to JavaScript (e.g. via a binding). For complex (nested) data structures, you will need your own implementation of the ProxyObject interface.

This works:

    public Object getBounds() {
        IBounds b = getEObject().getBounds();
        
        HashMap<String, Object> map = new HashMap<>();
        map.put("x", b.getX());
        map.put("y", b.getY());
        map.put("width", b.getWidth());
        map.put("height", b.getHeight());
        
        return ProxyObject.fromMap(map);
    }

However, this won't work if the Nashorn option is set in jArchi. We'd have to somehow figure out if GraalVM is running and return a ProxyObject if it is. Not sure about that at the moment.

I've come up with a workaround that will return a ProxyObject for a Java Map object if the Script Engine is GraalVM .

However, I'm not sure about this...

Looking at the Graal VM source code I found a ProxyArray class:

https://github.com/oracle/graal/blob/master/sdk/src/org.graalvm.polyglot/src/org/graalvm/polyglot/proxy/ProxyArray.java

There is a ProxyArray.fromList() method that returns a ProxyArray object instead of a Java List, similar to the ProxyObject.fromMap() method.

I have a horrible feeling that in order to support certain access methods on a Java List we will have to return a ProxyArray object instead of the Java List. If that is the case this would require a lot of changes to jArchi code.

More info here:

oracle/graaljs#369

But now the problem with adding the ProxyObject is that GraalVM does not support bounds.get("x")

On Nashorn:

var x = bounds.x returns x
var x = bounds["x"] returns x
var x = bounds.get("x") returns x

Without the ProxyObject (current GraalVM implementation):

var x = bounds.x returns null
var x = bounds["x"] returns null
var x = bounds.get("x") returns x

With the ProxyObject:

var x = bounds.x returns x
var x = bounds["x"] returns x
var x = bounds.get("x") throws an Exception[1]

So I don't know which is the best option. I'm thinking that bounds.get("x") should be supported.

[1] org.graalvm.polyglot.PolyglotException: TypeError: invokeMember (get) on {x=8, width=120, y=74, height=60} failed due to: Unknown identifier: get

I reported the latter problem:

oracle/graaljs#402

So I don't know which is the best option. I'm thinking that bounds.get("x") should be supported.

For me this doesn't really make sens in javascript, while bounds.x and bounds["x"] do. So that's not a big loss.

For me this doesn't really make sens in javascript, while bounds.x and bounds["x"] do. So that's not a big loss.

I think you're right. In JS for these type of objects we don't care that it's a HashMap.

For me this doesn't really make sens in javascript, while bounds.x and bounds["x"] do. So that's not a big loss.

Therefore we have to add the ProxyObject.

But now the problem with adding the ProxyObject is that GraalVM does not support bounds.get("x")

It does now. :-)

After the usual hours spent on fruitless Google searches I discovered this:

When you implement ProxyObject there is a method that returns the value of a key

public Object getMember(String key) {
   // return the value in the HashMap
}

When we call bounds.get("x") in JS this method is called with the key of "get".

So what we do is return a ProxyExecutable method to handle this:

private final ProxyExecutable getHandler = args -> {
    String key = args[0].asString(); // Get the key as a string. This will be "x"
    return get(key); // Return the HashMap value of this key
};

@Override
public Object getMember(String key) {
    if("get".equals(key)) {
        return getHandler;
    }
    
    return proxyObject.getMember(key);
}

@jbsarrodie This begs a question:

With the getBounds() method we are returning a Java map like this:

public Map<String, Object> getBounds() {
    IBounds b = getEObject().getBounds();
    
    Map<String, Object> map = ProxyUtil.createMap(); // Creates a HashMap
    map.put("x", b.getX());
    map.put("y", b.getY());
    map.put("width", b.getWidth());
    map.put("height", b.getHeight());
    
    return map;
}

But I wonder if we should instead return the IBounds object? If we do that we can call the following in JS:

var x = bounds.x;
var x = bounds["x"];
var x = bounds.getX();

I can't see any advantages to using a HashMap because if we have to use ProxyObject for the HashMap to support bounds.x and bounds["x"] then we can't access any of HashMap's methods like size() anyway.

But I wonder if we should instead return the IBounds object?

TBH, I don't remember why we used a HashMap in the first place. I guess that was because we don't want to provide access to the real IBounds object and let user change it bypassing undo stack, but we could return a copy.

Is it the only place where we did this ?

I don't remember why we used a HashMap in the first place.

I made that decision so that it matched the setter:

object.bounds =  {x: 10, y: 10, width: 100, height: 100};

When this is translated to Java it is a Map type,

And to make it a "neutral" JS object.

I guess that was because we don't want to provide access to the real IBounds object and let user change it bypassing undo stack, but we could return a copy.

Yes, that's right. We could return either a copy or wrapper object.

Is it the only place where we did this ?

We also do it for getBendpoints and getViewPoint

My feeling is that we return our own objects for everything else in jArchi (EObjectProxy, etc) so why not here too.

And more importantly, with the GraalVM ProxyObject workaround the object in JS is in fact not a Map but a ProxyObject.

Also, if Nashorn had not implemented its trick of allowing bounds.x and bounds["x"] on a HashMap we would have had to use objects anyway. As it turns out GraalVM does not really support these without converting the HashMap to a ProxyObject.

Still not sure of advantages and disadvantages.

If we use a Bounds copy object we can call:

var x = bounds.x;
var x = bounds["x"];
var x = bounds.getX();

If we use a Graal ProxyObject (with a lot of voodoo to get it to work) or a Nashorn HashMap we can call:

var x = bounds.x;
var x = bounds["x"];
var x = bounds.get("x");

In Nashorn it's a true HashMap, in Graal it's a ProxyObject.

I'm thinking that Bounds should be treated like an object.

One thing's for sure, if we return objects for getBounds() and getBendpoints() they should not be copies of IBounds and IDiagramModelBendpoint because the objects will be instances of EObject and we don't want to expose all of its methods in JS.

So the choice is:

  • return a HashMap/ProxyObject
  • return new Proxy wrapper objects (like EObjectProxy)

I guess it comes down to the choice of getters. If it's an object we can do getX()

If it's an object we can do getX()

IMHO, in JS we should only access attributes through object.x or object["x"]. Getters/setters are a Java thing, not a JS thing.

So the choice is:

  • return a HashMap/ProxyObject
  • return new Proxy wrapper objects (like EObjectProxy)

I don't mind, so choose whatever solution you think is the best, knowing that I really don't care about .getX() or .get("x") methods.

I don't mind, so choose whatever solution you think is the best, knowing that I really don't care about .getX() or .get("x") methods.

After trying both options I think I will go with HashMap/ProxyObject for these reasons:

  1. If using Nashorn, nothing is changed since we return a HashMap, so it's backwards compatible
  2. If using Graal, we return a ProxyObject which supports bounds.x, bounds["x"] and bounds.get("x") as in (1)

If users have been calling HashMap methods in Nashorn such as size() then these won't work with the ProxyObject. But then if we returned our own object they wouldn't work either.

We have 3 cases where we return a HashMap:

var bounds = object.bounds;
var bps = connection.bendpoints;
var vp  = view.viewpoint;

D'uh!

Another reason not to use an object is that we'd have to re-write the Java setters as well because they expect to see a Map object.

We can do this in JS:

var bounds = selected.bounds;
bounds.x = 200;
selected.bounds = bounds;

So, no!!!! What was I thinking?!!!!!! ๐Ÿ˜ 

Actually the last comment is not quite true.

If you return an IBounds object to JS, and set it again as in the example above, Graal does some magic and converts it to a Map object in the setter. But Nashorn does not do this.

@jbsarrodie Do you think that anyone is using these objects that are returned from bounds and bendpoints as HashMaps in JS? i.e are they calling their size(), put(), etc methods?

Do you think that anyone is using these objects that are returned from bounds and bendpoints as HashMaps in JS? i.e are they calling their size(), put(), etc methods?

I don't think so. People stick with jArchi API and standard JS, so non documented (by us) java methods are not used.

Knowning the internals of jArchi I might have use some java methods from time to time (not on HashMap), but I won't complain if switching to GraalVM makes some of them unavailable ;-)

OK.

The new ProxyObject can implement a HashMap method by using a ProxyExecutable as mentioned above. But each method has to be hand-implemented as a proxy method.

I think I will simply implement get("x") and size() to maintain some backward compatibility.

But enough of this now. It's taken too long. As long as we have these methods we're good:

var x = bounds.x;
var x = bounds["x"];
var x = bounds.get("x"); // Not so important but it's implemented

@rchevallier Does that sound OK to you?

Seems perfect. For now, I've changed my complex script to use .get("x"). It works then perfectly.
I will revert back to use directly the property (.x) at the next jArchi release, as it is more JS-esque and only way documented so far
(I side with jbsarrodie on this matter)

-- closing then the issue (I hope it is the proper process and not too soon)

I'm still experimenting with objects, hashmaps and all of that. I aim to get jArchi 1.0.1 out with this fix next week.

(This comment is just for my reference, feel free to ignore.)

So why use a Java Map instead of a Java Object?

Let's say we use a Java object of class Bounds instead of a Java Map:

class Bounds {
   public int x, y, width, height;
}

We then return an instance of Bounds when calling JS object.bounds and we can have these getters in JS:

var bounds = object.bounds;
var x = bounds.x;
var x = bounds["x"];

Setting the bounds in JS can be done in two ways:

(1) Modify and re-assign the Bounds object

var bounds = object.bounds;
bounds.x = 20;
bounds.y = 30;
object.bounds = bounds;

(2) Assign a JS type object

object.bounds = {x:20, y:30};

Now in the Java setter we have this:

public DiagramModelObjectProxy setBounds(Map<?, ?> map) {
    // more code
}

In both cases (1) and (2) GraalVM converts the object to a Java Map in the setter and it just works. Great!

But in Nashorn, (1) doesn't work because Nashorn does not convert the Bounds object to a Map. So we would have to re-write the setter to take a Bounds object instead of a Map. But if we did that then (2) wouldn't work.

So we'd have to re-write our setter to something like:

public DiagramModelObjectProxy setBounds(Object o) {
    if(o instanceof Map) {
        // convert map to internal bounds
    }
    else if(o instanceof Bounds) {
        // convert Bounds object to internal bounds
    }
}

So it gets complicated. If we were just using GraalVM it would be OK, but we need to support Nashorn as well (at least for now).

So the answer to this question:

TBH, I don't remember why we used a HashMap in the first place.

is because when we do (1) in Nashorn, the object has to be a Map.

TL;DR - HashMaps Rule.

BTW - next version of jArchi will have a JS function getScriptEngine() which returns the class name of the Script Engine in use.

The JS Script Engines are:

com.oracle.truffle.js.scriptengine.GraalJSScriptEngine
jdk.nashorn.api.scripting.NashornScriptEngine

This might be useful if different JS code is needed for each engine.

BTW - next version of jArchi will have a JS function getScriptEngine() which returns the class name of the Script Engine in use.

That's a good idea.

Thinking out loud: maybe we should group this function and some others under some new umbrella objects. We already do this with $.fs.writeFile() and $.model.xxx() functions. This was created to somewhat mimic the node.js modules and can be seen as a namespace for global functions. This also makes it a bit easier for people used to javascript by making jArchi following jQuery and Node.js design principles. This also makes it future proof as GraalVM is 100% compatible with Node.JS, so at some point it should even be possible to use real Node.js modules in jArchi.

So we could add or refactor:

  • $.process.engine as an attribute. Would replace getScriptEngine() and mimic Node.js process.platform
  • $.process.argv as an attribute (would replace getArgs()) and mimic Node.js process.argv
  • $.child_process.exec() to replace exec() and mimic Node.js child_process.exec

I aim to get jArchi 1.0.1 out with this fix next week.

Do you mind if we also include a new class which makes it possible to launch jArchi scripts from a cheat sheet ? (I already have the code and can push it during the week-end) ?

Edit: I would also make sense to provide my script pack too (obviously not tested with GraalVM) which includes lots of useful things)

$.fs.writeFile() and $.model.xxx() are bound to Java classes. Would your proposal mean that getScriptEngine() and getArgs() should be moved out of init.js and bound to new Java classes like the FS and Model classes?

Do you mind if we also include a new class which makes it possible to launch jArchi scripts from a cheat sheet ? (I already have the code and can push it during the week-end) ?

Edit: I would also make sense to provide my script pack too (obviously not tested with GraalVM) which includes lots of useful things)

OK, but these changes will push things back a bit, and it will be version 1.1.

$.fs.writeFile() and $.model.xxx() are bound to Java classes. Would your proposal mean that getScriptEngine() and getArgs() should be moved out of init.js and bound to new Java classes like the FS and Model classes?

No, we just have to change init.js to rename them. I'm working on it.

I'm working on it.

Thanks! I've been snowed under these last few weeks with fixes, changes, investigations, etc. I've got to stop so I can get started again on the icons/stereotypes work ;-)

BTW - the branch with my init.js change is graal-proxyobject

I would also make sense to provide my script pack too (obviously not tested with GraalVM) which includes lots of useful things)

One problem with shipping scripts is maintaining them. They can get out of date. If they are online they can be maintained.

I would also make sense to provide my script pack too (obviously not tested with GraalVM) which includes lots of useful things)

One problem with shipping scripts is maintaining them. They can get out of date. If they are online they can be maintained.

I maintain them for my own use and usually also share/update them on gist, so its not a big deal for me to test them before each new version of jArchi as I already do it. In addition, its up to the user to unpack/restore the examples folder, so people already know that this might change.

At some point in the future I'd like to find a good way to distribute a script pack *and keep it updated automatically, but that's not that easy to design.

I'm working on it.

Done and pushed ;-)

I've got to stop so I can get started again on the icons/stereotypes work ;-)

I know someone who really wants those exciting new features ;-)

Done and pushed ;-)

OK, I like it.

Why put exec in child_process? Why not as part of process? I can see that you want to differentiate it as a child process but does it need another namespace?

Should we keep getArgs() as well as the new one (and document that it's deprecated) in case people are using it? Or maybe say "screw it!, get with the program!". :-)

BTW - we're really overloading this issue. Sorry @rchevallier ;-)

Why put exec in child_process? Why not as part of process? I can see that you want to differentiate it as a child process but does it need another namespace?

I'm simply following Node.js on this (they are in 2 separate modules so I do the same)

Should we keep getArgs() as well as the new one (and document that it's deprecated) in case people are using it? Or maybe say "screw it!, get with the program!". :-)

I've shipped compatibility function for it too (line 103)

I've shipped compatibility function for it too (line 103)

I should have scrolled down. ;-)

I'm simply following Node.js on this (they are in 2 separate modules so I do the same)

And I predict that in some months, some people will most certainly really use Node.js modules, so let's accept the way it does it.