konveyor/tackle-diva

Performance problem with diva-windup

Closed this issue · 5 comments

Currently diva-windup rule (DivaLauncher.perform) is very slow. We guess it's due to our construction of janus-graph. The arquillian test log (*) repeats the following message while running the diva rule.

Sep 14, 2021 9:33:05 AM org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3 execute
WARNING: Query requires iterating over all vertices [()]. For better performance, use indexes
Sep 14, 2021 9:33:06 AM org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3 execute
WARNING: Query requires iterating over all vertices [()]. For better performance, use indexes
Sep 14, 2021 9:33:06 AM org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3 execute
WARNING: Query requires iterating over all vertices [()]. For better performance, use indexes
...

Diva standalone runs faster in few seconds.

Here are steps to run the arquillian test.

$ git clone https://github.com/WASdev/sample.daytrader7.git
$ cd tackle-diva
$ ./gradlew publishToMavenLocal -x test
$ cd windup
$ JAVA_HOME=<path-to-java8-runtime> mvn verify -Dtest=DaytraderTest

*) https://github.com/konveyor/tackle-diva/runs/3597051370

Yesterday I followed Aki's instructions and executed the tests.
In my machine the part of the log containing the (repeated) warnings below had an elapsed time of under 2 minutes (compared to 2 minutes 46 seconds in Aki's test).

org.janusgraph.graphdb.transaction.StandardJanusGraphTx$3 execute WARNING: Query requires iterating over all vertices [()]. For better performance, use indexes

I did a quick search on the warning message.

From what I can tell the message may be misleading as it depends upon the nature of the graph queries as to whether they would be improved by adding indices (not all queries are).

stackoverflow thread - https://stackoverflow.com/questions/49864903/janusgraph-warning-about-iterating-over-all-vertices-after-schema-and-indexes
janusgraph docs - https://docs.janusgraph.org/schema/index-management/index-performance/

The WINDUP graph model has changed very little in recent years so we have little experience in the team of evaluating, and tuning, the performance of queries.However the janusgraph docs shows how to use the profile() method to review the efficiency of your queries.
That might be a good starting point.

Looking ahead from a strategic perspective, it may be that  using a new version of JanusGraph will impact performance or allow us to leverage new JanusGraph features?
WINDUP is using an old version of JanusGraph (0.3.1). And I suspect that we will have to update WINDUP to be built with JDK11 as a prerequisite to upgrading the latest version of JanusGraph.That work is something we need to schedule in the coming months. 

@YasuKatsuno and @akihikot Hi!
I took the time to have a look and it seems to me the cause of the issue could be another one.
The GraphTraversal created in DivaEntryMethodService executing:

GraphTraversal<?, ?> traversal = getQuery().getRawTraversal().is(methodModel.getElement());

seems to be the root cause of the warning log entries you reported.
I'm not used to the is(final Object value) you decided to adopt but I feel like it could be replaced with something like:

GraphTraversal<?, ?> traversal = getQuery().getRawTraversal().V(methodModel.getElement());

that avoids the warning message.
The DaytraderTest test has no checks on the analysis result so please try it and check the results are the expected one.

@mrizzi @PhilipCattanach I created PR #32 to improve the performance of stacktrace model queries. I guess there's an issue with https://docs.janusgraph.org/v0.3/basics/cache/#transaction-level-caching, namely the cache does not seem to utilize indices very well. This PR inserts periodical tx commits to prevent the cache from growing too large.

event.getGraphContext().getGraph().tx().commit();
The PR also declares indices for the stacktrace model.
@Override
@Indexed(value = IndexType.DEFAULT, dataType = Integer.class)
@Property(LINE_NUMBER)
int getLineNumber();
@Override
@Indexed(value = IndexType.DEFAULT, dataType = Integer.class)
@Property(COLUMN_NUMBER)
int getColumnNumber();
@Override
@Indexed(value = IndexType.DEFAULT, dataType = Integer.class)
@Property(LENGTH)
int getLength();

I measured the effect of two changes as timings of total 6900 queries on stacktrace models. Showing 10x improvement by combining both changes.

No commits, no indices
INFO: count = 6900, time = 83694ms
No commits, with indices
INFO: count = 6900, time = 87191ms
With commits, no indices: 
INFO: count = 6900, time = 44802ms
With commits, and indices:
INFO: count = 6900, time = 8249ms

This PR also includes the change for removing the warning message as suggested by @mrizzi .