afs/rdf-delta

Integration tests fail on Windows

bvosburgh-tq opened this issue · 10 comments

I am trying to build rdf-delta on Windows (e.g. with mvn install); but a number of the tests in rdf-delta-integration-tests fail.

The Maven output is below.

It appears that most of the issues are in setup code that fails because it is encountering directories under target/test/server and target/Zone that have been only partially deleted. These directories are not deleted because some of the files are still open from previous tests. For example, the cleanup code cannot delete the file target/Zone/654321/data/GOSP.dat because another process still has it open. Each of the test suites (e.g. TestRestart) succeeds when run alone, but fails when run within the entire build test execution.

Is there any documentation that can help me figure out what needs to be closed when? I can see the various files being opened and held somewhere underneath a DatasetGraphTransaction, but it is not obvious to me what needs to be closed and cleaned up with each test.

I am guessing this is a Windows-specific issue and the problem is not present on Linux/MacOS?

org.seaborne.delta.TC_DeltaIntegration.txt

Here are a few observations from further investigation:

  • BaseTestDeltaFuseki.fuseki(Start, int, int, String, String) fails because it tries to build an IRI from the entire contents of the config file on line 115:
    String baseIRI = IRILib.filenameToIRI(text);
    This does not seem correct, and it fails on Windows because the text string looks like a file IRI and contains a ':'.
  • The TestRestart tests fail because they are unable to create a LocalServer with the directory target/test/server.
    • LocalServer.createFile(Path) throws an exception because it is trying to build a DataSourceDescription from the file target/test/server/ABC/source.cfg, which does not exist.
    • source.cfg is expected because the ABC directory is still (partially) present, as it was not successfully deleted earlier in the test.
    • The ABC directory was not deleted because the initial test cleanup code was unable to delete the file target/test/server/ABC/rdb/LOCK.
    • The deletion of LOCK fails silently in FileOps.clearAll(File) because clearAll does not check the boolean return value from File.delete(). I'm guessing LOCK cannot be deleted because it is still open.
    • LOCK is still open because it was created, opened, and not closed by the preceding TestZone tests.
  • All of the tests in TestZone seem to leave open the LOCK file mentioned above. If I add the following line of code to each of the TestZone tests the subsequent TestRestart failures are fixed:
    deltaClient.removeDataSource(dsRef);\ But I am in no way certain that is the appropriate way to close the LOCK` file.

Fixing these two problems results in this Maven output: org.seaborne.delta.TC_DeltaIntegration.txt

Most of these issues seem to revolve around more open files. For example:

  • In TestRemoteConnection, the test suite setup method AbstractTestDeltaConnection.setupZone() attempts to create a Zone with the directory target/Zone but fails because it cannot find the file target/Zone/6543/state.
  • The state file is expected because the 6543 directory is still (partially) present, as it was not successfully deleted earlier in the test suite setup. Again, a silent failure in FileOps.clearAll(File).
  • The 6543 directory and its files (which appear to be a lot of Jena data and index files) were created by the preceding TestLocalClient suite.

I tried a number of things to close those files in AbstractTestDeltaClient/TestLocalClient but to no avail. Can you give me any hints as to how to tear down what is created in AbstractTestDeltaClient.setupZone() and other similar test setup code?

It also appears AbstractTestDeltaClient.setupZone() should include a line of code to clear out the zone directory:
FileOps.clearAll(DIR_ZONE);
like what is in AbstractTestDeltaConnection.setupZone().

afs commented

Looks like String baseIRI = IRILib.filenameToIRI(text); should be String baseIRI = IRILib.filenameToIRI(config);.

Ask your TQ colleagues about deleting TDB databases on Windows.

There is a long time Java issue on MS Windows here (19 year old (Java 1.4), marked "won't fix"). it is not specific to RDF Delta - it impacts Apache Jena generally. There is code in Jena to mitigate this in tests (it creates a fresh database rather than reuse an old database area).

It does not affect normal RDF Delta operation.

(I don't build the system on Windows - I don't have a Windows machine.)

There is a latest release in Maven central now that built with a Jena 4.2.0 dependency as requested.

Yeah, I could guess you do not have a Windows build. : ) That's fair.

And, yes, I am aware of that longstanding issue with Java on Windows. : )

But: Although the Jena tests do not "reuse on old database area", it appears the rdf-delta tests do.

So my question:
Can you tell me how to change the rdf-delta tests to do one of the following?

  1. create a fresh database with every execution; or
  2. cleanly close the database(s) so subsequent tests can perform the necessary cleanup

Thanks

afs commented

Only (1) works. (2) is already being done - it doesn't help. Tests run same-JVM and only when the JVM exits does the file go away.

See org.apache.jena.tdb2.TestConfig in Jena.

What are you looking to change in Delta?

Well, more accurately, the databases are being deleted, not closed; and that would be the problem on Windows (i.e. open files cannot be deleted). It would seem reasonable that a test close whatever database(s) it creates and/or opens. But maybe that's just me being overly compulsive. : )

I do not see any class in Jena named org.apache.jena.tdb2.TestConfig. Is it supposed to be a Java class or something else?

I am trying to patch Delta so it does not choke on the Unicode Replacement Character (U+FFFD), which is caused by Jena 4.2.0.

afs commented

Should have been ConfigTest.

https://github.com/apache/jena/blob/main/jena-db/jena-tdb2/src/test/java/org/apache/jena/tdb2/ConfigTest.java

FWIW I haven't heard whether the fixes in 4.3.0 development address that issue.

I will take a look at ConfigTest. Thanks!

Jena 4.3.0 removes a number of deprecated classes that we rely on; so it is not easy to test right now. The Replacement Character changes look reasonable, but logging the warning is problematic because, given how often we can encounter the Replacement Character, it's a bit too much logging. : (

afs commented

Thank you for a visual inspection.

afs commented

Warnings are configured by the ErrorHandler given when the TokenizerText is created.

afs commented

Closing - "won't fix" until there is a solution that does not disadvantage Linux just because of Windows. Linux is the most common deployment platform.

RDF Dalta runs on Windows. Only the test harness of the integration tests is affected.