afs/rdf-delta

rdf2patch will flush null characters when character set is larger than 4096

mhoffman-tq opened this issue · 2 comments

The underlying error has already been reported to Jena as JENA-1920, but I thought it best to record here as well given this is where the issue was uncovered.

When attempting to create a patch file where the source .ttl has a character set larger than 4096 but less than 8192 characters in length will result in null characters being written to the patch file. This in turn will cause a RiotParseException when attempting to send the patch file to the rdf-delta-server.

Using the attached file data.ttl.txt (uploaded as txt)

dcmd r2p data.ttl > data.rdfp

followed by

(assumes the existence of 'examples' log)
dcmd add --server=http://localhost:1066 --log examples data.rdfp

Results in

org.apache.jena.riot.RiotParseException: [line: 6, col: 1 ] Failed to find a prefix name or keyword: (0;0x0000)
	at org.apache.jena.riot.tokens.TokenizerText$ErrorHandlerTokenizer.error(TokenizerText.java:65)
	at org.apache.jena.riot.tokens.TokenizerText.error(TokenizerText.java:1244)
	at org.apache.jena.riot.tokens.TokenizerText.readPrefixedNameOrKeyword(TokenizerText.java:536)
	at org.apache.jena.riot.tokens.TokenizerText.parseToken(TokenizerText.java:445)
	at org.apache.jena.riot.tokens.TokenizerText.hasNext(TokenizerText.java:99)
	at org.seaborne.patch.text.RDFPatchReaderText.apply1(RDFPatchReaderText.java:77)
	at org.seaborne.patch.text.RDFPatchReaderText.read(RDFPatchReaderText.java:57)
	at org.seaborne.patch.text.RDFPatchReaderText.apply(RDFPatchReaderText.java:67)
	at org.seaborne.patch.RDFPatchOps.read(RDFPatchOps.java:182)
	at org.seaborne.delta.cmds.append.toPatch(append.java:106)
	at org.seaborne.delta.cmds.append.exec1(append.java:71)
	at org.seaborne.delta.cmds.append.lambda$execCmd$0(append.java:64)
	at java.util.ArrayList.forEach(ArrayList.java:1257)
	at org.seaborne.delta.cmds.append.execCmd(append.java:64)
	at org.seaborne.delta.cmds.DeltaCmd.exec(DeltaCmd.java:107)
	at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
	at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
	at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
	at org.seaborne.delta.cmds.append.main(append.java:47)
	at org.seaborne.delta.cmds.dcmd.main(dcmd.java:125)
	at dcmd.main(dcmd.java:38)
afs commented

Thanks for the details - indeed, it is the bug JENA-1920 and is fixed by apache/jena#762. When that rolls though to RDF Delta, it'll be fixed - a source code fix is to lift the BufferingWriter source code out of Jena.

afs commented

This fixed in codebase because Apache Jena has been upgraded to v 3.16.0.