zazuko/xrm

Files and carml-jar

Opened this issue · 3 comments

ktk commented

There is a new cli tool for carml available: https://github.com/carml/carml-jar

I could get it to work with XRM carml output but I had to figure out how to do it properly. With the current configuration I can only get it to work with piping the file as a stream and I have to do the XRM source like this: source "". This generates:

<#MapCanton>
	a rr:TriplesMap;
	
	rml:logicalSource [
		rml:source [
			a carml:Stream;
			carml:streamName "";
			carml:declaresNamespace [
				carml:namespacePrefix "eCH-0071";
				carml:namespaceName "http://www.ech.ch/xmlns/eCH-0071/1"
			]
		];
		rml:referenceFormulation ql:XPath;
		rml:iterator "/eCH-0071:nomenclature/cantons/canton"
	];

I can then run: cat input/eCH0071.xml| java -jar bin/carml-jar-1.0.0-SNAPSHOT-0.4.4.jar map -m src-gen/mapping.carml.ttl

However, I cannot run it directly from a file, I always have to pipe it. So I checked out the carml docs and it says the name can be left out or empty, so that seems to be correct.

So I checked out the ontology and I found stream:url, so I changed the generated output to this (manually):

<#MapCanton>
	a rr:TriplesMap;
	
	rml:logicalSource [
		rml:source [
			carml:url "eCH0071.xml"; # cannot have a carml:Stream in this case
			carml:declaresNamespace [
				carml:namespacePrefix "eCH-0071";
				carml:namespaceName "http://www.ech.ch/xmlns/eCH-0071/1"
			]
		];
		rml:referenceFormulation ql:XPath;
		rml:iterator "/eCH-0071:nomenclature/cantons/canton"
	];

And then I can run it directly from file:

java -jar bin/carml-jar-1.0.0-SNAPSHOT-0.4.4.jar map -m src-gen -rsl input -of ttl -P

This probably would work in XRM via rml output but I use the XML Namespace extension so I have to stick with carml-syntax.

Not sure how to handle that properly in XRM though, would be nice if both options could be generated.

I understand this as follows: carml requires different rml:logicalSource definitions, depending on if either a stream source or a files source should be used.

For stream sources:

	rml:logicalSource [
		rml:source [
			a carml:Stream;
			carml:streamName "";
			...
		];
		...
	];

For file sources:

	rml:logicalSource [
		rml:source [
			carml:url "eCH0071.xml";
			...
		];
		...
	];

Using output carml in xrm version 1.2.0 generates the "stream sources" definition.

The new requirement is now, that both variations can be generated from xrm. Is that correct @ktk ?

ktk commented

Correct interpretation.

It would be nice as people can use XRM and create carml output without having to interact with streams that way. For those that do not necessarily use a barnard59 pipeline, that's probably a nice thing to have.

ktk commented

Ran into this again btw. I think for pure carml-cli mode, this feature would be useful. So either have an anonymous (empty) stream or a filename with carml:url