substring
Opened this issue · 4 comments
@noahboerger you reported substr
is not supported.
I thought it was, for scalars, components and datasets.
Could you report a failed example please?
Substring syntax (In Trevas param2 is the start_index, and param3 the length. The user test seems to assume, that param2 is the start_index and param3 the end_index (inclusive))
@noahboerger could you confirm substr
doesn't cause exception throw but a wrong result because of the previous remark?
The first note on substr was just that it is currently not supported in the shape DS_r := substr( DS_1, 1, 2)
.
The different syntax also seems to be the correct one according to the reference manual (p. 76 f.).
So i would propose to close the issue as it is only an issue for our used testcases.
This syntax has to work, and is implemented in Trevas I think.
The condition on the handled dataset is: dataset with only string measures.
@noahboerger, can you provide a failed example with the stack trace please?
@NicoLaval you are right, with a dataset with only string measures it is working also on my side. The dataset i have tested it with had the wrong format
Id_1 | Id_2 | num_1 | Text_1 |
---|---|---|---|
IDENTIFIER | IDENTIFIER | MEASURE | MEASURE |
STRING | STRING | DOUBLE | STRING |
The concrete test case i have executed is the one from BdI "11. conditional/if-then-else_2", which is trying to execute it with a dataset that is also containing a measure with an additional double type. This testcase has failed with the following error:
Occured error
Exception
fr.insee.vtl.engine.exceptions.FunctionNotFoundException: function 'substr(Double, Long, Long)' not found
at fr.insee.vtl.engine.visitors.expression.functions.GenericFunctionsVisitor.invokeFunction(GenericFunctionsVisitor.java:146)
at fr.insee.vtl.engine.visitors.expression.functions.StringFunctionsVisitor.visitSubstrAtom(StringFunctionsVisitor.java:183)
at fr.insee.vtl.engine.visitors.expression.functions.StringFunctionsVisitor.visitSubstrAtom(StringFunctionsVisitor.java:20)
at fr.insee.vtl.parser.VtlParser$SubstrAtomContext.accept(VtlParser.java:3074)
at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:18)
at fr.insee.vtl.engine.visitors.expression.ExpressionVisitor.visitStringFunctions(ExpressionVisitor.java:254)
at fr.insee.vtl.engine.visitors.expression.ExpressionVisitor.visitStringFunctions(ExpressionVisitor.java:41)
at fr.insee.vtl.parser.VtlParser$StringFunctionsContext.accept(VtlParser.java:1098)
at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:46)
at fr.insee.vtl.parser.VtlBaseVisitor.visitFunctionsExpression(VtlBaseVisitor.java:90)
at fr.insee.vtl.engine.visitors.expression.ExpressionVisitor.visitFunctionsExpression(ExpressionVisitor.java:491)
at fr.insee.vtl.engine.visitors.expression.ExpressionVisitor.visitFunctionsExpression(ExpressionVisitor.java:41)
at fr.insee.vtl.parser.VtlParser$FunctionsExpressionContext.accept(VtlParser.java:629)
at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:18)
at fr.insee.vtl.engine.visitors.AssignmentVisitor.visitAssignment(AssignmentVisitor.java:51)
at fr.insee.vtl.engine.visitors.AssignmentVisitor.visitTemporaryAssignment(AssignmentVisitor.java:59)
at fr.insee.vtl.parser.VtlParser$TemporaryAssignmentContext.accept(VtlParser.java:372)
at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:18)
at fr.insee.vtl.engine.VtlScriptEngine.evalStream(VtlScriptEngine.java:263)
at fr.insee.vtl.engine.VtlScriptEngine.eval(VtlScriptEngine.java:282)
at java.scripting/javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:262)
at fr.insee.trevas.jupyter.VtlKernel.eval(VtlKernel.java:305)
at io.github.spencerpark.jupyter.kernel.BaseKernel.handleExecuteRequest(BaseKernel.java:334)
at io.github.spencerpark.jupyter.channels.ShellChannel.lambda$bind$0(ShellChannel.java:64)
at io.github.spencerpark.jupyter.channels.Loop.lambda$new$0(Loop.java:21)
at io.github.spencerpark.jupyter.channels.Loop.run(Loop.java:78)
The test from BdI assumes that the additional non string measure is kept without being manipulated. But I think your implementation is right, as the table of VTL-ML-Operators in the reference manual (p. 17), requires the input to be in the following form:
op ::
dataset { measure<string> _+ }
| component<string>
| string
pattern1, pattern2 ::
component<string>
| string
so i think it the correct behaviour to fail here.
I did not dive so deep into this test case when i executed it the first time, but then it is a wrong test case. I am not sure if the trevas error message could be a bit more clear, as the engine still tries to execute the replace-operator even with a mismatching input dataset.