DmitrySoshnikov/syntax

Allow clients to pass additional variables for use in semantic actions

Closed this issue · 6 comments

While writing a translator that converts from Sphinx's expression syntax to ElasticSearch's expression syntax, I found that I needed to pass the document's field as a scope parameter. For example, I needed to instruct ElasticSearch to limit it's search to just the account_number field.

{
  "query": { "match": { "account_number": 20 } }
}

Other times, I needed to search the entire document.

{
  "query": { "match": { "_all": 20 } }
}

As a client, I need the parser to accept additional arguments and make them available in the semantic actions. For example:

{"query":{"match":{$context:$1}}}

$context could be a complex object. For example,
{"query":{"match":{$context.searchScope:$1}}}

Can you show a specific grammar and a semantic action example for this please?

Usually this can be handled as the data on the $1 itself (or via some state tracking in the data included in the moduleInclude)

$$ = {"match": {$1.context: $1.value}};

Or:

%{

class Storage {
  public static $context = "foo";
}

// Then update Storage::context in some semantic action

%}

And use it as (if syntactically it works in your language):

$$ = { "match": {Storage::$context: $1 }}

Or:

$$ = new MatchNode(Storage::$context, $1);

// Or again just as:
$$ = new MatchNode($1.context, $1.value);

I need to pass some additional information to the parser. E.g.,
$new_search_expression = MyParser::parse($search_scope, $old_search_expression);

Then I need $search_scope to be available to semantic actions.

Here is the affected grammar.
["TERM", "$$ = ['match' => [$search_scope => $1]]"],

@fdutton, OK so this can be achieved with the yy storage object, semantic actions have access to it (available since 0.0.66).

As described in #36, the yy exposes access to tokenizer instance, so semantic actions can affect tokenizer state. However, yy can also be used to store any extra user-level data.

So it might be:

// Set parser data.
yy::set('search_scope', $search_scope);

$new_search_expression = MyParser::parse($old_search_expression);

Later in semantic action:

["TERM", "$$ = ['match' => [yy::get('search_scope') => $1]]"]

In general this could also be solved with custom storage in moduleInclude: you can just define a similar to yy storage class in moduleInclude, then set needed data before parsing, and have access from semantic actions.

Re: semantic actions parameters, they are taken, and propagated from parsing stack which pops, and push needed amount of args, so we can't store user-data there.

Thanks for the solution.

I can see how this would work for single-threaded languages like PHP and Node. I assume that something similar could be done with multi-threaded languages like Java and C# using thread-local variables.

Yeah, currently parsers are singletons (with static parse method), which makes them thread-unsafe already (because of updating "global" $$, @$, yytext, etc. variables). There is issue #17 for this.

The static API will still be supported, and for other cases one will be able to create an instance of a parser, and all the data will be stored on that instance.