open-xml-templating/docxtemplater

Getting all the tags while using Conditions

Kolis121 opened this issue · 10 comments

  • Version of docxtemplater : 3.37.0
  • Used docxtemplater-modules : Free module
  • Runner : TypeScript

How to reproduce my problem :

First off thank you for your interesting script.
I am testing it and tried to read here and there but it is not clear to me how we can get the tags inside the document while they have conditional expression in them? I tried to use Angular Expression and get the tags but it does not work.
For example how can I get the tags of items inside your own Conditions example here:
https://docxtemplater.com/demo/#conditions

Thank you!

Yes I have seen that, and I can get the tags for example in this format if I do not use expressions in the document:
{
"company": {},
"users": {
"name": {},
"age": {}
}
}

But if I have expressions in the document, then I cannot get the tags anymore and I will be having errors.
Should I use something like this?

const expressions = require("angular-expressions");
const InspectModule = require("docxtemplater/js/inspect-module");
const iModule = InspectModule();
const doc = new Docxtemplater(zip, { modules: [iModule] }, parser: expressions);
const tags = iModule.getAllTags();
console.log(tags);

Yes, with the docxtemplater, if your tag is : {users.length > 0}, then the object will use "users.length > 0"

You can extract all real variables out of this using the following (this is also in the doc)

Note about the angular-expressions feature with the tag {%img | size:20:20} this will output to you : img | size:20:20

You can extract only the Identifiers from the angular-expressions using this function :

const expressions = require("angular-expressions");
function getIdentifiers(x) {
    const ids = [];
    x.ast.body.map(function (body) {
        body.expression.toWatch.map(function ({ name }) {
            if (ids.indexOf(name) === -1) {
                ids.push(name);
            }
        });
    });
    return ids;
}
const tag = "img | size:200:300";
const expr = expressions.compile(tag);
const identifiers = getIdentifiers(expr); // this will return ["img"]
// For the tag { firstName + lastName }, it will return ["firstName", "lastName"]

Yes I had seen that too.
But my point was that if I do not have any Conditional Expressions, I can get all the tags in one go including any nested map using this:

Code (A)
const InspectModule = require("docxtemplater/js/inspect-module");
const iModule = InspectModule();
const doc = new Docxtemplater(zip, { modules: [iModule] });
const tags = iModule.getAllTags();
console.log(tags);

But this, for the case when I have Conditional Expressions does not work:
Code (B)
const expressions = require("angular-expressions");
const InspectModule = require("docxtemplater/js/inspect-module");
const iModule = InspectModule();
const doc = new Docxtemplater(zip, { modules: [iModule] }, parser: expressions);
const tags = iModule.getAllTags();
console.log(tags);

So apparently the only solution is to first get all the tags by using code (A) above, then try to remove the conditional expressions from the obtained tags. But this method is not clean and you have to reconstruct the map of tags because for example a tag like "{#users<= 3 && users!= 0}" is itself considered as a loop tag when we use code (A)...
I thought maybe there is a better way, and I am not sure if you know what I mean!

Also in your example above which you took from manual, this line of code assuming the tag is "users<= 3 && users!= 0" will not work and returns undefined - you can try:

const expr = expressions.compile(tag);

You can do the following then :

const expressions = require("angular-expressions");

function uniq(arr) {
	const hash = {},
		result = [];
	for (let i = 0, l = arr.length; i < l; ++i) {
		if (!hash[arr[i]]) {
			hash[arr[i]] = true;
			result.push(arr[i]);
		}
	}
	return result;
}

function getIdentifiers(x) {
    if (x.expression) {
        return getIdentifiers(x.expression);
    }
    if (x.body) {
        return x.body.reduce((result, y) => result.concat(getIdentifiers(y)), []);
    }
    if (x.type === "CallExpression") {
        if (x.arguments) {
            return x.arguments.reduce(
                (result, y) => result.concat(getIdentifiers(y)),
                []
            );
        }
    }
    if (x.ast) {
        return getIdentifiers(x.ast);
    }

    if (x.left) {
        return getIdentifiers(x.left).concat(getIdentifiers(x.right));
    }

    if (x.type === "Identifier") {
        return [x.name];
    }
    if (x.type === "MemberExpression") {
        return getIdentifiers(x.object);
    }
    return [];
}

function getUniqueIdentifiers(x) {
	return uniq(getIdentifiers(x));
}

const tag = 'users<= 3 && users!= 0';
const ids = getUniqueIdentifiers(expressions.compile(tag));
console.log(JSON.stringify({"ids": ids}));

I've just released a new version of the docxtemplater module (3.37.2), with this, you can now do :

const expressionParser = require("docxtemplater/expressions.js");
const identifiers = expressionParser("x+0+users").getIdentifiers();
// identifiers will be : ["x", "users"]

It will also work with 'users<= 3 && users!= 0'

About your question for mixing step A and step B, I don't think it is possible in general.

Let's take an example template :

{#users.length > 1}
{#users}
{name}
{/}
{/}

What should be the output here ?

{ users: { users: {name: {}}}}

This seems incorrect since an input of : { users: [ {name: "john"}] } would be correct in this case.

Thank you for your reply again - really appreciate it!
I will try your new release.

And regarding your comment above (which I am showing below), that is exactly what I mean. Yes "{ users: { users: {name: {}}}}" is incorrect, and for this reason and a few other reasons, I was hoping that Docxtemplater can handle it in one go. It means, the script should understand that "user" in "#user.length > 1" and "user" in "#user" are one level since they do have same name as such the map should be this "{ users: [ { name: {} } ] }" in one go just by using something similar to code A or B I showed before. This would be really helpful specially if we have multiple nested levels and complicated conditions with multiple comparisons and tag names, etc.
Hope it is clear.

Let's take an example template :

{#users.length > 1}
{#users}
{name}
{/}
{/}
What should be the output here ?

{ users: { users: {name: {}}}}
This seems incorrect since an input of : { users: [ {name: "john"}] } would be correct in this case.

I think this could be done only for some simple cases and a lot of heuristics.

It is a bit like typings in coding languages.

If you have a function, such as :

function(a,b,c) {
    return a + b + c;
}

You don't know if a b,c are all strings, or all integers, or all floats.

Let's say I have the following template :

{#companies}
{#users}
{name}
{cname}
{/users}
{/companies}

This template would work well with the following data :

{
   companies: [{ cname: "Acme"}, {cname: "Bar"}]
   users: [{name: "John"}, {name: "Mary"} ],
}

But it also works with following data :

{
   companies: [ {
        cname: "Acme",
        users: [{name: "John"}]
   }]
}

My question is : what are you trying to achieve ?

First, just to clarify how I think Docxtemplater interprets nested sections - correct me if I am wrong.
I expect e.g. for this:

{#companies}
{#users}
{name}
{cname}
{/users}
{/companies}

we are having this:

{
   companies: [ 
      {
        users: [{name: "John"}, {name: "Mary"} ],
      }
   ]
}

Right?

What I want to achieve is to get a valid tag structure even if I have condition expressions in my tags, because otherwise I have to check every single tag I get with code (A) I introduced before to see if I need to assume them as a loop section or just let them be only conditions...
For example, lets assume we only have the following in the document:

{#users.length > 3}
They are many users
The name of the first user is: {users[0].name}
{/}
{users.length == 0}
They are no users
{/}

Then I expect Docxtemplater will give me the following tag structure in one go:
{
"users": [ {name: ""}]
}

But I see the situation becomes complicated, for example what if I had only this in the document:

{#users.length > 3  && stores && cars}
They are many users
The name of the first user is: {users[0].name}
car color is {color}
{/}
{users.length == 0}
They are no users
{/}

Then in this case how Docxtemplater is supposed to interpret "stores" and "cars"! As a loop section! As a simple tag! Should "color" be considered as a simple tag! should "color" be considered as a member of tag "cars" or as a member of "users" or "stores", etc.

So I was thinking maybe there are some rules in Docxtemplater in such situations and thus Docxtemplater can interpret such cases in one go, otherwise I have to post-process all the tags I get by code (A) to make a valid global tags of the document.

Now, I guess I should assume whenever I have conditional expression in my document and want to get the tags, just avoid trying to construct a map by considering the tags defined in the condition body as section names i.e. the tags defined in e.g. this:
{#users.length > 3 && stores && cars}

So this means, "users", "stores" and "cars" should have been defined somewhere else in the document and in {#users.length > 3 && stores && cars} they only be used to set up a condition logic.
So this means, I then assume all the tags defined in this section {#users.length > 3 && stores && cars} such as "color" or "users[0].name" are not to be inside any section coming from {#users.length > 3 && stores && cars} definition.

Sorry for a long text.

Hello @Kolis121 ,

you're right, I think it would be just guessing to try to determine the tag structure simply for the template.

Since one can do arbritrarily long expressions in tags, a template has no mapping to a single data structure.