GabrielDosReis/ipr

Global_region/Empty_stmt/Primitive/Parameter_list do not have catagories?

Closed this issue · 11 comments

Is this intentional, it makes checking if these nodes more difficult as the user is expected to query the actual type.

Parameter_list is documented with a TODO to add categories

  1. Global_region is a distinguished constant, just like value 0. So, it is not clear to me that we want to have a distinguished category code for it. Where is such a category code needed?

  2. Parameter_list is essentially a declarative region, just like a class scope. Where would we need to compare its category code?

I think this ticket might be the result of interpreting the following comment from interface:

   // A list of numerical codes in one-to-one correspondence with
   // IPR node "interface types".

And Parameter_list even had a FIXME and an entry in the node-category file.

But I do agree, these are not coming from real-world use-cases.

With distinguished constants like Global_region, we can always just compare the memory address (use the object identity) to check if the Region we are working with is the global one.

In case of Parameter_list what is the role of the Region? Do we think of it as the enclosing Region of the body? If we have an analysis that wants to handle Parameter_lists differently than other Regions can we branch on the kind of the Region somehow without a category?

I think this ticket might be the result of interpreting the following comment from interface

Ah! That comment is not precise enough to take literally.

And Parameter_list even had a FIXME and an entry in the node-category file.

Even weirder, :-)

With distinguished constants like Global_region, we can always just compare the memory address (use the object identity) to check if the Region we are working with is the global one.

Yes.

In case of Parameter_list what is the role of the Region? Do we think of it as the enclosing Region of the body? If we have an analysis that wants to handle Parameter_lists differently than other Regions can we branch on the kind of the Region somehow without a category?

Region captures the notion of declarative regions -- which a parameter-list is. In the case of function definition, that region conceptually extends till the end of the function definition, and indeed it is the enclosing Region of the body.

To turn the question around a bit. If we do not need a category number for distinguished constants like Global_region, do we need them to have a separate type in the first place?

Parameter_list is essentially a declarative region, just like a class scope. Where would we need to compare its category code?

In some internal code, there is a dumper, that dumps IPR nodes to an output format based on category code. In case we do not have category code for some nodes, we cannot have a faithful dump of the IPR. Of course, it is possible to overcome this problem using visitors, it is just a bit more involved. So in this particular case it is more like a nuisance, not a functional problem.

To turn the question around a bit. If we do not need a category number for distinguished constants like Global_region, do we need them to have a separate type in the first place?

The numerical codes are intended to be in one-to-one correspondence (an injection) with the interface type, but they were never intended to be an onto mapping (a sujection), therefore not in bijection correspondence. Said, differently, even though we keep adding or renaming the node numerical codes, I would like to see them reduced in number - if they can't totally disappear.

I don't see, at this point, enough compelling reasons to have a distinguished constant type like Global_region with a dedicated numerical code.

If something warrants being a Node and the user can find utility from know that concept I think it should have a category. I don't think there should be any type in the interface that requires me to do a dynamic_cast to find out it's meaning.

At the same time I don't think just lowering the number of nodes inherently makes the IPR simpler. If it is a distinct concept that can add useful context then promote it to a full node rather than remove it.

Global_region certainly doesn't need to be a dedicated type. You will have to solve issue #33 before removing as you need a way to walk up the tree without crashing.

If something warrants being a Node and the user can find utility from know that concept I think it should have a category.

Yes, but why?

I don't think there should be any type in the interface that requires me to do a dynamic_cast to find out it's meaning.

Yes, but why?

At the same time I don't think just lowering the number of nodes inherently makes the IPR simpler.

Actually, eliminating the category code makes the IPR simple: there is no longer ambiguity about exactly what to test (the type or the category code?).

Global_region certainly doesn't need to be a dedicated type. You will have to solve issue #33 before removing as you need a way to walk up the tree without crashing.

Logically, the global region does not have a parent. And logically, any walks of the parent chain has to test whether it is at the global region. Consequently, a test is logically necessary.

Sorry I missed the part where you said make them totally disappear. I'm not against that idea.

I guess I don't understand what Category_code was for. I always just assumed it was just more of utility for describing all "instantiable" nodes (not base classes) and maybe a perf thing to avoid RTTI lookup or virtual visitor calls.

These 4 nodes (now 3) are the only nodes that stop this from being a surjection and create what personally felt like an inconsistency. Does category have some purpose/meaning that I am missing?

Actually, eliminating the category code makes the IPR simple: there is no longer ambiguity about exactly what to test (the type or the category code?).

I was commenting on a reduction in node in general (not just categories) but that's just a very broad personal opinion that doesn't have relevance to this discussion so I will retract it.

One of the issues we have here is designing an interface that does not promote nor recommend:

  • (1) switch statements with (2) static_castto navigate class hierarchies: we should regard all static_cast as suspects (potentially unsafe) and indication of weakness in the interface

  • A cascade of dynamic_casts to navigate class hierarchies: the resulting code is not just ugly, but brittle; a series of homegrown dyn_cast is the same

Does category have some purpose/meaning that I am missing?

The numerical category codes were a hack (see also remark in Bjarne's TC++PL 4th edition on the IPR) designed to quickly match two nodes based on their semantic types, with the hope of finding better replacement as time goes by and the implementation language improves. I've had hopes that we would be able to design a pattern matching system that takes care of this. I don't know if the current pattern matching proposal addresses this -- but it was on my list of problems to solve when I was working on it early 2010s.

Luke Wagner designed a very nice system layered on top of the visitors to hide their inconveniences; I don't have his work anymore.

Global_scope was removed by #241 .
Empty_stmt and Primitive were removed by #243 .
Parameter_list has a numerical code as per #135 .

Closing as the core issues have been resolved by those patches.