flaport/sax

Better handling of invalid netlist names

Closed this issue · 10 comments

SAX should better handle 'invalid' netlist names. At the very least a warning should be issued.

Also see gdsfactory/gdsfactory#667 (comment)

Maybe as an initial very simple effort one could modify 02_util.ipynb:

def clean_string(s: str, dot='p', minus='m', other='_') -> str:
    """clean a string such that it is a valid python identifier"""
    original = s
    s = s.strip()
    s = s.replace(".", dot)  # dot
    s = s.replace("-", minus)  # minus
    s = re.sub("[^0-9a-zA-Z]", other, s)
    if s[0] in "0123456789":
        s = "_" + s
    if s != original:
        warnings.warn(f"modified string {original} in an attempt to make valid python identifier: {s}")
    if not s.isidentifier():
        raise ValueError(f"failed to clean string to a valid python identifier: {s}")
    return s

In addition, also portnames and placements should be checked (in 06_netlist.ipynb):

@validator("instances", pre=True)
    def coerce_different_type_instance_into_component_model(cls, instances):
        new_instances = {}
        for k, v in instances.items():
            if isinstance(v, str):
                v = {
                    "component": v,
                    "settings": {},
                }
            new_instances[k] = v

        return new_instances

    @staticmethod
    def clean_instance_string(value):
        if "," in value:
            raise ValueError(
                f"Invalid instance string. Should not contain ','. Got: {value}"
            )
        return clean_string(value)

    @validator("instances")
    def validate_instance_names(cls, instances):
        return {cls.clean_instance_string(k): v for k, v in instances.items()}

    @validator("placements")
    def validate_placement_names(cls, placements):
        return {cls.clean_instance_string(k): v for k, v in placements.items()}

    @classmethod
    def clean_connection_string(cls, value):
        *comp, port = value.split(",")
        comp = cls.clean_instance_string(",".join(comp))
        return f"{comp},{port}"

    @validator("connections")
    def validate_connection_names(cls, connections):
        return {
            cls.clean_connection_string(k): cls.clean_connection_string(v)
            for k, v in connections.items()
        }

    @validator("ports")
    def validate_port_names(cls, ports):
        return {
            cls.clean_instance_string(k): cls.clean_connection_string(v)
            for k, v in ports.items()
        }

this then also resolves gdsfactory/gdsfactory#667
If you want I can open a PR. Is there anything I have to be careful with when contributing? Like generating the .pyfiles from the .ipynb files before commit or anything?
Regards JD

@joamatab Do you thing we should do similar checking during netlist generation in gdsfactory also?

Yes, sounds like a good idea, maybe we can open also. Pull request on gdsfactory side

reference names are already set via a property, and some basic checks are happening there.

it may be a good idea to check when setting a reference name, which will ensure that even in programmatically-generated circuits, users are not giving references bad names

As far as I can see the basic checks in the property setter are only concerning/avoiding duplication of names under the same owner, correct? But not whether it's a valid python identifier. For consistency, I would check all identifiers (instances, ports, connections, placements) on netlist import. Or did I misunderstand you?

this then also resolves gdsfactory/gdsfactory#667 If you want I can open a PR. Is there anything I have to be careful with when contributing? Like generating the .pyfiles from the .ipynb files before commit or anything? Regards JD

A PR would be great :)

Yeah, just make the changes in the notebooks and do a pre-commit install before committing the changes. The pre-commit hooks should generate the python library automatically.

Alternatively you can also run a make lib, which is convenient if you want to build the library but don't want to commit the changes yet.

I might convert the library to a more conventional python library in the future to make contributions like yours easier.

As far as I can see the basic checks in the property setter are only concerning/avoiding duplication of names under the same owner, correct? But not whether it's a valid python identifier. For consistency, I would check all identifiers (instances, ports, connections, placements) on netlist import. Or did I misunderstand you?

Indeed. I think having proper validation on everything in the netlist is pretty necessary

As far as I can see the basic checks in the property setter are only concerning/avoiding duplication of names under the same owner, correct? But not whether it's a valid python identifier. For consistency, I would check all identifiers (instances, ports, connections, placements) on netlist import. Or did I misunderstand you?

correct. i was just saying that it may be good to check on set (in addition to the check we have). then you would not be able to make an instance with an invalid name to begin with.

if you check on netlisting, you will still be able to make the instance with a bad name, but it will complain when you netlist. i think there's arguments both ways to which is a better approach.

Fixed by #12. Thanks Jan David!