Better handling of invalid netlist names
Closed this issue · 10 comments
SAX should better handle 'invalid' netlist names. At the very least a warning should be issued.
Also see gdsfactory/gdsfactory#667 (comment)
Maybe as an initial very simple effort one could modify 02_util.ipynb:
def clean_string(s: str, dot='p', minus='m', other='_') -> str:
"""clean a string such that it is a valid python identifier"""
original = s
s = s.strip()
s = s.replace(".", dot) # dot
s = s.replace("-", minus) # minus
s = re.sub("[^0-9a-zA-Z]", other, s)
if s[0] in "0123456789":
s = "_" + s
if s != original:
warnings.warn(f"modified string {original} in an attempt to make valid python identifier: {s}")
if not s.isidentifier():
raise ValueError(f"failed to clean string to a valid python identifier: {s}")
return s
In addition, also portnames and placements should be checked (in 06_netlist.ipynb
):
@validator("instances", pre=True)
def coerce_different_type_instance_into_component_model(cls, instances):
new_instances = {}
for k, v in instances.items():
if isinstance(v, str):
v = {
"component": v,
"settings": {},
}
new_instances[k] = v
return new_instances
@staticmethod
def clean_instance_string(value):
if "," in value:
raise ValueError(
f"Invalid instance string. Should not contain ','. Got: {value}"
)
return clean_string(value)
@validator("instances")
def validate_instance_names(cls, instances):
return {cls.clean_instance_string(k): v for k, v in instances.items()}
@validator("placements")
def validate_placement_names(cls, placements):
return {cls.clean_instance_string(k): v for k, v in placements.items()}
@classmethod
def clean_connection_string(cls, value):
*comp, port = value.split(",")
comp = cls.clean_instance_string(",".join(comp))
return f"{comp},{port}"
@validator("connections")
def validate_connection_names(cls, connections):
return {
cls.clean_connection_string(k): cls.clean_connection_string(v)
for k, v in connections.items()
}
@validator("ports")
def validate_port_names(cls, ports):
return {
cls.clean_instance_string(k): cls.clean_connection_string(v)
for k, v in ports.items()
}
this then also resolves gdsfactory/gdsfactory#667
If you want I can open a PR. Is there anything I have to be careful with when contributing? Like generating the .py
files from the .ipynb
files before commit or anything?
Regards JD
@joamatab Do you thing we should do similar checking during netlist generation in gdsfactory also?
Yes, sounds like a good idea, maybe we can open also. Pull request on gdsfactory side
reference names are already set via a property, and some basic checks are happening there.
it may be a good idea to check when setting a reference name, which will ensure that even in programmatically-generated circuits, users are not giving references bad names
As far as I can see the basic checks in the property setter are only concerning/avoiding duplication of names under the same owner, correct? But not whether it's a valid python identifier. For consistency, I would check all identifiers (instances, ports, connections, placements) on netlist import. Or did I misunderstand you?
this then also resolves gdsfactory/gdsfactory#667 If you want I can open a PR. Is there anything I have to be careful with when contributing? Like generating the
.py
files from the.ipynb
files before commit or anything? Regards JD
A PR would be great :)
Yeah, just make the changes in the notebooks and do a pre-commit install
before committing the changes. The pre-commit hooks should generate the python library automatically.
Alternatively you can also run a make lib
, which is convenient if you want to build the library but don't want to commit the changes yet.
I might convert the library to a more conventional python library in the future to make contributions like yours easier.
As far as I can see the basic checks in the property setter are only concerning/avoiding duplication of names under the same owner, correct? But not whether it's a valid python identifier. For consistency, I would check all identifiers (instances, ports, connections, placements) on netlist import. Or did I misunderstand you?
Indeed. I think having proper validation on everything in the netlist is pretty necessary
As far as I can see the basic checks in the property setter are only concerning/avoiding duplication of names under the same owner, correct? But not whether it's a valid python identifier. For consistency, I would check all identifiers (instances, ports, connections, placements) on netlist import. Or did I misunderstand you?
correct. i was just saying that it may be good to check on set (in addition to the check we have). then you would not be able to make an instance with an invalid name to begin with.
if you check on netlisting, you will still be able to make the instance with a bad name, but it will complain when you netlist. i think there's arguments both ways to which is a better approach.