scipp/sciline

Alternative types in parameters

Closed this issue · 4 comments

Problem

For some parameters, it makes sense to be flexible about what types are allowed. For example:

  • Numeric parameters are often float. But in Python, passing an int instead would be expected to work the same.
  • File paths are commonly encoded as either str or pathlib.Path.

But Pipeline is strict about the types. E.g.,

params = {
    float: 3
}
sciline.Pipeline((), params=params)

raises TypeError: Key <class 'float'> incompatible to value 3 of type <class 'int'>

Potential solution

Convert to the declared type

Change

raise TypeError(
f'Key {key} incompatible to value {param} of type {type(param)}'
)
to not raise an exception but attempt a conversion instead. I.e., something along the lines of (with better error reporting)

self._set_provider(key, lambda: expected(param))

Or (for earlier error detection but more expensive pipeline construction)

converted = expected(param)
self._set_provider(key, lambda: converted)

This would make the pipeline almost as flexible as Python in terms of typing. It wouldn't quite be duck-typing as it requires types that are convertible to expected but it might be close enough.

Potential downsides are that these conversions can be expensive and that this can hide mistakes. But both are typical behaviour in Python programs.

Do not check the provided type

Simply drop the type-checking code shown above. This would correspond to regular duck-typing. If the provided value is of an unsupported type, this will lead to a runtime error in the providers that consume it.

So this would be maximally flexible but also shift right the error detection.

Workaround for numbers

In the case of int/float, parameters can instead be declared like this:

P1 = NewType('P1', Real)
P2 = NewType('P2', Real)
params = {
    P1: 3,   # works
    P2: 3.0, # works, too
}
sciline.Pipeline((), params=params)

But this only works where a common base type is available. And the pipeline author needs to design the pipeline for it.

Failed workarounds

Use Union

It is tempting to use

P = NewType('P', Union[str, Path])
params = {
    P: 'file'
}
sciline.Pipeline((), params=params)

but Sciline doesn't like unions as provider types. The above raises

  File ".../site-packages/sciline/pipeline.py", line 356, in __setitem__
    elif issubclass(origin, (Scope, ScopeTwoParams)):
TypeError: issubclass() arg 1 must be a class

Doing this with a scoped parameter:

class P(sciline.Scope[RunType, Union[str, Path]], Union[str, Path]): ...

raises

TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

Use TypeVar

Defining the parameter as a type var:

P = TypeVar('P', int, float)
params = {
    P: 3
}
sciline.Pipeline((), params=params)

raises

  File ".../site-packages/sciline/pipeline.py", line 367, in __setitem__
    if not isinstance(param, expected):
TypeError: isinstance() arg 2 must be a type or tuple of types

Are there any other examples apart from numbers and paths?

There is list/tuple which can be done using Sequence or Iterable. I can't think of anything else right now.

Use Union

It is tempting to use

P = NewType('P', Union[str, Path])
params = {
    P: 'file'
}
sciline.Pipeline((), params=params)

If we supported Union then this looks like a reasonable solution.

Todo: Find old discussion in Optional PR to find out why we decided against supporting it.

Todo: Find old discussion in Optional PR to find out why we decided against supporting it.

I couldn't find them in the issues or PRs, but I remember a couple of reasons so I'll just write them down here.

Union is not exactly a type and maybe need to parse type arguments many times for special handling

Union type aliases can't be uses to check an instance type with isinstance.
-> can be done by get_type_args instead, but it's an extra layer.
A type alias can't be type-hinted as a type
-> Maybe one-time problem that should be fixed in the pipeline module...? I don't know...

What Union means as a dependency type.

It could also mean that we only need one of the types, which will make graph building so complicated.
I think we can simply interpret it to find a provider of that Union type.

There was this potentially confusing conflicts I mentioned here: scipp/beamlime#114
but it can be solved by adding new types as a type arguments in Union type aliases.

For example,

from typing import Union, NewType

AInt = NewType("AInt", int)
BInt = NewType("BInt", int)
AStr = NewType("AStr", str)
BStr = NewType("BStr", str)

alias_a = Union[_AInt, AStr] 
alias_b = Union[_BInt, BStr]

If it is not just a parameter and need to be composed by another provider,
we only need one provider that returns the type of this Union alias.
And then we can use override decorator for better documentation for that provider.