New alias system towards a more Pythonic interface
seisman opened this issue · 3 comments
GMT's single-letter options (e.g. -B
) are difficult to read/understand, so they are not recommended for use in PyGMT. Instead, PyGMT uses long-form parameters and the PyGMT alias system is responsible for translating PyGMT long-form parameters into the corresponding short-form GMT options. The alias system was originally implemented by @leouieda in aad12e0 (seven years ago!) and hasn't changed much since then. The alias system has some limitations and flaws that prevent us from achieving the project goal: "Build a Pythonic API for GMT". Now it's time to design a new alias system. This issue reviews the current alias system and proposes a new alias system. The initial implementation of the proposed new alias system is available for review in #3238.
The current alias system
Currently, the alias system looks like this:
@fmt_docstrings
@use_alias(
R="region",
B="frame",
J="projection",
)
@kwargs_to_strings(R="sequence")
def func(self, **kwargs):
with Session() as lib:
lib.call_module("basemap", args=build_arg_list(kwargs))
The current alias system works in this way:
- The
kwargs_to_string
decorator converts an argument to a string. The argument can be a string, a numeric value, or a sequence (e.g., convertingregion=[10, 20, 30, 40]
toregion="10/20/30/40"
). - The
use_alias
decorator maps long-form PyGMT parameters (e.g,region
) to short-form GMT options (e.g.,R
). The short-form options are then stored inkwargs
(i.e., convertingregion="10/20/30/40" to kwargs["R"]="10/20/30/40"
. build_arg_list
(previouslybuild_arg_string
) converts the dictionarykwargs
to a list/string that GMT API can take.
The current alias system has some known limitations and flaws:
-
Long arguments are difficult to read/write.
Since each GMT option usually has many modifiers, some arguments are very long and no tab autocompletion is possible.
Here is an example from #1082:
fig.logo(position="jTR+o0.3c/0.6c+w3c", box="+p1p+glightblue")
The parameter names
position
andbox
are good, but their arguments are difficult to write/read. In #1082, some candidate solutions (dict, class or function) were proposed. Please refer to #1082 for detailed discussions. -
Short arguments are easy to write but difficult to read
For some options, GMT uses single-letter arguments. Here are two examples:
Figure.coast
,resolution="f"
is not readable.resolution="full"
is more Pythonicpygmt.binstats
,statistic="z"
is not readable.statstic="sum"
is more Pythonic.
To support Pythonic long-form arguments, we can use a dictionary which maps long-form arguments to short-form arguments. In the current alias system, it means a lot of coding effort, see #3012 and #3013.
-
Abuse of the
kwargs
parameter.Short-form GMT options are stored in the keyword argument
kwargs
, so it must be the last parameter for all wrappers that use the alias system. -
Can't access the original argument by the long-form parameter name inside the wrappers
The alias system is implemented as decorators, so all conversions/mappings are done outside of the wrappers. It means we can't access the original argument by the long-form parameter name in the wrappers.
For example, in
Figure.plot
,S
is aliased tostyle
. To access the argument ofstyle
, we have to usekwargs.get("S")
.Another example is,
region=[10, 20, 30, 40]
is converted tokwargs["R"]="10/20/30/40"
. If we want to get the region bounds in the wrapper, we have to do the inversed conversion:w, e, s, n = kwargs["R"].split("/")
. -
Difficult to implement Pythonic high-level wrappers
Due to the design of the GMT modules, each GMT module usually does too man things. For example,
basemap
/coast
provide exactly the same option for adding scale bar, direction rose, and magnetic rose. In #2831, we proposed to provide high-level wrappers that do a single job. These high-level wrappers should have a Pythonic interface with many long-form parameters (see #2831 for the proposed API) but it's unclear how to translate so many parameters into GMT short-form options (we can but it usually means a lot of if-else tests, e.g., #2130).Another related issue is #2797 for high-level wrappers of
plot
andplot3d
.
The new alias system
Here, I propose a new alias system after half a year of design and coding (design takes more time than coding!). The new alias system is implemented in pygmt/alias.py
of PR #3238.
The Alias
class
The Alias
class defines how to convert the argument of a long-form parameter name to a string (or a sequence of strings) that can be passed to GMT API.
In the example below, we define a parameter offset
. Its value can be a number, a string, or a sequence, or any object that the string representation (__str__
) makes sense to GMT. If a sequence is given, the sequence will be joined into a string by the separator '/'. The prefix +o
will also be added at the beginning of the string.
>>> from pygmt.alias import Alias
>>> par = Alias("offset", prefix="+o", separator="/")
>>> par.value = (2.0, 2.0)
>>> par.value
'+o2.0/2.0'
The Alias
class has the value
property, which is implemented using the setter method. So the argument is converted when Alias.value
is assigned.
Here are more examples:
>>> from pygmt.alias import Alias
>>> par = Alias("frame")
>>> par.value = ("xaf", "yaf", "WSen")
>>> par.value
['xaf', 'yaf', 'WSen']
>>> par = Alias("resolution", mapping=True)
>>> par.value = "full"
>>> par.value
'f'
>>> par = Alias("statistic", mapping={"mean": "a", "mad": "d", "rms": "r", "sum": "z"})
>>> par.value = "mean"
>>> par.value
'a'
The AliasSystem
class
The AliasSystem
class is similar to the old use_alias
decorator, which aliases GMT single-letter options to a Alias
object or a list of Alias
objectsn.
Here is an example:
>>> def func(par0, par1=None, par2=None, par3=None, par4=None, frame=False, panel=None, **kwargs):
... alias = AliasSystem(
... A=[
... Alias("par1"),
... Alias("par2", prefix="+j"),
... Alias("par3", prefix="+o", separator="/"),
... ],
... B=Alias("frame"),
... c=Alias("panel", separator=","),
... )
... return build_arg_list(alias.kwdict)
...
>>> func("infile", par1="mytext", par3=(12, 12), frame=True, panel=(1, 2), J="X10c/10c")
['-Amytext+o12/12', '-B', '-JX10c/10c', '-c1,2']
In this example, A
is mapped to a list of Alias
objesct. So, arguments of par1
/par2
/par3
will be used to build the -A
option (e.g., par1="mytext", par3=(12, 12)
is converted to kwdict["A"]="mytext+o12/12"
). It means now we can break any complicated GMT option into multiple long-form parameters.
The AliasSystem
class provides the property kwdict
which is a dictionary with single-letter options as keys and string/sequence as values. It can be passed directly to the build_arg_list
function. The kwdict
dictionary is dynamically calculated from the current values of long-form parameters. In this way, we can always access the original values of parameters by long-form parameter names and even make changes to them before accessing alias.kwdict
property.
The BaseParam
class for common parameters
As discussed in #1082, for some options, it makes more sense to define a class to avoid having too many (potentially conflicting) parameter names.
With the help of the Alias
system, the BaseParam
implementation is easy. Users won't use the BaseParam
class but we developers can use it to create new classes in a few lines without much coding effort (So adding new classes can be marked as "good-first-issue"!).
The Box
class
In pygmt/params/box.py
, I've implemented the Box
class as an example. The box
parameter is commonly used for plotting scale bar, color bar, gmt logo, images, inset, and more. So it makes sense to have a Box
class.
Below is the definition of the Box
class. To define a class for a parameter, we just need to define some fields (e.g., clearance
/fill
), and the special field _aliases
, which is a list of Alias
object.
@dataclass(repr=False)
class Box(BaseParam):
"""
Docstrings.
"""
clearance: float | str | Sequence[float | str] | None = None
fill: str | None = None
innerborder: str | Sequence | None = None
pen: str | None = None
radius: float | bool | None = False
shading: str | Sequence | None = None
_aliases: ClassVar = [
Alias("clearance", prefix="+c", separator="/"),
Alias("fill", prefix="+g"),
Alias("innerborder", prefix="+i", separator="/"),
Alias("pen", prefix="+p"),
Alias("radius", prefix="+r"),
Alias("shading", prefix="+s", separator="/"),
]
Here is an example. Please refer to the docstrings for more examples.
>>> str(Box(clearance=(0.1, 0.2, 0.3, 0.4), pen="blue", radius="10p"))
'+c0.1/0.2/0.3/0.4+pblue+r10p'
It's important to know that the Box class supports autocompletion!
The Frame
/Axes
/Axis
classes
The -B
option is one of the most complicated GMT options. It can repeat multiple times in GMT CLI, making it more complicated to support in Python.
In pygmt/params/frame.py
, the Frame
/Axes
/Axis
classes are implemented to address one of our oldest issues #249.
The technical details don't matter much. Here is an example use:
>>> import pygmt
>>> from pygmt.params import Frame, Axes, Axis
>>> fig = pygmt.Figure()
>>> # define a Frame object
>>> frame = Frame(
... axes=Axes("WSen", title="My Plot Title", fill="lightred"),
... xaxis=Axis(10, angle=30, label="X axis", unit="km"),
... yaxis=Axis(20, label="Y axis")
... )
>>> fig.basemap(region=[0, 80, -30, 30], projection="X10c", frame=frame)
>>> fig.show()
Check out PR #3238 and try it yourself! Enjoy autocompletion!
Pros/Cons of the new alias system
Pros:
- The new and old alias systems can co-exist. So we don't have to migrate all wrapper in a single PR.
- Allow building a GMT option argument from multiple PyGMT parameters (More Pythonic)
- No abuse of
kwargs
anymore - Define new parameter classes in a simple way
- Access the original argument by parameter name, not by dict lookup like
kwargs.get("S")
(Maybe faster) - Autocompletion for parameter classes like
Box
/Frame
- Autocompletion of all function parameters after #2896.
- Autocompletion for long-form arguments if we add type hints.
Cons:
- Big refactors may introduce new bugs. [We can always fix them if any.]
- The placeholder
{aliases}
in docstrings is not supported in the new alias system. [The list of aliases are not needed if we write good documentation.]
This is another big refactor towards a Pythonic interface! Ping @GenericMappingTools/pygmt-maintainers for comments.
Ping @GenericMappingTools/pygmt-maintainers for comments and thoughts.
Thanks @seisman for opening up this for discussion. The Alias
class you've implemented in #3238 seems to be meant for internal use (as a replacement for @use_alias
), rather than something user-facing? I do like point 5 (Access the original argument by parameter name), which would help with simplifying the makeup of internal functions (especially high level functions in the pipeline), and moving away from @
decorators means users will see a cleaner traceback on errors.
I'll need more time to look into your implementation at #3238. My initial impression is that the implementation of Alias
could be done as a first step in one PR, followed by the implementation of the Param
class. I'm also wondering if this is a good time to bring in Pydantic
to help with some validation logic based on type hints, essentially making syntax errors appear on the Python level rather than the GMT level (though that means PyGMT will need re-implement a lot of GMT's internal validation logic).
The
Alias
class you've implemented in #3238 seems to be meant for internal use (as a replacement for@use_alias
), rather than something user-facing?
Yes.
I'm also wondering if this is a good time to bring in
Pydantic
to help with some validation logic based on type hints, essentially making syntax errors appear on the Python level rather than the GMT level
It looks worth a try.