Breaking change in how text strings are treated
Closed this issue ยท 8 comments
Description of the problem
I have a code that makes use of pygmt that works fine. After an upgrade to 0.12/0.13 however, many of my titles and axis labels have quotes around them. This is because I am specifying gmt parameters such as the title like '+t"This is my title"'
. This used to work in all previous versions.
It is not clear if this is a bug, or if this is a breaking change. If pygmt changed how they parse strings for gmt options, that is fine, but I would like to make sure before changing my code.
Minimal Complete Verifiable Example
fig = pygmt.Figure()
title = 'my title'
fig.basemap(
region=[0, 10, -1.5, 1.5],
projection="X10c",
frame=['WSne+t"this is {:}"'.format(title), "xa2f1", "ya1f0.1"],
)
fig.show()
Full error message
No response
System information
PyGMT information:
version: v0.13.0
System information:
python: 3.12.5 | packaged by conda-forge | (main, Aug 8 2024, 18:36:51) [GCC 12.4.0]
executable: /home/mrak/miniforge3/envs/py312/bin/python
machine: Linux-6.10.10-200.fc40.x86_64-x86_64-with-glibc2.39
Dependency information:
numpy: 2.1.0
pandas: 2.2.2
xarray: 2024.7.0
netCDF4: 1.7.1
packaging: 24.1
contextily: None
geopandas: None
IPython: 8.26.0
rioxarray: None
gdal: 3.9.2
ghostscript: 10.03.1
GMT library information:
version: 6.5.0
padding: 2
share dir: /home/mrak/miniforge3/envs/py312/share/gmt
plugin dir: /home/mrak/miniforge3/envs/py312/lib/gmt/plugins
library path: /home/mrak/miniforge3/envs/py312/lib/libgmt.so
cores: 20
grid layout: rows
image layout:
binary version: 6.5.0
Hm. You are using both single and double quotation marks ('+t"This is my title"'
) because your title string contains white spaces, correct? This was fixed some time ago and is not needed any more (#247, #655, #2298).
I can confirm that the second quotation marks are now included in the text string. But I do not think that this was comunicated as a breacking change; wondering if it is related to PR #3132.
For me,
import pygmt
size = 1
title = "my title"
fig = pygmt.Figure()
fig.basemap(region=[-size, size] * 2, projection="X5c/1c", frame=["WSne+tthis is {:}".format(title)])
fig.show()
# OR
fig = pygmt.Figure()
fig.basemap(region=[-size, size] * 2, projection="X5c/1c", frame=[f"WSne+tthis is {title}"])
fig.show()
Yes, I originally did this because of whitespace, but also because adding a "+" in the title string might confuse the parser.
Also, is ignoring the quotes backword compatible with previous versions of pygmt?
I can confirm that the second quotation marks are now included in the text string. But I do not think that this was comunicated as a breacking change; wondering if it is related to PR #3132.
Yes, I can confirm that this breaking change was introduced in PR #3132 and we didn't realize this breaking change until now.
As originally shown in issue #247, there are four possible cases when we want to specify an argument string with whitespaces. Below is a summary table for the support of different cases in different PyGMT versions. In this table:
- Y means there are no quotation marks in the figure title [What we are expecting]
- N means there are crashes/errors or there are quotation marks in the figure title [Unexpected]
Case | Before v0.6.0 | v0.6.0 - v0.11.0 | After v0.12.0 |
---|---|---|---|
1. frame="WSen+tThis is my Title" |
N | Y | Y |
2. frame='WSen+tThis is my Title' |
N | Y | Y |
3. frame="WSen+t'This is my title'" |
N | N | N |
4. frame='WSen+t"This is my title"' |
Y | Y | N |
Here are the facts that we should know:
- In Python, single- and double quotations are the same. E.g.,
"Title"
and'title'
are the same. - In some other languages like C and Bash, single- and double quotation marks are not the same.
- In GMT, when it tries to parse the CLI string (
-JX10c -R0/10/0/10 -BWSen+t"This is my Title"
) into multiple options, it handles double quotation marks differently https://github.com/GenericMappingTools/gmt/blob/434ed1848a1e8886931694b2de4c352e4c6cddae/src/gmt_parse.c#L600.
Since single- and double-quotations are the same in Python, it makes sense to expect that cases 1 and 2 work in the same way and cases 3 and 4 also work in the same way. Prior to PyGMT v0.6.0, only case 4 works, as reported in #247 and other similar issues. Then we introduced some hacky workarounds in #1487 (first appeared in v0.6.0) and some subsequent PRs to make cases 1/2/4 work.
In PR #3132 (first appeared in v0.12.0), we realized that all the hacky workarounds in #1487 are not needed as long as we change the way we pass the CLI arguments to GMT API. The new way (passing an argument list) is also more robust than the old way (passing an argument string), but as this issue has reported, it breaks case 4.
As said above, cases 3 and 4 should work in the same way in Python, so I think this breaking change is breaking things in a good way.
Yes, I originally did this because of whitespace, but also because adding a "+" in the title string might confuse the parser.
Yes, +
in title causes troubles since GMT may parse it as an option modifier. Currently, you can use octal codes like frame='WSen+tThis is my title\\053g'
(two backslashs). We may have better solutions after #3238.
Also, is ignoring the quotes backword compatible with previous versions of pygmt?
Yes, as shown in the above table, case 1/2 work since PyGMT v0.6.0.
Thanks. I just modified my code to pass the title (and other arguments) without quotes and with white space. As this approach works back to 0.6, we can probably just close this. Nevertheless, I wonder if this should be mentioned somewhere in the docs. I think that most people would be hesitant to use a title with whitespace and without quotes.
The real solution is probably to move away from the idiosyncratic gmt strings, but that is another issue!
Nevertheless, I wonder if this should be mentioned somewhere in the docs.
Yes, I think we should add an entry to the v0.12.0 changelog and also mention it in the upcoming v0.14.0 release.