r1chardj0n3s/parse

parse.parse("{c:02d}","100") is None

martinResearch opened this issue · 3 comments

"{c:02d}".format(c=100) gives "100" , thus I would expect parse.parse("{c:02d}","100") to give <Result () {'c': 100}>
as parse aims to be the inverse of format, but at the moment it gives None

I don't think this is a desired feature, since the same argument could be made for the following example:

>>> "{c:02d}".format(c=100000)
'100000'

This would eliminate the feature to specify the number of expected digits. If you expect an integer of unspecified length, you should probably just use parse.parse("{c:d}", "100").

I am currently using "image_{c:02d}.png" when creating images and would like to be able to use the same string when parsing image file names in order to avoid having to maintain two strings ( "image_{c:02d}.png" and "image_{c:d}.png") in sync in my code. Also the format string might provided by the user of my code, in which case, having no direct control on the format string in my code, I would need to write some code to automatically remove any digit between ":" and "d" in the format before passing it to parse.

In general it seems to me that aiming for parse to be the "inverse" of format i.e. have parse.parse(f, f.format(**d)).named=d
and f.format(**parse.parse(f,s).named)==s is a good thing and is the behavior I would expect from parse. If that is not the case I believe it makes it less clear what the expected behavior of parse is in general.

The first equality is not verified when using f="image_{c:02d}.png" and d={"c": 100}.

The second equality is not verified when using f="image_{c:02d}.png" and s="image_9.png".
I would want parse.parse("image_{c:02d}.png","image_9.png") to give me no match because "image_{c:02d}.png".format(c=9) gives image_09.png and not image_9.png, while at the moment is gives me <Result () {'c': 9}>

I like your second point. If the programmer explicitly states that one-digit numbers should be padded with zeros (i.e. 02d format) the string "9" should not match.

However, the first inconsistency you point out seems to be an issue of Python's format function with an impossible solution, as this is not solvable for parse without losing the whole ability of being able to specify the expected padding of a number.

But honestly, I think that if you expect three digit numbers, you should specify that in your parse string. This is also a lot less surprising for readers of your code. Basically my argument from above applies here, too. If you as the programmer go out of your way to specify a number which is padded to a width of 2 it should only match strings which contain numbers of that width.

May I ask why you use the string "image_{c:02d}.png" if c can get > 99?