LeoHsiao1/pyexiv2

Xmp LangAlt handling

jim-easterbrook opened this issue · 7 comments

At present Xmp data with 'LangAlt' type is presented as a string with the languages included, such as lang="x-default" Hello, world, lang="de-DE" Hallo, Welt. This is not very easy to use, and easy to get wrong when setting a value. Would it be better to use a dict for this, e.g. {'x-default': 'Hello, world', 'de-DE': 'Hallo, Welt'}.

I'm not familiar with the format of most metadata, so try to treat it as a normal string.
You have a lot more experience than I do.
Do all image producers use this LangAlt format? I'm worried that it won't convert from string to dict.

The format is set by the Xmp specification, for example Xmp.dc.description has type LangAlt.
https://www.exiv2.org/tags-xmp-dc.html

I think converting everything to a string is oversimplifying the data. Lists of values (such as Iptc.Application2.Keywords) shouldn't be reduced to a single string, as individual values in the list might contain the ", " string you use as a separator. Some other values, such as Exif.Canon.ModelID have a value that is just a number, but Exiv2::Metadatum provides methods to convert it to an "interpreted string" such as "EOS Rebel SL1 / 100D / Kiss X7".

Simply using a string for everything is nice and simple, and for many applications is probably the best solution. A lower level interface would give more control, but would be a lot harder to use.

Well, I'll add a function when reading metadata that tries to convert the metadata from a string type to some other, more convenient Python type. If the conversion fails, the value is returned as a string.

XMP tags of type LangAlt are now converted to dict. For example:

>>> import pyexiv2
>>> img = pyexiv2.Image(r'./pyexiv2/tests/data/1.jpg')
>>> img.read_xmp()['Xmp.dc.title']
{'lang="x-default"': 'test-中文-', 'lang="de-DE"': 'Hallo, Welt'}

Keys are named like lang="x-default" instead of x-default to highlight their purpose.
This feature will be added to the next release.

I like that. If you set a LangAlt tag from a plain string (instead of a dict), is the "lang="x-default" part added automatically?

pyexiv2 does not set a default language, but exiv2 seems to do so:

>>> img.modify_xmp({'Xmp.dc.title': ''})
>>> img.read_xmp()['Xmp.dc.title']
{'lang="x-default"': ''}
>>> img.modify_xmp({'Xmp.dc.title': 'Hello'})
>>> img.read_xmp()['Xmp.dc.title']
{'lang="x-default"': 'Hello'}

I have released v2.7.0 to GitHub and pypi.org .