Produce values derived from one or more tags
Opened this issue · 15 comments
There are many cases where answering a question about an image may involve reading multiple different tags, possibly from different directories.
Dealing with redundancy
Examples:
- image width (equally height) may be obtained from the
JpegDirectory
andExifIFD0Directory
- There is often multiple ways to obtain exposure time
- XMP duplicates a lot of existing tags
Devise a strategy that sits on top of the directories and tags for extracting certain commonly used values according to well tested heuristics. One challenge here is that tags may not agree and it may be unclear which to trust.
(Migrated from Google Code issue 26)
Grouping values
Sometimes multiple tags should be combined to produce one logical 'value':
- GPS lat / lng
- Date & time values (i.e. in IPTC data)
- Aspect ratio (#494)
I think we could either
A) Rank the different formats/ directories, run through that order trying to find a matching directory and return the first value we find. Example: Order (XMP, Exif, IPTC), read meta-block, no matches for XMP, read meta-block again (better read once and remember the directories), 2 matches for Exif, look in first match for creation date, found, return that.
B) Just return the first matching tag of whatever directory we find.
C) Collect matching tags from all directories we find and return the "best" (e. g. for creation date this might be just the oldest).
Another question is if the output should somehow be canonized. E. g. Exif DateTime might return yyyy:mm:dd hh:ii:ss wheres IPTC Date Created might return yyyymmdd without a time (since time is stored in Time Created) and other formats might not even provide a time-fragment (date only).
I'd say we start w/ the creation date/time and just try to figure out a smart solution. Option B) seems to simple to me, option C) too complex/heavy (must always read full metadata).
I think option A is simplest, best and most transparent: walk through a
list of tags in priority order until one is present, then return that.
The open questions are what kinds of values to do this for, what tags to
consider for each, and how to surface this in the API.
Here would be my recommendation for fields (sorry caps, snagged from DB keys):
TIMESTAMP (prefer unix time to avoid locale)
MODEL
APERTURE
EXPOSURE
FLASH
FOCAL_LENGTH
ISO
WHITE_BALANCE
HEIGHT
WIDTH
LATITUDE
LONGITUDE
ALTITUDE
ORIENTATION (or rotation if we settle on a standard)
MAKE
THUMB_HEIGHT
THUMB_WIDTH
LENS_MODEL
DRIVE_MODE
EXPOSURE_MODE
EXPOSURE_PROGRAM
//XMP, but these are considered the most important by many
RATING
SUBJECT
LABEL
The main question is how to wrap these tag preferences. Since this proposal is more of a maker cheat sheet which covers multiple directories it would make sense to have maker-centric classes process an entire meta data set. For example, in my experience on Sony devices I prefer the exif lens model (usually populated) to the maker note one (which is a mess), in almost any other case you'd go to the maker note. It'd be best to post-process an existing metadata within this "wrapper". This should minimize impact to the existing project as well.
Maybe IPTC<>XMP would be a good candidate as the mappings and reconciliation practices seem to be well described? This schema is also widely used by picture agencies, editing applications and camera makers.
This would also help to avoid writing files that contain non-reconcilled/conflicting metadata with drewnoakes/metadata-extractor-dotnet#65 (when only XMP field is modified but not its legacy sibling(s)).
Different concepts. This is about taking the general concept of those fields I mentioned (or more) and automatically pulling a preferable tag from any of the various fields that can exist in an image that might best represent that field. It's merely a convenience for the undoubtedly hundreds of replicas of "pull tag x from driectory y" that everyone has for fields they're looking up.
You’re right, apologies, @rcketscientist. Your list seems to contain data much more suited for what you’re describing, then, as these are all (with the exception of width, height?) the properties closer to the creation/acquisition phase (makernote, EXIF) as opposed to editing/manipulation one, like IPTC. They are likely to be more “correct” in their original form, also. As opposed to IPTC which may be more correct in the higher level XMP data (e.g. containing full, Unicode Description and not the truncated, ASCII version of it in legacy IPTC field).
Additional properties off the top of my head:
COMMENT
COPYRIGHT
AUTHOR
IMAGE_COUNT (for icons, multi-page TIFF, animated GIF)
I believe some cameras will insert author, maybe copyright. But typically these (other than image count) are workflow meta additions, right? So these would differ slightly from the others that might be maker or exif, etc.? Not arguing against, I'm just not familiar with these tags.
I wonder if this could be done with some kind of 'script' engine instead of code? That would keep it open to change or override by end users. It could be something developed just for this, or off-the-shelf but I don't have any concrete suggestions.
That said, I kind of hope this project overall heads in a more scripted direction for processing tags. Explicitly coding tag processing certainly has performance advantages, but the maintenance bar is very high. @drewnoakes has alluded in passing to scripts before in other threads (I think, or my kids have crashed my brain's hard drive). Maybe this gets it off the ground?
How do you envision scripting helping? At some point there still needs to be a map to where Random Joe Inc. wants to put their proprietary data. I'm not sure how it would work, but my scripting experience also consists of more forgotten python than I still know.
As @kwhopper says, we've spoken before of a new API that uses a more suitable data model internally.
There's a branch on the .NET implementation that sketches out some (non-compilable) API ideas, and we're tracking it in this pull request:
drewnoakes/metadata-extractor-dotnet#90
Feel free to chime in there for general ideas. There are a fair few ideas posted in the PR.
We'll keep this issue open to track this specific feature.
An example where a user (of the .NET library) looks in Exif and PNG data to get the date. This will miss cases in these files formats, and doesn't support other file formats, so is a good example of how adding this capability would be generally useful.
JPEG DNL segment is a source of image height.
Image height and width may be affected by image orientation. Such a handler could take this into account (per drewnoakes/metadata-extractor-images#26).
"Title" is another example for this feature, per #474.