gpac/ComplianceWarden

Auxiliary tracks should have handler `auxv`

leo-barnes opened this issue · 12 comments

Please see this issue: AOMediaCodec/libavif#1120

In short, libavif was using pict as the handler for auxiliary tracks, even though they shall be auxv. (Although it's not explicitly worded as a shall in the specification, that's how we read it to mean.)
I see that the warden outputs warnings about missing auxi and ccst boxes, so the tracks are identified as auxiliary, but I don't see anything about the handler not being auxv.

CW uses the handler auxv to identify these auxiliary tracks.

@cconcolato and @podborski also asked for a way to know if a rule was actually triggered or not. At this point, when the rule is a success (i.e. returns no error), it can either be because it is successful or because it is not relevant. See https://github.com/gpac/ComplianceWarden/tree/exercized_rules - I just put a flag to true when some parts of the code for a rule is reached.

Maybe we could also see it the other way round: I know that a file is supposed to use certain features and I'd like to make sure the adequate signalling is present.

What do you think?

@leo-barnes @rbouqueau what do you folks think of PR #40 which tries to fix this issue?

The PR looks ok (just one cosmetic comment), but where is the text mentioning they shall have auxv ? I think the file could have additional tracks (and currently we only consider file brands).

but where is the text mentioning they shall have auxv ?

Section 7.5.3.1 in ISO IEC

"""
Any number of tracks with handler type 'auxv' may be included in files containing image
sequence tracks.
"""

Eventhough it doesn't say it specifically, that section goes on to talk about rules to which such auxiliary tracks should conform to (like mandatory auxi box, etc).

Like @leo-barnes mentions in the original bug report, it is not explicitly mentioned in the spec, but that is how we read it to mean. Maybe this should be a warning perhaps? Instead of an error?

I think the file could have additional tracks (and currently we only consider file brands).

Yes, the file could have other tracks, this PR only updates the tracks that are tagged as 'auxl' in the 'tref' box. So if there are other non-auxiliary tracks, this PR won't affect them.

@vigneshvg The spec is written in a way that if you have an auxv track then some normative constraints apply, and if you have pict track some other normative constraints apply. It is written this way to allow AVIF files to contain other things. The AVIF reader should just care about the AV1 parts. I think that you want something to say if X is true, there shall be an auxv track. Usually, that is done with brands but IIRC there is no such brand.

@cconcolato
So there isn't anything explicitly saying that an auxiliary track needs to be auxv? I was hoping that there was some text I had simply not found yet that explicitly said auxl track references could only be used with auxv tracks or something like that. Maybe that should be added to the spec in that case.

@leo-barnes From section 12.1.1 in ISOBMFF:

Auxiliary video media uses the 'auxv' handler type in the HandlerBox of the MediaBox, as defined
in 8.4.3.

It says "uses" and not "shall" because that's a statement of fact. The way to tell a track is an auxiliary video track is to look at the handler. From then, you can verify some constraints, e.g. presence of auxi.

It says "uses" and not "shall" because that's a statement of fact. The way to tell a track is an auxiliary video track is to look at the handler. From then, you can verify some constraints, e.g. presence of auxi.

Right. So based on this reply, the PR is doing the correct thing. It uses the handler_type to determine whether a track is auxiliary or not and then enforces the 'presence of auxi' constraint only on those tracks.

@vigneshvg I think the PR would be ok to only process when handlerType == FOURCC("auxv"). But it is not ok to report an error when handlerType != FOURCC("auxv").

However I understand that if a content uses the wrong handler type by mistake, then it would be reported as ok by cw.exe, which is not useful.

@cconcolato et @podborski suggested to add a report of which test are actually exercised (beta branch) ; this could allow to compare which tests were not early-discarded versus a feature list (e.g. identified by their section number in the spec). Opinion?

@cconcolato

It says "uses" and not "shall" because that's a statement of fact. The way to tell a track is an auxiliary video track is to look at the handler. From then, you can verify some constraints, e.g. presence of auxi.

Makes sense. That's how we read it as well, but I started second guessing myself that maybe a track could be considered auxiliary as long as it had an auxl reference.

@rbouqueau

@vigneshvg I think the PR would be ok to only process when handlerType == FOURCC("auxv"). But it is not ok to report an error when handlerType != FOURCC("auxv").

However I understand that if a content uses the wrong handler type by mistake, then it would be reported as ok by cw.exe, which is not useful.

Could we make it report an error if you have an auxl reference with a track that does not use auxv handler? That is not valid if I understand the spec correctly.

The rest of the checks for auxi and ccst can be tied to the track handler.

Could we make it report an error if you have an auxl reference with a track that does not use auxv handler? That is not valid if I understand the spec correctly.

This is exactly what the PR does right now.

The rest of the checks for auxi and ccst can be tied to the track handler.

The PR also follows this behavior.

@cconcolato and @podborski also asked for a way to know if a rule was actually triggered or not. At this point, when the rule is a success (i.e. returns no error), it can either be because it is successful or because it is not relevant. See https://github.com/gpac/ComplianceWarden/tree/exercized_rules - I just put a flag to true when some parts of the code for a rule is reached.

I think there is news here: CW is now able to tell if an input sample exercized the feature of a rule. I think that solves the core debate here. Let me know if you don't think so.

Now the question is: when someone is interested in a specific feature, how should we display it's been exercized? (the implementation is still partial, but I'd like to use the leverage here to come up with something beneficial to the users).