WGBH-MLA/AAPB2

BUG: Assets with Digital Instantiation of media type "Sound" and Physical Instantiation of media type "Moving Image" use <video> tag instead of <audio>

afred opened this issue · 1 comments

Reproduce

  1. Log into VPN to simulate presence i the GBH office for access to on-premises records.
  2. Go to https://americanarchive.org/catalog/cpb-aacip-338cf0f565b
  3. View source of the page

Expected behavior

  1. media plays
  2. HTML rendered using <audio> tag, and not a <video> tag

Actual behavior

  1. media does not play
  2. HTML is rendered using <video> tag instad of <audio> tag

Background

The logic that determines the media type will return "Moving Image" before "Sound" if both are found among the media types of the Asset's instantiations. However, it's not filtering out the Physical instantiations.

It's not that common for an Asset to have both a "Moving Image" instantiation and a "Sound" instantiation, however, there are some cases where there is a "Moving Image" Physical Instantiation and a "Sound" Digital Instantiation. In this case, we want to render an <audio> element to play the sound. The Physical Instantiation in the example record of this ticket (and others presumably) came in from the original metadata submitted by the station.

I believe the right way to address this is to narrow down to just the pbcoreInstantiation elements that include instantiationDigital as a sub-element. In the (very few) cases where there are multiple digital instantiations but they do not have the same media type, it's probably(?) safe to assume that the media type is "Moving Image", if that's an option, else "Sound", if that's an option.

Here's how I'm doing it in Python

import xml.etree.ElementTree as ET

tree = ET.parse(xml_filepath)
root = tree.getroot()
ns = {"pbcore": "http://www.pbcore.org/PBCore/PBCoreNamespace.html"}

insts = root.findall(".//pbcore:pbcoreInstantiation",ns)
dig_mts = []  # create list of media types for digial instantiations
for inst in insts:
    if inst.find("pbcore:instantiationDigital",ns) is not None:
        mte = inst.find("pbcore:instantiationMediaType",ns)
        if mte is not None:
            dig_mts.append(mte.text)

if len(dig_mts) == 0:
    media_type = ''
elif 'Moving Image' in dig_mts:
    media_type = 'Moving Image'
elif 'Sound' in dig_mts:
    media_type = 'Sound'
else:
    media_type = dig_mts[0]