beardypig/pymp4

Update to construct 2.9

Opened this issue · 4 comments

I'm trying to package a project that use pymp4, but find it challenging as pymp4 is based on construct 2.8, while construct 2.9 is now the default version in repositories..

There are a number of breaking changes, and I'm aware that the project hasn't received commits in a year, but it would be nice to be able to package pymp4 against recent version of its dependencies.

I was considering moving this to Kaitai Struct, but I will see what it will require to move to Construct 2.9.

Thanks for the quick answer !
I have no opinion on such a choice (I have no experience on this project), I was merely suggesting upgrading (or switching) dependencies in order to ease its installing.

Feel free to close this issue if Kaitai Struct is a stronger suit for this :)

Any progress? Tests fail with contruct-2.10

I just ran into this. Debian has construct 2.10 since April 2010 so it is in stable, testing and unstable. Main reason for breakage seems to be that all the string definitions have changed and 'Embedded' has gone away. https://construct.readthedocs.io/en/latest/transition29.html and https://construct.readthedocs.io/en/latest/transition210.html
Despite knowing nothing about construct or MP4 formats, nor much about python, I had a go at updating things.
I think I fixed the strings, assuming that the explicit padding options (no longer supported) didn't actually doing anything too special, but I failed to fix all the Embedded stuff, mostly due to a lack of docs for 2.8 so I'm not quite sure what the now-gone options actually do. It looks like there may be some changes in conditions stuff too.
It does seem that this is a rather unstable library, as I tried going back to build with older versions just to get pymp4 going and it still fails half the tests with 2.8.16 (and the BoundBytesIO feature seems to be missing there).

Even using 2.8.8 itself as specified by pymp4 I still get some test failures:
est_ftyp_build (tests.test_box.BoxTests) ... ERROR
test_mdhd_build (tests.test_box.BoxTests) ... ERROR
test_mdhd_parse (tests.test_box.BoxTests) ... ERROR
test_moov_build (tests.test_box.BoxTests) ... ERROR

ERROR: test_ftyp_build (tests.test_box.BoxTests)
ERROR: test_mdhd_build (tests.test_box.BoxTests)
ERROR: test_moov_build (tests.test_box.BoxTests)
...

  File "/usr/lib/python3/dist-packages/construct/core.py", line 1000, in _build
    if not isinstance(obj, collections.Sequence):
AttributeError: module 'collections' has no attribute 'Sequence'

and:
ERROR: test_mdhd_parse (tests.test_box.BoxTests)

File "/usr/lib/python3/dist-packages/construct/lib/bitstream.py", line 42, in close
    raise ValueError("closing stream but %d unwritten bytes remain, %d is encoded unit" % (len(self.wbuffer), self.encoderunit))
ValueError: closing stream but 1 unwritten bytes remain, 8 is encoded unit

I'm a bit surprised there are test failures even with the explicitly-required version.
Looks like collections.sequence moved to collections.abc.sequence, and fixing that in construct 2.8.8 gets me a pymp4 that builds and passes tests.

 --- construct-2.8.8.orig/construct/core.py                                                                                                                                   
+++ construct-2.8.8/construct/core.py                                                                                                                                        
@@ -997,7 +997,7 @@ class Range(Subconstruct):                                                                                                                               
         max = self.max(context) if callable(self.max) else self.max                                                                                                         
         if not 0 <= min <= max <= sys.maxsize:                                                                                                                              
             raise RangeError("unsane min %s and max %s" % (min, max))                                                                                                       
-        if not isinstance(obj, collections.Sequence):                                                                                                                       
+        if not isinstance(obj, collections.abc.Sequence):                                                                                                                   
             raise RangeError("expected sequence type, found %s" % type(obj))                                                                                                
         if not min <= len(obj) <= max:                                                                                                                                      
             raise RangeError("expected from %d to %d elements, found %d" % (min, max, len(obj)))  

Not that this is much use in practice. I don't suppose they are taking 2.8.8 (from Oct 2016) bugfixes any more.

I'd be a bit miffed if I used this (very handy) library, but then they made massive incompatible changes every couple of years. Who cares about backwards compatibility eh? Still, it's either move to something else, or update pymp4.

attached is my completely half-arsed effort to update things, which doesn't actually work yet, and definitely isn't right for the embedded stuff, which I've not grokked yet. But maybe it's helpful to someone.
(Hmm, not actually attached as github don't allow '.patch' attachments (WTF?) so pasted.)

Description: Use python3-construct-2.10 instead of 2.8
 Construct 2.10 changed in important ways. Especially the Strings class
 
Last-Update: 2021-12-23

Index: python-pymp4-1.2.0/src/pymp4/parser.py
===================================================================
--- python-pymp4-1.2.0.orig/src/pymp4/parser.py
+++ python-pymp4-1.2.0/src/pymp4/parser.py
@@ -83,22 +83,22 @@ class PrefixedIncludingSize(Subconstruct
 
 FileTypeBox = Struct(
     "type" / Const(b"ftyp"),
-    "major_brand" / String(4),
+    "major_brand" / PaddedString(4,"ascii"),
     "minor_version" / Int32ub,
-    "compatible_brands" / GreedyRange(String(4)),
+    "compatible_brands" / GreedyRange(PaddedString(4,"ascii")),
 )
 
 SegmentTypeBox = Struct(
     "type" / Const(b"styp"),
-    "major_brand" / String(4),
+    "major_brand" / PaddedString(4,"ascii"),
     "minor_version" / Int32ub,
-    "compatible_brands" / GreedyRange(String(4)),
+    "compatible_brands" / GreedyRange(PaddedString(4,"ascii")),
 )
 
 # Catch find boxes
 
 RawBox = Struct(
-    "type" / String(4, padchar=b" ", paddir="right"),
+    "type" / PaddedString(4,"ascii"),
     "data" / Default(GreedyBytes, b"")
 )
 
@@ -149,7 +149,7 @@ TrackHeaderBox = Struct(
     "type" / Const(b"tkhd"),
     "version" / Default(Int8ub, 0),
     "flags" / Default(Int24ub, 1),
-    Embedded(Switch(this.version, {
+    Switch(this.version, {
         1: Struct(
             "creation_time" / Default(Int64ub, 0),
             "modification_time" / Default(Int64ub, 0),
@@ -164,7 +164,7 @@ TrackHeaderBox = Struct(
             Padding(4),
             "duration" / Default(Int32ub, 0),
         ),
-    })),
+    }),
     Padding(8),
     "layer" / Default(Int16sb, 0),
     "alternate_group" / Default(Int16sb, 0),
@@ -175,28 +175,26 @@ TrackHeaderBox = Struct(
     "height" / Default(Int32ub, 0),
 )
 
-HDSSegmentBox = Struct(
-    "type" / Const(b"abst"),
-    "version" / Default(Int8ub, 0),
-    "flags" / Default(Int24ub, 0),
-    "info_version" / Int32ub,
-    EmbeddedBitStruct(
+HDSSegmentBox = BitStruct(
+    "type" / Bytewise(Const(b"abst")),
+    "version" / Bytewise(Default(Int8ub, 0)),
+    "flags" / Bytewise(Default(Int24ub, 0)),
+    "info_version" / Bytewise(Int32ub),
         Padding(1),
         "profile" / Flag,
         "live" / Flag,
         "update" / Flag,
-        Padding(4)
-    ),
-    "time_scale" / Int32ub,
-    "current_media_time" / Int64ub,
-    "smpte_time_code_offset" / Int64ub,
-    "movie_identifier" / CString(),
-    "server_entry_table" / PrefixedArray(Int8ub, CString()),
-    "quality_entry_table" / PrefixedArray(Int8ub, CString()),
-    "drm_data" / CString(),
-    "metadata" / CString(),
-    "segment_run_table" / PrefixedArray(Int8ub, LazyBound(lambda x: Box)),
-    "fragment_run_table" / PrefixedArray(Int8ub, LazyBound(lambda x: Box))
+        Padding(4),
+    "time_scale" / Bytewise(Int32ub),
+    "current_media_time" / Bytewise(Int64ub),
+    "smpte_time_code_offset" / Bytewise(Int64ub),
+    "movie_identifier" / Bytewise(CString()),
+    "server_entry_table" / Bytewise(PrefixedArray(Int8ub, CString())),
+    "quality_entry_table" / Bytewise(PrefixedArray(Int8ub, CString())),
+    "drm_data" / Bytewise(CString()),
+    "metadata" / Bytewise(CString()),
+    "segment_run_table" / Bytewise(PrefixedArray(Int8ub, LazyBound(lambda x: Box))),
+    "fragment_run_table" / Bytewise(PrefixedArray(Int8ub, LazyBound(lambda x: Box)))
 )
 
 HDSSegmentRunBox = Struct(
@@ -244,27 +242,25 @@ class ISO6392TLanguageCode(Adapter):
         return [c - 0x60 for c in bytearray(obj.encode("utf8"))]
 
 
-MediaHeaderBox = Struct(
-    "type" / Const(b"mdhd"),
-    "version" / Default(Int8ub, 0),
-    "flags" / Const(Int24ub, 0),
-    "creation_time" / IfThenElse(this.version == 1, Int64ub, Int32ub),
-    "modification_time" / IfThenElse(this.version == 1, Int64ub, Int32ub),
-    "timescale" / Int32ub,
-    "duration" / IfThenElse(this.version == 1, Int64ub, Int32ub),
-    Embedded(BitStruct(
+MediaHeaderBox = Bitwise(Struct(
+    "type" / Bytewise(Const(b"mdhd")),
+    "version" / Bytewise(Default(Int8ub, 0)),
+    "flags" / Bytewise(Const(Int24ub, 0)),
+    "creation_time" / Bytewise(IfThenElse(this.version == 1, Int64ub, Int32ub)),
+    "modification_time" / Bytewise(IfThenElse(this.version == 1, Int64ub, Int32ub)),
+    "timescale" / Bytewise(Int32ub),
+    "duration" / Bytewise(IfThenElse(this.version == 1, Int64ub, Int32ub)),
         Padding(1),
         "language" / ISO6392TLanguageCode(BitsInteger(5)[3]),
-    )),
-    Padding(2, pattern=b"\x00"),
-)
+    Bytewise(Padding(2, pattern=b"\x00")),
+))
 
 HandlerReferenceBox = Struct(
     "type" / Const(b"hdlr"),
     "version" / Const(Int8ub, 0),
     "flags" / Const(Int24ub, 0),
     Padding(4, pattern=b"\x00"),
-    "handler_type" / String(4),
+    "handler_type" / PaddedString(4,"ascii"),
     Padding(12, pattern=b"\x00"),  # Int32ub[3]
     "name" / CString(encoding="utf8")
 )
@@ -335,7 +331,7 @@ class MaskedInteger(Adapter):
 AVC1SampleEntryBox = Struct(
     "version" / Default(Int16ub, 0),
     "revision" / Const(Int16ub, 0),
-    "vendor" / Default(String(4, padchar=b" "), b"brdy"),
+    "vendor" / Default(PaddedString(4,"ascii"), b"brdy"),
     "temporal_quality" / Default(Int32ub, 0),
     "spatial_quality" / Default(Int32ub, 0),
     "width" / Int16ub,
@@ -346,7 +342,7 @@ AVC1SampleEntryBox = Struct(
     Padding(2),
     "data_size" / Const(Int32ub, 0),
     "frame_count" / Default(Int16ub, 1),
-    "compressor_name" / Default(String(32, padchar=b" "), ""),
+    "compressor_name" / Default(PaddedString(32,"ascii"), ""),
     "depth" / Default(Int16ub, 24),
     "color_table_id" / Default(Int16sb, -1),
     "avc_data" / PrefixedIncludingSize(Int32ub, Struct(
@@ -355,7 +351,7 @@ AVC1SampleEntryBox = Struct(
         "profile" / Int8ub,
         "compatibility" / Int8ub,
         "level" / Int8ub,
-        EmbeddedBitStruct(
+        BitStruct(
             Padding(6, pattern=b'\x01'),
             "nal_unit_length_field" / Default(BitsInteger(2), 3),
         ),
@@ -366,16 +362,16 @@ AVC1SampleEntryBox = Struct(
 
 
 SampleEntryBox = PrefixedIncludingSize(Int32ub, Struct(
-    "format" / String(4, padchar=b" ", paddir="right"),
+    "format" / PaddedString(4,"ascii"),
     Padding(6, pattern=b"\x00"),
     "data_reference_index" / Default(Int16ub, 1),
-    Embedded(Switch(this.format, {
+    Switch(this.format, {
         b"ec-3": MP4ASampleEntryBox,
         b"mp4a": MP4ASampleEntryBox,
         b"enca": MP4ASampleEntryBox,
         b"avc1": AVC1SampleEntryBox,
         b"encv": AVC1SampleEntryBox
-    }, Struct("data" / GreedyBytes)))
+    }, Struct("data" / GreedyBytes))
 ))
 
 BitRateBox = Struct(
@@ -684,13 +680,13 @@ SampleEncryptionBox = Struct(
 
 OriginalFormatBox = Struct(
     "type" / Const(b"frma"),
-    "original_format" / Default(String(4), b"avc1")
+    "original_format" / Default(PaddedString(4,"ascii"), b"avc1")
 )
 
 SchemeTypeBox = Struct(
     "type" / Const(b"schm"),
-    "scheme_uri" / Default(String(4) , b""),
-    "scheme_type" / Default(String(4), b"cenc"),
+    "scheme_uri" / Default(PaddedString(4,"ascii") , b""),
+    "scheme_type" / Default(PaddedString(4,"ascii"), b"cenc"),
     "scheme_version" / Int32ub
 )
 
@@ -726,8 +722,8 @@ class TellMinusSizeOf(Subconstruct):
 
 Box = PrefixedIncludingSize(Int32ub, Struct(
     "offset" / TellMinusSizeOf(Int32ub),
-    "type" / Peek(String(4, padchar=b" ", paddir="right")),
-    Embedded(Switch(this.type, {
+    "type" / Peek(PaddedString(4,"ascii")),
+    Switch(this.type, {
         b"ftyp": FileTypeBox,
         b"styp": SegmentTypeBox,
         b"mvhd": MovieHeaderBox,
@@ -781,12 +777,12 @@ Box = PrefixedIncludingSize(Int32ub, Str
         b'abst': HDSSegmentBox,
         b'asrt': HDSSegmentRunBox,
         b'afrt': HDSFragmentRunBox
-    }, default=RawBox)),
+    }, default=RawBox),
     "end" / Tell
 ))
 
 ContainerBox = Struct(
-    "type" / String(4, padchar=b" ", paddir="right"),
+    "type" / PaddedString(4,"ascii"),
     "children" / GreedyRange(Box)
 )