MalformedPDFError Invalid filter algorithm 31

PDF file:
EA9DDBD4F46B6A41F4CFC7FE3A222FAF8013C3CEAC0918D1E2A5.pdf

There seems to be some issue with png_depredict function when running the code:

PDF::Reader.new(file).pages[0].xobjects[:I3].unfiltered_data

# => PDF::Reader::MalformedPDFError (Invalid filter algorithm 31):

That specific xobject is the QR Code which we're trying to extract and parse, but struggling to get the unfiltered_data necessary to do so. Will continue to try and debug but may need someone else's help

The image xobject looks like this:

<</Type /XObject
/Subtype /Image
/Width 100
/Height 100
/ColorSpace [/Indexed /DeviceRGB 1 23 0 R]
/BitsPerComponent 1
/Filter /FlateDecode
/DecodeParms <</Predictor 15 /Colors 1 /BitsPerComponent 1 /Columns 100>>
/Length 265>>

I'm fairly sure it's accurate that 31 isn't a valid filter type in the PNG format, but I suspect the png_depredict isn't correctly parsing the data and it should be getting as far as thinking there's a filter type of 31. Maybe because it's a single bit per component? Or maybe because the colour space is indexed 🤔

Unfortunately I'm fairly swamped at the moment with day job and family life so I want be able to take a closer look for a while. Sorry!

Ouch, this has reminded me that there's only a single unit spec for the Flate filter with PNG shaped data 😬

pdf-reader/spec/reader/filter/flate_spec.rb

Lines 54 to 71 in 946559b

    
           context "deflated stream with PNG predictors" do 
        
             let(:deflated_path) { 
        
               File.dirname(__FILE__) + "/../../data/deflated_with_predictors.dat" 
        
             } 
        
             let(:depredicted_path) { 
        
               File.dirname(__FILE__) + "/../../data/deflated_with_predictors_result.dat" 
        
             } 
        
             let(:deflated_data) { binread(deflated_path) } 
        
             let(:depredicted_data) { binread(depredicted_path) } 
        
             it "inflates the data" do 
        
               filter = PDF::Reader::Filter::Flate.new( 
        
                 :Columns => 5, 
        
                 :Predictor => 12 
        
               ) 
        
               expect(filter.filter(deflated_data)).to eql(depredicted_data) 
        
             end 
        
           end

For those also having issues with this, we found HexaPDF was able to export the image correctly:
https://github.com/gettalong/hexapdf

	context "deflated stream with PNG predictors" do
	let(:deflated_path) {
	File.dirname(__FILE__) + "/../../data/deflated_with_predictors.dat"
	}
	let(:depredicted_path) {
	File.dirname(__FILE__) + "/../../data/deflated_with_predictors_result.dat"
	}
	let(:deflated_data) { binread(deflated_path) }
	let(:depredicted_data) { binread(depredicted_path) }

	it "inflates the data" do
	filter = PDF::Reader::Filter::Flate.new(
	:Columns => 5,
	:Predictor => 12
	)
	expect(filter.filter(deflated_data)).to eql(depredicted_data)
	end
	end