datacoon/metawarc

Error with pip installed version

gleporeNARA opened this issue · 4 comments

metawarc metadata flashfrozen-jwat-recompressed.warc.gz

2022-04-11 12:25:39,961 - root - DEBUG - Preparing flashfrozen-jwat-recompressed.warc.gz
[warn] Skip parser 'FAT12': stream is smaller than 512.0 bytes
[warn] Skip parser 'FAT16': stream is smaller than 512.0 bytes
[warn] Skip parser 'FAT32': stream is smaller than 512.0 bytes
[warn] Skip parser 'LinuxSwapFile': stream is smaller than 4096.0 bytes
[warn] Skip parser 'MSDos_HardDrive': stream is smaller than 512.0 bytes
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': stream is smaller than 4180.0 bytes
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] Skip parser 'FAT12': stream is smaller than 512.0 bytes
[warn] Skip parser 'FAT16': stream is smaller than 512.0 bytes
[warn] Skip parser 'FAT32': stream is smaller than 512.0 bytes
[warn] Skip parser 'LinuxSwapFile': stream is smaller than 4096.0 bytes
[warn] Skip parser 'MSDos_HardDrive': stream is smaller than 512.0 bytes
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': stream is smaller than 4180.0 bytes
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': Unknown magic string
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': Invalid magic
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': Unknown magic string
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': Invalid magic
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': stream is smaller than 4096.0 bytes
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': stream is smaller than 4180.0 bytes
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip parser 'FAT12': stream is smaller than 512.0 bytes
[warn] Skip parser 'FAT16': stream is smaller than 512.0 bytes
[warn] Skip parser 'FAT32': stream is smaller than 512.0 bytes
[warn] Skip parser 'LinuxSwapFile': stream is smaller than 4096.0 bytes
[warn] Skip parser 'MSDos_HardDrive': stream is smaller than 512.0 bytes
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': stream is smaller than 4180.0 bytes
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': Unknown magic string
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': Invalid magic
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': Unknown magic string
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': Invalid magic
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': Unknown magic string
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': Invalid magic
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': Unknown magic string
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': Invalid magic
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': stream is smaller than 4096.0 bytes
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': stream is smaller than 4180.0 bytes
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': Unknown magic string
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': Invalid magic
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] Skip parser 'FAT12': Invalid FAT12 signature
[warn] Skip parser 'FAT16': Invalid FAT16 signature
[warn] Skip parser 'FAT32': Invalid FAT32 signature
[warn] Skip parser 'LinuxSwapFile': Unknown magic string
[warn] Skip parser 'MSDos_HardDrive': Invalid signature
[warn] Skip parser 'PIFVFile': Invalid magic number
[warn] Skip parser 'ElfFile': Invalid magic
[warn] Skip parser 'MachoFatFile': Invalid magic
[warn] Skip parser 'MachoFile': Invalid magic
[warn] Skip parser 'PRCFile': False
[warn] [] Error when getting size of 'header': delete it
[warn] Skip value width_dpi=0 (filter)
[warn] Skip value height_dpi=0 (filter)
Traceback (most recent call last):
File "/home/lepore/anaconda3/bin/metawarc", line 8, in
sys.exit(main())
File "/home/lepore/anaconda3/lib/python3.9/site-packages/metawarc/main.py", line 11, in main
exit_status = cli()
File "/home/lepore/anaconda3/lib/python3.9/site-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/lepore/anaconda3/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/metawarc/core.py", line 35, in metadata
acmd.metadata(input, filetypes.split(',') if filetypes else None, fields, output=output)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/metawarc/cmds/extractor.py", line 137, in metadata
result = processWarcRecord(record, url, filename, mime=h)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/metawarc/cmds/extractor.py", line 87, in processWarcRecord
parser = createParser(temp.name)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/hachoir/parser/guess.py", line 136, in createParser
stream = FileInputStream(filename, real_filename, tags=tags)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/hachoir/stream/input_helper.py", line 38, in FileInputStream
return InputIOStream(inputio, source=source, **args)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/hachoir/stream/input.py", line 412, in init
InputStream.init(self, size=size, **args)
File "/home/lepore/anaconda3/lib/python3.9/site-packages/hachoir/stream/input.py", line 136, in init
raise NullStreamError(source)
hachoir.stream.input.NullStreamError: Input size is nul (source='file:/tmp/tmpqv9quokupng')!
lepore-desktop:~/Downloads/working/warc/webarchive-discovery/warc-indexer/src/test/resources/wikipedia-mona-lisa$

ivbeg commented

@gleporeNARA could you prpvode an example WARC file, please?

ivbeg commented

@gleporeNARA It was error in zero file handling. I've fixed it and added better error reporting. Please try it using recent code from repository. If it will be ok, I will create bugfix release.

Will do, thanks!