tbeu/matio

Library will not read files greater than 2Gb in size

dpzimmer opened this issue · 6 comments

The internal data structure members that store file positions (_mat_t.bof, matvar_internal.datapos) are 'long' data types which are inconsistent among platforms. By conditionally making these 64-bit integers we can support files greater than 2Gb on largefile-capable 32-bit systems as well as 64-bit Windows. This will also necessitate using 64-bit versions of fseek() and ftell() where available.

I have encountered Matlab Level 5 formatted files greater than 2Gb as output from the SWAN numerical model (http://swanmodel.sourceforge.net/). Matlab will read these just fine, and the Matlab file format specification does not seem to prohibit large files.

I have attached a patch against matio version 1.5.19 that I have tested working on Windows 32- and 64-bit using Visual Studio and it should also work on most Unix/Linux systems, although untested. The approach relies on exposing 64-bit versions of fseeko() and ftello() in stdio.h but the method used should really be implemented as a compiler flag in the build system and/or in the generation of matioConfig.h, so mine is not the most elegant implementation.

Nevertheless, it works and my patch contribution is attached:

largefile.zip

Don.

Is there any plans to fix this limitation? Any work in progress?

tbeu commented

Yes, will do.

Thanks !
Any official releases/hotfixes to include this important fix?
(The latest official release, i.e. 1.5.21 does not include this fix. )

Hi, I don't know if this is the correct place to discuss the add-lfs branch, but I couldn't find a better place.

if you need help with the CMake part, let me know. Because matioConfig.cmake.in should basically be a copy of matioConfig.h.in. And CMake is missing checks for the seek/tell functions.

There are also some other small things I like to comment on:

  • you added BOM encoding to mat73.c
  • should inflate.c use foff_t instead of off_t?
  • #include <sys/types.h> in matio_private.h is not needed

Thanks

tbeu commented

I already spent far too many hours to get the configuration done (le sigh). Thanks for your offer to help. For the CMake part the configration of _FILE_OFFSET_BITS and _LARGEFILE64_SOURCE is tricky and not yet solved. (I have just pushed though).

Feel free to push to the add-lfs branch, but be cautious with the Travis CI credits.

There are also 40 tests failing for CygWin 32-bit (https://ci.appveyor.com/project/tbeu/matio/builds/42952340/job/0ifj46s9tm5mg7bb?fullLog=true). I am still clueless since I neither can reproduce nor debug locally (https://ci.appveyor.com/project/tbeu/matio/builds/42952597/job/u92mh70ja74i9ydo).

Another issue is that the CygWin 64-bit env stopped working on Appveyor. I failed to fix it and disabled it now.

tbeu commented

@MaartenBent I also fixed your above mentioned findings now. Thanks.