How to save mesh to a '.dat' binary file?
WangYun1995 opened this issue · 18 comments
According to nbodykit documentation, I know how to save a mesh to a bigfile, using the code as follows.
mesh.save('filename.bigfile', mode='real', dataset='Field')
However, how should I do for saving mesh to a '.dat' binary file?
mesh.save('filename.dat
, mode='real', dataset='Field')`
Does that work?
If my understanding of 'dat' file format is correct (a stream of binary bytes of the dense 3d image stored in C order), then you can read in the bigfile and save it with numpy.
import bigfile
with bigfile.File('filename.bigfile') as bf:
data = bf['Field'][:]
data.tofile('filename.dat')
If my understanding of 'dat' file format is correct (a stream of binary bytes of the dense 3d image stored in C order), then you can read in the bigfile and save it with numpy.
import bigfile with bigfile.File('filename.bigfile') as bf: data = bf['Field'][:] data.tofile('filename.dat')
By '.dat' file here, I mean that binary data that is stored on disk in column-major format.
On Fri, Sep 4, 2020 at 10:20 PM WangYun1995 @.***> wrote: If my understanding of 'dat' file format is correct (a stream of binary bytes of the dense 3d image stored in C order), then you can read in the bigfile and save it with numpy. import bigfile with bigfile.File('filename.bigfile') as bf: data = bf['Field'][:] data.tofile('filename.dat') By '.dat' file here, I mean that binary data that is stored on disk in column-major format.
In that case you probably want to transpose the data to fortran order first:import bigfile with bigfile.File('filename.bigfile') as bf: shape = bf['Field'].attrs['ndarray.shape'] data = bf['Field'][:].reshape(shape) data.asfortranarray(order='F').tofile('filename.dat')
—
…
You are receiving this because you commented. Reply to this email directly, view it on GitHub <#639 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABBWTGF6YY72VW6SNIKDFLSEHDCZANCNFSM4Q2F6ZZQ .
do you mean that I still need to save the mesh to bigfile first, and then transform it into binary file?
Yes. I have found this to be the easiest and most scalable e.g: You can run the expensive part with thousands of MPI ranks, save a bigfile relatively quickly. After the expensive job is ran, then use a small 1 rank python script to do the transformation. If you really want to avoid bigfile and want to directly write to binary in your script, then something like this:
with open('filename.dat', 'w') as ff: pass real = field.compute('real') data = real.ravel() # unfortunately there is no order='F' here yet. so this is still C ordered. offset = sum(comm.allgather(len(data))[:comm.rank]) with open('filename.dat', 'rb+') as ff: ff.seek(offset) data.tofile(ff)
However, for this work with F ordering, we'll need a patch to allow real.revel(order='F').
…
On Sat, Sep 5, 2020 at 6:56 PM WangYun1995 @.> wrote: On Fri, Sep 4, 2020 at 10:20 PM WangYun1995 @.> wrote: If my understanding of 'dat' file format is correct (a stream of binary bytes of the dense 3d image stored in C order), then you can read in the bigfile and save it with numpy. import bigfile with bigfile.File('filename.bigfile') as bf: data = bf['Field'][:] data.tofile('filename.dat') By '.dat' file here, I mean that binary data that is stored on disk in column-major format. In that case you probably want to transpose the data to fortran order first: import bigfile with bigfile.File('filename.bigfile') as bf: shape = bf['Field'].attrs['ndarray.shape'] data = bf['Field'][:].reshape(shape) data.asfortranarray(order='F').tofile('filename.dat') — … <#m_-7879857218343078427_> You are receiving this because you commented. Reply to this email directly, view it on GitHub <#639 (comment) <#639 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABBWTGF6YY72VW6SNIKDFLSEHDCZANCNFSM4Q2F6ZZQ . do you mean that I still need to save the mesh to bigfile first, and then transform it into binary file? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#639 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABBWTDWZY6V7VKJXTKZNJDSELT6BANCNFSM4Q2F6ZZQ .
Thank you, I got it!
On Fri, Sep 4, 2020 at 10:20 PM WangYun1995 @.***> wrote: If my understanding of 'dat' file format is correct (a stream of binary bytes of the dense 3d image stored in C order), then you can read in the bigfile and save it with numpy. import bigfile with bigfile.File('filename.bigfile') as bf: data = bf['Field'][:] data.tofile('filename.dat') By '.dat' file here, I mean that binary data that is stored on disk in column-major format.
In that case you probably want to transpose the data to fortran order first:import bigfile with bigfile.File('filename.bigfile') as bf: shape = bf['Field'].attrs['ndarray.shape'] data = bf['Field'][:].reshape(shape) data.asfortranarray(order='F').tofile('filename.dat')
—
…
You are receiving this because you commented. Reply to this email directly, view it on GitHub <#639 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABBWTGF6YY72VW6SNIKDFLSEHDCZANCNFSM4Q2F6ZZQ .
I tested your code that convert bigfile to fortran unformatted file, i.e.
import bigfile
with bigfile.File('filename.bigfile') as bf:
shape = bf['Field'].attrs['ndarray.shape']
data = bf['Field'][:].reshape(shape)
data.asfortranarray(order='F').tofile('filename.dat')
But an AttributeError
occurred, it said that 'numpy.ndarray' object has no attribute 'asfortranarray'
. What's going on?
Perhaps use numpy.asfortranarray(data) instead. I doodled the code without running it... How big is your mesh, btw? On Thu, Sep 10, 2020 at 6:03 PM WangYun1995 notifications@github.com wrote:
…
On Fri, Sep 4, 2020 at 10:20 PM WangYun1995 @.***> wrote: If my understanding of 'dat' file format is correct (a stream of binary bytes of the dense 3d image stored in C order), then you can read in the bigfile and save it with numpy. import bigfile with bigfile.File('filename.bigfile') as bf: data = bf['Field'][:] data.tofile('filename.dat') By '.dat' file here, I mean that binary data that is stored on disk in column-major format. In that case you probably want to transpose the data to fortran order first: import bigfile with bigfile.File('filename.bigfile') as bf: shape = bf['Field'].attrs['ndarray.shape'] data = bf['Field'][:].reshape(shape) data.asfortranarray(order='F').tofile('filename.dat') — … <#m_7045689014606292553_> You are receiving this because you commented. Reply to this email directly, view it on GitHub <#639 (comment) <#639 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABBWTGF6YY72VW6SNIKDFLSEHDCZANCNFSM4Q2F6ZZQ . I tested your code that convert bigfile to fortran unformatted file, i.e. import bigfile with bigfile.File('filename.bigfile') as bf: shape = bf['Field'].attrs['ndarray.shape'] data = bf['Field'][:].reshape(shape) data.asfortranarray(order='F').tofile('filename.dat') But an AttributeError occurred, it said that 'numpy.ndarray' object has no attribute 'asfortranarray'. What's going on? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#639 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABBWTG7SE76AG3ZOLZFE2LSFFZOFANCNFSM4Q2F6ZZQ .
I have one simulation box whose each side is 75.0Mpc/h, and it contained 1820^3 dark matter particles. I want to convert particles to 1024^3 mesh.
I have succeeded in converting the bigfile to fortran unformatted file.
And I found that if I saved the data in the 'float32', the size of the bigfile and
the fortran unformatted file are both 4GB.
Yes, you are right. The sizes of these two files are not strictly equal to each other.
Unfortunately, when I tried to read such an unformatted mesh data file using Fortran code, Fortran runtime error
occurred. I submit this question in the 'stackoverflow'. You can refer to this link https://stackoverflow.com/questions/63868616/how-to-solve-fortran-runtime-error-i-o-past-end-of-record-on-unformatted-file for more details.
Hello,
In the case of 512 * 512 * 512 mesh, everything about I/O is OK. However, If I would load the 1024^3 mesh unformatted data using Fortran code, Fortran runtime error
occurred. Then I modified ACCESS='direct'
to ACCESS='stream
in the open
statement, and added one read
statement before read(10) dens
, i.e.
read(10) header
read(10) dens
The Fortran code runs fine now. But I don't understand the reason behind this. Could it be that the 1024^3 mesh is so large?
Good to know! But Fortran unformatted file has this 4 (head) + 4(tail) byte block length field, so it shouldn't be exactly 4GB after conversion? Also if the entire image is saved as a single F77 unformatted block, the block length field would overflow. (4 bytes can only represent 4G - 1 bytes).
…
On Sat, Sep 12, 2020 at 7:11 PM WangYun1995 @.***> wrote: I have succeeded in converting the bigfile to fortran unformatted file. And I found that if I saved the data in the 'float32', the size of the bigfile and the fortran unformatted file are both 4GB. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#639 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABBWTF7JKL3KVFQVLCTQ6TSFQS4BANCNFSM4Q2F6ZZQ .
As you said, the block length field of my unformatted file indeed overflow. So how to correct it?