ccpem/mrcfile

Proposal: `mrcfile.read()` and `mrcfile.write()`

alisterburt opened this issue ยท 9 comments

Hey Colin,

Merry Christmas! ๐ŸŽ„

In the spirit of 'simple is better than complex', would you be open to a PR adding simple read and write functions to mrcfile?

I know this isn't difficult to achieve with the current API but I find myself reaching for/reimplementing these as convenience functions often enough that I thought I'd see what you think

read would probably look along the lines of...

def read(filename):
    with mrcfile.open(filename) as mrc:
        data = np.copy(mrc.data)
    return data

and write...

def write(data, filename, overwrite=False):
    # set `mode` depending on value of overwrite
    with mrcfile.new(filename, mode) as mrc:
        mrc.set_data(data)

Hi Alister, merry Christmas to you too!

I'm certainly happy with the idea of a read function. I'm a little more hesitant about write, simply because it gives people even more encouragement to just dump their data into a file without any metadata to be able to make sense of it :-) (In particular, I always get annoyed when I find files that don't have a meaningful pixel size!) But I can see the convenience if you're just doing this kind of thing using the existing API anyway. Maybe it'd make sense to include a voxel_size=0.0 parameter for the write function so it reminds people and makes it easy to set? And perhaps also highlight in the function docstring that it's better to use new and consider setting other header fields too in files that are intended to be distributed or used with other software. What do you think? I'd be open to any other suggestions you might have.

Anyway, do go ahead and start on a PR if you like. In the meantime, if I have any more thoughts on this I'll let you know.

P.S. One other thing to point out: the new function already makes it reasonably easy to just dump data to a file: mrcfile.new(filename, data).close().

It probably still makes sense to add a write function, for symmetry with read and to avoid the awkwardness of remembering to include the close() at the end of the new call, but if you weren't aware of that already maybe it's helpful!

Thanks for the fast turnaround on feedback, much appreciated!

Perfecto, I agree with everything you said. Definitely good to flag the pixel size issue for write early, I hadn't really considered it and completely agree, we should probably make it a non-optional argument to avoid nasty files.

You'll likely see two PRs come in the next few days or so - don't feel any pressure to look at them even remotely quickly, enjoy the break and see you soon!

I see where you're coming from but I think I would still have pixel size as an optional argument. My reasoning is that if someone uses mrcfile to make a new file in the normal way, they're free to not set the pixel size and it gets a default value of zero, which is meaningless and at least indicates clearly that it hasn't been set. If the new write function does the same, that's in line with current behaviour. But if you make it a required argument, people will start throwing any old value (probably 1.0, most of the time) in there if they don't actually know or need the real pixel size, and you'll end up with many files with meaningless pixel sizes that are just harder to spot than they are now :-)

Hi Alister, I've just been reminded of your open pull request for read. I hadn't looked at it yet because you'd said you were intending to do one for write too, so I just wanted to check what your plans are. Is a write PR still in the works or should I just go ahead and deal with read for now?

ooh - did read and forgot to follow up, my bad! Will do write in the next few hours, if you prefer to wait to have both and deal with them together that's fine by me

friendly ping :)

I've finally got around to dealing with this. Thanks for the contributions! I'm not sure if it'll recognise that the pull requests have been merged or not, since I rebased them and squashed your commits together. I've also made a few changes of my own afterwards, most significantly to remove the compression option from write() and just set it based on the filename instead. I think this is all done now so I'll close this issue and the PRs, but do reopen it if you see anything that needs more changes. The new functions will be included next time I make a release for PyPI, which will hopefully be sometime in the next couple of weeks.

fantastic, thank you for taking the time Colin! Looking forward to the release ๐Ÿฅณ