project-gemmi/gemmi

Gemmi vs mrcfile - values of some voxels are not equal

aliaksei-chareshneu opened this issue · 2 comments

Dear @wojdyr,

I've noticed that values of some voxels provided by gemmi and mrcfile are not equal while reading the same map.
Could you think of why it can happen?

Map file: EMD-1832.zip

from pathlib import Path
from typing import Tuple
import gemmi
import mrcfile
import numpy as np


def get_np_arr_from_gemmi(map_path: Path) -> np.ndarray:
    path = str(map_path.resolve())
    m = gemmi.read_ccp4_map(path)
    arr = np.array(m.grid)
    print(f'gemmi arr: dtype={arr.dtype}')
    return arr

def get_np_arr_from_mrcfile(map_path: Path) -> np.ndarray:
    path = str(map_path.resolve())
    m = mrcfile.open(path, mode='r')
    arr = m.data
    print(f'mrcfile arr: dtype={arr.dtype}')
    return arr

def compare_two_arrays(arr1, arr2):
    mask = arr1 == arr2
    reversed_mask = np.invert(mask)
    print(f'Not equal indices: {reversed_mask.nonzero()}')

def get_value_from_two_arrs(arr1, arr2, index: Tuple[int,int,int]):
    val_1 = arr1[index[0], index[1], index[2]]
    val_2 = arr2[index[0], index[1], index[2]]
    return val_1, val_2

if __name__ == '__main__':
    MAP_FILEPATH = Path('EMD-1832.map')
    arr1 = get_np_arr_from_gemmi(MAP_FILEPATH)
    arr2 = get_np_arr_from_mrcfile(MAP_FILEPATH)
    compare_two_arrays(arr1, arr2)
    # try to get value of (62, 31, 31) voxel
    indices = (62, 31, 31)
    gemmi_val, mrcfile_val = get_value_from_two_arrs(arr1, arr2, indices)
    print(f'indices: {indices}')
    print(f'gemmi_val {gemmi_val}')
    print(f'mrcfile_val {mrcfile_val}')

    indices = (1, 31, 32)
    gemmi_val, mrcfile_val = get_value_from_two_arrs(arr1, arr2, indices)
    print(f'indices: {indices}')
    print(f'gemmi_val {gemmi_val}')
    print(f'mrcfile_val {mrcfile_val}')

Output:

gemmi arr: dtype=float32
mrcfile arr: dtype=float32
Not equal indices: (array([ 1,  2,  2, ..., 61, 61, 62]), array([31, 24, 24, ..., 38, 38, 31]), array([32, 29, 30, ..., 33, 34, 31]))
indices: (62, 31, 31)
gemmi_val 0.09139659255743027
mrcfile_val 0.0
indices: (1, 31, 32)
gemmi_val 0.0

Best regards,
Aliaksei

As discussed, the difference here arises because Gemmi uses Fortran-contiguous arrays and mrcfile uses C-contiguous arrays. The Gemmi documentation is already clear on this point so I think this issue can be closed.

@wojdyr I am curious about how Gemmi handles map files with swapped axes (i.e. mapc, mapr and maps fields set to something other than 1, 2, 3). Does it swap the axes in memory to ensure the data grid is always Fortran-contiguous and the x, y and z indices refer to the swapped axes?

Hi Colin, thanks for looking into it.
Yes, the axes and data are usually transposed to be in the order 1 2 3 or X Y Z (fast, medium, slow axes). This makes working with the data simpler.
In this case mapdump MAPIN EMD-1832.map shows

       Fast, medium, slow axes .........................    X    Y    Z

so the data in memory is the same as in the file.

Actually the axes are not transposed automatically, the call to read the map would need to be modified to:

m = gemmi.read_ccp4_map(path, setup=True)

or m.setup() would need to be called later on, but in this case it doesn't make a difference.