dfsp-spirit/freesurferformats

freesurfer surface import - row-major/column-major order

Closed this issue · 7 comments

Thanks for sharing this excellent package! These freesurfer import functions are really helpful.

I've spotted what I think is a problem though (disclaimer: not an expert on the freesurfer surface format).

For context I have been working to try and calculate (in R) the average vertex-vertex euclidean distances for different surfaces.

As an early sense check I looked to see if neighbouring vertices (i.e. vertices in the same face) were close to each other. This wasn't the case.

I modified the import function to use the option matrix(..., byrow = TRUE) whenever a matrix is formed from a vector, and after this my check passed with the set of closest euclidean vertices belonging to the same faces as the probe vertex.

I think this is because R works with column major order data, but freesurfer is default row major (C/C++ style).

I've put a reproducible example. custom_read_fs is practically identical to freesurferformats::read.fs.surface, but uses byrow = TRUE in every matrix call.

Interestingly, this doesn't cause problems down the line for fsbrain::vis.data.on.subject which displays fine. I think this is because rgl::tmesh3d force-converts the input vertices to a wide matrix (each column is a vertex) with matrix(vertices, nrow = vrows). This implicit reshaping (without transposing) counteracts the column major import.

I have only investigated the read.fs.surface function, I would imagine this might need checking elsewhere if matrix data is being imported.

The other possibility is I don't know what I'm doing and have managed to mangle something in the code somewhere. I'd be grateful for your thoughts.

All the best,

Andrew


custom_read_fs <- function (filepath)
{
  TRIS_MAGIC_FILE_TYPE_NUMBER = 16777214
  QUAD_MAGIC_FILE_TYPE_NUMBER = 16777215
  if (freesurferformats:::guess.filename.is.gzipped(filepath)) {
    fh = gzfile(filepath, "rb")
  }
  else {
    fh = file(filepath, "rb")
  }
  ret_list = list()
  magic_byte = freesurferformats:::fread3(fh)
  if (magic_byte == QUAD_MAGIC_FILE_TYPE_NUMBER) {
    warning("Reading QUAD files in untested atm. Please use with care. This warning will be removed once the code has unit tests.")
    ret_list$mesh_face_type = "quads"
    num_vertices = freesurferformats:::fread3(fh)
    num_faces = freesurferformats:::fread3(fh)
    cat(sprintf("Reading surface file, expecting %d vertices and %d faces.\n",
                num_vertices, num_faces))
    ret_list$internal = list()
    ret_list$internal$num_vertices_expected = num_vertices
    ret_list$internal$num_faces_expected = num_faces
    num_vertex_coords = num_vertices * 3L
    vertex_coords = readBin(fh, integer(), size = 2L, n = num_vertex_coords,
                            endian = "big")
    vertex_coords = vertex_coords/100
    vertices = matrix(vertex_coords, nrow = num_vertices, byrow = TRUE,
                      ncol = 3L)
    if (length(vertex_coords) != num_vertex_coords) {
      stop(sprintf("Mismatch in read vertex coordinates: expected %d but received %d.\n",
                   num_vertex_coords, length(vertex_coords)))
    }
    num_face_vertex_indices = num_faces * 4L
    face_vertex_indices = rep(0, num_face_vertex_indices)
    faces = matrix(face_vertex_indices, nrow = num_faces, byrow = TRUE,
                   ncol = 4L)
    for (face_idx in 1L:num_faces) {
      for (vertex_idx_in_face in 1L:4L) {
        global_vertex_idx = freesurferformats:::fread3(fh)
        faces[face_idx, vertex_idx_in_face] = global_vertex_idx
      }
    }
  }
  else if (magic_byte == TRIS_MAGIC_FILE_TYPE_NUMBER) {
    ret_list$mesh_face_type = "tris"
    creation_date_text_line = readBin(fh, character(), endian = "big")
    seek(fh, where = 3, origin = "current")
    info_text_line = readBin(fh, character(), endian = "big")
    seek(fh, where = -5, origin = "current")
    ret_list$internal = list()
    ret_list$internal$creation_date_text_line = creation_date_text_line
    ret_list$internal$info_text_line = info_text_line
    num_vertices = readBin(fh, integer(), size = 4, n = 1,
                           endian = "big")
    num_faces = readBin(fh, integer(), size = 4, n = 1, endian = "big")
    ret_list$internal$num_vertices_expected = num_vertices
    ret_list$internal$num_faces_expected = num_faces
    num_vertex_coords = num_vertices * 3L
    vertex_coords = readBin(fh, numeric(), size = 4L, n = num_vertex_coords,
                            endian = "big")
    vertices = matrix(vertex_coords, nrow = num_vertices, byrow = TRUE,
                      ncol = 3L)
    if (length(vertex_coords) != num_vertex_coords) {
      stop(sprintf("Mismatch in read vertex coordinates: expected %d but received %d.\n",
                   num_vertex_coords, length(vertex_coords)))
    }
    num_face_vertex_indices = num_faces * 3L
    face_vertex_indices = readBin(fh, integer(), size = 4L,
                                  n = num_face_vertex_indices, endian = "big")
    faces = matrix(face_vertex_indices, nrow = num_faces, byrow = TRUE,
                   ncol = 3L)
    faces = faces + 1L
    if (length(face_vertex_indices) != num_face_vertex_indices) {
      stop(sprintf("Mismatch in read vertex indices for faces: expected %d but received %d.\n",
                   num_face_vertex_indices, length(face_vertex_indices)))
    }
  }
  else {
    stop(sprintf("Magic number mismatch (%d != (%d || %d)). The given file '%s' is not a valid FreeSurfer surface format file in binary format. (Hint: This function is designed to read files like 'lh.white' in the 'surf' directory of a pre-processed FreeSurfer subject.)\n",
                 magic_byte, TRIS_MAGIC_FILE_TYPE_NUMBER, QUAD_MAGIC_FILE_TYPE_NUMBER,
                 filepath))
  }
  close(fh)
  ret_list$vertices = vertices
  ret_list$vertex_indices_fs = 0L:(nrow(vertices) - 1)
  ret_list$faces = faces
  return(ret_list)
}

# reprex:
library(tidyverse)
library(fsbrain)
library(freesurferformats)

source('custom_read_fs.R')

subjects_dir = fsbrain::get_optional_data_filepath("subjects_dir")
surf <- freesurferformats::read.fs.surface(paste(subjects_dir, "subject1", "surf", "lh.white", sep = "/"))

# Probe vertex 10

# which other vertices share a face with vertex #10:
as.data.frame(surf$faces) %>% filter(V1 == 10 | V2 == 10 | V3 == 10)
#   V1    V2     V3
# 1 10 49506  99277
# 2 10 50521  99278
# 3 10 49506 100371
# 4 10 50520  99284
# 5 10 49507 100382
# 6 10 50528 100372

# save the vertex list:
v10.neighbours <- as.data.frame(surf$faces) %>%
  filter(V1 == 10 | V2 == 10 | V3 == 10) %>%
  unlist() %>% unique() %>% sort()
#  [1]     10  49506  49507  50520  50521  50528  99277  99278  99284 100371
# [11] 100372 100382

# calculate euclidean distances to vertex 10:
edist <- function(x1,x2) { sqrt(sum((x1 - x2)^2)) }
v10.distances <- sapply(1:nrow(surf$vertices), function(i) {
  edist(surf$vertices[i,], surf$vertices[10,])
} )

# print closest neighbour indices:
order(v10.distances)[1:10] %>% sort()
#  [1]     10  41683  41686  51154  51217  51220  68818  75268 137029 137032

# only the trivial node 10 is actually a neigbour:
order(v10.distances)[1:length(v10.neighbours)] %in% v10.neighbours
#  [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

# alternatively, what are the distances (in mm) for the face neighbours:
v10.distances[v10.neighbours]
#  [1]   0.00000 131.86736  29.65419 165.91990  25.04429  87.17706  15.43588
#  [8]  84.75748  85.86355 120.43829  19.10735  85.17581

# Repeat this, but with a modified function:
surf2 <- custom_read_fs(paste(subjects_dir, "subject1", "surf", "lh.white", sep = "/"))
as.data.frame(surf2$faces) %>% filter(V1 == 10 | V2 == 10 | V3 == 10)
#   V1  V2  V3
# 1 10   9   5
# 2  5 106  10
# 3  9  10  14
# 4 15  14  10
# 5 10 106 112
# 6 10 112  15

v10.neighbours2 <- as.data.frame(surf2$faces) %>%
  filter(V1 == 10 | V2 == 10 | V3 == 10) %>%
  unlist() %>% unique() %>% sort()
# [1]   5   9  10  14  15 106 112

# calculate euclidean distances to vertex 10:
v10.distances2 <- sapply(1:nrow(surf2$vertices), function(i) {
  edist(surf2$vertices[i,], surf2$vertices[10,])
} )

# print closest neighbour indices:
order(v10.distances2)[1:10] %>% sort()
#  [1]   4   5   9  10  14  15  22 106 112 119

# clostest vertices correctly identified the neighbours:
order(v10.distances2)[1:length(v10.neighbours2)] %in% v10.neighbours2
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE

# what are the distances (in mm) for the face neighbours:
v10.distances2[v10.neighbours2]
# [1] 0.7853750 0.8502197 0.0000000 1.1290272 0.8335467 0.9591681 0.7781360
# now all ~<1mm

Hi Andrew,

Thanks for pointing this out, it looks like you are right and found a serious bug. I will have a closer look, verify and come up with a unit test that checks for it.

Your tests using the Euclidian distance make sense. In addition, I will use results from loading the mesh with the official Matlab implementation ($FREESURFER_HOME/matlab/read_surf.m) as ground truth to compare the neighborhood data of the meshes. I have done this in the unit tests for some of the functions, but apparently not for read.fs.surface.

Best,

Tim

Thanks Tim, I could have written an issue a fraction of the length by comparing with a reference import from read_surf.m. Apologies for irrelevance and length and hope the fix goes smoothly.

BTW, I think you can avoid the gdata dependency by using c(t(nmat)) instead of gdata::unmatrix(nmat, byrow = TRUE) unless you need elements to have names by rows / columns.

mat <- matrix(letters[1:9], nrow = 3, dimnames = list(c("x1","x2","x3"), c("y1","y2","y3")))
identical(
    unname(gdata::unmatrix(mat, byrow = TRUE)),
    c(t(mat))
)
# TRUE

No worries at all, I'm very grateful for the report and the fact that you supplied an example and such a detailed analysis.

I have followed your suggestion about avoiding the gdata dependency. I'm not really an R programmer, as you may be able to tell from that error, haha.

I cannot ship more data for tests directly with the package (due to the 5 MB limit on CRAN), but I have added a unit test that checks for the issue and downloads optional data to do so when it is run.

You may also like this: 03cb0d5

You may also like this: 03cb0d5

:) that's great - it's useful to see it as a unit test, currently I don't unit test, but I keep meaning to start

The fix is included in release v0.1.4., which is now on CRAN. (As always, it may take some hours until binary releases for Windows and MacOS become available on CRAN.) Closing.

I have a new version (0.0.2) of fsbrain upcoming, and it will require freesurferformats >= 0.1.4 to make sure the fix is used. Upgrading freesurferformats manually will of course fix the issue today.