tuxera/ntfs-3g

NTFS-3G segfaulting when removing a directory

solarisfire opened this issue · 4 comments

Just trying to remove a directory on an SSD and this happens:

[root@solaris-arion compatdata]# rm -rf 627690/ rm: cannot remove '627690/pfx/drive_c/windows/syswow64/xactengine3_4.dll': Software caused connection abort rm: cannot remove '627690/pfx/drive_c/windows/syswow64/xactengine3_5.dll': Transport endpoint is not connected ..

Journalctl shows:

May 02 12:30:51 solaris-arion lowntfs-3g[41757]: Version 2022.10.3 external FUSE 29
May 02 12:30:51 solaris-arion lowntfs-3g[41757]: Mounted /dev/sda1 (Read-Write, label "", NTFS 3.1)
May 02 12:30:51 solaris-arion lowntfs-3g[41757]: Cmdline options: rw,nosuid,nodev,uid=1000,gid=1000,umask=000,user,exec
May 02 12:30:51 solaris-arion lowntfs-3g[41757]: Mount options: nosuid,nodev,user,exec,allow_other,nonempty,relatime,rw,default_permissions,fsname=/dev/sda1,blkdev,blksize=4096
May 02 12:30:51 solaris-arion lowntfs-3g[41757]: Global ownership and permissions enforced, configuration type 9
May 02 12:31:00 solaris-arion lowntfs-3g[41757]: Failed to read full index block at 806912
May 02 12:31:00 solaris-arion kernel: mount.lowntfs-3[41757]: segfault at 10 ip 00007f40cec4b251 sp 00007ffc5682a5b0 error 4 in libntfs-3g.so.89.0.0[7f40cec2d000+39000] likely on CPU 8 (core 0, socket 0)
May 02 12:31:00 solaris-arion kernel: Code: c3 fe ff ff ff 15 8f d5 02 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 49 89 f0 ba 01 00 00 00 53 48 89 fb 48 83 ec 08 <48> 8b 6e 10 0f b6 8f f0 01 00 00 48 8b 7f 58 48 89 ee 48 d3 e6 8b
May 02 12:31:00 solaris-arion systemd[1]: Started Process Core Dump (PID 41775/UID 0).
May 02 12:31:00 solaris-arion systemd-coredump[41776]: Resource limits disable core dumping for process 41757 (mount.lowntfs-3).
May 02 12:31:00 solaris-arion systemd-coredump[41776]: Process 41757 (mount.lowntfs-3) of user 0 dumped core.
May 02 12:31:00 solaris-arion systemd[1]: systemd-coredump@5-41775-0.service: Deactivated successfully.

I get that the main issue here is the inability to read the full index block, and I need to get Windows to check and repair ntfs on the drive. But it also probably shouldn't segfault...

Hi, are you able to capture an metadata image of this filesystem, e.g. ntfsclone --save-image --metadata -o image.cloneimg /dev/sda1? We can't really see from the output what went wrong, so an image of the filesystem would be helpful for us to debug this issue.
Alternatively if you know your way around gdb you could try and get a stack trace of the crash after compiling the project from source (or install your distribution's debug symbols, if any).

Had to force this due to the volume being flagged for checking:

ntfsclone v2022.10.3 (libntfs-3g)
ERROR: Volume '/dev/sda1' is scheduled for a check or it was shutdown
uncleanly. Please boot Windows or use the --force option to progress.

Zipped metadata image attached (1.2gb decompressed)
https://drive.google.com/file/d/15iGgh3kXW3IHm6ZhFw22NdKJw8huUD9a/view?usp=share_link

@solarisfire Thank you, this was very helpful. I managed to reproduce the crash, even though the original file had been deleted. Stack trace:

Failed to read full index block at 806912

Program received signal SIGSEGV, Segmentation fault.
0x0000fffff7f5b9c8 in ntfs_ib_write (icx=0xaaaaaaaee0c0, ib=0x0) at index.c:85
85		s64 ret, vcn = sle64_to_cpu(ib->index_block_vcn);
(gdb) bt
#0  0x0000fffff7f5b9c8 in ntfs_ib_write (icx=0xaaaaaaaee0c0, ib=0x0) at index.c:85
#1  0x0000fffff7f5bc24 in ntfs_index_ctx_free (icx=0xaaaaaaaee0c0) at index.c:155
#2  0x0000fffff7f5bc60 in ntfs_index_ctx_put (icx=0xaaaaaaaee0c0) at index.c:171
#3  0x0000fffff7f6fffc in ntfs_delete_reparse_index (ni=0xaaaaaaaed750) at reparse.c:1141
#4  0x0000fffff7f583b8 in ntfs_delete (vol=0xaaaaaaad5880, pathname=0x0, ni=0xaaaaaaaed750, dir_ni=0xaaaaaaafb350, name=0xaaaaaaaf4c70, name_len=7 '\a') at dir.c:2088
#5  0x0000aaaaaaaaabf4 in ntfs_fuse_rm (req=0xaaaaaaaf5950, parent=835559, name=0xfffff7d3e038 "sfc.dll", rm_type=RM_LINK) at lowntfs-3g.c:2868
#6  0x0000aaaaaaaaace4 in ntfs_fuse_unlink (req=0xaaaaaaaf5950, parent=835559, name=0xfffff7d3e038 "sfc.dll") at lowntfs-3g.c:2888
#7  0x0000aaaaaaab26c0 in do_unlink (req=0xaaaaaaaf5950, nodeid=835559, inarg=0xfffff7d3e038) at fuse_lowlevel.c:606
#8  0x0000aaaaaaab47c0 in fuse_ll_process (data=0xaaaaaaadffd0, buf=0xfffff7d3e010 "0", len=48, ch=0xaaaaaaadfdc0) at fuse_lowlevel.c:1337
#9  0x0000aaaaaaab5ec8 in fuse_session_process (se=0xaaaaaaae02e0, buf=0xfffff7d3e010 "0", len=48, ch=0xaaaaaaadfdc0) at fuse_session.c:87
#10 0x0000aaaaaaab0f84 in fuse_session_loop (se=0xaaaaaaae02e0) at fuse_loop.c:34
#11 0x0000aaaaaaaaf164 in main (argc=4, argv=0xfffffffff378) at lowntfs-3g.c:4849

I'm assigning this issue to myself for the moment while investigating. You can remove the image file from Google Drive now.

A fix has been pushed that should resolve this crash: 241ddb3
Thanks a lot for your help in reporting and debugging this issue, please let us know if you are still having problems after the latest fixes.