nfs4:when network disconnect do nfs reinit then readdir lead to crash
xiaoyuezhufeng opened this issue · 5 comments
I found this problem when I was doing a network stability test.
For recovery link quickly, I do umount and destry for nfs handle and reinit as new link.Then, the crash come here.
message like :
nfs_stat64 failed: nfs_service failed. errno 11
nfs_stat64 failed: nfs_service failed. errno 11
ASAN:DEADLYSIGNAL
==7429==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fddbe32eaef bp 0x7ffc18fe6190 sp 0x7ffc18fe6150 T0)
==7429==The signal is caused by a READ memory access.
==7429==Hint: address points to the zero page.
#0 0x7fddbe32eaee in opendir_cb /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/libnfs-sync.c:1263
#1 0x7fddbe371b72 in nfs4_parse_readdir /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/nfs_v4.c:3704
#2 0x7fddbe370367 in nfs4_opendir_2_cb /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/nfs_v4.c:3571
#3 0x7fddbe37e558 in rpc_process_reply /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/pdu.c:368
#4 0x7fddbe3807f0 in rpc_process_pdu /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/pdu.c:643
#5 0x7fddbe381f25 in rpc_read_from_socket /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/socket.c:396
#6 0x7fddbe383302 in rpc_service /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/socket.c:554
#7 0x7fddbe321347 in nfs_service /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/libnfs.c:274
#8 0x7fddbe32ae0a in wait_for_nfs_reply /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/libnfs-sync.c:282
#9 0x7fddbe32ec90 in nfs_opendir /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/libnfs-sync.c:1286
#10 0x4016a2 in main /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/examples/nfs-fh.c:112
#11 0x7fddbd05ef29 in __libc_start_main ../csu/libc-start.c:308
#12 0x4011f9 in _start (/workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/examples/.libs/nfs-fh+0x4011f9)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /workspace/nggap/trunk/blfs81/tmp_x86_64/libnfs-libnfs-5.0.2/lib/libnfs-sync.c:1263 in opendir_cb
==7429==ABORTING
While the program is running, logout of the NFS server ip for 20s then login again and again. There is a chance of a crash.
This test used nfs-fh.c and do same change like this:
int reconnect = 0;
struct nfs_stat_64 nfs_st;
struct nfsdir *fd = NULL;
struct nfsdirent *nfs_entry = NULL;
while (1) {
nfs = nfs_init_context();
if (nfs == NULL) {
fprintf(stderr, "failed to init context\n");
goto finished;
}
(void)nfs_set_version(nfs, 4);
(void)nfs_set_autoreconnect(nfs, 1);
(void)nfs_set_dircache(nfs, 0);
url = nfs_parse_url_full(nfs, argv[1]);
if (url == NULL) {
fprintf(stderr, "%s\n", nfs_get_error(nfs));
ret = 1;
goto finished;
}
if (nfs_mount(nfs, url->server, url->path) != 0) {
fprintf(stderr, "Failed to mount nfs share : %s\n",
nfs_get_error(nfs));
ret = 1;
goto finished;
}
while (1) {
ret = nfs_opendir(nfs, "/abc", &fd);
if (ret < 0) {
printf("nfs_opendir failed: %s. errno %d\n", nfs_get_error(nfs), errno);
break;
}
while (1) {
char fullname[512];
nfs_entry = nfs_readdir(nfs, fd);
if (!nfs_entry) {
break;
}
if (strcmp(nfs_entry->name, ".") == 0 ||
strcmp(nfs_entry->name, "..") == 0) {
continue;
}
snprintf(fullname, sizeof(fullname), "/abc/%s", nfs_entry->name);
ret = nfs_stat64(nfs, fullname, &nfs_st);
if (ret < 0) {
printf("nfs_stat64 failed: %s. errno %d\n", nfs_get_error(nfs), errno);
break;
}
printf("fullname %s\n", fullname);
}
nfs_closedir(nfs, fd);
}
nfs_umount(nfs);
nfs_destroy_context(nfs);
nfs = NULL;
}
In my other programs, the crash maybe occur in nfs_destroy_context function;
libnfs-libnfs-5.0.2/lib/libnfs-sync.c:186
./configure --prefix=/usr --enable-examples CFLAGS="-g2 -fno-omit-frame-pointer -fsanitize=address -lasan"
Please try current master, I have added a fix that should avoid this crash.
Please try current master, I have added a fix that should avoid this crash.
It worked. The problem was solved, thanks!