westerndigitalcorporation/zenfs

Page fault once the GC is triggered

Opened this issue · 0 comments

Issue description: After running db_bench with fillrandom workload, after garbage collecting a few zones, the system always hit the page fault.

Environment
Terarkdb: 1.4
ZenFS Version: latest version
Operating System: Linux debian 5.10
Hardware Configuration: 3.2TB ZNS SSD and 16GB DRAM

Script to reproduce:

  1. zenfs mkfs --zbd=$DEVICE --aux_path=/tmp/zbd_$DEVICE --force=true --enable_gc=true
  2. db_bench
    --zbd_path=$DEVICE
    --benchmarks=fillrandom
    --readwritepercent=99
    --histogram=1
    --statistics=1
    --enable_lazy_compaction=0
    --level0_file_num_compaction_trigger=4
    --sync=1
    --wal_bytes_per_sync=32768
    --threads=16
    --num_levels=7
    --key_size=36
    --value_size=20000
    --level_compaction_dynamic_level_bytes=true
    --mmap_read=false
    --use_terark_table=false
    --blob_size=1024
    --blob_gc_ratio=0.0625
    --num=548768010
    --max_write_buffer_number=20
    --benchmark_write_rate_limit=300000000
    --batch_size=100
    --disable_auto_compactions
    --db=test_fill
    --disable_auto_compactions
    --write_buffer_size=536870912
    --zenfs_low_gc_ratio=0.3
    --zenfs_high_gc_ratio=0.6
    --zenfs_force_gc_ratio=0.9

Here is the following call trace.

debian kernel: [26158.433496][T19093] Call Trace:
debian kernel: [26158.433502][T19093] dump_stack+0x6d/0x88
debian kernel: [26158.433503][T19093] dump_header+0x4a/0x1e4
debian kernel: [26158.433504][T19093] oom_kill_process.cold.37+0xb/0x10
debian kernel: [26158.433505][T19093] out_of_memory+0x1a8/0x4e0
debian kernel: [26158.433507][T19093] __alloc_pages_slowpath.constprop.106+0xb29/0xc10
debian kernel: [26158.433508][T19093] __alloc_pages_nodemask+0x2cc/0x300
debian kernel: [26158.433509][T19093] pagecache_get_page.part.62+0xe2/0x440
debian kernel: [26158.433510][T19093] filemap_fault+0x6a7/0xa10
debian kernel: [26158.433512][T19093] ? xas_load+0x8/0x80
debian kernel: [26158.433524][T19093] ext4_filemap_fault+0x2c/0x40 [ext4]
debian kernel: [26158.433526][T19093] __do_fault+0x30/0x100
debian kernel: [26158.433527][T19093] handle_mm_fault+0x1201/0x1600
debian kernel: [26158.433529][T19093] do_user_addr_fault+0x1b0/0x5c0
debian kernel: [26158.433530][T19093] exc_page_fault+0x7f/0x170
debian kernel: [26158.433532][T19093] ? asm_exc_page_fault+0x8/0x30
debian kernel: [26158.433532][T19093] asm_exc_page_fault+0x1e/0x30