dm-vdo/kvdo

VDO error: assertion "metadata bio has no next bio" failed.

thememika opened this issue · 2 comments

Environment:

  • Ubuntu 22.04.3 LTS (GNU/Linux 6.1.68 x86_64)
  • KVDO version: 35, release version: 0

When using a VDO device which was manually set up with vdoformat and dmsetup, these messages with err priority sometimes can be found in the kernel ring buffer. The error and the call trace which was printed with it:

[1612941.063945] kvdo20:reqQ: assertion "metadata bio has no next bio" (vio->bio->bi_next == ((void *)0)) failed at /home/*****/mymodules/vdo/io-submitter.c:396
[1612941.063959] CPU: 12 PID: 3202791 Comm: kvdo20:reqQ Tainted: P        W  OE      6.1.68 #1                                            
[1612941.063963] Hardware name: OEM X79G/X79G, BIOS 4.6.5 08/02/2022        
[1612941.063965] Call Trace:                                                
[1612941.063968]  <TASK>                                                    
[1612941.063973]  dump_stack_lvl+0x4d/0x67                                  
[1612941.063983]  ? enter_zone_read_only_mode+0x50/0x50 [kvdo]             
[1612941.064016]  dump_stack+0x14/0x1a                                      
[1612941.064020]  uds_log_backtrace.cold+0x5/0xa [kvdo]                     
[1612941.064050]  uds_assertion_failed+0x72/0xa0 [kvdo]                     
[1612941.064085]  ? psi_group_change+0x20a/0x460                            
[1612941.064089]  ? sched_clock+0xd/0x20                                    
[1612941.064094]  vdo_submit_metadata_io+0x142/0x180 [kvdo]                 
[1612941.064127]  write_initialized_page+0x75/0x90 [kvdo]                   
[1612941.064149]  write_page+0x1cf/0x210 [kvdo]                             
[1612941.064171]  ? vdo_invoke_completion_callback_with_priority+0x69/0x90 [kvdo]                                                                       
[1612941.064197]  write_page_callback+0x12/0x20 [kvdo]                      
[1612941.064219]  notify_next_waiter+0x67/0x80 [kvdo]                       
[1612941.064248]  return_vio_to_pool+0x81/0xc0 [kvdo]                       
[1612941.064277]  finish_page_write+0x92/0x170 [kvdo]                       
[1612941.064300]  work_queue_runner+0xf5/0x260 [kvdo]                       
[1612941.064328]  ? var_wake_function+0x60/0x60                             
[1612941.064331]  ? get_current_thread_work_queue+0x60/0x60 [kvdo]         
[1612941.064358]  kthread+0xf4/0x120                                        
[1612941.064362]  ? kthread_complete_and_exit+0x30/0x30                     
[1612941.064365]  ret_from_fork+0x22/0x30                                   
[1612941.064371]  </TASK>

The error doesn't seem to affect the I/O, everything continues as without errors, and there are no performance drops.
But I want to know if there is a misconfiguration, bug in VDO, or other issue that can affect the correctness of device.
Linux 6.1.68.
VDO was created with:

vdoformat --logical-size 1500G --uds-memory-size 0.25 --slab-bits=18 /dev/sdab1

DM table:

0 3145728000 vdo V4 /dev/sdab1 20916337 4096 32768 16380

vdodumpconfig /dev/sdab1

VDOConfig:                                                                                        
blockSize: 4096                                                                                 
logicalBlocks: 393216000                                                                        
physicalBlocks: 20916337                                                                        
slabSize: 262144                                                                                
recoveryJournalSize: 32768                                                                      
slabJournalBlocks: 224                                                                        
UUID: ***                                                     
ReleaseVersion: 0                                                                               
Nonce: ***                                                                        
IndexRegion: 1                                                                                  
DataRegion: 679128                                                                              
IndexConfig:                                                                                      
memory: 4294967040                                                                              
sparse: false

There are other VDO devices in the system, created using LVM tools. They operate correctly.
Any explanation of what can cause the error is appreciated.

I have removed the trigger of these messages from the code and recompiled the module.
For now, my devices are working fine. I will post update if something happens.

Thanks for the report. It is probably safe for you to ignore these assertions. Nonetheless this is not a condition that should occur, so we'll try to dig into this and see if we can figure out how this can happen.