[Question]: Why is every head config saved with "vertical_and_slash"?

Describe the issue

Regardless of the pattern observed, the config saves it as "vertical_and_slash" when using the search_patterns function.

MInference/minference/modules/minference_forward.py

Lines 198 to 216 in b5b8745

    
           for ty, fc in [("stream_llm", stream_llm), ("vertical_and_slash", vertical_and_slash), ("block_sparse", block_sparse)]: 
        
               if ty == "stream_llm": 
        
                   vs_list = [(100, 800)] 
        
               elif ty == "vertical_and_slash": 
        
                   vs_list = [(30, 800), (100, 750), (500, 700), (3500, 100)] 
        
               else: 
        
                   vs_list = [(8, 1)] 
        
               for v_size, s_size in vs_list: 
        
                   score = fc(v_size, s_size) 
        
                   score = score.item() 
        
                   all_info.append([ty, v_size, s_size, score]) 
        
                   if score > best_score: 
        
                       best_score = score 
        
                       best_s, best_v = s_size, v_size 
        
                       best_ty = ty 
        
           if best_ty == "stream_llm": 
        
               best_ty = "vertical_and_slash" 
        
           if best_ty == "block_sparse": 
        
               best_ty, best_v, best_s = "vertical_and_slash", 1000, 6096

The configs saved in the repo appear to only contain this method type ^.

Specifically those lines:

 if best_ty == "stream_llm": 
     best_ty = "vertical_and_slash" 
 if best_ty == "block_sparse": 
     best_ty, best_v, best_s = "vertical_and_slash", 1000, 6096

When doing the forward pass, I think this means that we never route to anything other than the vertical_and_slash impl / kernels.

Is this a bug or intended? The experiment docs cite the use of this search patterns function.

On the other hand:

Search pattern v2

MInference/minference/modules/minference_forward.py

Line 220 in b5b8745

def search_pattern_v2(q, k, v, head):

does actually appear to save pattern type with specific names for routing to different pattern impls.

Do we need to use search patterns v2 to replicate the results of the paper? Or are the vertical_and_slash settings actually enough to pull off needle-in-a-haystack for long sequences?

Hi @fmmoret, thanks for your feedback.

The configs provided in the repo can reproduce the results from the paper. This means that the vertical_and_slash settings are sufficient to pass the Needle In A Haystack test for long sequences.

The search_pattern function reroutes to vertical_and_slash because our tests have shown that this setting offers better generalization and efficiency across different context windows and tasks.

	for ty, fc in [("stream_llm", stream_llm), ("vertical_and_slash", vertical_and_slash), ("block_sparse", block_sparse)]:
	if ty == "stream_llm":
	vs_list = [(100, 800)]
	elif ty == "vertical_and_slash":
	vs_list = [(30, 800), (100, 750), (500, 700), (3500, 100)]
	else:
	vs_list = [(8, 1)]
	for v_size, s_size in vs_list:
	score = fc(v_size, s_size)
	score = score.item()
	all_info.append([ty, v_size, s_size, score])
	if score > best_score:
	best_score = score
	best_s, best_v = s_size, v_size
	best_ty = ty
	if best_ty == "stream_llm":
	best_ty = "vertical_and_slash"
	if best_ty == "block_sparse":
	best_ty, best_v, best_s = "vertical_and_slash", 1000, 6096