Issue with data type of the import model

I can see from this line of code

Line 119 in 54cf7c2

d["operand_precision"] = {"O": 16, "O_final": 8, "W": 8, "I": 8}

that the precision of the I, W, and O operands are hard-coded to 8 during conv parsing. Also, from the code segment

Lines 164 to 167 in 54cf7c2

    
           ia_data_type, oa_data_type, w_data_type = get_input_output_weight_data_type( 
        
               self.node, self.onnx_model 
        
           )

, it seems the data type of the layer I/W/O operands are acquired but not used in later code. Is there an assumption behind this (e.g. the model is assumed to be of precision=8)? Would that affect the cost estimation? Any help will be well appreciated!

Hi, yes you are correct that these precisions are assumed. Traditional accelerators will use 8 bit for the operands and a higher precision for the partial output sums. Here, 16 bit is assumed, but 24 bit is also used frequently.

If your accelerator uses larger precision, this will affect the cost estimation because the data fetches to/from the memories will be more expensive and not as many operands can be stored in lower memory levels. You can modify the operand_precision accordingly.

	ia_data_type, oa_data_type, w_data_type = get_input_output_weight_data_type(
	self.node, self.onnx_model
	)