openvinotoolkit/nncf

[Good First Issue][NNCF]: Add tests for torch device utils

daniil-lyakhov opened this issue ยท 5 comments

Dear good first issue solvers, greetings!๐Ÿฑ
Don't miss the opportunity to contribute to our beloved project!

Context

Functions get_model_device, is_multidevice and get_model_dtype from the file https://github.com/openvinotoolkit/nncf/blob/develop/nncf/torch/utils.py are not covered by tests and don't have proper docstrings.

What needs to be done?

The tasks are
*) To extend file https://github.com/openvinotoolkit/nncf/blob/develop/tests/torch/test_utils.py with tests for functions mentioned above. Please check all 3 possible scenarios:

  1. Model has no parameters
  2. Model has parameters and all parameters placed on the same device
  3. Model has parameters and parameters placed on different devices

*) To add proper docstrings for functions mentioned above

Example Pull Requests

#2526

Resources

Contact points

@daniil-lyakhov

Ticket

No response

.take

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Hi, @daniil-lyakhov thank you for having posted a GFI, I was looking for that :)

I was wondering whether it's safe to consider only the first network layer in the function get_model_dtype.

As far as I know, there are some edge cases in which torch allows you to use different dtype within the same models as long as you keep the two workflows separate.

Consider this example:

import torch
import torch.nn as nn

class MixedTypeNet(nn.Module):
    def __init__(self):
        super(MixedTypeNet, self).__init__()
        self.fc_f16 = nn.Linear(10, 1).to(torch.float16)
        self.fc_f32 = nn.Linear(10, 1).to(torch.float32)

    def forward(self, f16, f32):
        x_f16 = torch.relu(self.fc_f16(f16))
        x_f32 = torch.relu(self.fc_f32(f32))
        return x_f16, x_f32

device = torch.device("cuda")
net = MixedTypeNet().to(device)

input_data_f16 = torch.randn(1, 10, dtype=torch.float16).to(device)
input_data_f32 = torch.randn(1, 10, dtype=torch.float32).to(device)

x_f16, x_f32 = net(input_data_f16, input_data_f32)

print("Output (f16):", x_f16)
print("Output (f32):", x_f32)
print(next(net.parameters()).dtype)

In this case, the network contains 2 different types but next(net.parameters()).dtype would return torch.float16. Is this situation considered in the current get_model_dtype implementation? In case, is it relevant?

Hi, @daniil-lyakhov thank you for having posted a GFI, I was looking for that :)

I was wondering whether it's safe to consider only the first network layer in the function get_model_dtype.

As far as I know, there are some edge cases in which torch allows you to use different dtype within the same models as long as you keep the two workflows separate.

Consider this example:

import torch
import torch.nn as nn

class MixedTypeNet(nn.Module):
    def __init__(self):
        super(MixedTypeNet, self).__init__()
        self.fc_f16 = nn.Linear(10, 1).to(torch.float16)
        self.fc_f32 = nn.Linear(10, 1).to(torch.float32)

    def forward(self, f16, f32):
        x_f16 = torch.relu(self.fc_f16(f16))
        x_f32 = torch.relu(self.fc_f32(f32))
        return x_f16, x_f32

device = torch.device("cuda")
net = MixedTypeNet().to(device)

input_data_f16 = torch.randn(1, 10, dtype=torch.float16).to(device)
input_data_f32 = torch.randn(1, 10, dtype=torch.float32).to(device)

x_f16, x_f32 = net(input_data_f16, input_data_f32)

print("Output (f16):", x_f16)
print("Output (f32):", x_f32)
print(next(net.parameters()).dtype)

In this case, the network contains 2 different types but next(net.parameters()).dtype would return torch.float16. Is this situation considered in the current get_model_dtype implementation? In case, is it relevant?

Hi @DaniAffCH, thank you for the collaboration! This is a nice catch! As function get_model_device returns only device of the fist params, I think it is valid for the get_model_dtype to return the dtype of the first parameter as well. As such cases like multi dtype model are not really common I suggest to mention "first parameter device/ dtype of the model if exist and default value elsewhere" and leave functions as is.

CC: @alexsu52

@DaniAffCH, thank you for the contribution!