[Good First Issue][NNCF]: Add tests for torch device utils

Question

[Good First Issue][NNCF]: Add tests for torch device utils

daniil-lyakhov opened this issue 8 months ago · 5 comments

Dear good first issue solvers, greetings!🐱
Don't miss the opportunity to contribute to our beloved project!

Context

Functions get_model_device, is_multidevice and get_model_dtype from the file https://github.com/openvinotoolkit/nncf/blob/develop/nncf/torch/utils.py are not covered by tests and don't have proper docstrings.

What needs to be done?

The tasks are
*) To extend file https://github.com/openvinotoolkit/nncf/blob/develop/tests/torch/test_utils.py with tests for functions mentioned above. Please check all 3 possible scenarios:

Model has no parameters
Model has parameters and all parameters placed on the same device
Model has parameters and parameters placed on different devices

*) To add proper docstrings for functions mentioned above

Ticket

No response

DaniAffCH commented 8 months ago

.take

Answer 1 · 2024-03-15T19:25:05.000Z

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Answer 2 · 2024-03-15T21:34:11.000Z

Hi, @daniil-lyakhov thank you for having posted a GFI, I was looking for that :)

I was wondering whether it's safe to consider only the first network layer in the function get_model_dtype.

As far as I know, there are some edge cases in which torch allows you to use different dtype within the same models as long as you keep the two workflows separate.

Consider this example:

import torch
import torch.nn as nn

class MixedTypeNet(nn.Module):
    def __init__(self):
        super(MixedTypeNet, self).__init__()
        self.fc_f16 = nn.Linear(10, 1).to(torch.float16)
        self.fc_f32 = nn.Linear(10, 1).to(torch.float32)

    def forward(self, f16, f32):
        x_f16 = torch.relu(self.fc_f16(f16))
        x_f32 = torch.relu(self.fc_f32(f32))
        return x_f16, x_f32

device = torch.device("cuda")
net = MixedTypeNet().to(device)

input_data_f16 = torch.randn(1, 10, dtype=torch.float16).to(device)
input_data_f32 = torch.randn(1, 10, dtype=torch.float32).to(device)

x_f16, x_f32 = net(input_data_f16, input_data_f32)

print("Output (f16):", x_f16)
print("Output (f32):", x_f32)
print(next(net.parameters()).dtype)

In this case, the network contains 2 different types but next(net.parameters()).dtype would return torch.float16. Is this situation considered in the current get_model_dtype implementation? In case, is it relevant?

Answer 3 · 2024-03-18T09:57:22.000Z

Hi, @daniil-lyakhov thank you for having posted a GFI, I was looking for that :)

I was wondering whether it's safe to consider only the first network layer in the function get_model_dtype.

As far as I know, there are some edge cases in which torch allows you to use different dtype within the same models as long as you keep the two workflows separate.

Consider this example:
import torch
import torch.nn as nn

class MixedTypeNet(nn.Module):
    def __init__(self):
        super(MixedTypeNet, self).__init__()
        self.fc_f16 = nn.Linear(10, 1).to(torch.float16)
        self.fc_f32 = nn.Linear(10, 1).to(torch.float32)

    def forward(self, f16, f32):
        x_f16 = torch.relu(self.fc_f16(f16))
        x_f32 = torch.relu(self.fc_f32(f32))
        return x_f16, x_f32

device = torch.device("cuda")
net = MixedTypeNet().to(device)

input_data_f16 = torch.randn(1, 10, dtype=torch.float16).to(device)
input_data_f32 = torch.randn(1, 10, dtype=torch.float32).to(device)

x_f16, x_f32 = net(input_data_f16, input_data_f32)

print("Output (f16):", x_f16)
print("Output (f32):", x_f32)
print(next(net.parameters()).dtype)
In this case, the network contains 2 different types but next(net.parameters()).dtype would return torch.float16. Is this situation considered in the current get_model_dtype implementation? In case, is it relevant?

Hi @DaniAffCH, thank you for the collaboration! This is a nice catch! As function get_model_device returns only device of the fist params, I think it is valid for the get_model_dtype to return the dtype of the first parameter as well. As such cases like multi dtype model are not really common I suggest to mention "first parameter device/ dtype of the model if exist and default value elsewhere" and leave functions as is.

CC: @alexsu52

Answer 4 · 2024-03-19T10:00:19.000Z

@DaniAffCH, thank you for the contribution!

Context

What needs to be done?

Example Pull Requests

Resources

Contact points

Ticket