IBM/aihwkit

Model Initialized outsize [w_min, w_max]

Zhaoxian-Wu opened this issue · 7 comments

Description

When I am training an analog component, I find the weight of the analog layer can fall out the range [w_min, w_max], where w_min and w_max are the parameters of PulseDevice

How to reproduce

from aihwkit.nn import AnalogLinear
from aihwkit.simulator.configs import SingleRPUConfig
from aihwkit.simulator.configs.devices import SoftBoundsReferenceDevice
    
device = SoftBoundsReferenceDevice(
    construction_seed=10,
    w_max = 0.1,
    w_min = -0.1
)
rpu_config = SingleRPUConfig(device=device)

model = AnalogLinear(1, 1, False, rpu_config=rpu_config)
print(f'w_max: {rpu_config.device.w_max}, w_min: {rpu_config.device.w_min}, weight: {model.get_weights()[0].item()}')

The output is

(analog) zhaoxian@server:~/Desktop/$ python main.py 
w_max: 0.1, w_min: -0.1, weight: 0.1632751077413559

Here the weight 0.16 is larger than w_max, which really confuses me. Do I miss something like a mapping operator? And what is the exact meaning of the parameters w_min and w_max?

Expected behavior

The weight should be smaller than 0.1

Other information

  • Pytorch version: 2.1.2+cu121
  • Package version: 0.8.0
  • OS: Ubuntu 20.04.2
  • Python version: Python 3.10
  • Conda version (or N/A): conda 23.10.0

You need to use get_weights(apply_weight_scaling=False) to get the restricted weights.

Actually apart from the weight scaling which might change the effective max weight, even if no weight scaling is used the w_max parameter is the mean maximal weight across devices. There is device to device variation, too, controlled by w_max_dtod. The actual W max per synapse can be read out with analog_tile.get_hidden_parameters(). See also the Api documentation to for these parameters where it is described in more details.

@kaoutar55 this issue should be converted to a discussion as it is not a bug.

Actually apart from the weight scaling which might change the effective max weight, even if no weight scaling is used the w_max parameter is the mean maximal weight across devices. There is device to device variation, too, controlled by w_max_dtod. The actual W max per synapse can be read out with analog_tile.get_hidden_parameters(). See also the Api documentation to for these parameters where it is described in more details.

I set both dtod and apply_weight_scaling=False but the same thing happens

from aihwkit.nn import AnalogLinear
from aihwkit.simulator.configs import SingleRPUConfig
from aihwkit.simulator.configs.devices import SoftBoundsReferenceDevice
    
device = SoftBoundsReferenceDevice(
    construction_seed=10,
    w_max = 0.1,
    w_min = -0.1,
    w_max_dtod=0,
    w_min_dtod=0,
)
rpu_config = SingleRPUConfig(device=device)

model = AnalogLinear(1, 1, False, rpu_config=rpu_config)
weight = model.get_weights(apply_weight_scaling=False)[0].item()
print(f'w_max: {rpu_config.device.w_max}, w_min: {rpu_config.device.w_min}, weight: {weight}')

I got the result

(analog) zhaoxian@server:~/Desktop/$ python tmain.py 
w_max: 0.1, w_min: -0.1, weight: -0.10000000149011612

The weight is still slightly larger than 0.1

Actually, I set all the parameters I knew ideally but I still got the same thing. The following is a more detailed version

import math

import torch
import torch.nn as nn
from aihwkit.nn import AnalogLinear
from aihwkit.optim import AnalogSGD

from aihwkit.simulator.configs import (
    SingleRPUConfig,
    WeightNoiseType,
    NoiseManagementType,
    BoundManagementType,
)
from aihwkit.simulator.configs.devices import SoftBoundsReferenceDevice
from aihwkit.simulator.parameters import IOParameters


# ==================
INPUT_SIZE = 1
def get_loss(model, for_training=True, analog_exact=True):
    criterion = nn.MSELoss()
    outputs = model(torch.ones(INPUT_SIZE))
    loss = criterion(outputs.view(-1), torch.tensor([0.5]))
    return loss
    
def get_IO():
    io_param = IOParameters(
        is_perfect  = True,
        inp_bound  = 10,
        out_bound  = 10,
        w_noise    = 0,
        w_noise_type = WeightNoiseType.NONE,
        inp_noise  = 0.,
        out_noise  = 0.,
        inp_res    = 0,
        out_res    = 0,
        ir_drop    = 0,
        ir_drop_g_ratio=0,
        noise_management = NoiseManagementType.NONE,
        bound_management = BoundManagementType.NONE,
        v_offset_w_min = 0,
    )
    return io_param

device = SoftBoundsReferenceDevice(
    construction_seed=10,
    dw_min = 1e-4,
    dw_min_dtod = 0,
    dw_min_std = 0,
    dw_min_dtod_log_normal=False,
    write_noise_std = 0,
    corrupt_devices_prob = 0,
    corrupt_devices_range = 0,
    up_down = 0,
    up_down_dtod = 0,
    slope_up_dtod = 0,
    slope_down_dtod = 0,
    reference_mean = 0,
    reference_std = 0,
    w_max = 0.1,
    w_min = -0.1,
    w_min_dtod = 0,
    w_max_dtod = 0,
    reset_std = 0,
    subtract_symmetry_point = True,
    perfect_bias = True,
)
rpu_config = SingleRPUConfig(device=device)
rpu_config.forward = get_IO()
rpu_config.backward = get_IO()
rpu_config.update.desired_bl = 5000
rpu_config.update.sto_round = True

model = AnalogLinear(INPUT_SIZE, 1, False, rpu_config=rpu_config)
torch.manual_seed(618)
optimizer = AnalogSGD(model.parameters(), lr=0.1)

print(f'w_max: {rpu_config.device.w_max}, w_min: {rpu_config.device.w_min}')
for iter_idx in range(10):
    weight = model.get_weights(apply_weight_scaling=False)[0].item()
    print(f'iteration {iter_idx}, weight: {weight}')
    
    model.eval()
    loss = get_loss(model)
    
    optimizer.zero_grad()
    loss.backward()

    optimizer.step()
    optimizer.zero_grad()

Running it I got the output

(analog) zhaoxian@server:~/Desktop/$ python main.py 
w_max: 0.1, w_min: -0.1
iteration 0, weight: 0.2317507266998291
iteration 1, weight: 0.28406578302383423
iteration 2, weight: 0.3259482979774475
iteration 3, weight: 0.3595556616783142
iteration 4, weight: 0.38660740852355957
iteration 5, weight: 0.4084051549434662
iteration 6, weight: 0.42603760957717896
iteration 7, weight: 0.44019657373428345
iteration 8, weight: 0.45166146755218506
iteration 9, weight: 0.46091893315315247

The weight still falls outside the region.

Turn off subtract symmetry point, is perfect and perfect bias and also show the results of get_hidden_paramters()

Why are you setting the model to eval? Also you should make sure that you are not initially setting the weights too high. Softbound synapse might not clip the weights if you set it initially wrong.

Also turn off weight scaling, that is (in mapping) weight scaling omega = 0 out scaling False