Issue on page /tutorials/W1D3_MultiLayerPerceptrons/student/W1D3_Tutorial1.html
William-Gong opened this issue · 1 comments
In Coding Exercise 1: Function approximation with ReLU. The plotting only consists of basis ReLu functions and approximated function. This is very confusing because the basis functions are non-zero everywhere, but the approximated function is a sine function. I recommend adding a weighted Relu activations subplot after the subplot of basis ReLu functions.
The plotting code will be updated to:
def plot_function_approximation(x, combination_weights, relu_acts, y_hat):
Helper function to plot ReLU activations and
function approximations
x: torch.tensor
Incoming Data
relu_acts: torch.tensor
Computed ReLU activations for each point along the x axis (x)
y_hat: torch.tensor
Estimated labels/class predictions
Weighted sum of ReLU activations for every point along x axis
fig, axes = plt.subplots(3, 1)
# Plot ReLU Activations
axes[0].plot(x, relu_acts.T);
title='ReLU Activations - Basis Functions')
labels = [f"ReLU {i + 1}" for i in range(relu_acts.shape[0])]
axes[0].legend(labels, ncol = 2)
weighted_relu = relu_acts * combination_weights[:,None]
axes[1].plot(x, weighted_relu.T);
axes[1].set_ylim([-2, 2])
title='Weighted ReLU Activations')
# Plot Function Approximation
axes[2].plot(x, torch.sin(x), label='truth')
axes[2].plot(x, y_hat, label='estimated')
title='Function Approximation')
**And the coding excise will be updated to:**
def approximate_function(x_train, y_train):
Function to compute and combine ReLU activations
x_train: torch.tensor
Training data
y_train: torch.tensor
Ground truth labels corresponding to training data
relu_acts: torch.tensor
Computed ReLU activations for each point along the x axis (x)
y_hat: torch.tensor
Estimated labels/class predictions
Weighted sum of ReLU activations for every point along x axis
x: torch.tensor
x-axis points
# Number of relus
n_relus = x_train.shape[0] - 1
# x axis points (more than x train)
x = torch.linspace(torch.min(x_train), torch.max(x_train), 1000)
# First determine what bias terms should be for each of `n_relus` ReLUs
b = -x_train[:-1]
# Compute ReLU activations for each point along the x axis (x)
relu_acts = torch.zeros((n_relus, x.shape[0]))
for i_relu in range(n_relus):
relu_acts[i_relu, :] = torch.relu(x + b[i_relu])
# Set up weights for weighted sum of ReLUs
combination_weights = torch.zeros((n_relus, ))
# Figure out weights on each ReLU
prev_slope = 0
for i in range(n_relus):
delta_x = x_train[i+1] - x_train[i]
slope = (y_train[i+1] - y_train[i]) / delta_x
combination_weights[i] = slope - prev_slope
prev_slope = slope
# Get output of weighted sum of ReLU activations for every point along x axis
y_hat = combination_weights @ relu_acts
return combination_weights, y_hat, relu_acts, x
# Add event to airtable
atform.add_event('Coding Exercise 1: Function approximation with ReLU')
# Make training data from sine function
N_train = 10
x_train = torch.linspace(0, 2*np.pi, N_train).view(-1, 1)
y_train = torch.sin(x_train)
## Uncomment the lines below to test your function approximation
combination_weights, y_hat, relu_acts, x = approximate_function(x_train, y_train)
with plt.xkcd():
plot_function_approximation(x, combination_weights, relu_acts, y_hat)
Hi @William-Gong,
Thanks so much for your contribution. We highly appreciate the same!
I ran your code and got the following output.
It seems to me that the graph could raise additional questions regarding interpretability. For instance: "Why is the behavior of ReLU in latter layers different from the former layers?" or "Why the slope of certain layers is steeper than the slope of others?". While these are great questions that delve into the intrinsic behavior of weighted ReLU activations, we currently think this added complexity is infeasible to implement across Neuromatch scale for all (pod) levels.
Having said that, your contributions at improving our content is highly valuable. Thank you for the same!
Kind Regards,