ML lesson 1 recap¶

[back] [code download]

No description has been provided for this image

Summary:

  1. Learned how a linear model is a function of features multiplied by weights: $$f(x_1, x_2, \ldots) = w_1 x_1 + w_2 x_2 + \cdots + b.$$
  2. Saw how normalization (scaling data so that every feature has mean 0 and standard deviation 1) can speed up training by making gradient descent more efficient.
  3. Saw how we can create sequential neural nets using nn.Sequential(nn.Linear(...), ... ). Showed that multiple linear layers "collapse" into a single linear function without nonlinearities
  4. Showed the ReLU function, which throws away negative values, and how this allows neurons to selectively activate. Saw that with just two hidden neurons we could get significantly better performance.
  5. Experimented with learning rate to see that it needs to be tuned to the specific data and model.

We ended up with this model architecture:

In [ ]:
num_hidden = 4

model = nn.Sequential(
    nn.Linear(num_features, num_hidden),
    nn.ReLU(),
    nn.Linear(num_hidden, num_hidden),
    nn.ReLU(),
    nn.Linear(num_hidden, 1)
)

See if you can improve on this model. The best loss we got was:

Step [30000/30000], Train Loss: $232345.39, Test Loss: $255485.52

I will give a prize to whoever has the best model on Monday¶

Hints:

  • Try to add more layers of different shapes
  • Change the learning rate, number of training steps, etc.
  • Harder: our data has Suburb and Type columns which tell us about what neighborhood the house is in and what kind of house it is. These will tell us a lot if you can figure out how to use them!
    • "1-hot" encoding may help you

Use AI if you want but I'm going to ask you about the model and you'll be embarassed if you can't answer