Multi-Layer Perceptron (MLP)

The vanilla ice-cream of a neural networks.

<aside> 💡 Check out the Google Colab implementation as a multi-class classifier implemented in raw Numpy.

</aside>

Highlights

Modular structure that allows for as many layers / neurons as desired
Can conduct SGD, MBGD, or BGD depending on what is passed in
Utilizes ReLU as the inner-layer activation function and Softmax as a last-layer activation function

Takeaways

I trained a few different neural networks and found that a (4-4-4) architecture with 2-hidden ReLU units and a Softmax output unit performed best. I utilized $lr=3e^{-5}$ and $\lambda=0.2$ to yield these results.

Success! Quite a pretty loss curve.

Effective as a classifier.

We can also look at each specific class and understand how the decision boundaries are set:

Class 0: We can see that given the outlier blue point, the gradient has reduced in the top-right corner.

Class 0: We can see that given the outlier blue point, the gradient has reduced in the top-right corner.

Class 1: Fits the data - we can see the impact of the outlier blue point in the top-right here.

Class 1: Fits the data - we can see the impact of the outlier blue point in the top-right here.

Class 2: Fits the data very well, can see the non-linearities at play.

Class 2: Fits the data very well, can see the non-linearities at play.

Class 3: Fits data well. Interesting to see that where there is more overlap with the purple dots, that decision boundary is less steep than the split between yellow and green (which has more margin).

Class 3: Fits data well. Interesting to see that where there is more overlap with the purple dots, that decision boundary is less steep than the split between yellow and green (which has more margin).

Overall, we can see the effect of the non-linearities at play that a neural-net allows for compared to a linear classifier like the logistic linear regression model that was trained on the same data.

Detail on Implementation

Further detail on the implementation can be found in specific write-ups for the following:

Softmax
ReLU