4. Multilayer Perceptrons¶
In this chapter, we will introduce your first truly deep network. The simplest deep networks are called multilayer perceptrons, and they consist of multiple layers of neurons each fully connected to those in the layer below (from which they receive input) and those above (which they, in turn, influence). When we train high-capacity models we run the risk of overfitting. Thus, we will need to provide your first rigorous introduction to the notions of overfitting, underfitting, and model selection. To help you combat these problems, we will introduce regularization techniques such as weight decay and dropout. We will also discuss issues relating to numerical stability and parameter initialization that are key to successfully training deep networks. Throughout, we aim to give you a firm grasp not just of the concepts but also of the practice of using deep networks. At the end of this chapter, we apply what we have introduced so far to a real case: house price prediction. We punt matters relating to the computational performance, scalability, and efficiency of our models to subsequent chapters.
- 4.1. Multilayer Perceptrons
- 4.2. Implementation of Multilayer Perceptrons from Scratch
- 4.3. Concise Implementation of Multilayer Perceptrons
- 4.4. Model Selection, Underfitting, and Overfitting
- 4.5. Weight Decay
- 4.6. Dropout
- 4.7. Forward Propagation, Backward Propagation, and Computational Graphs
- 4.8. Numerical Stability and Initialization
- 4.9. Environment and Distribution Shift
- 4.10. Predicting House Prices on Kaggle