What is overfitting?
Overfitting refers to a model that models the training data too well. It happens when a model learns the noise and detail in the training data to such an extent that it negatively impacts the model’s performance on new data. The noise or random fluctuations in the training data are picked up and learned as concepts by the model. The problem is that these concepts do not apply to new data and negatively impact the model’s ability to generalize.
How can overfitting be reduced?
- Use a resampling technique to estimate model accuracy.
- Rather than using whole data for training, split the data into two sets: training and testing.
- Hold back a validation dataset.
- A larger dataset can reduce overfitting. If more data can’t be gathered, we can apply data augmentation to increase the size of the dataset artificially.
- If we only have a limited amount of training samples, each with many features, we should only select the most important features for training so that the model doesn’t need to learn from so many features and eventually overfit.
- Regularization is a technique to constrain the network from learning a too complex model, which may therefore overfit. In L1 or L2 regularization, a penalty term can be added to the cost function to push the estimated coefficients towards 0. L1 regularization allows weights to decay to 0. L2 regularization allows weights to decay towards 0 but not to 0.
- An over-complex model is more likely to overfit, so we can reduce the model’s complexity by removing layers or reducing the number of units per layer, hence reducing the model’s size.