Explication détaillée
Underfitting in Machine Learning
Overview
Underfitting happens when a machine learning model is too simple to capture the complexity of the data it's trying to learn from. This leads to poor performance both on training data and unseen test data, resulting in inaccurate predictions.
Causes of Underfitting
Common causes of underfitting include using models that are too simple compared to the complexity of the data, lack of sufficient training data, and overly aggressive regularization techniques. Models such as linear regression with too few features or decision trees with shallow depth can often underfit complex data.
Identifying Underfitting
Identifying underfitting involves evaluating how the model performs on both training and validation datasets. If the model performs poorly on both, it may be underfitting. This can also be observed through high bias, where the model makes strong assumptions about the form of the mapping function.
Addressing Underfitting
To address underfitting, one can use a more complex model that can better capture the patterns in the data. Increasing the number of features, adding more training data, or reducing the regularization strength are other methods to improve model complexity and accuracy.
Underfitting vs. Overfitting
It is important to distinguish underfitting from overfitting. While underfitting is due to a model being too simple, overfitting occurs when a model is too complex and learns the training data too well, including the noise. Achieving the right model complexity is key to effective machine learning.