How to Manage Overfitting and Underfitting in ML Validation

Machine learning (ML) is a powerful technique for finding patterns and making predictions from data. However, it also comes with some challenges, such as overfitting and underfitting. These are common problems that affect the performance and generalization of ML models. In this article, you will learn what overfitting and underfitting are, how to detect them, and how to manage them using statistical validation methods.

Overfitting and underfitting explained

Overfitting occurs when a ML model learns too much from the training data and fails to generalize well to new or unseen data. This means that the model captures the noise and the specific features of the training data, but not the underlying relationship between the input and the output. On the other hand, underfitting occurs when a ML model learns too little from the training data and fails to capture the complexity and the variability of the data. This means that the model is too simple or too rigid to fit the data well and to make accurate predictions.

Add your perspective

Kevin Sebineza

Student Guild President at CMU Africa | MS in Engineering AI | Data Scientist
We can tell that our model is overfitting if the training error is decreasing monotonically, but the test error increases. This means that the trained model is no longer generalizing the data but rather also learning the noise. We can address overfitting by either increasing the dataset (since the model is too complex for the data) or adding a regularization parameter to penalize the weights and make the model less complex. On the other hand, we can recognize underfitting when the model is too simple and unable to learn the training data. One of the ways we can address this is by decreasing the value of the regularization parameter (if it's already in place) to add more complexity to the model.
Like

12
Report contribution
Sanjay Kumar MBA,MS,PhD
Overfitting in machine learning happens when a model learns too much from its training data, including noise and specific details, but struggles to generalize to new data. It doesn't capture the underlying patterns effectively. Underfitting, on the other hand, occurs when a model learns too little from the training data, missing the complexity and variability of the data. It's too simplistic and doesn't make accurate predictions. Balancing between these two extremes is essential for building effective machine learning models.
Like

7
Report contribution
Christopher Kramer

AI, Machine Learning & Data Science
Overfitting and underfitting is related commonly to the concept of Bias-Variance trade-off (especially in interviews!). Where a model with small variance and high bias is likely to make the same types of errors, regardless of the input data, and is therefor underfit to the target. High variance and low bias results in over-complexity, resulting in a model which is overfit on the target. Finding the right balance between the two (bias vs. variance, complexity vs. simplicity) will generally result in the optimal solution.
Like

7

(edited)
Report contribution

How to detect overfitting and underfitting

One way to detect overfitting and underfitting is to compare the training and testing errors of the ML model. The training error is the error that the model makes on the training data, while the testing error is the error that the model makes on the testing data. The testing data is a subset of the data that is not used for training, but for evaluating the model's performance. Ideally, the training and testing errors should be low and close to each other. However, if the training error is much lower than the testing error, it indicates overfitting. If both the training and testing errors are high, it indicates underfitting.

Add your perspective

Khouloud El Alami

Data Scientist at Spotify | Top Data Science Writer on Medium & TDS 💌 Follow my journey as a Data Scientist in Tech, I also write about career advice
When doing Feature Engineering, always validate the performance of your model with the new features on a separate validation set to ensure that the improvements are not due to overfitting. Having too many features means that some of them could introduce noise to the model which is bad because it leads to overfitting as the model learns from the noise rather than the true relationships.
Like

10

(edited)
Report contribution
Sergio Calderón Pérez-Lozao

Senior Data Scientist @ Cabify
When looking for overfitting, it's crucial to consider how you create your training and testing datasets. If your testing data is too similar to the training data, and the real-world scenario won't provide such similar data for predictions (like needing future data rather than a random sample for testing), this can be risky. You might not notice overfitting by just comparing error metrics, but it could become apparent when using a truly representative test set.
Like

6
Report contribution
Christy Rajan
Detecting overfitting and underfitting involves analyzing the performance metrics, training and testing errors. Overfitting is indicated by a significant gap between training and test data performance, while underfitting is characterized by high errors on both sets. Techniques like cross-validation and residual analysis, are valuable for assessing and mitigating overfitting and underfitting.
Like

4
Report contribution

How to manage overfitting and underfitting

One way to manage overfitting and underfitting is to use statistical validation methods, such as cross-validation and regularization. Cross-validation is a technique that splits the data into multiple folds, and uses some of them for training and some of them for testing. This way, the model can be trained and tested on different subsets of the data, and the average testing error can be used as a measure of the model's performance. Cross-validation can help reduce overfitting by avoiding using the same data for both training and testing, and can help detect underfitting by showing how well the model fits different parts of the data. Regularization is a technique that adds a penalty term to the ML model's objective function, which reduces the model's complexity and prevents it from learning too many parameters. Regularization can help reduce overfitting by shrinking or eliminating the model's weights that are not relevant or useful for the prediction, and can help prevent underfitting by allowing the model to learn more features from the data.

Add your perspective

Renato Boemer

Machine Learning Engineer
One way to address overfitting in CNNs is by using techniques such as dropout and data augmentation. Dropout randomly deactivates some neurones during training, preventing the network from relying too much on specific features. Data augmentation is a technique that applies random (but realistic) transformations to the images in your training set. For example, you can apply: - Geometric transformations (e.g. rotations, flips, or crops) - Colour space transformations (e.g. change RGB colour channels or intensify colours) - Kernel filters (e.g. sharpen or blur an image) As a results, you effectively increase the diversity of the training data and help the model to generalise better on unseen data. Try using OpenCV and let me know!
Like

10
Report contribution
Enoch N. Appiah

Data Scientist | Data Analyst at Acquirente Unico (AU), Italy
In a project I did recently involving medical handwritten recognition, one effective way I dealt with overfitting and underfitting was the use of data augmentation(because my dataset was small) and dropout regularisation. My model was a CNN-LSTM and I preprocessed the word images and performed data augmentation using opencv to create diverse forms of the images. Data augmentation steps included geometric transformations (rotation, flipping and rescaling) and kernel filters such as masking, blurring and contrast adjustment. The dropout regularisation was implemented in the model construction stage for both the CNN stage and the LSTM. I tried different dropout parameters and my model performed very well on both the training and test data.
Like

4

(edited)
Report contribution
Sanjay Kumar MBA,MS,PhD
To manage overfitting and underfitting in machine learning, you can employ statistical validation methods like cross-validation and regularization. Cross-validation involves splitting the data into multiple folds for training and testing, helping to avoid overfitting and assess model performance on different subsets of the data. Regularization adds a penalty term to the model's objective function, reducing complexity and preventing it from learning too many parameters. This can mitigate overfitting by eliminating irrelevant features and prevent underfitting by allowing the model to learn more from the data.
Like

3
Report contribution

Examples of cross-validation and regularization

There are different types of cross-validation and regularization methods that can be applied to different ML models. For example, for linear regression models, one can use k-fold cross-validation, where the data is divided into k equal folds, and each fold is used as the testing data once, while the rest are used as the training data. The testing errors from each fold are then averaged to get the cross-validation error. For regularization, one can use Lasso or Ridge regression, where the penalty term is either the absolute value or the square of the model's weights, respectively. These methods can help reduce the model's variance and bias, and improve its prediction accuracy.

Add your perspective

Bruno Miguel L Silva

AI & ML LinkedIn Top Voice | Head of R&D | Professor | PhD Candidate in AI | Co-Founder @Geekering | PSPO | Podcast Host 🎙️
In addition to traditional cross-validation and regularization methods, consider integrating Bayesian optimization for hyperparameter tuning. This approach can significantly enhance model performance by systematically and efficiently searching for the optimal set of hyperparameters. Unlike grid or random search, Bayesian optimization uses prior results to inform future trials, making the search process more targeted. It's particularly effective in finding the right balance between model complexity and prediction accuracy, addressing both overfitting and underfitting!
Like

5
Report contribution
Nitesh Tiwari

Data Science | Analytics Enabler | PSPO | PSM
k-fold X-validation, is a practical example of a diagnostic tool for V & R! While building a predictive model, with k-fold X-validation, we divide the dataset into 'k' subsets or folds. Then, we train our model on 'k-1' folds & validate it on the remaining one. By repeating this process 'k' times, we get a set of 'k' performance scores. If our model's performance varies significantly between these folds, it could indicate overfitting. Now, on the other hand, regularization like L1(Lasso) might force some regression coefficients to be exactly zero, effectively eliminating them from the model, simplifying it. And L2 regularization, reduces the magnitude of all coefficients, preventing them from becoming too large & dominating the model.
Like

4

(edited)
Report contribution
Mzwandile Mhlongo

Data Scientist | Business & Commercial Clients | Standard Bank Group
In a classification problem, with highly imbalanced classes, stratified K-fold with cross validation comes in handy. Stratified K-fold cross validation is a technique that is used to evaluate the performance of a machine learning model on a limited dataset. It is a variation of K-fold cross validation, but it ensures that the folds are stratified, meaning that the proportion of samples for each class is the same in each fold as it is in the full dataset.
Like

3
Report contribution

How to choose the best validation method

There is no definitive answer to how to choose the best validation method for a ML model, as it depends on the type, size, and distribution of the data, as well as the complexity and flexibility of the model. However, some general guidelines are to use cross-validation when the data is limited or imbalanced, and to use regularization when the model is overparameterized or prone to overfitting. Moreover, one can use different cross-validation and regularization methods and compare their results, such as using different values of k for k-fold cross-validation, or different values of the penalty parameter for regularization. The best validation method is the one that minimizes the testing error and maximizes the generalization of the model.

Add your perspective

Christy Rajan
Choosing the best validation method depends on the dataset size, available computational resources, and the specific characteristics of the machine learning task. Common validation methods include holdout validation, cross validation and stratified cross-validation. The choice depends on the specific requirements of the task and the need to balance computational efficiency with reliable performance estimates.
Like

5
Report contribution
Nitesh Tiwari

Data Science | Analytics Enabler | PSPO | PSM
First & foremost, I think understanding of dataset's characteristics, size, & structure is crucial. Additionally, defining the problem type, whether it's classification, regression, or clustering, is fundamental to selecting an appropriate validation method. Btw, the dataset's size plays a pivotal role in the decision-making process. For large datasets, straightforward approaches like Hold-Out Validation or K-Fold Cross-Validation are often sufficient. However, smaller datasets require more specialized techniques like Leave-One-Out Cross-Validation. The data distribution also matters, particularly in classification tasks with imbalanced classes. For hyperparameter tuning cases, Nested Cross-Validation is a great option!
Like

4
Report contribution
Mzwandile Mhlongo

Data Scientist | Business & Commercial Clients | Standard Bank Group
The best way to choose a validation method for your machine learning model is to try different methods and see what works best. This will depend on the - Type of data: If you have a small dataset or noisy data, you will want to use a validation method that is good at preventing overfitting. - Complexity of model: If your model is complex, you are more likely to overfit the training data. Therefore, you should use a validation method that is good at detecting overfitting. - Resources available: Some validation methods, such as nested cross validation, can be computationally expensive. Therefore, you should choose a validation method that is appropriate for your resources. Sometimes, trial and error is a way to go.
Like

3
Report contribution

Here’s what else to consider

This is a space to share examples, stories, or insights that don’t fit into any of the previous sections. What else would you like to add?

Add your perspective

Maikel Groenewoud
It is indeed very important to be mindful of the risks of overfitting and underfitting when developing and testing ML/AI models. Also after the models have been developed and tested, it is very important to keep monitoring them though. Models that have been deployed for instance need to be monitored for phenomena such as 'model drift', model degradation due to changes in external factors that impact the relationships between model inputs and outputs. Given the dynamic nature of ML/AI models and the environment they operate in, it is crucial to also have dynamic governance/oversight in place when models are deployed.
Like

6
Report contribution
Yashwant (Sai) R.

Director - Machine Learning @ Fidelity Investments | AI Product Leader | Generative AI | High ROI AI
Overfitting is a significant issue that requires ongoing attention in model development and maintenance. This is especially true as the distribution of production data often changes. It's wise to remain pessimistic, assuming that models might/will fail, and to stay prepared to adapt them for unseen data. On another note, overfitting isn't always bad. In fields like physics, healthcare, drug discovery etc... there can be a need to learn very specific details, even from noise.
Like

5
Report contribution
Thamalu Maliththa Piyadigama

Senior Data Engineer at hSenid Mobile Solutions (Pvt) Ltd
There is a recent discussion about a phenomenon named Double Descent. When you consider the large deep learning models the bias and variance, or overfitting and underfitting do not behave in the usual way we see in small scale models. When the model complexity grows further the test set performance will be better again. This is surprising and some people attribute it to regularization. Yet, same happen in deep models without explicit regularization. (It may be due to some implicit regularization.) This is an interesting open topic these days.
Like

2
Report contribution

What is the best way to manage overfitting and underfitting in statistical validation for ML?

Overfitting and underfitting explained

How to detect overfitting and underfitting

How to manage overfitting and underfitting

Examples of cross-validation and regularization

How to choose the best validation method

Here’s what else to consider

Machine Learning

Rate this article

Thanks for your feedback

More articles on Machine Learning

What is the best way to manage overfitting and underfitting in statistical validation for ML?

Overfitting and underfitting explained

How to detect overfitting and underfitting

How to manage overfitting and underfitting

Examples of cross-validation and regularization

How to choose the best validation method

Here’s what else to consider

Machine Learning

Rate this article

Thanks for your feedback

Explore Other Skills