What is a hyperparameter?

A model hyperparameter is a configuration external to the model and whose value cannot be estimated from data. They are often used in processes to help estimate model parameters and are specified by the practitioner. They can often be set using heuristics, and they are often tuned for a given predictive modeling problem.

We cannot know the best value for a model hyperparameter on a given problem. We may use rules of thumb, copy values used on other problems, or search for the best value by trial and error. When a machine learning algorithm is tuned for a specific problem, you are tuning the hyperparameters of the model to discover the parameters of the model that result in the most skillful predictions.


Why are hyperparameters important?

Hyperparameters are essential because they directly control the behavior of the training algorithm and have a significant impact on the performance of the model which is being trained. Choosing good hyperparameters gives two benefits:

  • Efficiently search the space of possible hyperparameters.
  • Easy to manage a large set of experiments for hyperparameter tuning.


What are the different hyperparameter optimization techniques?

The different hyperparameter optimization techniques are as follows:

  • Grid Search:It is a very traditional technique for implementing hyperparameters. Grid search requires creating two sets of hyperparameters, learning rate, and several layers. It trains the algorithm for all combinations using the two sets of hyperparameters and measures the performance using the Cross-Validation technique. This validation technique ensures that our trained model got most of the patterns from the dataset. One of the best methods to do validation is using K-fold Cross-Validation, which helps provide comprehensive data for training the model and validations.
  • Random Search:Randomly samples the search space and evaluates a specified probability distribution set. For instance, instead of checking all 10,000 samples, we can check 1000 random parameters.
  • Bayesian Optimization:Hyperparameter setting maximizes the model’s performance on a validation set. Machine learning algorithms are frequently required to fine-tune the model hyperparameters. Unfortunately, that tuning is often called a Black Function because it cannot be written into a formula since the derivates of the function are unknown. A more appealing way to optimize and fine-tune hyperparameters is to enable an automated model tuning approach using the Bayesian Optimization Algorithm. The model used for approximating the objective function is called a surrogate model. A popular surrogate model for Bayesian Optimization is Gaussian Process (GP). Bayesian optimization typically works by assuming that the unknown functions were sampled from a Gaussian Process (GP) and maintaining a posterior distribution for this function as observations are made.

8 Keys to AI Success in an Enterprise

Download E-Book