# Points to consider while building a Machine Learning model

19 Feb 2017
The objective of this post is to list down some of the pointers to keep in mind while building a Machine Learning model.

- Always start with the simplest of models. You can increase the complexity if the performance of a simple model is inadequate.
- Understand your dataset first.
- Build a baseline model before building any prediction model. I will expand on this further in another post.
- Complex models tend to over fit and simpler models tend to under fit. It is your job to find a balance between these two.
**High bias and low variance** - A property of simpler models. Suggests under fitting.
**High variance and low bias** - A property of complex models. Suggests over fitting.
- In any Machine Learning model, if the number of parameters is greater than the number of training examples, beware. It leads to over fitting. Try considering a simpler model with lesser number of parameters or reduce the number of hidden layers or anything else to reduce the number of parameters of the model.
- Always normalize the inputs. Neural Networks are optimized for working on numbers between 0 and 1. Any number greater than 1 leads to explosive gradient descent, which involves weight updates by large numbers.
- Regularization is very very important. Therefore, consider using an XGBoost model instead of Random Forest.

Some terms to keep in mind:

**Stratified Sampling** - When the training data is overly skewed, the practice of picking the samples such the final training data has the distribution you need.
**Bootstrapping** - Evaluating the same model with different random seeds.