Too Long; Didn't Read
Machine learning tools are as good as the quality of your data. This blog deals with the various steps of cleaning data. The data we get is rarely homogenous. Sometimes data can be missing and it needs to be handled so that it does not reduce the performance of our machine learning model. Encoding categorical data transforms categorical features to a format that works better with classification and regression algorithms. Splitting the data set into training and test sets, we will create 4/20 training sets.