ML Exam: 6 - Feature Engineering
ML Exam: 6 Feature Engineering Feature Engineering - Basic Concepts Applying domain knowledge (your knowledge of the data – and the model you’re using) to create better features to train your model ART OF ML!! Most critical part in a good ML implementation Talented/expert ML specialists are good at feature engineering Curse of dimensionality More features is not better! Every feature is a new dimension Much of feature engineering is selecting most relevant features → domain knowledge comes into play Unsupervised dimensionality reduction techniques can help (PCA, K-Means) Common problems are below: Missing Data Impute missing data = fill missing data with something Impute: Mean Replacement Replace missing values with mean value of column A column represents a single feature Median value of column can be more useful if outliers distort the mean e.g. outlier billionaires distorting the income data of average citizens Pros Fast & ea...