Classic Problems in Machine Learning
- Classification: Given a set of labels, can I correctly classify a new data point?
- Ex: figuring out if an email is spam or not. Here, the labels are "spam" and "not spam".
- Regression: Given a set of independent variables, how does my dependent variable change when I vary one independent variable?
- Ex: predicting height given weight
- Clustering: Given a set of data, can I figure out what natural groups it falls into?
- Ex: If I represent my friends as a list of courses they are taking, can I find natural social groups?
Note: The difference between clustering and classification is in the training data. For classification, you have preexisting labels and examples that are already labeled. With clustering you just have a dataset with no labels.