The midterm The idea is for you to explore using logistics regression, support vector machines, decision trees and random forest I want you to explore at least 4 ways of running each one. As examples:

The midterm

The idea is for you to explore using logistics regression, support vector machines, decision trees and random forest

I want you to explore at least 4 ways of running each one. As examples:

Using logistics regression there are quite a few parameters:

l2 penalties (or none). When using l2 penalties, what is the correct C coefficient?
type of algorithm/solver to use
type of way of handling multi-class (number of classes > 2)
preprocessing – do you need to center/scale the data before hand?
- My guess is no – all the features are in the same scale, but it should be verified

And with Support Vector Machines (read: https://scikit-learn.org/stable/modules/svm.html), explore:

Different multi-class parameters
LinearSVC vs SVC
For SVC, different kernels
Different margin

With Classification trees (https://scikit-learn.org/stable/modules/tree.html#tree), sklearn has two types:

Decision Trees (https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier)
Extra Tree Classification (https://scikit-learn.org/stable/modules/generated/sklearn.tree.ExtraTreeClassifier.html#sklearn.tree.ExtraTreeClassifier). I never used this one.

And there are random forest (https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier)

Read the documentation, select the 4+ ways you want to explore each of these 4 classifiers, AND WRITE UP NOTES IN MARKDOWN CELLS Your interpretation and conclusions are really important