Supervised Machine Learning

  • Label: variable we are predicting

  • Features: input variable describes our data

  • Example: particular instance of data

  • Labeled example: feature and label

  • Unlabeld example: feature but no label

  • Model: maps examples to predicted labels

Modeling Process

  1. Training model on training set

    • Fit the model

  2. Evaluate model on validation set

    • Estimate prediction error

  3. Repeat

  4. Record the result on test set

    • Assessment of the error

Metrics

  • True positive: correct

  • True negative: correct

  • False positive: identify as positive but its not

  • False negative: identify as negative but its not

  • Accuracy: correct classfication / total classifcation

  • Precision: TP/(TP+FP) correct/everything classfied as positive

  • Recall: TP/(TP+FN) correct/all actual positives

  • False positive rate: FP/(FP+TN)

  • F1: 2precisionrecall/(precision+recall)

Loss function

  • A loss function quantifies difference betwen actual response variable and the response variable predicted by your model