Cyber Crime with Confusion Matrix and Types of Errors
What is a Confusion Matrix?
A confusion matrix is a performance measurement technique for Machine learning classification. It is a kind of table which helps us to know the performance of the classification model on a set of test data for that the true values are known. The term confusion matrix itself is very simple, but its related terminology can be a little confusing.
Structure of the Confusion Matrix
The size of the matrix is directly proportional to the number of output classes. It is a square matrix where we assume the column headers as actual values and the row headers as model predictions. The values which are true and predicted true by the model are True Positives (TP), correct negative value predictions are True Negatives (TN), values which were negative but predicted as true are False Positives (FP) and positive values predicted as negative are False Negatives (FN). Have a look at this image:
- TP (True Positive) : We predict positive and its turn out to be true. For example, we had predicted that France would win the world cup, and it won.
- TN (True Negative) : When we predict negative, and it is true. You had predicted that England would not win and it lost.
- FP (False Positive) : Our prediction is positive, but it is false. We had predicted that England would win, but it lost. It is also called Type-1 Error
- FN (False Negative) : Our prediction is negative, and result is also false. We had predicted that France would not win, but it won. It is also called Type-2 Error.
- Accuracy Score can be calculated from the confusion matrix:
TWO TYPES OF ERROR IN CONFUSION MATRIX :
Type I error:
This type of error can prove to be very dangerous. Our system predicted no attack but in real, attack takes place, in that case no notification would be reached to the security team and no actions can be taken to prevent it. The False Positive cases above fall in this category and thus one of the aim of model is to minimize this value.
Type II error:
This type of error are less dangerous as our system is protected in reality but model predicted an attack. the team would get notified and check for any malicious activity. This doesn’t cause any harm. They can be termed as False Alarm.
Need for Confusion Matrix in Machine learning
Here are the advantages of utilizing a confusion matrix:
- It gives knowledge not just into the errors which are made by a classifier yet into additional of mistakes that are being made.
- Each row of the confusion matrix addresses the occasions of the real class.
- Each column of the confusion matrix addresses the cases of that anticipated class.
- This breakdown encourages you to defeats the impediment of utilizing classification precision alone.
- The confusion matrix does not just give you knowledge of the mistakes being made by your classifier yet also the kinds of errors that are being made.
- It shows how any classification model is confused when it makes forecasts.
— — — — — — — — — — — — — — — — — — — — — — — — — —