How to build a function that will choose optimal threshold for classes probabilities? (in Python)
Clash Royale CLAN TAG#URR8PPP
How to build a function that will choose optimal threshold for classes probabilities? (in Python)
My output of neural network is table of predicted class probabilities:
print(probabilities)
| | 1 | 3 | ... | 8354 | 8356 | 8357 |
|---|--------------|--------------|-----|--------------|--------------|--------------|
| 0 | 2.442745e-05 | 5.952136e-06 | ... | 4.254002e-06 | 1.894523e-05 | 1.033957e-05 |
| 1 | 7.685694e-05 | 3.252202e-06 | ... | 3.617730e-06 | 1.613792e-05 | 7.356643e-06 |
| 2 | 2.296657e-06 | 4.859554e-06 | ... | 9.934525e-06 | 9.244772e-06 | 1.377618e-05 |
| 3 | 5.163169e-04 | 1.044035e-04 | ... | 1.435158e-04 | 2.807420e-04 | 2.346930e-04 |
| 4 | 2.484626e-06 | 2.074290e-06 | ... | 9.958628e-06 | 6.002510e-06 | 8.434519e-06 |
| 5 | 1.297477e-03 | 2.211737e-04 | ... | 1.881772e-04 | 3.171079e-04 | 3.228884e-04 |
I converted it to class labels using a threshold (0.2) for measuring accuraccy of my prediction:
predictions = (probabilities > 0.2).astype(np.int)
print(predictions)
| | 1 | 3 | ... | 8354 | 8356 | 8357 |
|---|---|---|-----|------|------|------|
| 0 | 0 | 0 | ... | 0 | 0 | 0 |
| 1 | 0 | 0 | ... | 0 | 0 | 0 |
| 2 | 0 | 0 | ... | 0 | 0 | 0 |
| 3 | 0 | 0 | ... | 0 | 0 | 0 |
| 4 | 0 | 0 | ... | 0 | 0 | 0 |
| 5 | 0 | 0 | ... | 0 | 0 | 0 |
Also I have a test set:
print(Y_test)
| | 1 | 3 | ... | 8354 | 8356 | 8357 |
|---|---|---|-----|------|------|------|
| 0 | 0 | 0 | ... | 0 | 0 | 0 |
| 1 | 0 | 0 | ... | 0 | 0 | 0 |
| 2 | 0 | 0 | ... | 0 | 0 | 0 |
| 3 | 0 | 0 | ... | 0 | 0 | 0 |
| 4 | 0 | 0 | ... | 0 | 0 | 0 |
| 5 | 0 | 0 | ... | 0 | 0 | 0 |
Question: How to build an algorithm in Python that will choose the optimal threshold that maximize roc_auc_score(average = 'micro')
or another metrics?
roc_auc_score(average = 'micro')
Maybe it is possible to build manual function in Python that optimize threshold, depending on the accuracy metric.
roc_curve
@Scratch'N'Purr, ok, but also I want manually change metric of accuraccy(e.g.
accuracy_score
, f1_score
). So maybe it is possible to build manual function in Python that optimize threshold– lemon
50 mins ago
accuracy_score
f1_score
Gotcha, in that case, my best answer for you is to build a function that takes a threshold argument and uses your NN to generate the probabilities instead of the class values and then determine the class using the threshold. Then, run a grid search over your threshold array to find the best threshold.
– Scratch'N'Purr
31 mins ago
@Scratch'N'Purr, ok. However, how to implement grid search to find the best threshold in Python?
– lemon
28 mins ago
What are the columns you are showing? Do you have 8357 classes? And is the class membership unique (each sample can belong only to one class) or you are in a multi-label context (samples can belong to more than one class)?
– desertnaut
13 mins ago
2 Answers
2
the best way to do so is to put a logistic regression on top of your new dataset. It will multiply every probability by a certain constant and thus will provide an automatic threshold on the output (with the LR you just need to predict the class not the probabilities)
You need to train this by subdividing the Test set in two and use one part to train the LR after predicting the output with the NN.
This is not the only way to do it, but it works fine for me everytime.
we have X_train_nn,X_valid_nn,X_test_NN and we subdivide X_test_NN in X_train_LR, X_test_LR (or do a Stratified Kfold as you wish)
here is a sample of the code
X_train = NN.predict_proba(X_train_LR)
X_test = NN.predict_proba(X_test_LR)
logistic = linear_model.LogisticRegression(C=1.0, penalty = 'l2')
logistic.fit(X_train,Y_train)
logistic.score(X_test,Y_test)
You condider you output as a new dataset and but train a LR on this new dataset.
I have tried to use a logistic regression, but I have a dataset with approximatelly 8k classes. So logistic works to slow in these conditions. Neural network is one of the best solution in huge multi-label classifiction
– lemon
54 mins ago
don't replace the NN, put a LR on top of it
– Alexis
47 mins ago
clear, sorry for misunderstanding. But unfortunately I do not quite understand how to implement it. Can you provide sample of code?
– lemon
30 mins ago
I assume your groundtruth labels are Y_test
and predictions are predictions
.
Y_test
predictions
Optimizing roc_auc_score(average = 'micro')
according to a prediction threshold
does not seem to make sense as AUCs are computed based on how predictions are ranked and therefore need predictions
as float values in [0,1]
.
roc_auc_score(average = 'micro')
threshold
predictions
[0,1]
Therefore, I will discuss accuracy_score
.
accuracy_score
You could use scipy.optimize.fmin
:
scipy.optimize.fmin
def thr_to_accuracy(thr, Y_test, predictions):
return accuracy_score(Y_test, np.array(predictions>thr, dtype=np.int))
best_thr = scipy.optimize.fmin(thr_to_accuracy, initial_guess=0.5)
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Might want to take a look at
roc_curve
. This will help you adjust your threshold. There's no right/wrong threshold. It depends on your business's tolerance for false positives.– Scratch'N'Purr
52 mins ago