How to build a function that will choose optimal threshold for classes probabilities? (in Python)

How to build a function that will choose optimal threshold for classes probabilities? (in Python)

My output of neural network is table of predicted class probabilities:

print(probabilities) | | 1 | 3 | ... | 8354 | 8356 | 8357 | |---|--------------|--------------|-----|--------------|--------------|--------------| | 0 | 2.442745e-05 | 5.952136e-06 | ... | 4.254002e-06 | 1.894523e-05 | 1.033957e-05 | | 1 | 7.685694e-05 | 3.252202e-06 | ... | 3.617730e-06 | 1.613792e-05 | 7.356643e-06 | | 2 | 2.296657e-06 | 4.859554e-06 | ... | 9.934525e-06 | 9.244772e-06 | 1.377618e-05 | | 3 | 5.163169e-04 | 1.044035e-04 | ... | 1.435158e-04 | 2.807420e-04 | 2.346930e-04 | | 4 | 2.484626e-06 | 2.074290e-06 | ... | 9.958628e-06 | 6.002510e-06 | 8.434519e-06 | | 5 | 1.297477e-03 | 2.211737e-04 | ... | 1.881772e-04 | 3.171079e-04 | 3.228884e-04 |

I converted it to class labels using a threshold (0.2) for measuring accuraccy of my prediction:

predictions = (probabilities > 0.2).astype(np.int) print(predictions) | | 1 | 3 | ... | 8354 | 8356 | 8357 | |---|---|---|-----|------|------|------| | 0 | 0 | 0 | ... | 0 | 0 | 0 | | 1 | 0 | 0 | ... | 0 | 0 | 0 | | 2 | 0 | 0 | ... | 0 | 0 | 0 | | 3 | 0 | 0 | ... | 0 | 0 | 0 | | 4 | 0 | 0 | ... | 0 | 0 | 0 | | 5 | 0 | 0 | ... | 0 | 0 | 0 |

Also I have a test set:

print(Y_test) | | 1 | 3 | ... | 8354 | 8356 | 8357 | |---|---|---|-----|------|------|------| | 0 | 0 | 0 | ... | 0 | 0 | 0 | | 1 | 0 | 0 | ... | 0 | 0 | 0 | | 2 | 0 | 0 | ... | 0 | 0 | 0 | | 3 | 0 | 0 | ... | 0 | 0 | 0 | | 4 | 0 | 0 | ... | 0 | 0 | 0 | | 5 | 0 | 0 | ... | 0 | 0 | 0 |

Question: How to build an algorithm in Python that will choose the optimal threshold that maximize roc_auc_score(average = 'micro') or another metrics?

roc_auc_score(average = 'micro')

Maybe it is possible to build manual function in Python that optimize threshold, depending on the accuracy metric.

Might want to take a look at roc_curve. This will help you adjust your threshold. There's no right/wrong threshold. It depends on your business's tolerance for false positives.
– Scratch'N'Purr
52 mins ago

roc_curve

@Scratch'N'Purr, ok, but also I want manually change metric of accuraccy(e.g. accuracy_score, f1_score). So maybe it is possible to build manual function in Python that optimize threshold
– lemon
50 mins ago

accuracy_score

f1_score

Gotcha, in that case, my best answer for you is to build a function that takes a threshold argument and uses your NN to generate the probabilities instead of the class values and then determine the class using the threshold. Then, run a grid search over your threshold array to find the best threshold.
– Scratch'N'Purr
31 mins ago

@Scratch'N'Purr, ok. However, how to implement grid search to find the best threshold in Python?
– lemon
28 mins ago

What are the columns you are showing? Do you have 8357 classes? And is the class membership unique (each sample can belong only to one class) or you are in a multi-label context (samples can belong to more than one class)?
– desertnaut
13 mins ago

2 Answers
2

the best way to do so is to put a logistic regression on top of your new dataset. It will multiply every probability by a certain constant and thus will provide an automatic threshold on the output (with the LR you just need to predict the class not the probabilities)

You need to train this by subdividing the Test set in two and use one part to train the LR after predicting the output with the NN.

This is not the only way to do it, but it works fine for me everytime.

we have X_train_nn,X_valid_nn,X_test_NN and we subdivide X_test_NN in X_train_LR, X_test_LR (or do a Stratified Kfold as you wish)
here is a sample of the code

X_train = NN.predict_proba(X_train_LR) X_test = NN.predict_proba(X_test_LR) logistic = linear_model.LogisticRegression(C=1.0, penalty = 'l2') logistic.fit(X_train,Y_train) logistic.score(X_test,Y_test)

You condider you output as a new dataset and but train a LR on this new dataset.

I have tried to use a logistic regression, but I have a dataset with approximatelly 8k classes. So logistic works to slow in these conditions. Neural network is one of the best solution in huge multi-label classifiction
– lemon
54 mins ago

don't replace the NN, put a LR on top of it
– Alexis
47 mins ago

clear, sorry for misunderstanding. But unfortunately I do not quite understand how to implement it. Can you provide sample of code?
– lemon
30 mins ago

I assume your groundtruth labels are Y_test and predictions are predictions.

Y_test

predictions

Optimizing roc_auc_score(average = 'micro') according to a prediction threshold does not seem to make sense as AUCs are computed based on how predictions are ranked and therefore need predictions as float values in [0,1].

roc_auc_score(average = 'micro')

threshold

predictions

[0,1]

Therefore, I will discuss accuracy_score.

accuracy_score

You could use scipy.optimize.fmin:

scipy.optimize.fmin

def thr_to_accuracy(thr, Y_test, predictions): return accuracy_score(Y_test, np.array(predictions>thr, dtype=np.int)) best_thr = scipy.optimize.fmin(thr_to_accuracy, initial_guess=0.5)

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

ZF O6B,n

搜尋此網誌

Sfyjdyy