machine learning - specify scoring metric in GridSearch function with hypopt package in python

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I'm using Gridsearch function from hypopt package to do my hyperparameter searching using specified validation set. The default metric for classification seems to be accuracy (not very sure). Here I want to use F1 score as the metric. I do not know where I should specify the metric. I looked at the documentation but kind of confused.

Does anyone who are familiar with hypopt package know how I can do this? Thanks a lot in advance.

from hypopt import GridSearch
log_reg_params = {"penalty": ['l1'], 'C': [0.001, 0.01]}
opt = GridSearch(model=LogisticRegression())
opt.fit(X_train, y_train, log_reg_params, X_val, y_val)
The default metric of the hypopt package is the the score() function for whatever model you use, so in your case it is LogisticRegression().score() which defaults to accuracy.
If you upgrade the hypopt package to version 1.0.8 via pip install hypopt --upgrade, you can specify any metric of your choosing in the scoring parameter of GridSearch.fit(), for example, fit(scoring='f1'). Here is a simple working example based on your code that uses the F1 metric:
from hypopt import GridSearch
param_grid = {"penalty": ['l1'], 'C': [0.001, 0.01]}
opt = GridSearch(model=LogisticRegression(), param_grid = param_grid)
# This will use f1 score as the scoring metric that you optimize.
opt.fit(X_train, y_train, X_val, y_val, scoring='f1')
hypopt supports most any scoring function that sklearn supports. 
For classification, hypopt supports these metrics (as strings): 'accuracy', 'brier_score_loss', 'average_precision', 'f1', 'f1_micro', 'f1_macro', 'f1_weighted', 'neg_log_loss', 'precision', 'recall', or 'roc_auc'. 
For regression, hypopt supports: "explained_variance", "neg_mean_absolute_error", "neg_mean_squared_error", "neg_mean_squared_log_error", "neg_median_absolute_error", "r2".
You can also create your own metric your_custom_score_func(y_true, y_pred) by wrapping it into an object like this:
from sklearn.metrics import make_scorer
scorer = make_scorer(your_custom_score_func)
opt.fit(X_train, y_train, X_val, y_val, scoring=scorer)
You can learn more in the hypopt.GridSearch.fit() docstring here:
https://github.com/cgnorthcutt/hypopt/blob/master/hypopt/model_selection.py#L240
You can learn more about creating your own custom scoring metrics here: 
Example: https://github.com/cgnorthcutt/hypopt/blob/master/tests/test_core.py#L371
Source code: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html
                thank you so much for your detail answer!   One more question,  is hypopt same as hyperopt package?
– zesla
                Oct 23, 2018 at 13:22
                Great - Mark and up vote it if it's the right answer. To answer your question - Not at all. Hypopt is a general package anytime you want to find the best setting of parameters for any model. Hypopt is the only python package for the simple case of optimizing parameters with a validation set. Whereas, hyperopt implements a tree search algorithm for a specific case of Bayesian Gaussian processes. There are a number of packages for Bayesian hyper-parameter optimization, and hyperopt may be a good one, I don't know as I haven't used it. @zesla
– cgnorthcutt
                Oct 24, 2018 at 15:01
                @zesla Note you'll want to upgrade to hypopt version 1.0.7, there was a minus sign error in 1.0.6. Fixed now :)
– cgnorthcutt
                Oct 25, 2018 at 17:31
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.