Hyperparameter tune for Tensorflow

Asked 25/5, 2017 at 13:13 Answered 14/3, 2019 at 3:46

optimization tensorflow machine-learning bayesian hyperparameters

I am searching for a hyperparameter tune package for code written directly in Tensorflow (not Keras or Tflearn). Could you make some suggestion?

Sicanian answered 25/5, 2017 at 13:13 Comment(1)

github.com/cerlymarco/keras-hypetune – Bonniebonns 2/10, 2020 at 14:59

Usually you don't need to have your hyperparameter optimisation logic coupled with the optimised model (unless your hyperparemeter optimisation logic is specific to the kind of model that you are training, in which case you would need to tell us a bit more). There are several tools and packages available for the task. Here is a good paper on the topic, and here is a more practical blog post with examples.

hyperopt implements random search and tree of parzen estimators optimization.
Scikit-Optimize implements a few others, including Gaussian process Bayesian optimization.
SigOpt is a convenient service (paid, although with a free tier and extra allowance for students and researchers) for hyperparameter optimization. It is based upon Yelp's MOE, which is open source (although the published version doesn't seem to update much) and can, in theory, be used on its own, although it would take some additional effort.
Spearmint is a commonly referred package too, also open source but not free for commercial purposes (although you can fall back to a less restrictive older version). It looks good, but not very active, and the available version is not compatible with Python 3 (even though pull requests have been submitted to fix that).
BayesOpt seems to be the golden standard in Bayesian optimization, but it's mainly C++, and the Python interface doesn't look very documented.

Out of these, I have only really (that is, with a real problem) used hyperopt with TensorFlow, and it didn't took too much effort. The API is a bit weird at some points and the documentation is not terribly thorough, but it does work and seems to be under active development, with more optimization algorithms and adaptations (e.g. specifically for neural networks) possibly coming. However, as suggested in the previously linked blog post, Scikit-Optimize is probably as good, and SigOpt looks quite easy to use if it fits you.

Amal answered 25/5, 2017 at 13:50 Comment(2)

I new to this DNN. but I did some parameter grid search with scikit-learn (traditional ML). My question is: grid search in DNN required too much computation power, is it practical? – Hightower 17/9, 2017 at 7:32

@scotthuang Take a look at this paper. Besides describing several other methods, one of the conclusions is that even doing random search may be more efficient, since frequently only a small subset of hyperparameters play a significant role in the performance of the model. – Amal 17/9, 2017 at 12:22

I'd like to add one more library to @jdehesa's list, which I have applied in my research particularly with tensorflow. It's hyper-engine, Apache 2.0 licensed.

It also implements Gaussian Process Bayesian optimization and some other techniques, like learning curve prediction, which save a lot of time.

Enthetic answered 17/9, 2017 at 7:18 Comment(0)

You can try out Ray Tune, a simple library for scaling hyperparameter search. I mainly use it for Tensorflow model training, but it's agnostic to the framework - works seamlessly with PyTorch, Keras, etc. Here's the docs page - ray.readthedocs.io/en/latest/tune.html

You can use it to run distributed versions of state-of-the-art algorithms such as HyperBand or Bayesian Optimization in about 10 lines of code.

As an example to run 4 parallel evaluations at a time:

import ray
import ray.tune as tune
from ray.tune.hyperband import HyperBandScheduler


def train_model(config, reporter):  # add the reporter parameter
    model = build_tf_model(config["alpha"], config["beta"])
    loss = some_loss_function(model)
    optimizer = tf.AdamOptimizer(loss)

    for i in range(20):
        optimizer.step()
        stats = get_statistics()
        reporter(timesteps_total=i, 
                 mean_accuracy=stats["accuracy"])

ray.init(num_cpus=4)
tune.run(train_model,
    name="my_experiment",
    stop={"mean_accuracy": 100}, 
    config={ 
        "alpha": tune.grid_search([0.2, 0.4, 0.6]), 
        "beta": tune.grid_search([1, 2]) 
    },
    scheduler=HyperBandScheduler(reward_attr="mean_accuracy"))

You also don't need to change your code if you want to run this script on a cluster.

Disclaimer: I work on this project - let me know if you have any feedback!

Fettling answered 4/4, 2018 at 7:47 Comment(2)

One thing I haven't been able to figure out from looking at the Ray Tune examples: how do I get the trained model object after tune.run_experiments(...) was called? – Schistosome 15/3, 2019 at 12:18

use analysis = tune.run(...). And then analysis.get_best_config. – Fettling 14/8, 2019 at 8:31

I'm not sure if this is also the parameters that you want but you mentioned TensorFlow hyperparameters so I guess I can suggest some.

Try to clone this repository for having the needed scripts;

git clone https://github.com/googlecodelabs/tensorflow-for-poets-2

and in the Master folder, invoke your command prompt and run this line;

python -m scripts.retrain -h

to get the list of optional arguments.

Source: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#6

Cyanotype answered 23/8, 2018 at 9:58 Comment(0)

I found sci-kit optimize very simple to use for bayesian optimization of hyperameters, and it works with any tensorflow API (estimator, custom estimator, core, keras, etc.)

https://mcmap.net/q/1172316/-hyperparameter-tuning-locally-tensorflow-google-cloud-ml-engine

Feodora answered 2/12, 2018 at 16:54 Comment(0)

You could use variational inference (bayesian) as a point cloud over the optimization space; hyperparameter tuning would be much better. Tensorflow probability would be an approach.

Coheman answered 14/3, 2019 at 3:46 Comment(0)

Recommended topics

Hot tags