Google Cloud - Compute Engine VS Machine Learning
Asked Answered
D

1

15

Does anyone know what is the difference between using Google Cloud Machine Learning compare to a Virtual Machine instance in the Google Cloud Engine ?

I am using Keras with Python 3 and feel like GML is more restricting (using python 2.7, older version of TensorFlow, must follow the given structure...). I guess they are benefits of using GML over a VM in GCE but I would like to know what they are.

Discriminate answered 1/6, 2017 at 14:49 Comment(8)
I run TF on simple Ubuntu VMs in Compute Engine, and there you have a lot of flexibility on what libraries to use/etc. Form what I understand in CloudML a lot of stuff is done for you behind the scenes and so it's more convenient, but you have less flexibility. I thought one big thing regarding CloudML is that they actually use TPUs? I haven't seen TPUs being available in Compute Engine, so it's just regular CPUs and now GPUs (altho still haven't managed to make one work for me!!). Also, in terms of pricing, with VMs you just pay for the usage time, but with CloudML it's a bit more trickyPritchett
It seems that for my needs (training faster and not using my personal computer), there is no real benefits of using Cloud ML. Regarding TPUs: they are not available now, but they will be available for Compute Engine as well. You can connect to Cloud TPUs from custom VM types. I guess that my only remaining question now is whether or not I could/should use the hyper parameters optimisation tool (for Cloud ML) or I could use another tool in the VM (i.e. HyperOpt).Discriminate
For Hyperparameters optimization, use VM tools rather then cloud MLReinert
@VikasGupta That is indeed what I am planning to do after all (mainly for simplicity). I am still curious about why you would use a VM tool instead of the cloud ML tool?Discriminate
I have used both VM, Google ML and Azure ML. In terms of cooked features, ML cloud platforms are good but for flexibility they are not good.Reinert
Thanks for the link re TPUs. I've been waiting for that; need to check this out...Pritchett
For me personally, I always need to integrate TF with other stuff, which is not purely TF, so I find it hard to imagine how to run stuff in CloudML, because then you have to think about how to connect what you've done in TF with the rest of your logic/tasks. In this sense using VMs is much easier. Regarding hyperparameters tuning, Tensorboard is very helpful, but of course that's basically a manual process. I normally run a handful of models simultaneously on one VM using TMUX and compare how accuracy/cost/etc. progress for all of them simultaneously using Tensorboard.Pritchett
@VikasGupta Do you know any tool or some prior work done with TensorFlow. The hyper-parameter tuning from CloudML is rather convenient so I am curious if there is anything similar. I am looking for something that I can run first on my local computer to first get a sense of the boundaries to set in the 'proper' tuning, and maybe even the parameters that could potentially be worth the tune (or the combination for that matter).Townley
T
12

Google Cloud ML is a fully managed service whereas Google Compute Engine is not (the latter is IaaS).

Assuming that you just want to know some differences for the case when you have your own model, here you have some:

  • The most noticeable feature of Google CloudML is the deployment itself. You don't have to take care of things like setting up your cluster (that is, scaling), launching it, installing the packages and deploy your model for training. This is all done automatically, and you would have to do it yourself in Compute Engine although you would be unrestricted in what you can install.

    Although all that deployment you can automatise more or less, there is no magic to it. In fact, you can see in the logs of CloudML for a training job that it is quite rudimentary in the sense that a cluster of instances is launched and thereafter TF is installed and your model is run with the options you set. This is due to TensorFlow being a framework decoupled from Google systems.

  • However, there is a substancial difference of CloudMl vs Compute Engine when it comes to prediction. And that is what you pay for mostly I would say with CloudML. You can have deployed model in CloudML for online and batch prediction out of the box pretty much. In Compute Engine, you would have to take care of all the quirks of TensorFlow Serving which are not that trivial (compared to training your model).

  • Another advantage of CloudML is hyper-parameter tuning. It is no more than just a somewhat smart brute-forcing tool to find out the best combination of parameters for your given model, and you could possibly automatise this in Compute Engine, but you would have to do that part of figuring out the optimisation algorithms to find the combination of parameters and values that would improve the objective function (usually maximise your accuracy or reduce your loss).

  • Finally, pricing is slightly different in either service. Until recently, pricing of CloudML was in pair with other competitors (you would pay for computing time in both training and prediction but also per prediction which you could compare with the computing time in Compute Engine). However, now you will only pay for that computing time (and it is even cheaper than before) which probably renders the idea of managing and scaling your own cluster (with TensorFlow) in Compute Engine useless in most scenarios.

Townley answered 25/12, 2017 at 14:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.