I have a trained model that I want to use in a python application, but I can't see any examples of deploying to production environment without installing TensorFlow or creating gRPC service. Is it possible at all? What is the right approach in this situation?
How do you want to serve it if not using TensorFlow itself or TensorFlow serving? Do you plan on reimplementing the TensorFlow operations to get the same semantics?
That said, with XLA there is now a way to compile a TensorFlow model into a binary which can be called from C++. See the documentation on tfcompile for an example.
You can deploy a tensorflow model without tensorflow by using NVIDIA's TensorRT deep learning inference library, which is now compatible with tensorflow since version 3 of the library. It is tailored for inference so it is a very good choice if you fullfill its requirements.
However, it won't work for you if you plan to do inference on CPU or on a platform that is not supported by TensorRT (e.g. Windows).
© 2022 - 2024 — McMap. All rights reserved.