I have a Keras model which is doing inference on a Raspberry Pi (with a camera). The Raspberry Pi has a really slow CPU (1.2.GHz) and no CUDA GPU so the model.predict()
stage is taking a long time (~20 seconds). I'm looking for ways to reduce that by as much as possible. I've tried:
- Overclocking the CPU (+ 200 MhZ) and got a few extra seconds of performance.
- Using float16's instead of float32's.
- Reducing the image input size as much as possible.
Is there anything else I can do to increase the speed during inference? Is there a way to simplify a model.h5 and take a drop in accuracy? I've had success with simpler models, but for this project I need to rely on an existing model so I can't train from scratch.