Based on overlay specification:
https://ffmpeg.org/ffmpeg-filters.html#overlay-1
when you specify time interval it will happen only at that time interval:
For example, to enable a blur filter (smartblur) from 10 seconds to 3 minutes:
smartblur = enable='between(t,10,3*60)'
What you need to do is to overlay an image at specific coordinates, for example the following at fixed x and y:
ffmpeg -i rtsp://[host]:[port] -i x.png -filter_complex 'overlay=10:main_h-overlay_h-10' http://[host]:[post]/output.ogg
Now the idea is to calculate those coordinates based on the current frame of the video and force filter to use changed coordinates on every frame.
For example based on time:
FFmpeg move overlay from one pixel coordinate to another
ffmpeg -i bg.mp4 -i fg.mkv -filter_complex \
"[0:v][1:v]overlay=enable='between=(t,10,20)':x=720+t*28:y=t*10[out]" \
-map "[out]" output.mkv
Or using some other expressions:
http://ffmpeg.org/ffmpeg-utils.html#Expression-Evaluation
Unfortunately this will require to find a formula before using those limited expressions of cat moving his head or drawing a pen for x and y. It can be linear, trigonometric or other dependency from time:
x=sin(t)
With the free move it is not always possible.
To be more precise of finding an object coordinates to overlay something it should be possible to provide your own filter(ffmpeg is open sourced) similar to overlay:
https://github.com/FFmpeg/FFmpeg/blob/master/libavfilter/vf_overlay.c
Calculating x and y either based on external file(where you can dump all x and y for all times if it is a static video) or do some image processing to find specific region.
Hopefully it will give you an idea and direction to move to.
It's very interesting feature.