You’re on the right track, but what you’re doing now is creating new lines of text. They will, in fact, be shown next to each other, but they will be stacked vertically, which is not quite what you want.
Multiple events with repeated text
To add words to the same line, you need to stop the previous Dialogue event when the next word must appear and repeat the entire visible text in the next Dialogue event:
Dialogue: 0,0:00:00.00,0:00:01.00,Default,,0,0,0,,I'm
Dialogue: 0,0:00:01.00,0:00:01.50,Default,,0,0,0,,I'm a
Dialogue: 0,0:00:01.50,0:00:03.46,Default,,0,0,0,,I'm a subtitle
In case of centre- or (for languages written left-to-right) right-aligned text, or if your text is so long it automatically wraps around and is displayed on several lines, you’ll want to specify the entire text each time so the alignment and wrapping calculations work the same way on each event, but make the yet-invisible part of the text actually invisible by turning it fully transparent:
Dialogue: 0,0:00:00.00,0:00:01.00,Default,,0,0,0,,I'm {\alpha&HFF}a subtitle
Dialogue: 0,0:00:01.00,0:00:01.50,Default,,0,0,0,,I'm a {\alpha&HFF}subtitle
Dialogue: 0,0:00:01.50,0:00:03.46,Default,,0,0,0,,I'm a subtitle
This is the way it is usually done.
Single event with override tags
Alternatively, if your text has no border/outline and no shadow (not the case in your sample video), you can use the ASS karaoke tags by setting the yet-to-be-sung text colour (SecondaryColour
) to fully transparent:
Dialogue: 0,0:00:00.00,0:00:03.46,Default,,0,0,0,,{\2a&HFF\k100}I'm {\k50}a {\k196}subtitle
If you do have a border or a shadow (as in the sample video), plain karaoke is not enough as you also want to modify the border/shadow colour. You can do this using ASS’s generic animation tag \t
:
Dialogue: 0,0:00:00.00,0:00:03.46,Default,,0,0,0,,I'm {\alpha&HFF\t(1000,1000,\alpha0)}a {\alpha&HFF\t(1500,1500,\alpha0)}subtitle
Here, for each word except the first, I set all alpha values to fully transparent when the event starts, and then I instantly switch it to fully opaque when the word is to appear.
Caution about right-to-left text
There is no indication in the question that you intend to use this with anything but left-to-right English text, but this deserves a word of caution nevertheless. (You may even think it is natural to assume the distinction doesn’t matter nowadays. Unfortunately, due to some bad decisions and habits from the past, it still does.)
If you try any of these approaches with right-to-left text (e. g. Arabic or Hebrew) except the first, simplest approach that has no override tags, you’ll find that the text gets split at override tags and the sections get reordered so that earlier sections are always to the left of later sections.
Even in the simplest case without any override tags, if your text starts or ends with some punctuation (or other direction-neutral characters), they will not match the text’s natural direction and appear reversed. You can fix this by inserting the Unicode bidirectional control character U+200F RIGHT-TO-LEFT MARK before any leading punctuation and after any trailing punctuation.
Furthermore, old versions of libass (one of the two major ASS renderers), namely older than the 0.16.0 release of May 2022, used the same word & glyph order as if you had a single line without any tags. If you expect that your subtitle file might sometimes be rendered using old builds of libass and you want it to look the same in all renderers, you can work around the differences by inserting Unicode bidirectional control characters around override tags to make old libass match the ASS-standard (albeit unnatural) text order.
Of course, if you’re only burning in or visualizing subtitles without intending to distribute the subtitle file, then you don’t have to worry about how they look in other renderers. FFmpeg uses libass, and all versions of libass support a special bidirectional mode if you put -1
(minus one) in a Style
definition’s Encoding
field. This triggers autodetection of base direction for each line and disables the forced left-to-right ordering of tag-delimited sections, so right-to-left lines work intuitively correctly (except where autodetection fails).