How can I give some hint phrases to OpenAI's Whisper ASR?

conda create -y --name whisperpy39 python==3.9 conda activate whisperpy39 pip install git+https://github.com/openai/whisper.git sudo apt update && sudo apt install ffmpeg whisper recording.wav whisper recording.wav --model large

2 potential places for hint phrases / boost:

https://github.com/openai/whisper/blob/15ab54826343c27cfaf44ce31e9c8fb63d0aa775/whisper/decoding.py#L87-L88: add hint phrases in the prompt (and not in prefix: see this discussion on prompt vs. prefix. There's a new --initial_prompt option since commit 2037b65:
```
whisper audio.mp3 --initial\_prompt "So we were just talking about DALL·E"
```
https://github.com/openai/whisper/blob/15ab54826343c27cfaf44ce31e9c8fb63d0aa775/whisper/decoding.py#L302: change the code to increase the likelihood of the sequences containing the hint phrases, e.g.:

Currently there's no interface for this other than giving the initial_prompt like the above; you could hack something with logit biasing, that effectively boosts the predicted probability of certain tokens. The LogitFilter class is designed to support this.

I don't know how efficient it'd be. Also, one potential issue arises when the hint word is not in the dictionary, in which case one would need to add the hint word in the dictionary, which may be difficult.

Recommended topics

Hot tags