I'm using the OpenAI GPT-2 model from github
I think that the top_k parameter dictates how many tokens are sampled. Is this also the parameter that dictates how large of a prompt can be given?
If top_k = 40, how large can the prompt be?
I'm using the OpenAI GPT-2 model from github
I think that the top_k parameter dictates how many tokens are sampled. Is this also the parameter that dictates how large of a prompt can be given?
If top_k = 40, how large can the prompt be?
GPT-2 does not work on character-level but on the subword level. The maximum length of text segments in was trained on was 1,024 subwords.
It uses a vocabulary based on byte-pair-encoding. Under such encoding, frequent words remain intact, infrequent words get split into several units, eventually down to the byte level. In practice, the segmentation looks like this (69 characters, 17 subwords):
Hello , ▁Stack Over flow ! ▁This ▁is ▁an ▁example ▁how _a ▁string ▁gets ▁segment ed .
At the training time, there is no difference between the prompt and the answer, so the only limitation is that the prompt and answer cannot be longer than 1,024 subwords in total. In theory, you can continue generating beyond this, but the history model considers can never be longer.
The selection of top_k
only influences memory requirements. A long query also needs more memory, but it is probably not the main limitation
© 2022 - 2024 — McMap. All rights reserved.