ChatGPT Token Limit

Asked 28/2, 2023 at 0:6 Answered 19/11, 2023 at 17:38

I want ChatGPT to remember past conversations and have a consistent (stateful) conversation.

I have seen several code of ChatGPT prompt engineering.

There were two ways to design the prompt shown below (pseudo code):

Use a single input (Cheap) <- Better if possible
Stack all of previous history (Expensive, Token Limitation)

def openai_chat(prompt):
    completions = openai.Completion.create(
        engine = "text-davinci-003",
        prompt = prompt,
        max_tokens = 1024,
        n = 1,
        temperature = 0.8,
    )
    response = completions.choices[0].text.strip()
    return response

# 1. Use a single input
while True:
    prompt = input("User: ")
    completion = openai_chat(prompt)

# 2. Stack all of previous history (prompt + completion)
prompt = ""
while True:
    cur_prompt = input("User: ")
    prompt += cur_prompt  # pseudo code
    completion = openai_chat(prompt)
    prompt += completion  # pseudo code

Is it possible to choose the first way (the cheap one) to have a consistent conversation?

In other words, does ChatGPT remember past history even if the prompt only has the current input?

Beanfeast answered 28/2, 2023 at 0:6 Comment(1)

Does this answer your question? OpenAI API: How to make a GPT-3 model remember past conversations? – Chef 28/2, 2023 at 9:30

A small point: ChatGPT is a very specific version of the GPT model which is used for conversations via ChatGPT online. You are using GPT-3. It is a small point, but an important one.

In terms of remembering past conversation; no, GPT-3 does not do this automatically. You will need to send the data in via the prompt.

There are several workarounds, while none perfect that can be used.

Summarize the previous conversation.

Get GPT-3 to summarize the previous conversations so that it can be provided in the next prompt. You will lose some meaning, but it will reduce your total prompt count.
Save previous conversations as a vector embedding, and use vector search to find the most relevant part of the previous conversation and send this via the prompt. This is much more complex and will require an understanding of the GPT-3 embeddings endpoint. But it might solve the problem of losing meaning from the previous prompt.

Aborigine answered 28/2, 2023 at 3:11 Comment(1)

In the second workaround, if there's enough matching text (over the token limit) then is it not the same problem? Or you summarise the matching text and then use it i guess? – Dehiscence 31/3, 2023 at 9:23

Tl;Dr: OpenAI API GPT-3 models don't use previously sent input data to generate a new response, but with other models you can use fine-tuning which among other things allows you to send shorter prompts.

OpenAI announced today, as I read in their email newsletter, the launch of the official ChatGPT API. In the OpenAI API documentation, it appears as Chat Completions. It has a guide section and a reference section.

It looks that the code you are using is obsolete as it uses "engine" and this API was deprecated (ref. Engines | OpenAI API Reference). Nowadays, instead of engines, OpenAI APIs are using model, i.e., according to https://platform.openai.com/docs/guides/chat ChatGPT (the web app, https://chat.openai.com/chat) is powered by the gpt-3.5-turbo which is a model, not an engine.

The Chat Completions Guide explains that the main input is an array of messages objects. These objects specify the role that might be system, user or assistant of the message object, and mentions that including the history of the conversation might be modified by the developer.

Long story short, on each call to the brand new official ChatGPT API you should send an array of message objects having all the data required for the model to build the response. It does not use information of previous calls.

Regarding the token limit, from https://platform.openai.com/docs/guides/chat/managing-tokens

... as total tokens must be below the model’s maximum limit (4096 tokens for gpt-3.5-turbo-0301)

Both input and output tokens count toward these quantities.

Each model has its own capacity and each of them has its own price by token. OpenAI says (taken from the Chat Completions Guide)

Because gpt-3.5-turbo performs at a similar capability to text-davinci-003 but at 10% the price per token, we recommend gpt-3.5-turbo for most use cases.

Fine-tuning

According to the OpenAI API Fine Tuning guide

Fine-tuning is currently only available for the following base models: davinci, curie, babbage, and ada.

Reference

Introducing ChatGPT and Whisper APIs

Salpinx answered 1/3, 2023 at 22:6 Comment(2)

Fine-tuning is certainly not appropriate for conversations. That might make sense if you need to incorporate a lot of new information permanently into a tuned ChatGPT instance - e.g. to answer questions about your (large) site to users. That takes time and costs a lot more than just accessing the normal ChatGPT . – Pustulant 8/6, 2023 at 20:14

Just as a counterpoint, I've found fine-tuning gpt-3.5-turbo to be very helpful for eliciting consistent output. Of course, this is only possible if your fine-tuning can be safely stuffed into the token limit. Also, re-framing most problems to fit into the conversation format is sort of a necessary evil right now since this is the most inexpensive model by far. – Hydrotropism 11/6, 2023 at 14:49

No, the first approach (using a single input) does not allow ChatGPT to remember past conversations explicitly. Each time the model receives a new prompt, it doesn't have access to the previous conversation history. Therefore, it cannot maintain context or continuity between interactions.

The second approach (stacking all previous history) is necessary to have a consistent, stateful conversation, as it allows the model to build upon past interactions and maintain context throughout the conversation. Without stacking the history, the model will treat each prompt in isolation, resulting in disjointed and inconsistent responses.

You could use gpt-3.5-turbo-16k model for increasing the token limit to 16000

Resolvable answered 27/7, 2023 at 7:38 Comment(0)

You cannot use only the current prompt to enable context memory on the chatbot using openAI GPT API, you will need to create a memory array that must be initialized at every page load and updated with every request/answer.

You should use this procedure:

//initialize context memory array
function createMemory(messages) {
    const memory = [];
    for (const msg of messages) {
        memory.push({ role: msg.role, content: msg.content });
    }
    return memory;
}

// send the memory content in every message to the API
async function sendMessage() {
    const inputElement = document.getElementById('user-input');
    const userInput = inputElement.value.trim();

    if (userInput !== '') {
        showMessage("Guest", userInput);
        chatMemory = await getChatGPTResponse(userInput, chatMemory);
        inputElement.value = '';
    }
}

//modify your json structure to the API:
try {
        const response = await fetch('https://api.openai.com/v1/chat/completions', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'Authorization': 'Bearer yourapikey'
            },
            body: JSON.stringify({
                "model": "gpt-3.5-turbo-16k",
                "messages": [
                    ...chatMemory,
                    {"role": "user", "content": userInput}
                ]
            })
        });

        if (!response.ok) {
            throw new Error('Error');
        }

    // Return chat memory with the updated content from the api response
        return chatMemory;

If you want, you can set an initial or conditional personality by using custom prompts

chatMemory = createMemory([
        {
          role: "system",
          content: getDefinition("You are Papa Smurf, leader of the Smurfs. You are 546 years old and you have broad knowledge on all human sciences and literatures. You will talk like a smurf instead of like a human.")
                
        }
      ]);

If you want, please check out my GITHUB repository

Liveryman answered 19/11, 2023 at 17:38 Comment(0)

Fine-tuning

Recommended topics

Hot tags