How do I change the default 4 documents that LangChain returns?
Asked Answered
B

2

7

I have the following code that implements LangChain + ChatGPT to answer questions from given data:

import { PineconeStore } from 'langchain/vectorstores/pinecone';
import { ConversationalRetrievalQAChain } from 'langchain/chains';

const CONDENSE_PROMPT = `Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.

Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:`;

const QA_PROMPT = `You are a helpful AI assistant. Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say you don't know. DO NOT try to make up an answer.
If the question is not related to the context, politely respond that you are tuned to only answer questions that are related to the context. Always answer in spanish.

{context}

Question: {question}
Helpful answer in markdown:`;

export const makeChain = (vectorstore: PineconeStore) => {
  const model = new OpenAI({
    temperature: 0.9, // increase temepreature to get more creative answers
    modelName: 'gpt-4', //change this to gpt-4 if you have access
  });

  const chain = ConversationalRetrievalQAChain.fromLLM(
    model,
    vectorstore.asRetriever(),
    {
      qaTemplate: QA_PROMPT,
      questionGeneratorTemplate: CONDENSE_PROMPT,
      returnSourceDocuments: false, //The number of source documents returned is 4 by default
    },
  );
  return chain;
};

The issue I'm dealing with is that it always returns only 4 documents (in my case these are json files but I have stored 30). I can see that even in the LangChain documentation page the use a similar bot and also returns only sources.

The comment in my code that says "the number... is 4 by default" makes me think that there's a way to increase this value.

I've tried Bard and ChatGPT to find a solution but the code they suggest doesn't work.

Brantley answered 6/7, 2023 at 1:11 Comment(1)
In Python you can pass search_kwargs param to as_retriever() like: as_retriever(search_kwargs={"k": 3}). There could be a similar interface in JS binding probably.Referent
S
5

Here is some pseudo code:

DB = Chroma(
    persist_directory=PERSIST_DIRECTORY, embedding_function=EMBEDDINGS, 
    client_settings=CHROMA_SETTINGS,
)

RETRIEVER = DB.as_retriever(search_kwargs={"k": 10})

QA = RetrievalQA.from_chain_type(
    llm=LLM_LOCAL, chain_type="stuff", retriever=RETRIEVER, 
    return_source_documents=SHOW_SOURCES
)

Here is the magic:

RETRIEVER = DB.as_retriever(search_kwargs={"k": 10})

Sliwa answered 16/8, 2023 at 10:10 Comment(0)
W
4

I know I am answering late, but someone might benefit from it. In your code there is method called asRetriever() in that method you pass a parameter of how many documents you want to return Ex: asRetriever(30) It will return 30 documents.

Wondering answered 25/7, 2023 at 17:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.