how to specify similarity threshold in langchain faiss retriever?
Asked Answered
K

2

6

I would like to pass to the retriever a similarity threshold. So far I could only figure out how to pass a k value but this was not what I wanted. How can I pass a threshold instead?

from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

def get_conversation_chain(vectorstore):
    llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')
    qa = ConversationalRetrievalChain.from_llm(llm=llm, retriever=vectorstore.as_retriever(search_kwargs={'k': 2}), return_source_documents=True, verbose=True)
    return qa

loader = PyPDFLoader("sample.pdf")
# get pdf raw text
pages = loader.load_and_split()
faiss_index = FAISS.from_documents(list_of_documents, OpenAIEmbeddings())
# create conversation chain
chat_history = []
qa = get_conversation_chain(faiss_index)
query = "What is a sunflower?"
result = qa({"question": query, "chat_history": chat_history}) 
Kneeland answered 20/7, 2023 at 12:18 Comment(0)
P
10

You can use the following as a VectorStoreRetriever as you say but with the search_type parameter.

retriever = dbFAISS.as_retriever(search_type="similarity_score_threshold", 
                                 search_kwargs={"score_threshold": .5, 
                                                "k": top_k})
Possess answered 31/8, 2023 at 16:18 Comment(0)
K
3

This was the answer search_kwargs={'score_threshold': 0.3}) from the api docs.

Kneeland answered 21/7, 2023 at 11:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.