I am currently learning ChromaDB vector DB.
I can't understand how the querying process works.
When I try to query using text, it's returning all documents.
collection.add(
documents=["This is a document about cat", "This is a document about car"],
metadatas=[{"category": "animal"}, {"category": "vehicle"}],
ids=["id1", "id2"]
)
results = collection.query(
query_texts=["vehicle"],
n_results=2
)
results
The output is:
{'ids': [['id2', 'id1']],
'distances': [[0.8069301247596741, 1.648103952407837]],
'metadatas': [[{'category': 'vehicle'}, {'category': 'animal'}]],
'embeddings': None,
'documents': [['This is a document about car',
'This is a document about cat']]}
Even I entered a word the not present anywhere, it's still returning all docs.
Why does this happen?
where={'category': 'vehicle'}
? A simple query like what you did is always going to return the whole collection, and the'distances'
tells you how close the document was to your query text.query_texts
doesn't look at the metadata. – Outmarchn_results
results. – Outmarch