TL;DR: Use LlamaIndex or LangChain to get an exact answer (i.e., a fact) to a specific question from existing data sources.
Why choose LlamaIndex or LangChain over fine-tuning a model?
The answer is simple, but you couldn't answer it yourself because you were only looking at the costs. There are other aspects as well, not just costs. Take a look at the usability side of the question.
Fine-tuning a model will give the model additional general knowledge, but the fine-tuned model will not give you an exact answer (i.e., a fact) to a specific question.
People train an OpenAI model with some data, but when they ask it something related to the fine-tuning data, they are surprised that the model doesn't answer with the knowledge gained by fine-tuning. See an example explanation on the official OpenAI forum by @juan_olano:
I fine-tuned a 70K-word book. My initial expectation was to have the
desired QA, and at that point I didn’t know any better. But this
fine-tuning showed me the limits of this approach. It just learned the
style and stayed more or less within the corpus, but hallucinated a
lot.
Then I split the book into sentences and worked my way through
embeddings, and now I have a very decent QA system for the book, but
for narrow questions. It is not as good for questions that need the
context of the entire book.
Also, see the official OpenAI documentation:
Some common use cases where fine-tuning can improve results:
- Setting the style, tone, format, or other qualitative aspects
- Improving reliability at producing a desired output
- Correcting failures to follow complex prompts
- Handling many edge cases in specific ways
- Performing a new skill or task that’s hard to articulate in a prompt
LlamaIndex or LangChain enable you to connect OpenAI models with your existing data sources. For example, a company has a bunch of internal documents with various instructions, guidelines, rules, etc. LlamaIndex or LangChain can be used to query all those documents and give an exact answer to an employee who needs an answer.
OpenAI models (GPT-3, GPT-3.5, GPT-4, etc.) can't query their knowledge. Querying requires calculating embedding vectors from a resource and then calculating cosine similarity, which OpenAI models can't do. An OpenAI model simply gives an answer based on the statistical probability of which word should follow the previous one.
I strongly suggest you read my previous answer regarding semantic search. You'll understand this answer better.