Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) is an architecture that enhances AI model responses by retrieving relevant information from external knowledge sources before generating an answer, reducing hallucinations and enabling access to current or proprietary data.
What is Retrieval-Augmented Generation?
Why RAG Matters for Business
Related Terms
Explore further
FAQ
Frequently asked questions
RAG retrieves external information at query time without modifying the model. Fine-tuning changes the model's weights through additional training. RAG is better for factual, frequently updated knowledge; fine-tuning is better for teaching new behaviours, styles, or domain-specific reasoning patterns.
Key metrics include retrieval relevance (are the right documents being found?), answer accuracy (is the generated response correct?), answer faithfulness (is the response grounded in the retrieved documents?), and latency (how fast is the end-to-end response?).
RAG significantly reduces hallucinations by grounding responses in retrieved documents, but it cannot eliminate them entirely. The model may still misinterpret retrieved information or generate content not supported by the sources. Citation and source linking help users verify responses.
Need help implementing this?
Our team can help you apply these concepts to your business. Book a free strategy call.