Retrieval Augmented Generation (RAG) is a way to enhance a specific context comprehension by a Large Language Model (LLM). This approach combines a database information retrieval with LLMs reasoning capabilities.

RAG Pipeline

flowchart TD
A[Query] -->B(Embeddings)
C[Documents] -->D(Split into chunks)
E[Web Pages] -->D(Split into chunks)
D -->F(Embeddings)
F -->|Store| G(Vector DB)
B -->|Search| G
G --> |Top-K| H(Context)
A --> I(Prompt Template)
H --> I(Prompt Template)
I --> J(LLM)
J --> K[Answer]

Embedding Vectors

Cosine Similarity Score

Vector Databases

Hierarchical Navigable Small Worlds (HNSW)

Trade precision for speed.