1) End Goal (What should the agent do?)
Agent job:
- Answer support questions
- Summarize documents/videos
- Find specific facts inside text
- Calculate totals/averages/top items
- Give timelines / step-by-step breakdown
- Search internal knowledge base
- Mix of above (hybrid)
2) What questions will users ask?
Write 5–10 real user questions:
- List
- List
- List
- List
- List
Now label each question type:
- Lookup (simple fact)
- Filter-based query (rows match criteria)
- Aggregation (sum/avg/max/min)
- Ranking (top 3, best performing)
- Timeline / chronological explanation
- Full summary of a whole document
- Semantic search (find relevant parts)
3) What data does the agent need to see?
Pick the data type:
A) Structured (tabular)
- rows + columns
- spreadsheet / database table Examples: sales, CRM, orders, inventory
B) Unstructured (text)
- PDFs, docs, transcripts, notes Examples: SOPs, policies, meeting notes
C) Mixed
- both structured + unstructured
4) Choose the Retrieval Method (the correct RAG)
Use this rule:
✅ If a human would use spreadsheet filters → Filters
✅ If a human would use pivot tables / formulas → SQL
✅ If a human would read the whole doc before answering → Full Context
✅ If a human would search inside many docs → Vector Search
5) Final RAG Choice (pick one)
- Filters RAG
- SQL RAG
- Full Context RAG
- Vector Search RAG
- Hybrid (Filters/SQL + Vector)
🔥 Detailed Guide (4 Methods Explained Properly)
1) Filters RAG
(Fastest + Cheapest)
When to use
Use Filters when:
- your data is tabular (rows/columns)
- you already know the fields to filter by
- question needs a small subset of rows
Best for questions like
- “How many Bluetooth speakers did we sell on Sept 16?”
- “Show orders where product = X”
- “Give customers from Delhi only”
Why it works (from transcript)
- Fast
- Cheap
- Accurate
- Lower hallucination risk Because the agent pulls less data into context.
What you MUST do (important)
Filters are NOT semantic.
It’s explicit match only:
If the real product name is “Bluetooth speaker” and the agent writes “Bluetooth speakers”, your filter may fail.
So you must define:
- valid product names
- correct date format
- allowed fields
Setup checklist
✅ Define all allowed filter keys
✅ Define exact allowed values (if needed)
✅ Validate formatting (date, casing)
✅ Agent returns:
- filter used
- rows returned
- calculation logic
2) SQL RAG
(Best for totals + rankings + trends)
When to use
Use SQL when:
- you need math across many rows
- you need ranking
- you need grouping
- you need comparisons
Best for questions like
- “Top 3 highest earning products?”
- “Average order value?”
- “Revenue trend by month?”
- “Which product has highest profit?”
Why SQL is better than vector for structured data
Because databases are built for:
- SUM / AVG / MAX
- GROUP BY
- ORDER BY
- LIMIT
SQL gives a complete answer because it processes the full dataset correctly.
Example SQL pattern (like transcript)
- select product
- sum(revenue)
- group by product
- order by revenue desc
- limit 3
Common mistake to avoid
If you don’t give the agent the table + column names clearly, it will:
- guess wrong column names
- break queries
- return wrong answers
Setup checklist
✅ Provide schema (table names + columns)
✅ Provide example queries
✅ Add safety constraints (no deletes, no updates)
✅ Optionally add “schema lookup tool” for dynamic tables
3) Full Context RAG
(Most accurate for order + meaning)
What it is
Instead of retrieving chunks, you give the agent the entire document.
When to use
Use Full Context when:
- order matters (chronological breakdown)
- you need a proper summary of the full doc/video
- step-by-step explanation is required
- dataset is small enough to fit in context
Best for questions like
- “Give a chronological breakdown of this transcript”
- “Summarize this entire SOP”
- “Explain the full onboarding process”
Pros
✅ Full context
✅ Best accuracy
✅ Best for timelines
Cons
⚠️ More tokens (more expensive)
⚠️ Slower
⚠️ Context window limits (but improving fast)
3 ways to do full context (from transcript)
Option A: Tool-based full context
Agent chooses which doc to load
(cheaper than always loading everything)
Option B: Put full docs in the prompt
Always included → always expensive
Option C: Dynamic variables (same cost, more flexible)
Docs injected each time as variables
Setup checklist
✅ Keep docs organized (names, sources)
✅ Give agent “choose doc A/B” tool if possible
✅ Use when order matters more than cost
4) Vector Search RAG (Chunk-Based Retrieval)
What it is
Documents are split into chunks → embedded into vectors → semantic search retrieves chunks.
When to use
Use Vector Search when:
- knowledge base is huge
- user asks “where is this mentioned?”
- you need semantic matching across many documents
Best for questions like
- “What does our policy say about refunds?”
- “Find where the transcript mentions pricing”
- “What did we say about agent memory?”
Why people misuse it
People realize:
“Agent needs external info” → instantly add vector DB.
But chunk retrieval can lose:
- full document context
- order
- complete dataset visibility
- “holistic truth”
Biggest failure cases (from transcript)
❌ Summaries of full docs/videos
It summarizes only the chunks it retrieved, not the entire doc.
❌ Tabular math questions
Example: “What week had highest sales?”
Vector retrieval may only return 1 chunk → agent answers based on partial rows.
How to improve vector retrieval (if you must use it)
✅ Increase chunk limit (retrieve more chunks)
✅ Use metadata tagging (source, URL, timestamp)
✅ Add hybrid approach:
- vector search → retrieve text
- then full context on retrieved source ✅ Use it only when semantic search is actually needed
Setup checklist
✅ Chunk size strategy
✅ Metadata fields (doc name, timestamp, source)
✅ Retrieval limit tuning
✅ Evaluation set (test questions)
⚡ The Ultimate RAG Decision Flow (Super Simple)
Step 1: Is the data structured (rows/columns)?
✅ YES → use Filters or SQL
- If simple subset → Filters
- If math/aggregation → SQL
❌ NO → go to step 2
Step 2: Does the answer require reading the whole doc in order?
✅ YES → Full Context
❌ NO → Vector Search
🧠 Context Engineering Checklist (From Transcript)
To make agents accurate long-term, you need:
1) Begin with the end in mind
What does “correct answer” look like?
2) Design your data pipeline
Where does the truth live?
- DB? Docs? Sheets? CRM?
3) Ensure data accuracy
Bad input → bad output
Always.
4) Optimize context windows
Don’t overload the agent.
Pull only what’s needed.
5) Embrace specialization
One agent doesn’t need one method forever.
Use the right tool per job.
🧪 Quick Testing Framework (Use this before deploying)
Test your agent with these 10 questions:
Filters test
- “How many X sold on Y date?”
- “Show all orders where product = X”
SQL test
- “Top 3 products by revenue”
- “Average order value”
- “Revenue trend month-wise”
Full context test
- “Give chronological breakdown”
- “Summarize entire transcript”
Vector test
- “Where does it mention X?”
- “What does it say about Y?”
If your agent fails:
- you didn’t choose the right RAG method OR
- your context is wrong/insufficient
✅ Final Cheat Sheet (1 line)
Filters = spreadsheet filters
SQL = pivot tables
Full Context = read everything
Vector Search = find relevant parts
Follow Vikash Kumar on LinkedIn for more.

