Session 11: RAG — Give AI a Cheat Sheet (And Watch It Stop Making Stuff Up)

Retrieval-Augmented Generation — the technique that makes AI actually useful for real work.

📖 Reading time: ~9 minutes
📚 Prerequisites: Session 9 — AI Risks, Bias, and Ethics and Session 5 — Large Language Models
📄 Keywords: RAG explained simply, retrieval-augmented generation, AI with your documents, how to reduce AI hallucinations, NotebookLM

Remember Hallucinations? Here's the Fix.

In Session 9, we talked about one of AI's most frustrating problems: hallucinations — when AI confidently tells you something that is completely, utterly made up. Fake citations. Invented statistics. Nonexistent court cases. All delivered with the confidence of someone who definitely did not do the reading.

It's a real problem. But here's what we didn't tell you in Session 9: there's a practical fix. And you've probably already used it without knowing what it was called.

Ever uploaded a PDF to ChatGPT and asked questions about it? Used Perplexity and noticed those little numbered citations next to every claim? Dropped a textbook into Google's NotebookLM and had a conversation about it?

That's RAG — Retrieval-Augmented Generation. It's the single most important technique making AI actually useful for real work in 2026. And the concept is dead simple.

Key Concepts

The Open-Book Exam Analogy

Here's the easiest way to understand RAG:

A regular chatbot is like a student taking a closed-book exam. It answers from memory — which might be impressive, but might also be wrong. RAG is like giving that same student an open-book exam — they can look up answers in the provided materials before responding.

That's it. That's RAG.

When you ask ChatGPT a question normally, it answers from what it "memorized" during training — patterns from trillions of words it saw months or years ago. Sometimes it nails it. Sometimes it invents a Supreme Court case that never happened.

When you upload your documents first and then ask the question, the AI retrieves the relevant information from your files before generating its answer. It's reading, not guessing. And that makes all the difference.

What RAG Actually Stands For

Retrieval-Augmented Generation — let's break that down in plain English:

Word What It Means
Retrieval The AI searches through your documents to find the relevant pieces
Augmented Those pieces are added to the AI's context — it's augmented with real information
Generation The AI generates its response based on what it actually found

So: find the relevant stuff → feed it to the AI → get a grounded answer. That's the whole pattern.

How RAG Works (The Simple Version)

Here's what happens behind the scenes when you upload a document and ask a question:

  1. You provide documents — PDFs, notes, web pages, spreadsheets, whatever you've got
  2. The system breaks them into chunks — AI can't swallow a 500-page manual whole, so it splits documents into smaller, searchable pieces (think of it like creating an incredibly detailed index)
  3. You ask a question — "What does our refund policy say about international orders?"
  4. The system retrieves the relevant chunks — it finds the specific passages most likely to contain your answer
  5. The AI reads those passages and generates a response — grounded in your actual documents, not its training data
  6. You get an answer with sources — and (in good RAG tools) citations pointing to exactly where the information came from

The beauty is: you don't need to understand any of the technical machinery. You just upload, ask, and verify.

Grounding — The Fancy Word for "Citing Your Sources"

In AI-speak, grounding means connecting the AI's response to specific, verifiable source material. It's exactly what your English teacher meant when they said "cite your sources" — except now the AI is the one doing the citing.

Ungrounded: "The company's return window is 30 days" (says who?)

Grounded: "According to page 14 of the refund policy, the return window is 30 days for domestic orders and 45 days for international orders" (checkable!)

Grounding is why tools like Perplexity feel more trustworthy than a plain chatbot — every claim comes with a receipt.

Context Window vs. Knowledge Base

This is a distinction worth understanding (and it builds on what you learned about context windows in Session 5):

Context Window Knowledge Base (RAG)
What it is The text the AI can "see" in one conversation A library of documents the AI can search through
Size Limited (even 200K tokens has a ceiling) Can be massive (thousands of documents)
Persistence Gone when the conversation ends Stays available across conversations
Analogy The AI's short-term memory The AI's reference library

When you paste text directly into a chat, you're using the context window. When you upload documents to a system like NotebookLM or a company's AI tool, you're creating a knowledge base that the AI searches through — that's RAG.

You're Already Using RAG (You Just Didn't Know It)

Here's the thing: RAG isn't some futuristic concept. It's baked into the AI tools you already use. You've been doing it.

Every one of these products is doing the same thing: retrieve first, then generate. Now you know the name for it.

Real-Life Examples

Try It Yourself 🧪

Activity: See RAG in action — before and after

This activity takes about 5 minutes and will give you the "aha" moment.

Part 1 — The "closed-book" answer

  1. Open ChatGPT or Claude (free accounts work)
  2. Ask: "What are the main points of the Krebs cycle?" (or any specific topic from a class, report, or article you care about)
  3. Note the response — it's probably decent but generic, with no sources

Part 2 — The "open-book" answer

  1. Open NotebookLM (free, just needs a Google account)
  2. Click "New Notebook" and upload a source — try a PDF lecture note, a Wikipedia article you've saved, or any document you'd like to ask questions about
  3. Once uploaded, ask the same question you asked in Part 1
  4. Compare the two answers. Notice:
    • NotebookLM cites specific passages from your document
    • The answer is grounded in your material, not generic AI knowledge
    • You can click the citations to verify every claim

Part 3 — The verification test

  1. Still in NotebookLM, ask something that is NOT in your uploaded document
  2. Watch how the AI handles it — a well-designed RAG system will tell you it can't find that information in your sources, instead of making something up

What you learned: The exact same AI technology gives dramatically different results when it has your documents to work from. That's the power of RAG — and now you know to reach for it whenever accuracy matters.

When Should You Use RAG vs. Regular AI?

Not every question needs RAG. Here's a simple decision framework:

Situation Use Regular AI Use RAG
Brainstorming ideas
Creative writing
General explanations
Questions about YOUR documents
Anything where accuracy is critical
Research with citations needed
Summarizing specific reports or articles
Company/organizational knowledge

Rule of thumb: If you'd want a human to check the sources before answering, use RAG. If you're just thinking out loud, regular AI is fine.

RAG Isn't Perfect (A Quick Reality Check)

RAG dramatically reduces hallucinations, but it's not a magic wand:

  • Garbage in, garbage out — If your source documents are wrong, the AI will confidently serve you wrong answers with citations. RAG grounds AI in your sources, not truth.
  • It can miss relevant passages — The retrieval step isn't always perfect. If your question is vague or the relevant information is buried, the system might not find it.
  • Context window limits still apply — Even with RAG, there's a limit to how much retrieved text the AI can process at once (Session 5 covers this).
  • Not all RAG implementations are equal — NotebookLM's citations are excellent. Some tools just vaguely reference "your documents" without showing you where. Prefer tools that show their work.

The important thing: RAG makes AI much more reliable for factual, document-based work. It's a seatbelt, not a force field.

Why This Matters 🌍

  • RAG is the bridge between "AI is fun" and "AI is useful." General chatting is nice; getting accurate answers from your own documents is a game-changer for real work.
  • Every major AI product launched in 2026 — Google AI Mode in Chrome, NotebookLM in Gemini, ChatGPT Workspace Agents, Claude Connectors — is built on RAG. Understanding it means understanding where the entire industry is heading.
  • It directly solves the hallucination problem from Session 9. Now when someone says "but AI makes stuff up," you can say: "Not when you give it the right sources."
  • Knowing when to use RAG vs. regular AI is becoming a core part of AI literacy — the kind of skill that separates someone who uses AI casually from someone who uses it effectively.

Quick Recap 📝

  • RAG (Retrieval-Augmented Generation) = giving AI your documents to read before it answers — open-book exam instead of closed-book
  • It works by retrieving relevant chunks from your files, then generating a response grounded in what it found
  • Grounding means connecting AI's answers to specific, verifiable sources — with citations you can check
  • You're already using RAG in tools like Perplexity, NotebookLM, ChatGPT file uploads, Claude, and Microsoft Copilot
  • RAG dramatically reduces hallucinations but isn't perfect — your sources still need to be accurate
  • When to use it: any time accuracy matters, you're working with specific documents, or you need citations

Fun Analogy 🎯

RAG is like the difference between asking a friend and asking a librarian. Your friend (regular AI) is smart, well-read, and will always give you an answer — but they're working from memory, and sometimes they're confidently wrong. A librarian (RAG) does something different: they go pull the relevant books off the shelf, read the specific pages, and then give you an answer — with a note telling you exactly where to look if you want to check. Same question, same intelligence. But one of them did their homework first.