Session 5: Large Language Models — How ChatGPT Actually Works
It's the world's most sophisticated game of "guess the next word" — and it's simpler and stranger than you think.
What Is a Large Language Model (LLM)?
- Deep learning model trained specifically on text data — enormous quantities of it
- "Large" = hundreds of billions to trillions of parameters
- Core function: given some text, predict what comes next
- Examples: GPT-4o (OpenAI), Claude (Anthropic), Gemini (Google), Llama (Meta), Mistral
Analogy: Think of an LLM as the world's best autocomplete. Your phone's autocomplete predicts the next word from a few words of context. An LLM does the same thing — but considers thousands of words of context, learned from trillions of words of training data.
How Next-Word Prediction Creates Intelligence
- The model chooses one token at a time
- Each output word becomes part of the context for predicting the next word
- When trained on enough text, this simple process produces coherent writing, reasoning, translation, coding, and conversation
- Emergent behavior: complex capabilities that weren't explicitly programmed but emerged from training at scale
- No beliefs, no understanding, no memory between conversations — it's a pattern-completion engine
Key LLM Vocabulary
| Term | What It Means |
|---|---|
| Token | A chunk of text (~3/4 of a word). "ChatGPT is amazing" ≈ 4 tokens |
| Context window | How much text the model can "see" at once (input + output). GPT-4o: ~128K tokens. Claude: ~200K tokens. Think of it as the model's short-term memory for this conversation. |
| Parameters | Internal settings learned during training. More parameters = more knowledge capacity |
| Training data | Massive text corpus (books, websites, code, conversations) |
| Inference | When the model generates a response — the "thinking" phase |
| Temperature | Controls randomness. Low = predictable. High = creative |
| Fine-tuning | Additional training on a specific dataset to specialize the model |
What LLMs Can and Can't Do
Can Do Well
- Write, summarize, translate, and rewrite text
- Answer questions across a wide range of topics
- Generate and debug code
- Brainstorm, outline, and draft
- Follow complex, multi-step instructions
Can't Do
- Truly understand or have consciousness
- Guarantee factual accuracy (they hallucinate)
- Access real-time information without tools
- Learn from a conversation after it ends
- Perform reliable math without tools
Why Different LLMs Exist
- Different companies, different training data, different design choices
- Trade-offs: speed vs quality, cost vs capability, safety vs freedom
- Open-source (Llama, Mistral) vs proprietary (GPT-4o, Claude)
- No single "best" model — depends on the task
Real-Life Examples
- Customer support: LLM chatbots handle 80%+ of routine questions
- Legal research: summarize case law, draft contracts in seconds
- Content creation: outline articles, write drafts, generate headlines
- Education: personalized tutors adapting to student questions
- Software development: Copilot/Cursor for code, tests, and explaining codebases
- Research: summarize papers, generate literature reviews
🎯 Try It Yourself
Activity: Explore How Context and Wording Change LLM Output
- Open ChatGPT or Claude
- Send: "Explain quantum computing."
- Note the response style and length
- Send: "Explain quantum computing to a skeptical 8-year-old using only food analogies. Keep it under 100 words."
- Compare: same topic, wildly different output — because you changed the context and constraints
- Try: "You are a stand-up comedian. Explain quantum computing as part of your comedy set."
- Notice how a role instruction changes the tone entirely
What you learned: The way you ask dramatically shapes the response. This is the foundation of prompt engineering (covered in Session 6).
💡 Why This Matters
- LLMs are the most widely used AI technology on the planet right now
- Understanding how they work helps you:
- Use them more effectively — better prompts → better results
- Spot limitations — hallucinations, bias, lack of real-time knowledge
- Make informed decisions about when to trust AI output
- LLMs are the engine behind AI assistants, coding copilots, search tools, and autonomous agents (Session 10)
📋 Quick Recap
- LLMs = deep learning models trained on massive text to predict the next word/token
- Core mechanism: next-word prediction — a simple concept that's astonishing at scale
- Key terms: tokens (text chunks), context window (how much it "sees"), parameters (learned knowledge), temperature (creativity dial)
- Capabilities: can write, summarize, translate, code, and converse — but also hallucinate, make errors, and lack understanding
- Different LLMs exist with different strengths; there is no single "best" model
🎭 Fun Analogy
An LLM is like a world-class improv actor who's memorized every script ever written. Give them a scene setup (your prompt), and they'll improvise something that sounds brilliant and perfectly in character. But they're not actually thinking about the scene — they're drawing on patterns from every performance they've ever absorbed. And occasionally, they'll throw in a completely fabricated "fact" with total confidence, because in improv, the show must go on.