Reading time: ~9 min Prerequisites: Session 4 Keywords: how ChatGPT works, large language models explained, LLMs for beginners, tokens and context window, next-word prediction

Session 5: Large Language Models — How ChatGPT Actually Works

It's the world's most sophisticated game of "guess the next word" — and it's simpler and stranger than you think.

What Is a Large Language Model (LLM)?

Deep learning model trained specifically on text data — enormous quantities of it
"Large" = hundreds of billions to trillions of parameters
Core function: given some text, predict what comes next
Examples: GPT-4o (OpenAI), Claude (Anthropic), Gemini (Google), Llama (Meta), Mistral

Analogy: Think of an LLM as the world's best autocomplete. Your phone's autocomplete predicts the next word from a few words of context. An LLM does the same thing — but considers thousands of words of context, learned from trillions of words of training data.

How Next-Word Prediction Creates Intelligence

The model chooses one token at a time
Each output word becomes part of the context for predicting the next word
When trained on enough text, this simple process produces coherent writing, reasoning, translation, coding, and conversation
Emergent behavior: complex capabilities that weren't explicitly programmed but emerged from training at scale
No beliefs, no understanding, no memory between conversations — it's a pattern-completion engine

Key LLM Vocabulary

Term	What It Means
Token	A chunk of text (~3/4 of a word). "ChatGPT is amazing" ≈ 4 tokens
Context window	How much text the model can "see" at once (input + output). GPT-4o: ~128K tokens. Claude: ~200K tokens. Think of it as the model's short-term memory for this conversation.
Parameters	Internal settings learned during training. More parameters = more knowledge capacity
Training data	Massive text corpus (books, websites, code, conversations)
Inference	When the model generates a response — the "thinking" phase
Temperature	Controls randomness. Low = predictable. High = creative
Fine-tuning	Additional training on a specific dataset to specialize the model

What LLMs Can and Can't Do

Can Do Well

Write, summarize, translate, and rewrite text
Answer questions across a wide range of topics
Generate and debug code
Brainstorm, outline, and draft
Follow complex, multi-step instructions

Can't Do

Truly understand or have consciousness
Guarantee factual accuracy (they hallucinate)
Access real-time information without tools
Learn from a conversation after it ends
Perform reliable math without tools

Why Different LLMs Exist

Different companies, different training data, different design choices
Trade-offs: speed vs quality, cost vs capability, safety vs freedom
Open-source (Llama, Mistral) vs proprietary (GPT-4o, Claude)
No single "best" model — depends on the task

Real-Life Examples

Customer support: LLM chatbots handle 80%+ of routine questions
Legal research: summarize case law, draft contracts in seconds
Content creation: outline articles, write drafts, generate headlines
Education: personalized tutors adapting to student questions
Software development: Copilot/Cursor for code, tests, and explaining codebases
Research: summarize papers, generate literature reviews

🎯 Try It Yourself

Activity: Explore How Context and Wording Change LLM Output

Open ChatGPT or Claude
Send: "Explain quantum computing."
Note the response style and length
Send: "Explain quantum computing to a skeptical 8-year-old using only food analogies. Keep it under 100 words."
Compare: same topic, wildly different output — because you changed the context and constraints
Try: "You are a stand-up comedian. Explain quantum computing as part of your comedy set."
Notice how a role instruction changes the tone entirely

What you learned: The way you ask dramatically shapes the response. This is the foundation of prompt engineering (covered in Session 6).

💡 Why This Matters

LLMs are the most widely used AI technology on the planet right now
Understanding how they work helps you:
- Use them more effectively — better prompts → better results
- Spot limitations — hallucinations, bias, lack of real-time knowledge
- Make informed decisions about when to trust AI output
LLMs are the engine behind AI assistants, coding copilots, search tools, and autonomous agents (Session 10)

📋 Quick Recap

LLMs = deep learning models trained on massive text to predict the next word/token
Core mechanism: next-word prediction — a simple concept that's astonishing at scale
Key terms: tokens (text chunks), context window (how much it "sees"), parameters (learned knowledge), temperature (creativity dial)
Capabilities: can write, summarize, translate, code, and converse — but also hallucinate, make errors, and lack understanding
Different LLMs exist with different strengths; there is no single "best" model

🎭 Fun Analogy

An LLM is like a world-class improv actor who's memorized every script ever written. Give them a scene setup (your prompt), and they'll improvise something that sounds brilliant and perfectly in character. But they're not actually thinking about the scene — they're drawing on patterns from every performance they've ever absorbed. And occasionally, they'll throw in a completely fabricated "fact" with total confidence, because in improv, the show must go on.