What is a large language model (LLM)?

A large language model (LLM) is an AI system trained on massive datasets (hundreds of billions of words) to predict and generate human-like language. The "large" refers to the scale of the data and the parameters (internal settings) the model uses to process information.

The tech behind the bot

The "magic" behind LLMs is a neural network architecture called a Transformer. Introduced in 2017, Transformers are basically the world's best context-readers. They can weigh the relationship between words over long stretches of text, which is why AI-generated writing feels so spooky-accurate and coherent.

How an LLM generates text

When you hit send, the model processes your input into tokens, bite-sized chunks of text. It then calculates a probability distribution over thousands of possible next tokens. It's essentially playing a high-stakes game of "predict the next word" based on billions of examples it's seen before.

Crucially, this process has no awareness of truth or accuracy. The model is optimising for what is linguistically plausible, not what is factually correct. A well-trained LLM can produce a confidently worded, grammatically perfect sentence that is entirely false.

The truth about fine-tuning

Most models go through a "finishing school" called Reinforcement Learning from Human Feedback (RLHF). This is where humans grade the AI's homework to make it more helpful and less likely to say something unhinged. This shapes the model's "personality," but it doesn't change the engine: it's still just a very smart prediction machine.

← What is generative AI? (The non-techy guide)The truth about training data →

What is a large language model (LLM)?

The tech behind the bot

How an LLM generates text

The truth about fine-tuning

Related reading

What is generative AI? (The non-techy guide)

The truth about training data

The hallucination problem