What is a Neural Network?
The phrase gets thrown around constantly. Here's what's actually happening inside — told through the one analogy that gets closest to the truth, and where that analogy falls apart.
Before you can understand transformers, attention, RLHF, or any of the machinery that makes modern AI work, you need one thing: an honest picture of what a neural network actually is. Not what it feels like to use one — what it literally does.
The short answer: a neural network is a function that maps inputs to outputs by passing numbers through layers of simple mathematical operations, where the behaviour of those operations is shaped by billions of learned parameters called weights.
That sentence is precise but not yet useful. Let's build up to it.
The analogy: a panel of judges
Imagine you're trying to decide whether a photo contains a cat. You assemble a panel of judges — say, five of them. Each judge looks at the photo and gives a score between 0 and 1, representing their confidence. But here's the twist: each judge only pays attention to a small patch of the image, and each has a different sensitivity. One notices whiskers. One notices pointed ears. One notices the general shape of a sitting animal.
You then combine their scores — weighting some judges more heavily than others based on how reliable they've proven to be — and produce a final verdict.
Now imagine instead of five judges, you have millions. And instead of one panel, you have dozens of panels stacked on top of each other — the first panel looks at raw pixels, the second looks at the outputs of the first (edges, shapes), the third looks at the outputs of the second (textures, parts), and so on. By the time you reach the final layer, the network has built a rich, abstract representation of the image from the ground up.
That's a neural network. The "judges" are neurons. Their individual sensitivities are weights. The stacked panels are layers.
Highlighted neurons fired strongly for this input. Output: 91% cat, 2% not-cat.
What a weight actually is
Every connection between neurons has a weight — a single number that determines how much one neuron's output influences the next neuron's input. A high positive weight means "if this fires, strongly push the next one to fire too." A negative weight means "if this fires, suppress the next one." A weight near zero means "this connection barely matters."
A large modern neural network might have tens of billions of these weights. The entire character of the network — what it knows, what it's good at, how it behaves — lives in those numbers.
| Connection | Weight | Meaning |
|---|---|---|
| whisker-detector → cat-output | +2.8 | Strong positive signal |
| round-ear-detector → cat-output | +1.9 | Positive signal |
| beak-detector → cat-output | −3.1 | Strong suppressor |
| fur-texture → cat-output | +0.4 | Weak — not decisive |
These weights aren't hand-coded. Nobody sat down and decided that whiskers should score +2.8. The network learned those values by being shown millions of labelled examples and gradually adjusting every weight to reduce its mistakes. That process is called training — and it's the subject of the next article.
The activation function: giving neurons opinions
There's one more piece. Each neuron doesn't just pass its input straight through — it applies a small function called an activation function that introduces non-linearity. The most common one used today is called ReLU: if the input is negative, output zero. If it's positive, pass it through unchanged.
Why does this matter? Without non-linearity, stacking dozens of layers would be mathematically equivalent to having just one layer. Non-linear activations are what let deep networks learn curves, corners, and complexity.
Where the analogy breaks down
Neurons don't "pay attention" to specific things by design. Real neurons in a trained network activate in response to patterns that often don't have clean human-interpretable labels like "whiskers." The features a network learns are frequently distributed, abstract, and hard to describe.
Neurons don't work independently. The analogy implies discrete, separable opinions. In reality, meaning is distributed across many neurons simultaneously — no single neuron reliably represents a single concept.
Biological neurons are nothing like this. The name "neural network" is a loose analogy to the brain. These are not simulations of the brain — they're mathematical functions inspired, very loosely, by its architecture.
What this means in practice
When you interact with GPT-4, Claude, or any modern language model, you're talking to a neural network with hundreds of billions of weights arranged in a specific architecture called a transformer. All of its knowledge — about history, code, language, reasoning — lives in those numbers. There's no database being queried. No rules being followed. Just a very large, very well-trained function mapping your input to a probable output.
That's the foundation. Everything else — attention, training, alignment, agents — is built on top of this.