If your email inbox is mostly free of dodgy pharmacy adverts, fake delivery notices and "you have won a prize" messages, you have spam filtering to thank. Quietly, behind the scenes, it sifts through an astonishing volume of junk so that you rarely have to.
But how does a piece of software decide that one email is a welcome message from a friend and another is a scam? The answer is a surprisingly clever mix of detective work, reputation and statistics.
What spam filtering is
Spam filtering is the automated process that examines incoming email and decides which messages are wanted and which are unwanted or dangerous, diverting the junk away from your main inbox — usually into a spam or junk folder.
The word "spam" itself means unsolicited bulk email: messages sent in huge quantities to people who never asked for them. Some is merely annoying advertising; some is outright criminal, carrying phishing attempts or malware. A filter's job is to catch as much of it as possible without ever throwing away a real message you needed to see.
The basic flow
Every email you receive runs a gauntlet before it reaches you:
- An incoming message arrives at your provider's mail servers.
- The filter examines it against many checks at once.
- Each check contributes to an overall spam score.
- If the score crosses a threshold, the message is sent to junk — or blocked entirely.
- If not, it lands in your inbox.
The crucial point is that modern filtering is not a single test but dozens of signals combined. No one factor decides the outcome; it is the weight of evidence.
Sender reputation
One of the strongest signals is who the message is from — not just the name, but the reputation of the server and domain that sent it.
Email providers track the behaviour of sending sources over time. A server that suddenly fires out millions of identical messages, or that many recipients keep reporting as spam, develops a poor reputation and finds its mail increasingly blocked. A long-established, well-behaved sender builds trust. This is why a brand-new, unknown sender can sometimes be treated with suspicion even when the message is perfectly genuine.
Authentication: proving an email is genuine
Scammers love to forge the "From" address so a message appears to come from your bank or a delivery company. To fight this, the email industry uses three authentication standards that let a filter verify a sender's identity:
- SPF (Sender Policy Framework) — lets a domain publish a list of the servers allowed to send email on its behalf, so the filter can check the message came from an approved one.
- DKIM (DomainKeys Identified Mail) — adds a cryptographic signature proving the message genuinely originated from the claimed domain and was not tampered with in transit.
- DMARC — ties the two together and tells receiving servers what to do with mail that fails the checks, such as reject it or send it to junk.
When an email claims to be from a major organisation but fails these checks, that is a powerful clue it is forged — and a well-tuned filter acts on it.
Reading the content
Filters also look inside the message itself. They have learned, from analysing vast quantities of real spam, the patterns that tend to give it away:
- Classic spam phrases and urgent, pressuring language.
- Mismatched or disguised links, where the visible text and the real destination differ.
- Suspicious attachments.
- Messages that are almost all image with little text, a trick used to dodge word-based checks.
- Links to websites already known to be malicious.
Much of this is powered by machine learning. Rather than relying only on fixed rules a person wrote, the system is trained on millions of examples of spam and legitimate mail, learning the subtle combinations of features that distinguish them. Spammers constantly change tactics, so the filters keep learning in response — an endless cat-and-mouse game.
The wisdom of the crowd
On large email platforms, your filter benefits from everyone else's behaviour. When thousands of people receive a similar message and click "report spam", the system rapidly recognises that campaign and starts catching it for everyone. This collective feedback is one reason the big providers' filters are so effective — and why your own clicks genuinely matter.
When filters get it wrong
No filter is perfect, and the errors come in two flavours:
- A false positive is a legitimate email wrongly sent to junk. This is the more harmful mistake, because you might miss something important.
- A false negative is spam that slips through into your inbox.
Filter designers must balance the two. Make the filter too aggressive and real mail gets lost; make it too lax and junk floods in. That trade-off is why the occasional genuine email turns up in your spam folder, and why it is worth glancing in there now and then.
What you can do to help
You have more influence over your own filtering than you might think:
- Report spam rather than just deleting it — this trains the system.
- Mark genuine messages as "not spam" if they are misfiled, and add trusted senders to your contacts.
- Never unsubscribe from obvious spam, as it can simply confirm to a scammer that your address is active. Unsubscribe only from legitimate senders.
- Treat unexpected messages with care, because filtering is a safety net, not a guarantee — the same caution that protects you from impersonation scams applies here.
If you receive unwanted marketing that breaks the rules, the Information Commissioner's Office oversees electronic marketing law in the UK, and the NCSC runs a service for reporting suspicious emails.
The bottom line
Spam filtering is the quiet workhorse of email, weighing the sender's reputation, authentication checks, the content of the message and the reactions of millions of other users to decide what reaches you. It is remarkably good, but never flawless: real mail occasionally lands in junk, and clever scams occasionally land in your inbox.
Understanding how it works helps on both fronts. Reporting spam sharpens the filter, checking your junk folder rescues the odd misfiled message, and staying sceptical of anything unexpected covers the gap that no automated system can fully close.