Three weeks ago, the FCA was being hauled over the coals for handing a chunk of its most sensitive customer data to Palantir. The Guardian reported the regulator had agreed to pay the controversial US firm more than £30,000 a week to train an AI system on case intelligence files, fraud reports, consumer complaints, phone call recordings, email trawls. Privacy lawyers called it a disaster waiting to happen.
This week, the FCA announced something rather different. A fake bank.
The headline, in more measured language, is that the FCA has partnered with the Alan Turing Institute, Plenitude Consulting and Napier AI to build a fully synthetic dataset. It’s derived from real UK retail banking patterns but contains no real customer information. The money laundering scenarios inside it are fabricated. Fake people, fake accounts, fake transactions, fake dirty money moving through fake networks. Available from this month to any firm that applies to the FCA’s Digital Sandbox and wins a slot on the Synthetic Data AML Solution Sprint. Applications close on the 26th.
Set the two stories side by side and they start to look like a regulator working out, in public, what it thinks artificial intelligence should cost.
The tension is genuine. Anti-money-laundering work has always been hobbled by the same problem. You can’t build good detection models without access to the patterns hidden in real financial data, and you can’t get access to that data without tripping over GDPR, customer confidentiality and common sense. The FCA’s own research note on the project puts it plainly. Progress in combating money laundering depends on access to detailed financial data, yet legal and privacy constraints often restrict sharing of such information. Every compliance professional reading this knows that sentence by heart. It’s the reason AML has barely shifted the dial in two decades. UNODC estimates the global detection rate for laundered money at around one per cent.
Synthetic data offers a way out of that bind. If you can generate a dataset that behaves statistically like the real thing, without ever exposing a real customer, you can train models on patterns that previously sat inside a handful of tier-one banks with the scale and legal teams to handle the originals. That matters because most of the innovation in AML detection, and most of the smaller firms that could bring it, have been priced out of the game by data access alone.
Consider what that changes. A challenger bank, a specialist mortgage lender, a regtech startup in Shoreditch building graph-based transaction monitoring. Today, none of them can easily test their ideas against anything that resembles reality. In six months, if this works, they can. They can iterate in the Sandbox, see how their model performs against synthetic laundering scenarios curated by some of the sharpest minds at the Turing Institute, and come out the other side with something worth selling to lenders.
That’s a very different posture from the Palantir approach. In one case, the FCA hands real data to a single large contractor and hopes the privacy containment holds. In the other, it builds shared infrastructure that anyone with a good idea can plug into. One is command and control. The other is closer to a public utility.
Synthetic data isn’t a magic solvent. The scenarios in the dataset are only as good as the laundering typologies the FCA, Turing, Plenitude and Napier chose to model. Real criminals don’t oblige by sticking to the script. Any firm that trains purely on synthetic data is going to miss things a bank with fifteen years of its own real transaction history would not. There’s also a harder critique. If the synthetic data is generated from real UK retail banking inputs, how synthetic is it really? The FCA has argued that the process strips identifying signals, but the debate about whether synthetic data can ever be fully divorced from its source is live in academic circles, and it’ll follow this project.
Call it infrastructure rather than a fix. An attempt to build a common ground on which regulators and firms can test new AML ideas without either side having to stare down the privacy lawyers every time.
For mortgage brokers and lenders, the relevance is less theoretical than it might look. Property has always been one of the most reliable laundering vehicles in the UK. The NCA has been flagging this for years. The rise in specialist lender volumes, the fragmentation of broker networks, the sheer quantity of mortgage money that now flows through channels outside the high-street mainstream, all of it creates cracks for dirty money to slip through. The firms most likely to catch it will be the ones with the best pattern detection, not the ones with the loudest tickbox compliance. And better pattern detection, if this project does what it says on the tin, is about to get meaningfully cheaper to build.
There’s a bigger story here too, about how the FCA is learning to regulate AI. The Palantir trial is the old way. Buy capability, hope for the best, apologise afterwards. The synthetic data sprint is an attempt at something more considered. Build the tools, set the boundaries, let the market innovate inside them. If the FCA gets this right, it becomes a model for how a regulator can be pro-innovation without being reckless with the data of people who have no idea their records are even being used.
Whether it actually gets this right is another matter. The applications deadline is on the 26th. The project will be judged on what firms come out of the sprint with, not on the press release that kicked it off. Whether synthetic data can shift the AML dial, most people in the industry can already see. The harder question is whether the FCA has the patience to give this the long runway it needs, or whether the next political wind blows it back toward the cheaper, messier option of handing everything to Palantir and hoping no one notices.
Source: FCA and Turing Institute develop synthetic dataset to combat money laundering, Finextra, 17 April 2026. FCA research note: Synthetic data anti-money laundering project report.