← Other Blogs

AI social engineering in 2026: why phishing simulations built on last year's templates are the wrong defense

Targeted social engineering used to require hours of manual reconnaissance. AI removed that ceiling. Personalized, multi-channel attacks now take seconds to build — and most simulation programs still test only email.
Social engineering
GenAI and security
Phishing
Written By:
Natalia Bochan

TL;DR

AI has made social engineering adaptive, personalized, and multi-channel. Most phishing simulation programs were designed for a different threat. This post explains what has changed, what a modern AI social engineering simulation needs to test, and what metrics give a more accurate picture of actual exposure.

Introduction

Most phishing simulation programs run on a library of email templates. A vendor creates scenarios based on attacks that worked months ago, you pick a few, send them on schedule, and measure who clicked. The click rate goes into the report. The report goes to the board. The cycle repeats.

That model worked when social engineering was mostly email-based and attacker resources were scarce. In 2026, it measures the wrong thing.

AI has changed the economics of social engineering. Creating a believable, personalized attack no longer requires a skilled attacker to spend hours on reconnaissance. It requires the right prompt and a few minutes. The result is a different category of threat: adaptive, multi-channel, and calibrated to the individual target in real time.

A training program built on last year's templates isn't wrong because the templates are old. It's wrong because the threat it models no longer exists at scale.

How AI has changed social engineering in 2026

For most of the past decade, social engineering at scale meant phishing: mass-distributed email campaigns using templates designed to catch as many people as possible. Targeted attacks existed but required significant attacker effort. The resource constraint created a natural ceiling.

AI has removed that ceiling.

Modern social engineering attacks use AI to personalize at scale. A system pulls publicly available data from LinkedIn, company websites, and other sources to build a detailed target profile, then generates a message that matches that person's context, communication style, and recent activity. What used to require hours of manual work now takes seconds.

The attack surface has also expanded beyond email. Voice cloning can replicate a colleague's or executive's voice convincingly enough to pass casual scrutiny. Deepfake video is increasingly accessible. SMS, Teams, Slack, and WhatsApp are attack channels alongside email. Sophisticated attackers sequence these: an email establishes a false context, a follow-up call from a cloned voice creates urgency, a Teams message closes the loop.

This multi-channel, multi-step sequencing is how attackers operate in 2026. A simulation program that tests only email clicks isn't testing this threat.

What traditional phishing simulations actually measure

Traditional phishing simulations answer one question: did this employee click a link in a simulated phishing email?

That's a useful data point. But it's a narrow slice of behavior, captured in a narrow context, using scenarios that may not reflect what attackers are currently doing.

The template lag problem

Simulation vendors build scenario libraries based on observed attack patterns. By the time a template is created, reviewed, approved, and deployed, the attack it models may be six to twelve months old. AI-powered attackers aren't operating on that timeline. They generate new variants continuously, adapting to what's working now.

Training employees to recognize yesterday's attack patterns builds a specific, brittle form of recognition. It doesn't build the broader behavioral resilience that a continuously adapting threat requires.

Traditional simulations also tend to measure platform metrics rather than risk. A 72% training completion rate and a 15% click-through rate on simulations are real numbers. But they measure performance within the training environment, not behavior change in actual risk situations. The two are related, but they are not the same thing.

68% of breaches involved the human element in 2024, according to the Verizon Data Breach Investigations Report (Verizon, 2024, verizon.com/dbir). That figure has not improved substantially despite widespread adoption of phishing simulation programs. That gap deserves attention.

What an AI social engineering simulation needs to test

An AI social engineering simulation designed for the current threat environment differs from template-based phishing in three important ways.

First, realism. Scenarios should reflect how attacks are actually constructed, which increasingly means personalized content rather than generic templates. A simulation that generates scenarios using the target's role, context, and communication patterns tests a different behavioral response than one that sends the same email to the entire department.

Second, multi-vector coverage.

Multi-channel exposure

Real attacks don't arrive through a single channel. An employee might receive a spoofed email, a follow-up SMS, and a voice message from a cloned executive within the same hour. Each touchpoint is designed to reinforce the others. The threat isn't the email or the call in isolation. It's the combination.

A simulation that tests only email gives employees no exposure to this. Multivector social engineering simulation places employees inside scenarios that mix channels simultaneously: email alongside voice, SMS alongside a Teams message, a deepfake call reinforced by a calendar invite. The question it answers isn't "did they click the phishing link?" It's "did they recognize the pattern when the pressure came from multiple directions at once?"

Third, behavioral signals rather than click metrics. A simulation should capture what happens before and after the click, not just whether the click occurred. Did the employee report the suspicious message? How quickly? Did they verify through a secondary channel before acting? These behavioral indicators are more predictive of actual risk than a binary clicked/didn't click outcome.

The metrics that matter

The question a CISO needs to answer isn't "what percentage of our employees clicked last quarter?" It's "how exposed is this organization to social engineering right now?"

Those are different questions, and they require different data.

A social engineering susceptibility score aggregates behavioral signals across multiple channels and scenarios to give a more complete picture of organizational exposure. It accounts for how employees respond to different attack types, how consistent their judgment is across channels, and whether training interventions produce durable behavior change or just temporary awareness.

Completion rates and click rates are platform metrics. A susceptibility score is a risk metric. The distinction matters because it changes what you report to the board, what you prioritize in your program, and what you can honestly claim about your actual risk posture.

What this means for your program in 2026

If your current simulation program is email-only and template-based, that doesn't mean it has no value. It means you have an accurate measure of one narrow behavior in one narrow context. You're missing the broader picture.

Here are the questions to ask when reviewing your program:

➡️ Does your simulation include voice, SMS, and messaging channels alongside email? If not, you're not testing the full attack surface your employees actually face.

➡️ Does your scenario content reflect current attack sophistication, or is it built on templates from last year? The gap between simulation realism and actual attacker capability is itself a risk.

➡️ Are you measuring behavioral change over time, or click rates in discrete campaigns? A program designed to reduce risk should show measurable behavioral improvement, not just fluctuating click-through numbers.

➡️ Do you have AI vishing simulation in scope? Voice-based attacks represent one of the fastest-growing vectors and a significant gap in most simulation programs.

These aren't questions that demand an immediate vendor change. They're questions that help identify where your current program has blind spots.

The mismatch is the risk

The core problem isn't that phishing simulations are useless. It's that they were designed for a threat environment that no longer exists at scale. When the gap between simulation sophistication and actual attack sophistication widens, employees encounter attacks they've never been trained to recognize.

That gap is the risk. Closing it requires updating both what you simulate and how you measure.

The human layer has never been more important to defend, and it has never been harder to defend well. The tools that measure it need to reflect that.

Conclusion

Phishing simulations remain a useful tool. But the organizations managing human risk most effectively in 2026 aren't those with the highest completion rates. They're the ones that have updated their simulation approach to match how attacks actually work today.

That means multi-channel. That means behavioral signals, not just click metrics. And it means understanding what you're measuring well enough to explain the gap to the people who need to close it.

We built Zepo's simulation approach around these principles.

Content
Act now before attackers do
Unify deepfake simulations, personalized training, and risk analytics into a single platform that builds measurable defense.
Talk to an expert