AI social engineering in 2026: why phishing simulations built on last year's templates are the wrong defense

Targeted social engineering used to require hours of manual reconnaissance. AI removed that ceiling. Personalized, multi-channel attacks now take seconds to build — and most simulation programs still test only email.

Social engineering

GenAI and security

Phishing

Written By:

Natalia Bochan

TL;DR

AI has made social engineering adaptive, personalized, and multi-channel. Most phishing simulation programs were designed for a different threat. This post explains what has changed, what a modern AI social engineering simulation needs to test, and what metrics give a more accurate picture of actual exposure.

Introduction

Most phishing simulation programs run on a library of email templates. A vendor creates scenarios based on attacks that worked months ago, you pick a few, send them on schedule, and measure who clicked. The click rate goes into the report. The report goes to the board. The cycle repeats.

That model worked when social engineering was mostly email-based and attacker resources were scarce. In 2026, it measures the wrong thing.

AI has changed the economics of social engineering. Creating a believable, personalized attack no longer requires a skilled attacker to spend hours on reconnaissance. It requires the right prompt and a few minutes. The result is a different category of threat: adaptive, multi-channel, and calibrated to the individual target in real time.

A training program built on last year's templates isn't wrong because the templates are old. It's wrong because the threat it models no longer exists at scale.

How AI has changed social engineering in 2026

For most of the past decade, social engineering at scale meant phishing: mass-distributed email campaigns using templates designed to catch as many people as possible. Targeted attacks existed but required significant attacker effort. The resource constraint created a natural ceiling.

AI has removed that ceiling.

Modern social engineering attacks use AI to personalize at scale. A system pulls publicly available data from LinkedIn, company websites, and other sources to build a detailed target profile, then generates a message that matches that person's context, communication style, and recent activity. What used to require hours of manual work now takes seconds.

The attack surface has also expanded beyond email. Voice cloning can replicate a colleague's or executive's voice convincingly enough to pass casual scrutiny. Deepfake video is increasingly accessible. SMS, Teams, Slack, and WhatsApp are attack channels alongside email. Sophisticated attackers sequence these: an email establishes a false context, a follow-up call from a cloned voice creates urgency, a Teams message closes the loop.

This multi-channel, multi-step sequencing is how attackers operate in 2026. A simulation program that tests only email clicks isn't testing this threat.

What traditional phishing simulations actually measure

Traditional phishing simulations answer one question: did this employee click a link in a simulated phishing email?

That's a useful data point. But it's a narrow slice of behavior, captured in a narrow context, using scenarios that may not reflect what attackers are currently doing.

The template lag problem

Simulation vendors build scenario libraries based on observed attack patterns. By the time a template is created, reviewed, approved, and deployed, the attack it models may be six to twelve months old. AI-powered attackers aren't operating on that timeline. They generate new variants continuously, adapting to what's working now.

Training employees to recognize yesterday's attack patterns builds a specific, brittle form of recognition. It doesn't build the broader behavioral resilience that a continuously adapting threat requires.

Traditional simulations also tend to measure platform metrics rather than risk. A 72% training completion rate and a 15% click-through rate on simulations are real numbers. But they measure performance within the training environment, not behavior change in actual risk situations. The two are related, but they are not the same thing.

68% of breaches involved the human element in 2024, according to the Verizon Data Breach Investigations Report (Verizon, 2024, verizon.com/dbir). That figure has not improved substantially despite widespread adoption of phishing simulation programs. That gap deserves attention.

What an AI social engineering simulation needs to test

An AI social engineering simulation designed for the current threat environment differs from template-based phishing in three important ways.

First, realism. Scenarios should reflect how attacks are actually constructed, which increasingly means personalized content rather than generic templates. A simulation that generates scenarios using the target's role, context, and communication patterns tests a different behavioral response than one that sends the same email to the entire department.

Second, multi-vector coverage.

Multi-channel exposure

Real attacks don't arrive through a single channel. An employee might receive a spoofed email, a follow-up SMS, and a voice message from a cloned executive within the same hour. Each touchpoint is designed to reinforce the others. The threat isn't the email or the call in isolation. It's the combination.

A simulation that tests only email gives employees no exposure to this. Multivector social engineering simulation places employees inside scenarios that mix channels simultaneously: email alongside voice, SMS alongside a Teams message, a deepfake call reinforced by a calendar invite. The question it answers isn't "did they click the phishing link?" It's "did they recognize the pattern when the pressure came from multiple directions at once?"

Third, behavioral signals rather than click metrics. A simulation should capture what happens before and after the click, not just whether the click occurred. Did the employee report the suspicious message? How quickly? Did they verify through a secondary channel before acting? These behavioral indicators are more predictive of actual risk than a binary clicked/didn't click outcome.

The metrics that matter

The question a CISO needs to answer isn't "what percentage of our employees clicked last quarter?" It's "how exposed is this organization to social engineering right now?"

Those are different questions, and they require different data.

A social engineering susceptibility score aggregates behavioral signals across multiple channels and scenarios to give a more complete picture of organizational exposure. It accounts for how employees respond to different attack types, how consistent their judgment is across channels, and whether training interventions produce durable behavior change or just temporary awareness.

Completion rates and click rates are platform metrics. A susceptibility score is a risk metric. The distinction matters because it changes what you report to the board, what you prioritize in your program, and what you can honestly claim about your actual risk posture.

What this means for your program in 2026

If your current simulation program is email-only and template-based, that doesn't mean it has no value. It means you have an accurate measure of one narrow behavior in one narrow context. You're missing the broader picture.

Here are the questions to ask when reviewing your program:

➡️ Does your simulation include voice, SMS, and messaging channels alongside email? If not, you're not testing the full attack surface your employees actually face.

➡️ Does your scenario content reflect current attack sophistication, or is it built on templates from last year? The gap between simulation realism and actual attacker capability is itself a risk.

➡️ Are you measuring behavioral change over time, or click rates in discrete campaigns? A program designed to reduce risk should show measurable behavioral improvement, not just fluctuating click-through numbers.

➡️ Do you have AI vishing simulation in scope? Voice-based attacks represent one of the fastest-growing vectors and a significant gap in most simulation programs.

These aren't questions that demand an immediate vendor change. They're questions that help identify where your current program has blind spots.

The mismatch is the risk

The core problem isn't that phishing simulations are useless. It's that they were designed for a threat environment that no longer exists at scale. When the gap between simulation sophistication and actual attack sophistication widens, employees encounter attacks they've never been trained to recognize.

That gap is the risk. Closing it requires updating both what you simulate and how you measure.

The human layer has never been more important to defend, and it has never been harder to defend well. The tools that measure it need to reflect that.

Conclusion

Phishing simulations remain a useful tool. But the organizations managing human risk most effectively in 2026 aren't those with the highest completion rates. They're the ones that have updated their simulation approach to match how attacks actually work today.

That means multi-channel. That means behavioral signals, not just click metrics. And it means understanding what you're measuring well enough to explain the gap to the people who need to close it.

We built Zepo's simulation approach around these principles.

‍

Content

Act now before attackers do

Unify deepfake simulations, personalized training, and risk analytics into a single platform that builds measurable defense.

Talk to an expert

Cookie	Proveedor	Descripción	Duración
__cf_bm	Cloudflare	Soporte para Cloudflare Bot Management.	1 hora
_cfuvid	Calendly	Rastrea usuarios entre sesiones para mantener la consistencia y ofrecer servicios personalizados.	Sesión
__hssrc	HubSpot	Determina si el visitante ha reiniciado su navegador.	Sesión
__hssc	HubSpot	Rastrea sesiones y determina si HubSpot debe incrementar el número de sesión.	1 hora
wpEmojiSettingsSupports	WordPress	Determina si el navegador del usuario puede mostrar emojis correctamente.	Sesión
cookie_acm_temp_42549_shown	Propio	Cookie temporal de gestión del banner de cookies.	Menos de 1 minuto

Cookie	Proveedor	Descripción	Duración
_ga	Google Analytics	Calcula datos de visitantes, sesiones y campañas. Almacena información de forma anónima y asigna un número generado aleatoriamente para reconocer visitantes únicos.	1 año 1 mes 4 días
_ga_*	Google Analytics	Almacena y cuenta las páginas vistas.	1 año 1 mes 4 días
__hstc	HubSpot	Cookie principal de HubSpot para el seguimiento de visitantes. Contiene el dominio, la marca de tiempo inicial, la última visita y el número de sesión.	6 meses
hubspotutk	HubSpot	Rastrea visitantes al sitio web. Se transmite a HubSpot al enviar formularios para evitar contactos duplicados.	6 meses

Cookie	Proveedor	Descripción	Duración
wp-wpml_current_language	Propio (WPML)	Almacena la configuración del idioma actual del sitio.	Sesión
wpml_browser_redirect_test	Propio (WPML)	Comprueba si las cookies están habilitadas en el navegador.	Sesión
_icl_visitor_lang_js	Propio (WPML)	Almacena el idioma al que fue redirigido el visitante.	1 día