William Jones·May 5, 2026·6 min read

Synthetic Users vs. Digital Twins: Why Product Teams Don't Need Clones

digital twinssynthetic usersproduct researchethics

The digital twin pitch sounds compelling: take a real person, build an AI replica of them, and interview the replica instead. Perfect fidelity. No scheduling conflicts. An exact copy of your customer, available on demand.

It's also the wrong goal for product research. And pursuing it creates problems that are harder to solve than the ones it claims to fix.

What digital twins actually require

Building a faithful digital twin of a specific person requires an enormous amount of personal data. You need their communication style, their decision-making patterns, their preferences, their biases, their history. You need enough data to distinguish this person from every other person with similar demographics.

That means collecting, storing, and processing deeply personal information at a granularity that makes most privacy frameworks nervous. GDPR's "right to be forgotten" becomes an engineering nightmare when a person's identity is distributed across embedding vectors in a high-dimensional space.

And even if you solve the data collection and privacy problems, the result is brittle. People change. They have bad days. They change jobs, move cities, discover new preferences. A digital twin is a snapshot that starts degrading the moment you build it.

The entire enterprise is expensive, ethically fraught, and fragile. But the deeper problem isn't practical — it's conceptual.

Research needs patterns, not individuals

When a product team interviews users, they're not trying to understand one person. They're trying to understand a type of person. What do price-sensitive early adopters think about our freemium model? How do risk-averse enterprise buyers evaluate our security posture? What frustrates power users about our onboarding?

These are questions about archetypes, not individuals. The insight from a user interview isn't "Jessica from Acme Corp thinks the pricing is too high." It's "users with Jessica's profile — detail-oriented, budget-conscious, comparison shoppers — tend to stall at the pricing page."

Cooper's personas, Revella's buyer personas, Rogers' diffusion of innovation categories, Jobs-to-be-Done frameworks — every major research methodology works with archetypes. None of them require, or even benefit from, an exact replica of a specific person.

This is the fundamental mismatch. Digital twins solve for individual fidelity. Product research needs archetype fidelity.

The OCEAN model as archetype engine

Synthicant builds personas on the Big Five personality model (OCEAN: Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism). This is a deliberate choice.

The Big Five doesn't describe individuals. It describes personality space — the five dimensions along which all human personality varies. When you set a persona's Conscientiousness to 4.5 and Agreeableness to 1.5, you're not cloning a person. You're defining a region of personality space: detail-oriented people who push back on things they disagree with.

This persona represents a type, and that type is grounded in decades of validated psychological research. Costa and McCrae's NEO-PI-R inventory has been used in thousands of studies linking Big Five scores to real-world outcomes: job performance, purchasing behavior, risk tolerance, communication style.

A digital twin gives you one data point with uncertain accuracy. An OCEAN-based persona gives you a behavioral archetype with documented predictive validity.

The ethical line is clear

There's a straightforward ethical distinction between these two approaches.

An OCEAN-based persona is a fictional character informed by personality science. It doesn't represent a real person. It doesn't contain anyone's personal data (and when real data is uploaded, PII is stripped before processing). No one's identity is captured, stored, or simulated.

A digital twin is, by definition, an attempt to replicate a specific person's behavior without that person being present. Even with consent, this raises questions that personality-based archetypes simply don't trigger. Can someone consent to being replicated in ways they can't predict? What happens when the digital twin says something the real person wouldn't? Who is liable?

Goffman's work on impression management showed that people actively control how they present themselves in different contexts. A digital twin strips away that agency. It lets someone else put words in your mouth — literally.

Product research doesn't need to go there. The questions that matter for product decisions are archetype questions, and they can be answered with archetype tools.

Six frameworks, all archetype-based

Synthicant includes six research frameworks as built-in templates, and every one of them operates at the archetype level:

Cooper Personas define design targets based on goals, behaviors, and pain points — not individual identities.

Revella Buyer Personas focus on the buying decision: what triggers it, what criteria matter, what objections arise. These patterns repeat across buyers; they're not unique to one person.

Rogers' Diffusion of Innovation segments users by adoption speed: innovators, early adopters, early majority, late majority, laggards. These are population-level archetypes.

Jobs-to-be-Done frames users by the outcome they're trying to achieve, deliberately abstracting away individual identity.

Gartner Personas model enterprise decision-makers by role, authority level, and evaluation criteria — archetypes of organizational behavior.

Lifecycle Personas represent users at different stages of the customer journey: prospects, new users, power users, at-risk users.

Every framework asks: what type of person are we designing for? None of them ask: which specific person should we clone?

When individual data helps (without cloning)

There's a middle ground between pure archetype and digital twin, and it's where Synthicant's dynamic personas live.

You can upload real customer data — support tickets, interview transcripts, survey responses — to ground a persona in evidence. The system analyzes this data to extract personality traits, speaking patterns, beliefs, and concerns. But it aggregates across documents. It builds a representative archetype from the data, not a replica of any single person in it.

And every piece of uploaded text passes through PII redaction before processing. Names, emails, phone numbers — all stripped. The persona is informed by your customer base without being a copy of any customer in it.

This gives you the best of both approaches: the specificity that comes from real data, with the generalizability that comes from archetype-level abstraction.

Practical implications

If someone pitches you a digital twin product for user research, ask two questions:

First, what data do they need to build it, and where does that data go? If the answer involves collecting and storing individual behavioral profiles, you're taking on privacy risk for capability you don't need.

Second, what research question requires individual-level fidelity that archetype-level fidelity can't answer? In most product research contexts, the answer is none.

Build personas that represent the types of people who use your product. Ground them in real data. Give them documented personality traits. Then interview them the way you'd interview a real user — with open questions, follow-ups, and healthy skepticism about the answers.

That's synthetic user research. It doesn't require cloning anyone.

References

Costa, P.T. & McCrae, R.R. (1992). "Revised NEO Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI) Professional Manual." Psychological Assessment Resources. — The foundational Big Five personality instrument, establishing the trait dimensions that Synthicant uses to define personality archetypes.

Goffman, E. (1959). "The Presentation of Self in Everyday Life." Anchor Books. — Seminal work on impression management, demonstrating that people actively control self-presentation across contexts — a form of agency that digital twins strip away.

Jiang, H., Zhang, X., Cao, X., et al. (2024). "PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits." Proceedings of NAACL 2024. — Showed that Big Five-assigned personas produce consistent, identifiable behavior in LLMs, validating the archetype-based approach to synthetic persona construction.

John, O.P. & Srivastava, S. (1999). "The Big Five Trait Taxonomy: History, Measurement, and Theoretical Perspectives." Handbook of Personality: Theory and Research. — The most-cited overview of Big Five science, establishing that personality traits describe dimensions of variation, not individual identities.

Park, J.S., O'Brien, J.C., Cai, C.J., et al. (2023). "Generative Agents: Interactive Simulacra of Human Behavior." Proceedings of ACM UIST 2023. — Demonstrated that personality-driven AI agents sustain coherent, archetype-consistent behavior over extended interactions.