Synthetic Data, Human Judgment, and the Future of Market Research

28 May 2026 | 12 min read | Written by Kelvin Claveria

A conversation with Dale Evernden on synthetic data and personas in the market research space

Synthetic data has quickly become one of the hottest market research trends — but also one of the most debated. For some, it represents a powerful new way to accelerate learning, simulate customer reactions, and reduce the cost of early-stage exploration. For others, it raises uncomfortable questions about evidence, representation, and whether the industry risks replacing real people with polished approximations of them.

Dale To explore where synthetic data fits today — and where it may be headed — we sat down with Dale Evernden, EVP Design and Innovation at Rival Group, to discuss synthetic respondents, AI-generated personas, validation, and the evolving role of human insight in an AI-accelerated world.

Dale’s perspective is optimistic but measured. He sees synthetic data not as a replacement for research, but as a new layer in the research system — one that can help teams move faster, ask better questions, and make better use of real human feedback.

“The future is not synthetic versus human,” he says. “ It is synthetic plus human, grounded in a living context layer.”

Check out our full Q&A.


Let’s start with the basics. When people say “synthetic data,” what do they actually mean?

Dale Evernden: Part of the challenge is that we use one phrase — synthetic data — to describe a few different things. At one end, synthetic data can mean statistically generated or augmented datasets. That might involve filling gaps, protecting privacy, modeling scenarios, or creating representative-looking data where real data is limited or sensitive.

At the other end, especially in market research, people are often talking about synthetic respondents or synthetic personas — AI-generated agents designed to simulate how certain types of people might respond to a product idea, message, concept, or experience.

So in the context of market research, I would define synthetic data as artificially generated research input that approximates how real people, segments, or markets might respond under certain conditions.

The word “approximates” is important.

Synthetic data can simulate patterns, reactions, and possibilities. It can help us explore. It can help us pressure-test. But it is not the same thing as lived experience. It is not emotional truth. It is not a direct substitute for real customer contact.

For me, synthetics are best understood as a simulation layer, not a replacement layer.


When you think about the future of synthetic data, do you see it as a danger, a passing trend, a specific-use-case tool, or a true game changer?


Dale:
If I had to choose one, I’d say it is a game changer — but for specific parts of the workflow.

I don’t think synthetic data replaces market research. I don’t think it replaces human respondents. But I do think it changes the shape, speed, and economics of the research process.

The most immediate value is upstream. Synthetic approaches can help insight teams explore ideas, pressure-test concepts, surface edge cases, improve stimuli, test language, refine questionnaires, and decide what deserves real human validation.

That is where I think the value is: not in avoiding research, but in getting to better research faster.

Synthetic data is a game changer because it raises the value of good human insight.

The least interesting version of synthetic data is simply “let’s do the same thing cheaper.” There may be value in that, but I don’t think that is the frontier. The more interesting version is that synthetic data helps us build more adaptive learning systems — systems where AI-generated exploration, synthetic simulation, and real human feedback work together.

So yes, it is a game changer. But not because it eliminates the need for human insight. Synthetic data is a game changer because it raises the value of good human insight.


Where do you think the market research industry actually is today? Is synthetic data still experimental, or is it becoming standard practice?

Dale: I think we are somewhere between experimentation and emerging practice. There is a lot of enthusiasm right now. There are impressive demos. Some teams doing thoughtful pilots. There are also a lot of claims moving faster than the standards around them.

That is not unusual. Every major methodological shift goes through a phase where the technology moves faster than the language, norms, buyer confidence, and validation practices.

Right now, I think many organizations are asking the right questions: Where can this be useful? What types of decisions can it support? What should it not be used for? How do we validate it? How transparent do we need to be with clients and stakeholders?

Those are healthy questions.

The risk would be jumping too quickly from “this is useful” to “this is equivalent to human research.” Those are very different claims.

My view is that synthetic data is already useful in specific parts of the workflow, but the industry is still developing the confidence, governance, and shared standards required for broader adoption.

👉 RELATED ARTICLE: Synthetic Personas: Insights From Tropicana, Newell Brands and Warner Bros. Discovery


That brings us to quality. How do we know if synthetic data is any good?

Dale: We need to stop asking “Is the synthetic data good?” in the abstract.

The better question is: Good enough for what decision?

That is the key. If I’m using synthetic respondents to generate possible objections to a product concept, I don’t need the same evidentiary standard as I would if I were using synthetic data to estimate market demand, make a launch decision, or shape a high-investment strategy.

Synthetic respondents can give you polished, confident answers. But fluency is not evidence.

Validation has to be use-case specific. For me, there are a few layers of validation. The first is face validity. Does the output make sense to people who understand the customer, the category, and the market?

The second is comparison to known human data. If we have prior studies, behavioral data, community feedback, support tickets, reviews, or other real-world signals, do the synthetic outputs resemble known patterns?

The third is usefulness. Did the synthetic layer actually improve the next human research cycle? Did it help us ask better questions? Did it help us identify weaker concepts earlier? Did it surface tensions we might otherwise have missed?

And then there is transparency. Can the research supplier or insight team explain what the synthetic data was grounded in, what it is appropriate for, what it is not appropriate for, and where uncertainty remains?

My biggest concern is plausibility. AI is very good at producing outputs that sound reasonable. Synthetic respondents can give you polished, confident answers. But fluency is not evidence.
That is why validation matters so much.


Are current validation methods sufficient?

Dale: In some places, yes. In many places, not yet. There are certainly teams doing serious work around validation. But across the industry, a lot of synthetic output is still being evaluated by whether it feels plausible. That is not enough.

Plausibility is a dangerous standard because the entire technology is optimized to produce plausible output.

That does not mean we should avoid synthetic data. It means we should be disciplined about how we use it.

Again, it comes back to decision risk. If the synthetic layer is being used for ideation or early pressure-testing, the validation burden is lower. If it is being used to support a major business decision, the validation burden should be much higher.

I think this is where insights teams have an important leadership role to play. They understand evidence. They understand uncertainty. They understand the difference between signal and story. As synthetic tools become more common, that judgment becomes more important, not less.


Where would you draw the line on appropriate use cases?

Dale: I draw the line based on decision risk and human consequence. I’m comfortable using synthetic data for exploration, pressure-testing, stimulus development, edge-case discovery, questionnaire refinement, and early-stage prioritization.

Those are places where synthetic data can help teams think better. It can expand the range of possibilities. It can help identify what needs to be tested with real people. It can make the next research cycle sharper.

Where I get much more cautious is when synthetic data is being used as the sole basis for claims about real people — especially in sensitive areas.

That might include health, financial stress, vulnerability, cultural identity, emotional well-being, accessibility, or decisions that materially affect customers, employees, patients, or citizens.

My rule of thumb is simple: Use synthetics to expand thinking. Do not use synthetics to replace accountability.

If a decision has meaningful consequences, real human evidence needs to be in the loop.

Use synthetics to expand thinking. Do not use synthetics to replace accountability.


Who gets to decide where synthetic data should and shouldn’t be used?

Dale: I think it has to be a shared governance model. Researchers should play a central role because they understand methodology, evidence quality, bias, and interpretation. Product and business leaders also need to be clear about the decision being made and the risk attached to that decision. Depending on the context, legal, privacy, data science, and ethics teams may need to be involved as well.

But I would argue that insights teams should be one of the most important voices in that governance conversation.

As AI becomes more embedded in the business, the role of insights is not just to run studies. It is to help the organization understand what kind of evidence it is using, where that evidence came from, what it can support, and where it is being overextended.

That is a much bigger strategic role.


What would a buyer need to know before trusting a deliverable that incorporates synthetic data?

Dale: I would want three things: disclosure, grounding, and validation.

First, disclosure. Tell me where synthetics were used and where real humans were used. Don’t blur that line.

Second, grounding. Tell me what the synthetic layer was grounded in. Was it generic model knowledge? Prior research? Community data? Behavioral data? Category expertise? A specific customer segment? The answer matters.

Third, validation. Show me why this method is appropriate for the decision being made. If synthetic data helped generate hypotheses, that is one thing. If it helped shortlist concepts before human validation, that is another. If it is being presented as the voice of the customer, that requires a much higher standard.
Trust comes from clarity.

Research buyers do not need every technical detail, but they do need to understand the role synthetic data played in the workflow and how much weight they should put on it.


What unresolved methodological questions concern you most when it comes to the use of synthetic data in market research?

Dale: One of the big questions for me is whether synthetic data helps us get closer to reality or simply gives us a more polished version of our assumptions.

That is especially important with personas. Traditional personas have always had a risk of becoming oversimplified artifacts. They can flatten complexity. They can become a convenient story about a customer rather than a living representation of real people.

Synthetic personas can intensify that risk because they can talk back. They feel alive. They can answer questions. They can express preferences. They can produce emotionally convincing responses. That makes them useful, but it also makes them potentially dangerous.

A synthetic persona can make an assumption feel real. So the methodological question is not only “Is this accurate?” It is also “What kind of evidence are we actually looking at?” Are we treating synthetic output as a hypothesis? A simulation? A proxy? A directional input? Or evidence?

Those are different things, and we need to be honest about the distinction.


Rival Technologies has been pushing at the edges of conversational and qualitative research. Where do you see synthetic data unlocking genuinely new types of research?

Dale: This is the part I’m most excited about. The least interesting version of synthetic data is using it to do the same research faster or cheaper. Again, that has value, but it is not the real frontier.

The more interesting opportunity is building adaptive research systems. Imagine a team has an always-on insight community, a rich knowledge base, and a synthetic layer grounded in real human context. That team can generate product ideas, simulate possible reactions, identify objections, refine stimuli, and then go back to the community for real validation. The human feedback then refreshes the knowledge base, which makes the next synthetic cycle better.

That is not just a faster survey. That is a different operating model.

I think synthetic data unlocks at least three new possibilities.

  • The first is continuous pre-learning. Before you spend real respondent time, you can simulate possible reactions and improve the quality of what you take to humans.

  • The second is dynamic personas grounded in real feedback. Not static personas in a PDF, but living representations of customer segments that can be queried, challenged, and updated as new human data comes in.

  • The second is dynamic personas grounded in real feedback. Not static personas in a PDF, but living representations of customer segments that can be queried, challenged, and updated as new human data comes in.

  • The third is scenario-based research. You can explore a much wider range of futures, product variants, messages, journeys, and tradeoffs before deciding where to invest real research depth.

But the key is grounding. Without real human context, synthetic systems drift toward generic averages. With a living community feeding them, they become much more useful.

So for me, the unlock is not synthetic respondents in isolation. The unlock is the combination of AI generation, synthetic simulation, human validation, and a continuously refreshed knowledge base.

That is the research flywheel.


You’ve used the phrase “human context layer.” What does that mean in this conversation?

Dale: The human context layer is the living source of customer understanding that sits between AI systems and business decisions.

As AI gives teams the ability to produce more ideas, more concepts, more variants, and more scenarios, the bottleneck moves. The constraint is no longer just production capacity. It is decision quality.

The question becomes: which ideas are worth believing in?
That is a human question. It requires understanding what people need, what they feel, what they are trying to accomplish, what frustrates them, what motivates them, and what is changing in their lives.

Insight communities are uniquely valuable here because they provide recurring, profiled, longitudinal, fresh human feedback. They are not just a place to ask questions. They become a living system for grounding decisions.

That is why I think communities become more strategically important in the AI era, not less.

AI can help us generate possibilities. Synthetic data can help us simulate reactions. But real human context is what keeps the system connected to reality.


How should insights teams think about their role as synthetic data becomes more common?

Dale: I think the role of insights teams becomes more strategic.
The old perception was that insights acted as a quality gate. The business had an idea, research tested it, and that process sometimes felt like it slowed things down.

In the AI era, the role shifts from gatekeeper to accelerator. Insights teams can help organizations move faster by reducing uncertainty earlier. They can help teams avoid investing too much in weak ideas. They can help AI systems stay connected to real customer context. They can help the business understand not just what people said, but what it means and what to do next.

I see two roles becoming especially important.

  • The first is stewardship. Insights teams will increasingly own the quality, freshness, and interpretation of the human context layer that AI and business teams rely on.

  • The second is translation. AI can summarize patterns, but insights teams help organizations understand meaning. They help the business feel the customer clearly enough to act.

That human translation layer is not going away.


What is the biggest risk you see with synthetic data in market research?

Dale: The biggest risk is not that synthetic data is fake. The biggest risk is that it can be plausibly wrong. It can give teams polished, confident answers that feel like customer truth but are really just a simulation of assumptions, training data, or incomplete context.

That matters because AI changes the speed of decision-making. If teams are moving faster, a weak signal can travel further before anyone challenges it.

So the danger is not just bad data. It is faster overconfidence.
That is why I keep coming back to human grounding. The future should not be about removing people from the research process. It should be about using AI to make every human interaction more valuable.

Synthetic data should help us get to better questions, better conversations, and better decisions. It should not become a shortcut around the people we are trying to understand.


Looking ahead to 2027, what do you think will change most about how the industry uses synthetic data?

Dale: I think the conversation will become much more practical.
Right now, we are still debating whether synthetic data is real research, fake research, dangerous, transformative, or overhyped. That binary debate will fade.

By 2027, the better organizations will be talking about use cases, validation standards, workflow integration, governance, and decision risk.

Synthetic tools will also be much more embedded into everyday research workflows. They will help write screeners, test surveys, generate hypotheses, model possible segment reactions, summarize prior learning, and pressure-test concepts before fieldwork.

A lot of that will feel normal. What will remain the same is the need for human judgment.

Research has never only been about collecting responses. It is about understanding people, interpreting context, making tradeoffs, and helping organizations act wisely.

Synthetic data can help with parts of that system, but it does not remove the need for accountability.

So my prediction is: The tools will become much more synthetic. The responsibility will remain deeply human.


Final thought: synthetic plus human

For Dale, the most productive way to think about synthetic data is not as a replacement for traditional research, but as a new capability inside a broader learning system.

Used well, synthetic data can help teams explore more possibilities, sharpen concepts earlier, and make better use of real respondent time. Used poorly, it can create a false sense of confidence and distance organizations from the very people they need to understand.

The opportunity is not to choose between synthetic and human insight coming from tools like communities. It is to connect them.

As AI accelerates the pace of creation, companies will need stronger ways to stay grounded in reality. That is where insight communities, real customer feedback, and human judgment become more valuable.

Or, as Dale puts it: “Synthetic data should accelerate the path to human validation, not bypass it.”

Market Research Trends 2026 - blog CTA

author image
Written by Kelvin Claveria

Kelvin Claveria is Senior Director of Demand Generation and Content Marketing at Rival Technologies and Reach3 Insights

Talk to an expert
TALK TO AN EXPERT

Talk to an expert

Got questions about insight communities and mobile research?Chat with one of our experts

GET STARTED
MTK789tQ

SUBSCRIBE Sign up to get new resources from Rival.

Subscribe by Email

No Comments Yet

Let us know what you think