61,597 real people · 1,472 data points each

Survey data grounded in
the truth.

SynthSurvey gives you real survey distributions fast — without making things up. Every answer is labelled: real data, grounded estimate, or synthetic. You always know what you’re looking at.

Request Access See how it works

app.synthsurvey.com

Audience Response

+ New question

• From real data

Do you trust AI-generated content in news?

Strongly disagree

24%

Somewhat disagree

15%

Neutral

35%

Somewhat agree

18%

Strongly agree

• Grounded estimate

How often do you use TikTok per day?

Never

44%

< 30 minutes

29%

30min – 2hrs

18%

The fused dataset

The ground truth at the centre of everything

At the heart of SynthSurvey is a fused dataset of 61,597 real people — drawn from governments, major universities, and our own proprietary panel. We know 1,472 things about each of them. We weight newer data more heavily and decay older readings over time, so the answer reflects what is true now, not what was true a decade ago.

National Probability Samples Validated Political Instruments Government Trackers University Surveys Social Media Signals Proprietary Panel

61,597

Real respondents in the fused dataset — not simulated, not guessed.

1,472

Data points known about each person across surveys, demographics, and social signals.

5-tier

Resolution cascade — every question follows a disciplined path to the most grounded answer possible.

The resolution cascade

Five tiers to the truth

When you ask SynthSurvey a question, it follows a disciplined path — not just a prompt fired at a machine. Every result is labelled so you always know exactly what you’re looking at.

Tier 1

Exact Match

The system looks for the same question in our index of real surveys. If real people have already answered it, we return their distribution directly — no guessing, no machine inference. Just the data.

From real data

Tier 2

Close Match

No exact match found — but a question asking the same thing in different words exists. We return the real distribution and flag that the wording was not identical. Still real people. Still real data.

From real data

Tier 3

Validated Lookup

No survey question matches — so we pull from government data and independent trackers that measure facts directly. If you ask who uses TikTok, the data tells you the facts.

From real data

Tier 4

Grounded Estimate

A close survey match exists but your specific audience is too small to be reliable. We broaden to all respondents in the same geography, then adjust — checking how your audience differs from the general population and correcting the distribution to reflect their specific tendencies.

Grounded estimate

Tier 5

Fallback

The last resort. The system uses the attitudinal fingerprint of the audience and social media engagement signals to generate an answer — constrained by everything we know about that group. Clearly labelled so you always know.

Synthetic estimate

Attitudinal Fingerprint

Young women · London · Gaming audience

Tech optimism

+0.78

Trust in science

+0.82

Media scepticism

+0.61

Brand loyalty

+0.44

Social identity

+0.71

sha256: a3f9c2e1b7d4…8e12f0c3a91b

Attitudinal fingerprinting

Voices that are not just noise

Every audience has a fingerprint — a unique hash describing what makes them different from the average person. We use it to ensure that when we generate synthetic voices, they are the right voices: shifted to the right positions in the data, not random noise.

If you are looking at young women in London who play video games, their fingerprint will show they are more optimistic about technology and more likely to trust scientists. The machine is constrained by what the data knows to be true of that group.

We show our errors

Most people hide their accuracy numbers.
We put ours on the page.

Across a range of surveys — including our Cint panel — we apply an 80/20 hold-split. 80% of responses are fused into the dataset. The remaining 20% go into our validation index and calibration model, where they are used to calculate Mean Absolute Error against our synthetic outputs. Our Calibration Index gets smarter every time a new hold-out set is checked against real data.

80/20

Hold-split across multiple surveys — 80% fused into the dataset, 20% held out for validation

MAE

Mean Absolute Error calculated openly from hold-out data — not hidden in a footnote

↑

Calibration Index improves with every new real-data comparison — a moat that grows deeper every day

The team

Built by people who have spent careers working with real data

Three co-founders with complementary expertise across consumer intelligence, communications measurement, and machine learning.

Jessica Thomas

Co-Founder

Consumer IntelligenceSocial ListeningAI Strategy

“Listening to the voice of the customer at scale — and translating that into actionable intelligence.”

Award-winning insights strategist with over a decade turning social, behavioural and consumer data into commercial strategy. Founded Ten Bear Group and AI Strategy Hub. Previously Brandwatch. Co-founder of OceanDeep AI.

Social Intelligence Insider 50 — The SI Lab, 2024

50 Digital Women to Watch — Digital Women, 2022

Best Female-Led Startup — Stevie® Awards, 2019

Rayna Grudova-de Lange

Co-Founder

Comms MeasurementResearch DesignMedia Intelligence

“Passionate about innovation in measurement and evaluation — constantly looking for new ways to approach data problems.”

Globally recognised authority in communications measurement and media intelligence. CEO of InsightHQ. Elected Board Director of AMEC. Named contributor to the AMEC Integrated Evaluation Framework — the definitive global standard for communications measurement.

Elected Board Director, AMEC (2022–present)

Chair, AMEC Education Commission

Guest Lecturer, multiple universities

Dr. Paul Siegel

Co-Founder

ML & AINLPNetwork Science

“The majority of the work — 60% — is finding the right question. Then 40% is the analysis.”

Mathematician turned ML engineer. Former Columbia mathematics professor. Eight years as Principal Data Scientist at Brandwatch. Now Senior ML Engineer at Turing Labs (Y Combinator), working with Kraft and Unilever to simulate product outcomes using machine learning.

PhD Mathematics — Penn State, 2012

Assistant Professor, Columbia University (2012–2015)

Senior ML Engineer, Turing Labs (YC-backed)

Survey data grounded inthe truth.

The ground truth at the centre of everything

Five tiers to the truth

Voices that are not just noise

Most people hide their accuracy numbers.We put ours on the page.

Built by people who have spent careers working with real data

You do not have to choose between fast and right.

Survey data grounded in
the truth.

Most people hide their accuracy numbers.
We put ours on the page.