61,597 real people · 1,472 data points each

Survey data grounded in
the truth.

SynthSurvey gives you real survey distributions fast — without making things up. Every answer is labelled: real data, grounded estimate, or synthetic. You always know what you’re looking at.

app.synthsurvey.com
Audience Response
+ New question
• From real data
Do you trust AI-generated content in news?
Strongly disagree
24%
Somewhat disagree
15%
Neutral
35%
Somewhat agree
18%
Strongly agree
8%
• Grounded estimate
How often do you use TikTok per day?
Never
44%
< 30 minutes
29%
30min – 2hrs
18%
Data sources include UK Government Trackers Ofcom NTIA Cint Panel Proprietary Social Signals

The ground truth at the centre of everything

At the heart of SynthSurvey is a fused dataset of 61,597 real people — drawn from governments, major universities, and our own proprietary panel. We know 1,472 things about each of them. We weight newer data more heavily and decay older readings over time, so the answer reflects what is true now, not what was true a decade ago.

National Probability Samples Validated Political Instruments Government Trackers University Surveys Social Media Signals Proprietary Panel
61,597
Real respondents in the fused dataset — not simulated, not guessed.
1,472
Data points known about each person across surveys, demographics, and social signals.
5-tier
Resolution cascade — every question follows a disciplined path to the most grounded answer possible.

Five tiers to the truth

When you ask SynthSurvey a question, it follows a disciplined path — not just a prompt fired at a machine. Every result is labelled so you always know exactly what you’re looking at.

Tier 1
Exact Match
The system looks for the same question in our index of real surveys. If real people have already answered it, we return their distribution directly — no guessing, no machine inference. Just the data.
From real data
Tier 2
Close Match
No exact match found — but a question asking the same thing in different words exists. We return the real distribution and flag that the wording was not identical. Still real people. Still real data.
From real data
Tier 3
Validated Lookup
No survey question matches — so we pull from government data and independent trackers that measure facts directly. If you ask who uses TikTok, the data tells you the facts.
From real data
Tier 4
Grounded Estimate
A close survey match exists but your specific audience is too small to be reliable. We broaden to all respondents in the same geography, then adjust — checking how your audience differs from the general population and correcting the distribution to reflect their specific tendencies.
Grounded estimate
Tier 5
Fallback
The last resort. The system uses the attitudinal fingerprint of the audience and social media engagement signals to generate an answer — constrained by everything we know about that group. Clearly labelled so you always know.
Synthetic estimate
Attitudinal Fingerprint
Young women · London · Gaming audience
Tech optimism
+0.78
Trust in science
+0.82
Media scepticism
+0.61
Brand loyalty
+0.44
Social identity
+0.71
sha256: a3f9c2e1b7d4…8e12f0c3a91b

Voices that are not just noise

Every audience has a fingerprint — a unique hash describing what makes them different from the average person. We use it to ensure that when we generate synthetic voices, they are the right voices: shifted to the right positions in the data, not random noise.

If you are looking at young women in London who play video games, their fingerprint will show they are more optimistic about technology and more likely to trust scientists. The machine is constrained by what the data knows to be true of that group.

Most people hide their accuracy numbers.
We put ours on the page.

Across a range of surveys — including our Cint panel — we apply an 80/20 hold-split. 80% of responses are fused into the dataset. The remaining 20% go into our validation index and calibration model, where they are used to calculate Mean Absolute Error against our synthetic outputs. Our Calibration Index gets smarter every time a new hold-out set is checked against real data.

80/20
Hold-split across multiple surveys — 80% fused into the dataset, 20% held out for validation
MAE
Mean Absolute Error calculated openly from hold-out data — not hidden in a footnote
Calibration Index improves with every new real-data comparison — a moat that grows deeper every day

Built by people who have spent careers working with real data

Three co-founders with complementary expertise across consumer intelligence, communications measurement, and machine learning.

JT
Jessica Thomas
Co-Founder
Consumer IntelligenceSocial ListeningAI Strategy
“Listening to the voice of the customer at scale — and translating that into actionable intelligence.”
Award-winning insights strategist with over a decade turning social, behavioural and consumer data into commercial strategy. Founded Ten Bear Group and AI Strategy Hub. Previously Brandwatch. Co-founder of OceanDeep AI.
Social Intelligence Insider 50 — The SI Lab, 2024
50 Digital Women to Watch — Digital Women, 2022
Best Female-Led Startup — Stevie® Awards, 2019
RG
Rayna Grudova-de Lange
Co-Founder
Comms MeasurementResearch DesignMedia Intelligence
“Passionate about innovation in measurement and evaluation — constantly looking for new ways to approach data problems.”
Globally recognised authority in communications measurement and media intelligence. CEO of InsightHQ. Elected Board Director of AMEC. Named contributor to the AMEC Integrated Evaluation Framework — the definitive global standard for communications measurement.
Elected Board Director, AMEC (2022–present)
Chair, AMEC Education Commission
Guest Lecturer, multiple universities
PS
Dr. Paul Siegel
Co-Founder
ML & AINLPNetwork Science
“The majority of the work — 60% — is finding the right question. Then 40% is the analysis.”
Mathematician turned ML engineer. Former Columbia mathematics professor. Eight years as Principal Data Scientist at Brandwatch. Now Senior ML Engineer at Turing Labs (Y Combinator), working with Kraft and Unilever to simulate product outcomes using machine learning.
PhD Mathematics — Penn State, 2012
Assistant Professor, Columbia University (2012–2015)
Senior ML Engineer, Turing Labs (YC-backed)

You do not have to choose between fast and right.

SynthSurvey gives you real survey distributions — grounded in 61,597 real people — in the time it takes to type a question. Follow the data.