If you're testing Demand Gen campaigns in Google Ads right now, there’s a good chance you’re doing it in a way that gives you rubbish data and no clear answers.

I know that sounds dramatic, but hear me out…

Most people make one of the following mistakes:

They lump all their video creatives into one campaign.
They run multiple campaigns but forget about audience overlap.
They rely on Google to “optimise” without really testing anything in a controlled way.

All of which ends up with you not knowing what actually worked and what didn’t.

But there is a better way to test Demand Gen campaigns. A way that gets you clean data, no audience overlap, and even tells you when your results are statistically significant.

Let’s walk through exactly how to do it.

Set Up Demand Gen Experiments the Right Way

Google has a powerful (but underused) feature inside your account called Experiments.

This is how I recommend all advertisers test creatives in Demand Gen campaigns. Here's the process:

Set Up Demand Gen Experiments the Right Way

Head into your Google Ads account, and in the sidebar, click into Campaigns → Experiments.
Create a new Demand Gen experiment by clicking the plus icon.
Choose your success metric. I usually go with cost per conversion, but even if you want to judge based on ROAS, you’ll still get all the other metrics in the results dashboard later.
Set up each arm of the experiment to test one single variable. For example, test one creative per arm — let’s say “Video 1”, “Video 2”, “Video 3” and “Video 4”.
Ensure each campaign is already created and only differs in that one thing you’re testing. In this case, the only difference between the campaigns should be the creative. Everything else — targeting, bid strategy, placements — should remain the same.
Assign traffic splits evenly (e.g. 25% each for 4 arms) unless you’ve got a specific reason to weight it otherwise.
Name your test clearly so you don’t forget what you were doing in 3 months’ time!
Choose a long enough date range. I recommend 60–90 days. You can always stop early if you hit statistical significance.

Why This Method Crushes Traditional Testing

Most people just throw all their creatives into a single campaign and hope Google figures it out.

But here’s why that fails:

Audience overlap means the same person could be seeing multiple test variants. That completely ruins your data because you don’t know which creative actually influenced them.
Budget bias: Google often picks a “winner” too early and doesn’t give the other creatives a fair chance.
No clarity on what worked: Without proper split testing, your results are anecdotal at best.

By using the Experiments feature, you solve all of this. You split the audience cleanly between versions. You control how much spend goes to each creative. And you get clear, side-by-side performance results.

Don’t Miss the Statistical Significance Dashboard

One of my favourite features of using proper experiments is that you actually get a statistical significance dashboard.

Yes, Google will tell you when your test has reached a level of confidence where you can trust the results. You’ll see this on the bottom row of the results table.

And that’s super helpful. Because let’s face it: without statistical significance, you might just be looking at noise or lucky streaks.

Don’t Miss the Statistical Significance Dashboard

And up top, you get side-by-side comparisons of key metrics like:

Cost per conversion
Conversion value
CPA
ROAS

So now, instead of guessing what worked, you know with data-backed certainty.

What to Actually Test in Demand Gen

In my experience, this setup works best for testing different creatives.

Here’s what I like to test:

Entirely different video ads
Alternate hooks and intros
Different creative themes or angles (e.g. UGC vs pro polish)

One of my go-tos is testing UGC-style raw videos against more slick, professionally shot product demos. You’d be surprised how often the raw stuff wins especially in today’s ad environment.

But here’s the key:

Only test ONE thing at a time.

If you’re testing different videos, don’t also test a new bid strategy or targeting tweak. If you change too many variables, you’ll never know what actually made the difference.

Also, let your test run long enough. Aim for at least 30–60 days, or until you hit statistical significance.

Focus on Conversion-Based Metrics

It sounds obvious, but I see people get tripped up by this all the time.

Don’t make decisions based on CTR or impressions. That doesn’t tell you anything about actual results.

Instead, look at:

Conversions.
Cost per conversion.
Revenue.
ROAS.

This is especially true if you’re running ecommerce campaigns. We’re here to make money, not just get clicks.

Final Thoughts

If you’re testing Demand Gen by stuffing creatives into a campaign and “letting it run”, stop.

Google Ads has given us proper testing tools, and when you use them properly, you get clean, reliable results that actually tell you what’s working.

That means better optimisation, more profitable campaigns, and a whole lot less guesswork.

Conclusion

To test Demand Gen campaigns properly, you need to use Google’s Experiments feature. This allows for clean audience splits, even budget distribution, and statistically valid results. Focus on testing one variable at a time — ideally creatives — and let the test run long enough to gather meaningful data. Most importantly, judge your results based on real business metrics like conversions and ROAS, not vanity metrics like CTR.

And hey, if all this sounds like a headache and you’d rather have a team do it for you, well, that’s what we do at Big Flare. We manage Google Ads for ecommerce brands doing 6 to 8 figures and we love running smart, statistically valid experiments like this.

Book a call if you want to talk.

Until next time,
Daryl

The Right Way to Test Demand Gen Campaigns in Google Ads (Without Ruining Your Results)