Pulling UGC case studies from both markets to build actual cross-market benchmarks—where do you start?

I hit a wall last month when I realized we had UGC campaigns scattered across two regions with no way to compare performance or identify patterns. We had successful UGC videos from Russian creators, separate wins from US creators, but no system to learn from them side-by-side.

The challenge: we needed benchmarks, but we were starting from scratch. No historical data compiled. No consistent metrics across campaigns. Just a ton of individual wins that felt isolated.

I started by collecting every single UGC campaign we’d run in the past 18 months—docs, spreadsheets, Slack conversations, creator reports. Then I categorized them: what was the brief, who was the creator, what was the audience size, and most importantly, what actually made it work or break it.

What surprised me: the successful UGC patterns weren’t the same across regions. A UGC video that crushed it with Russian audiences often had different storytelling rhythms, creative focus, and engagement hooks than one that worked in the US. But there were some universal principles—authenticity, clear product positioning, and audience relevance worked everywhere.

Now I’m building a shared case study library where each case includes: the original brief, what the creator actually delivered, the performance metrics, and my analysis of what made it work. The goal is to turn anecdotes into a real playbook.

Here’s my question: when you’re pulling case studies to benchmark performance, how do you actually decide what’s a “valid” comparison? Are you looking for identical brief structures, or do you compare across different campaign types?

The comparison validity question is critical. I’d argue you need two comparison layers: structural and thematic.

Structural: same brief type (e.g., unboxing videos, product-in-life shots, testimonials). These need identical success metrics because you’re testing the same execution variables.

Thematic: similar campaign objectives (e.g., “boost awareness for new SKU”) but different creative approaches. These you compare differently—you’re not asking whether they hit the same KPIs, you’re asking whether the approach led to different outcomes.

When I built our benchmark library, I color-coded by brief type. Green for direct comparisons (same structure, different regions). Blue for thematic comparisons (similar goal, different creativity). The library became useful once I stopped trying to compare everything to everything and got intentional about comparison groupings.

How are you categorizing your cases right now?

Also—be ruthless about whether a case is actually benchmark-worthy or just an outlier. I used to include every win. Then I realized that one $50M influencer’s campaign performance wasn’t a benchmark for micro-creator UGC work—it was a distraction.

I now keep separate libraries: micro-creator UGC benchmarks, mid-tier benchmarks, and high-reach benchmarks. Each has different success thresholds. That segmentation made the data actually actionable instead of just a collection of loose examples.

I love this approach because I see it from the creator side too. When I’m onboarding new UGC creators to clients, I always share 2-3 “reference cases” so they understand what success looks like for that brand and audience. But I’ve noticed I was picking cases inconsistently—sometimes the most impressive video, sometimes the most recent, sometimes just whatever was easiest to find.

What changed everything for me was asking creators: “If you were studying 3 great UGC examples from this brand, which ones would teach you the most about their audience and style?” Creators pick way more strategically than I was. They look for variety—different creators, different pacing, different angles—so they can see the full range of what works.

Maybe your benchmark library should be built with creator input? They understand what’s transferable knowledge versus what’s a one-off win.

When you share these case studies with new creators, are you planning to share them differently for Russian creators versus US creators, or are you framing them as universal?

We started building a UGC playbook earlier this year and faced this exact problem. We had 30+ videos from Russian creators and 20+ from US creators, and initially tried to average them all together. Big mistake.

What worked: we bucketed by both region and creator tier (micro, mid, macro) because performance expectations are genuinely different. A 2% engagement rate from a macro-influencer’s UGC is a disaster. From a micro-creator? It’s normal.

Then within each bucket, we looked for patterns, not individual benchmarks. Across the 8 highest-performing Russian micro-creator videos, was there a common structure? (Usually yes—they were more narrative, less product-focused upfront.) Same question for US. Different patterns emerged.

The real insight came from comparing the patterns between regions, not comparing individual videos. That’s when we stopped saying “this video is good” and started saying “this type of video works better in this market.”

This is the right problem to solve, but I’d push you to think bigger about the benchmark structure. Don’t just collect cases—collect them with explicit metadata so you can actually query them later.

Every case should have: creator tier, audience size, video format, product category, campaign objective, performance metrics (engagement, CTR, conversion if you have it), audience demographics, and regional context. Once you have 20-30 cases with consistent metadata, you can start running actual analysis.

Example query: “Show me all mid-tier creator UGC for beauty products in the US market with engagement > 3%.” That’s when a case library becomes a benchmark system instead of just a folder of examples.

Are you planning to build this systematically, or is it more of a manual reference collection?

This is gold. We’ve been telling clients for years, “You need a playbook,” but nobody wants to do this work. You’re actually doing it.

From my agency perspective, I’d tell you: once you have this library, share it. Not with competitors obviously, but with your internal stakeholders and your creator partners. When a creator sees the top 5 UGC examples that performed best, they understand your brand’s creative DNA instantly. Onboarding time drops by weeks.

We actually build micro-playbooks for each client now—not just benchmarks, but a curated set of “here’s what worked” examples. It’s become a core deliverable. Clients love it because they finally have something concrete to show their team instead of just scatter-plot data.

Honestly, from a creator standpoint, I’d love to see the benchmarks that brands are building. Right now, we’re always guessing what will work. We shoot 5-10 variations and hope something sticks.

If you’re building this library, feedback suggestion: creators should be able to access something from it—not the full competitive analysis, but maybe a simplified version that shows “here’s the creative style this brand responds to.” It would help us deliver better UGC from the start instead of iterating endlessly.

Also—definitely include case studies with different creator personalities and styles. Some brands respond to scripted, polished videos. Others want raw, authentic morning voice content. Your benchmarks should show both so creators know which lane fits this particular brand.