playbooks 2026-04-18 11 min read the underwriting desk

Stripe Radar false positives — how to tune the rules

3-minute scan

Radar defaults are tuned for card testing, not for your specific business — most operators over-block.
Tuning requires data: false positive rate, actual chargeback rate, decline-to-block ratio.
Custom rules beat tuning thresholds — specific merchant patterns outperform generic Radar sensitivity knobs.

On this page

Stripe Radar is a good fraud tool that ships with too-aggressive defaults for most multi-brand operators. If you have never touched the rules, you are probably over-blocking good customers and leaving 2-4% of authorizations on the floor for no chargeback benefit.

Here is the tuning playbook that actually works, based on Radar configurations we have run across 80+ merchant accounts.

1. Baseline your current Radar performance

Pull 30 days of Radar data. Calculate: block rate (% of charges blocked), chargeback rate on unblocked charges, dispute rate on blocked charges that eventually got through (via retry, manual review, or customer contact). If block rate > 3% and chargeback rate < 0.5%, you are over-blocking.

2. Customer evidence of over-blocking

Search support tickets for "declined," "blocked," "rejected." Count false positive complaints. If you have more than one per 500 orders, Radar is over-blocking your customer base.

3. Identify the over-blocking rule

Radar has built-in rules plus custom rules. Go to Dashboard → Radar → Rules. Sort by block count. The top 3-5 rules account for 80%+ of blocks. These are your candidates for tuning.

4. Default rules that commonly over-block

(a) "Block if CVC check fails" — overly strict; many legitimate customers mis-type CVC or use Apple Pay tokens that confuse matching.

(b) "Block if high risk" — Radar's risk score over 75 triggers; this threshold is often too aggressive for high-value customers.

(d) "Block if distinct email on card in past 7 days" — shared families/offices trigger this frequently.

5. Replacing default rules with custom

Instead of "block if risk score > 75," try "review if risk score > 75 AND amount > $X" where X is your 90th percentile order value. High-risk score + typical amount = usually fine. High-risk score + unusually large amount = actually risky.

6. Velocity rules that work

(a) "Block if same card used 5+ times in 1 hour across multiple emails" — catches card testing.

(b) "Review if same email attempted 5+ distinct cards in 30 minutes" — catches card stuffing.

These are surgical. They catch actual fraud patterns without hitting legitimate customers.

7. Descriptor-based rules

If you run multiple brands, rules can vary per brand. "Review if amount > $1500 on brand A" vs "Review if amount > $250 on brand B" based on historical fraud rates per brand. Radar does not do this natively; you build it in the rule editor.

8. Allow-listing

Known good customers (100+ orders, 0 disputes) should be on an allow-list that bypasses Radar evaluation. Stripe lets you create customer allow-lists via metadata. Reduces false positives on your best customers.

9. 3DS step-up as a tuning tool

Instead of blocking high-risk charges, step them up to 3DS. Liability shift on fraud, customer completes if legitimate. "Require 3DS if risk score > 65" is often better than "Block if risk score > 65."

10. Test changes safely

Radar has a "preview" mode for new rules — runs them in shadow without actually blocking. Run 7-14 days in preview, compare to production, promote rules that reduce false positives without increasing chargebacks.

11. BIN-level tuning

Certain BINs have structurally higher Radar false positive rates (corporate cards, prepaid, some international). Add BIN-specific rules to relax scoring for BINs that historically approved despite high Radar score.

12. Multi-brand Radar deduplication

If you run 8 brands on 8 Stripe accounts, each account has its own Radar instance. Fraud patterns on brand A do not inform brand B. This is a structural limit that orchestration layers solve by aggregating fraud intelligence across rails. See Radar vs Signifyd vs Kount.

Tuning workflow

Baseline 30-day Radar metrics.
Identify top 3-5 blocking rules.
Write custom replacements; run in preview.
Measure false positive reduction vs chargeback impact.
Promote rules, remove old defaults.
Review monthly; fraud patterns shift.

Common mistakes

(a) Turning off all defaults at once — exposes you to card testing. (b) Not measuring false positives — assuming blocks are all fraud. (c) Not reviewing monthly — fraud tactics evolve and rules go stale. (d) Building rules emotionally after one bad chargeback.

Radar Premium

Radar for Teams ($0.07/charge vs $0.05 standard) unlocks custom rules beyond thresholds. Required for the tuning described here at any serious scale. Worth the cost on $100k+/month operators.

The orchestration alternative

At multi-brand scale, operators layer Signifyd, Kount, or Sift over Radar for cross-rail intelligence. Or use orchestration to route by rail-specific fraud profile. See pricing for the orchestrated fraud stack or apply for a Radar tuning audit on your current volume.

13. The shadow testing discipline

Preview mode (shadow) on new rules is under-used by operators. Ship the new rule in preview, monitor 7-14 days, compare blocks in preview vs production for same charges. If preview would have blocked 100 but your current production rule blocked 150 — and chargeback rate on the non-blocked 50 is low — the new rule is better. Promote it. This is how tuning without breaking things actually works.

14. Radar Lists as segmentation tool

Radar Lists let you create reusable collections — allow-list of known customers, block-list of known-bad IPs, country groups, etc. Instead of hardcoding values in rules, reference lists. Easier to maintain, easier to audit. Most operators never use Lists and end up with fragmented rule logic.

15. Network Tokens and fraud intersection

Network tokenized cards have lower fraud rates than PAN-stored cards because the token is device-bound. Your Radar rules should be less aggressive on network-token transactions. Add a condition: "Do not apply rule X if network_token is true."

Found this useful? Share it X LinkedIn Reddit HN Email

FAQ

Does Radar learn from my chargebacks?

Yes. Every chargeback on an unblocked charge feeds Radar's model. Every successful charge on a "high-risk" score also feeds it. Months of data improve accuracy.

Can I turn Radar off entirely?

You can disable rules but not the underlying risk scoring. Fully disabling Radar exposes you to card testing and elevates chargeback risk.

How often should I re-tune?

Monthly review; quarterly deep tune. Fraud patterns shift weekly, but rule changes should be deliberate with preview validation.

Do Radar for Teams custom rules cost extra?

Radar for Teams is $0.02/charge above standard Radar. Paid by the tuning value it enables for mid-market+ operators.

Is Radar good enough vs Signifyd?

For single-Stripe operators under $1M/month, usually yes. For multi-brand, multi-rail, or $5M+/month, a dedicated fraud layer adds value.

Running multiple brands?
multiflow was built for this.

Start your application How it works

Stripe Radar false positives — how to tune the rules

1. Baseline your current Radar performance

2. Customer evidence of over-blocking

3. Identify the over-blocking rule

4. Default rules that commonly over-block

5. Replacing default rules with custom

6. Velocity rules that work

7. Descriptor-based rules

8. Allow-listing

9. 3DS step-up as a tuning tool

10. Test changes safely

11. BIN-level tuning

12. Multi-brand Radar deduplication

Tuning workflow

Common mistakes

Radar Premium

The orchestration alternative

13. The shadow testing discipline

14. Radar Lists as segmentation tool

15. Network Tokens and fraud intersection

FAQ

Keep reading

Radar vs Signifyd vs Kount

Stripe 4100 decline code

Chargeback fraud prevention subscription

Running multiple brands?
multiflow was built for this.

Want a 2-minute rate quote?

Stripe Radar false positives — how to tune the rules

1. Baseline your current Radar performance

2. Customer evidence of over-blocking

3. Identify the over-blocking rule

4. Default rules that commonly over-block

5. Replacing default rules with custom

6. Velocity rules that work

7. Descriptor-based rules

8. Allow-listing

9. 3DS step-up as a tuning tool

10. Test changes safely

11. BIN-level tuning

12. Multi-brand Radar deduplication

Tuning workflow

Common mistakes

Radar Premium

The orchestration alternative

13. The shadow testing discipline

14. Radar Lists as segmentation tool

15. Network Tokens and fraud intersection

FAQ

Keep reading

Radar vs Signifyd vs Kount

Stripe 4100 decline code

Chargeback fraud prevention subscription

Running multiple brands?multiflow was built for this.

Twice-monthly. No fluff.

Running multiple brands?
multiflow was built for this.