A/B Test Significance Calculator
Two-proportion z-test for conversion experiments — p-value, confidence, relative lift and whether it's significant.
The standard test for conversion A/B experiments. A p-value below 0.05 (95% confidence) is the common bar, but beware peeking (checking repeatedly inflates false positives) and always pre-compute sample size. p-value is NOT the probability the variant is better — it's the chance of seeing this gap if there were no real difference.
Formula
About A/B Test Significance Calculator
The A/B test significance calculator answers the question every experiment ends with: is variant B's higher conversion rate real, or just noise? It runs the standard two-proportion z-test, returning each variant's rate, the relative lift, the two-sided p-value, and your confidence that the difference is genuine. A p-value below 0.05 (95% confidence) is the conventional bar for calling a winner — but the tool also flags marginal results that need more data, and the FAQ explains the peeking and interpretation traps that invalidate more experiments than bad math ever does.
How to use A/B Test Significance Calculator
- 1Enter your values into A/B Test Significance Calculator — sensible, domain-typical defaults are pre-filled so you see a real result immediately.
- 2The result recomputes live using the formula shown on the page; there is no button to press.
- 3Adjust any input to compare scenarios, then read the worked example to see the substituted numbers.
Why use A/B Test Significance Calculator?
- ✓Computes A/B Test Significance instantly in your browser — no sign-up, no upload, no server round-trip.
- ✓100% free and unlimited, with the exact formula shown: z = (p_B - p_A) / √(p̄(1-p̄)(1/n_A + 1/n_B)).
- ✓Runs entirely client-side, so every value you enter stays private on your device.
- ✓Live recompute as you type, with a worked example and authoritative references for trust.
Frequently asked questions
What does the p-value actually mean?+
It's the probability of observing a difference at least this large IF the two variants were truly identical (the null hypothesis). A p-value of 0.03 means 'if there were no real difference, you'd see a gap this big only 3% of the time' — so you reject 'no difference'. It is NOT the probability that B is better, a near-universal misinterpretation.
Why is 'peeking' at results dangerous?+
Checking significance repeatedly as data arrives and stopping when it crosses 0.05 dramatically inflates false positives — you'll eventually cross the line by chance even with no real effect. Either fix the sample size in advance (see our sample-size calculator) and check once at the end, or use sequential testing methods designed for continuous monitoring.
How big a sample do I need?+
Compute it before you start, based on your baseline rate, the minimum lift worth detecting, and your desired power (usually 80%). Underpowered tests miss real effects; running until 'it looks significant' is the peeking trap. Our sample-size calculator does this — decide the duration up front and commit to it.
Statistical vs practical significance — what's the difference?+
With enough traffic, a trivial 0.1% lift becomes statistically significant yet may not be worth shipping. Always pair the p-value with the effect size (the lift) and a confidence interval on the difference. A significant result that's smaller than your minimum-worthwhile effect is a 'real but useless' finding — don't let p < 0.05 alone drive the decision.
Related Statistics tools
A/B Test Sample Size Calculator
Required visitors per variant to detect a target lift at your power and significance — before you launch.
● LiveConversion Rate Confidence Interval Calculator
Wald and Wilson confidence intervals for a proportion — the honest error bars around your conversion rate.
● LivePermutation & Combination Calculator
nPr, nCr and n! for counting arrangements and selections — the foundation of combinatorics and probability.
● Live