Skip to main content
πŸ“Š

T-Test Calculator

Perform one-sample, two-sample, and paired t-tests online. Calculate t-statistic, p-value, and confidence intervals with step-by-step statistical analysis.

One-Sample t-test: Compare sample mean to known population mean

About This Calculator

The t-test is one of the most widely used statistical tests for comparing means. It helps determine whether there's a statistically significant difference between groups or whether a sample mean differs from a hypothesized value. This calculator performs one-sample, two-sample, and paired t-tests with complete statistical output.

What is a t-test? The t-test uses the t-distribution to test hypotheses about population means when the population standard deviation is unknown and the sample size is relatively small. It was developed by William Sealy Gosset (publishing under the pseudonym "Student") while working at Guinness Brewery.

Types of t-tests:

  • One-sample t-test: Compare a sample mean to a known or hypothesized population mean
  • Two-sample t-test: Compare means of two independent groups
  • Paired t-test: Compare means of matched pairs (before/after, matched subjects)

When to use the t-test:

  • Data is approximately normally distributed
  • Data is continuous (interval/ratio scale)
  • Sample size is sufficient (generally n β‰₯ 15-30)
  • Observations are independent (except for paired test)

Key outputs:

  • t-statistic: Measures how many standard errors the sample mean is from the hypothesized mean
  • p-value: Probability of observing this result if null hypothesis is true
  • Confidence interval: Range likely containing the true parameter

For comparing more than two groups, see our ANOVA Calculator. For correlation analysis, try our Correlation Calculator.

How to Use the T-Test Calculator

  1. 1Select the appropriate t-test type for your data.
  2. 2Choose the tail type (two-tailed for β‰ , one-tailed for < or >).
  3. 3Set your significance level (Ξ±), typically 0.05.
  4. 4Enter the required statistics for your chosen test.
  5. 5Review the t-statistic and p-value.
  6. 6Check if the result is statistically significant.
  7. 7Examine the confidence interval.
  8. 8Consider the effect size (Cohen's d) for practical significance.
  9. 9Interpret results in context of your research question.
  10. 10Report all relevant statistics in your conclusions.

One-Sample t-test

Compare a sample mean to a known or hypothesized population value.

When to Use

Use one-sample t-test when you want to determine if a sample mean differs significantly from a specific value.

Examples:

  • Is the average height of students different from the national average?
  • Does the mean test score differ from 100?
  • Is the average processing time different from the target?

The Formula

t = (xΜ„ - ΞΌβ‚€) / (s / √n)

Where:

  • xΜ„ = sample mean
  • ΞΌβ‚€ = hypothesized population mean
  • s = sample standard deviation
  • n = sample size
  • df = n - 1

Hypotheses

Two-tailed:

  • Hβ‚€: ΞΌ = ΞΌβ‚€
  • H₁: ΞΌ β‰  ΞΌβ‚€

One-tailed (right):

  • Hβ‚€: ΞΌ ≀ ΞΌβ‚€
  • H₁: ΞΌ > ΞΌβ‚€

Example

Research question: Is the average IQ of a group different from 100?

Data:

  • Sample mean: 105.3
  • Sample SD: 12.4
  • Sample size: 36

Calculation:

  • SE = 12.4 / √36 = 2.067
  • t = (105.3 - 100) / 2.067 = 2.565
  • df = 35
  • p-value β‰ˆ 0.015

Conclusion: At Ξ± = 0.05, we reject Hβ‚€ and conclude the mean IQ is significantly different from 100.

Two-Sample t-test

Compare means of two independent groups.

When to Use

Use two-sample t-test when comparing means of two separate, unrelated groups.

Examples:

  • Compare test scores between two teaching methods
  • Compare recovery times between two treatments
  • Compare salaries between two departments

Two Versions

Pooled (Equal Variance)

Assumes both groups have equal population variances.

Formula: t = (x̄₁ - xΜ„β‚‚) / (sp Γ— √(1/n₁ + 1/nβ‚‚))

Where sp = pooled standard deviation

Welch's (Unequal Variance)

Does not assume equal variances - generally more robust.

Formula: t = (x̄₁ - xΜ„β‚‚) / √(s₁²/n₁ + sβ‚‚Β²/nβ‚‚)

Choosing Between Them

Use Welch's test when:

  • Sample sizes are unequal
  • Variances appear different
  • You're uncertain about equal variance assumption

Rule of thumb: If larger variance / smaller variance > 2, use Welch's

Example

Research question: Do two medications have different effects on blood pressure?

Drug A: n=25, mean=-8.2 mmHg, SD=4.5 Drug B: n=28, mean=-5.1 mmHg, SD=5.2

Using Welch's test:

  • SE = √(4.5Β²/25 + 5.2Β²/28) = 1.33
  • t = (-8.2 - (-5.1)) / 1.33 = -2.33
  • df β‰ˆ 50.5
  • p-value β‰ˆ 0.024

Conclusion: Drug A shows significantly greater blood pressure reduction.

Paired t-test

Compare means of matched or paired observations.

When to Use

Use paired t-test when observations come in pairs:

  • Before/after measurements on same subjects
  • Matched pairs (twins, matched controls)
  • Two measurements on each subject

The Concept

Instead of comparing two groups, analyze the differences within each pair.

Why it's more powerful:

  • Controls for individual variation
  • Each subject serves as their own control
  • Reduces variability in the comparison

The Formula

t = dΜ„ / (sd / √n)

Where:

  • dΜ„ = mean of differences
  • sd = standard deviation of differences
  • n = number of pairs
  • df = n - 1

Calculating Differences

SubjectBeforeAfterDifference (d)
1150142-8
2165158-7
3145140-5
............

Mean difference: dΜ„ SD of differences: sd

Example

Research question: Does a weight loss program reduce weight?

Data: 20 participants, before and after weights

  • Mean difference: -3.5 kg
  • SD of differences: 2.8 kg

Calculation:

  • SE = 2.8 / √20 = 0.626
  • t = -3.5 / 0.626 = -5.59
  • df = 19
  • p-value < 0.001

Conclusion: The program produces significant weight loss.

Understanding p-values

Interpreting the probability value from t-tests.

What p-value Means

The p-value is the probability of obtaining results at least as extreme as observed, assuming the null hypothesis is true.

NOT:

  • The probability Hβ‚€ is true
  • The probability the result is due to chance
  • The effect size

Interpreting p-values

p-valueInterpretation
p < 0.001Very strong evidence against Hβ‚€
p < 0.01Strong evidence against Hβ‚€
p < 0.05Moderate evidence against Hβ‚€
p < 0.10Weak evidence against Hβ‚€
p β‰₯ 0.10Little evidence against Hβ‚€

Common Significance Levels (Ξ±)

  • Ξ± = 0.05: Most common, good balance
  • Ξ± = 0.01: More conservative, fewer false positives
  • Ξ± = 0.10: More liberal, fewer false negatives

One-tailed vs. Two-tailed

Two-tailed (default): Tests for any difference (β‰ )

  • Use when direction of difference is unknown

One-tailed: Tests for specific direction (< or >)

  • Use only when direction is predicted in advance
  • p-value is half of two-tailed

Statistical vs. Practical Significance

Important: A statistically significant result may not be practically meaningful!

Example:

  • Treatment reduces pain by 0.5 points (p = 0.01)
  • Statistically significant? Yes
  • Practically meaningful? Maybe not (small effect)

Always report effect size alongside p-value!

Effect Size: Cohen's d

Measuring the magnitude of the difference.

What is Effect Size?

Effect size quantifies the magnitude of a result independent of sample size. Cohen's d is the most common measure for mean differences.

Formula

Cohen's d = (Mean₁ - Meanβ‚‚) / Pooled SD

For one-sample: d = (xΜ„ - ΞΌβ‚€) / s

Interpretation

Cohen's dInterpretationExample
0.2SmallBarely noticeable
0.5MediumNoticeable
0.8LargeObvious
1.2Very largeSubstantial
2.0HugeMassive

Why Effect Size Matters

Scenario 1: Large sample, small effect

  • n = 10,000
  • Difference = 1 point
  • p < 0.001 (significant!)
  • d = 0.1 (trivial effect)

Scenario 2: Small sample, large effect

  • n = 20
  • Difference = 15 points
  • p = 0.06 (not significant)
  • d = 1.5 (large effect)

Reporting Standards

Always report:

  1. Descriptive statistics (means, SDs, n)
  2. Test statistic (t)
  3. Degrees of freedom
  4. p-value
  5. Effect size (Cohen's d)
  6. Confidence interval

Example report: "The treatment group (M = 45.2, SD = 8.5, n = 25) scored significantly higher than the control group (M = 38.7, SD = 9.1, n = 28), t(51) = 2.78, p = .008, d = 0.74, 95% CI [1.77, 11.23]."

Assumptions and Violations

Understanding when t-tests are valid.

Key Assumptions

1. Normality

Data should be approximately normally distributed.

Checking:

  • Histograms/Q-Q plots
  • Shapiro-Wilk test

Violation handling:

  • t-tests are robust to mild violations
  • Use non-parametric tests (Mann-Whitney, Wilcoxon) for severe violations
  • Large samples (n > 30) are generally fine

2. Independence

Observations should be independent.

Exceptions:

  • Paired t-test handles dependence within pairs
  • Two-sample: groups must be independent

Violation handling:

  • Use paired test for matched data
  • Consider mixed models for complex dependencies

3. Equal Variance (Two-sample)

Groups should have similar variances.

Checking:

  • Levene's test
  • Compare SD ratio

Violation handling:

  • Use Welch's t-test
  • Generally recommended as default

Sample Size Considerations

SituationMinimum n per group
Normal data, equal n10-15
Slightly non-normal20-25
Unequal groups15-20 in smaller group
Very non-normalUse non-parametric

Power Analysis

Before conducting study, determine needed sample size:

For Ξ± = 0.05, power = 0.80:

Effect Size (d)n per group
Small (0.2)~400
Medium (0.5)~65
Large (0.8)~25

Pro Tips

  • πŸ’‘Always visualize your data before running statistical tests.
  • πŸ’‘Report effect size (Cohen's d) alongside p-values for complete interpretation.
  • πŸ’‘Use Welch's t-test as default for two-sample comparisons.
  • πŸ’‘Check assumptions: normality, independence, and equal variance when required.
  • πŸ’‘One-tailed tests require a priori justification - don't choose based on data.
  • πŸ’‘Large sample sizes can make trivial differences statistically significant.
  • πŸ’‘Confidence intervals provide more information than p-values alone.
  • πŸ’‘Use paired t-test when you have matched or repeated measurements.
  • πŸ’‘Statistical significance doesn't imply practical importance.
  • πŸ’‘Plan sample size before collecting data using power analysis.
  • πŸ’‘Report complete statistics: means, SDs, n, t, df, p, d, and CI.
  • πŸ’‘Consider multiple testing corrections when running many t-tests.

Frequently Asked Questions

A two-tailed test checks if the mean differs in either direction (β‰ ), while a one-tailed test checks only one direction (< or >). Two-tailed is the default and more conservative. Use one-tailed only when you have a strong theoretical reason to predict the direction of the effect before seeing the data.

Nina Bao
Written byNina Baoβ€’ Content Writer
Updated January 17, 2026

More Calculators You Might Like