Medical Statistics Assistant

Statistical Test Selector

Answer the following questions to determine the appropriate statistical test for your study.

What is the main purpose of your analysis?

Explanation:

The purpose of your analysis determines the broad category of statistical tests that would be appropriate for your study.

How many groups are you comparing?

Explanation:

The number of groups in your study affects which statistical test is appropriate. Some tests are designed specifically for two-group comparisons, while others can handle multiple groups.

What type of data are you analyzing?

Explanation:

The type of data you're analyzing is crucial for selecting the appropriate test. Continuous data are measured on a scale, while categorical data fall into distinct categories.

Are the data normally distributed?

Explanation:

Normal distribution is a key assumption for many statistical tests. You can check normality using histograms, Q-Q plots, or statistical tests like Shapiro-Wilk.

What type of categorical data?

Explanation:

The specific type of categorical data affects which test is most appropriate. Binary data have only two possible values, while ordinal data have a natural order.

Are the samples paired or independent?

Explanation:

Paired samples involve the same subjects measured twice (e.g., before and after treatment). Independent samples involve different subjects in each group.

Are the samples paired or independent?

Explanation:

For non-normally distributed data, we use non-parametric tests that don't assume normality.

How to check for normality

Methods to check normality:

1. Visual methods: Histogram, Q-Q plot

2. Statistical tests: Shapiro-Wilk test, Kolmogorov-Smirnov test

If p-value > 0.05 in these tests, data can be considered normally distributed.

Normal Distribution

Non-Normal Distribution

What type of data are you analyzing?

Explanation:

For multiple group comparisons, the data type determines whether to use ANOVA (for continuous data) or chi-square (for categorical data).

Are the data normally distributed?

Explanation:

For multiple group comparisons with continuous data, we use ANOVA if data are normally distributed, or Kruskal-Wallis if not.

Are the groups related or independent?

Explanation:

For normally distributed data with multiple groups, we use repeated measures ANOVA for related samples and one-way ANOVA for independent samples.

What types of variables are you analyzing?

Explanation:

The types of variables you're analyzing determine which correlation or association test is appropriate.

Are both variables normally distributed?

Explanation:

For two continuous variables, we use Pearson correlation if both are normally distributed, or Spearman correlation if not.

How many categories in the categorical variable?

Explanation:

When analyzing a categorical and a continuous variable, the number of categories determines whether to use t-tests (for two categories) or ANOVA (for multiple categories).

What type of outcome are you predicting?

Explanation:

The type of outcome you're predicting determines whether to use linear regression (for continuous outcomes) or logistic regression (for categorical outcomes).

How many categories in the outcome variable?

Explanation:

For categorical outcomes, we use binary logistic regression for two categories or multinomial logistic regression for multiple categories.

Recommended Statistical Test

Loading recommendation...

Sample Size Calculator

Calculate the required sample size for your study based on statistical parameters.

Select Study Type

Parameters for Comparing Means

Significance level (α):

Power (1-β):

Effect size (Cohen's d):

Number of groups:

Parameters for Comparing Proportions

Significance level (α):

Power (1-β):

Proportion in group 1:

Proportion in group 2:

Parameters for Correlation

Significance level (α):

Power (1-β):

Expected correlation coefficient (r):

Results

Required sample size will appear here after calculation.

Quick Reference Guide

Common statistical tests used in medical research and when to use them.

Comparing Groups

Test	When to Use	Example in Medical Research
Independent t-test	Compare means between two independent groups with normally distributed data	Comparing mean blood pressure between treatment and control groups
Paired t-test	Compare means between two related groups with normally distributed data	Comparing blood pressure before and after treatment in the same patients
Mann-Whitney U test	Compare two independent groups with non-normally distributed data	Comparing pain scores between two different treatment groups
Wilcoxon signed-rank test	Compare two related groups with non-normally distributed data	Comparing pain scores before and after treatment in the same patients
One-way ANOVA	Compare means among three or more independent groups with normally distributed data	Comparing mean blood glucose levels among three different drug treatments
Repeated measures ANOVA	Compare means among three or more related groups with normally distributed data	Comparing blood glucose levels at multiple time points after treatment
Kruskal-Wallis test	Compare three or more independent groups with non-normally distributed data	Comparing pain scores among three different treatment groups
Friedman test	Compare three or more related groups with non-normally distributed data	Comparing pain scores at multiple time points after treatment

Categorical Data Analysis

Test	When to Use	Example in Medical Research
Chi-square test	Compare proportions between independent groups	Comparing the proportion of patients with side effects between two treatments
Fisher's exact test	Compare proportions between independent groups with small sample sizes	Comparing rare adverse events between two treatments
McNemar's test	Compare proportions between related groups	Comparing the presence of a symptom before and after treatment

Correlation and Regression

Test	When to Use	Example in Medical Research
Pearson correlation	Assess linear relationship between two normally distributed continuous variables	Assessing the relationship between BMI and blood pressure
Spearman correlation	Assess monotonic relationship between two variables when at least one is not normally distributed	Assessing the relationship between disease severity score and quality of life score
Linear regression	Predict a continuous outcome based on one or more predictor variables	Predicting blood pressure based on age, BMI, and sodium intake
Logistic regression	Predict a binary outcome based on one or more predictor variables	Predicting the likelihood of heart attack based on risk factors
Cox proportional hazards	Analyze time-to-event data with censoring	Analyzing survival time after cancer diagnosis based on treatment type

Common Statistical Terms

Term	Definition
p-value	The probability of obtaining results at least as extreme as the observed results, assuming the null hypothesis is true
Confidence interval	A range of values that is likely to contain the true population parameter with a certain level of confidence
Effect size	A quantitative measure of the magnitude of a phenomenon, such as the difference between groups or the strength of a relationship
Power	The probability of correctly rejecting the null hypothesis when it is false
Type I error	Rejecting the null hypothesis when it is true (false positive)
Type II error	Failing to reject the null hypothesis when it is false (false negative)

Interactive Examples

Explore these examples to better understand statistical concepts and tests.

Example 1: Comparing Two Treatment Groups

A randomized controlled trial compared a new antihypertensive medication (Treatment A) with a standard medication (Treatment B) in 50 patients with hypertension. The primary outcome was reduction in systolic blood pressure after 8 weeks of treatment.

Sample Data:

Treatment A (mmHg reduction)	Treatment B (mmHg reduction)
15, 12, 17, 14, 18, 20, 13, 16, 19, 14	10, 8, 12, 9, 11, 14, 7, 10, 13, 9

Statistical Analysis:

Since we are comparing two independent groups with continuous data, we need to first check if the data are normally distributed. Assuming normality, an independent t-test would be appropriate.

Results: Mean reduction in Treatment A = 15.8 mmHg, Treatment B = 10.3 mmHg

t-statistic: 5.42, p-value: < 0.001

Interpretation: There is a statistically significant difference in blood pressure reduction between the two treatments, with Treatment A showing a greater reduction.

Example 2: Before-After Intervention Study

A study evaluated the effect of a 12-week exercise program on HbA1c levels in 15 patients with type 2 diabetes.

Sample Data:

Patient	HbA1c Before (%)	HbA1c After (%)
1-5	8.2, 7.9, 8.5, 7.6, 8.0	7.5, 7.3, 7.8, 7.1, 7.4
6-10	8.3, 7.8, 8.1, 7.7, 8.4	7.6, 7.2, 7.5, 7.0, 7.7
11-15	7.5, 8.2, 7.9, 8.3, 8.0	7.0, 7.6, 7.3, 7.7, 7.4

Statistical Analysis:

Since we are comparing measurements from the same patients before and after an intervention, a paired t-test would be appropriate (assuming the differences are normally distributed).

Results: Mean HbA1c before = 8.03%, after = 7.41%

Mean difference: 0.62% (95% CI: 0.54 to 0.70)

t-statistic: 16.5, p-value: < 0.001

Interpretation: There is a statistically significant reduction in HbA1c levels after the exercise program.

Example 3: Association Between Risk Factor and Disease

A case-control study examined the association between smoking status and lung cancer in 200 participants (100 cases with lung cancer, 100 controls without lung cancer).

Sample Data:

	Smokers	Non-smokers	Total
Lung Cancer	75	25	100
No Lung Cancer	40	60	100
Total	115	85	200

Statistical Analysis:

Since we are examining the association between two categorical variables, a chi-square test would be appropriate.

Results: Chi-square statistic = 24.35, p-value: < 0.001

Odds Ratio: 4.5 (95% CI: 2.4 to 8.4)

Interpretation: There is a statistically significant association between smoking and lung cancer. The odds of having lung cancer are 4.5 times higher among smokers compared to non-smokers.

Example 4: Correlation Between Clinical Variables

A study examined the correlation between body mass index (BMI) and systolic blood pressure (SBP) in 50 adults.

Sample Data (excerpt):

Patient	BMI (kg/m²)	SBP (mmHg)
1-5	22.5, 27.8, 31.2, 24.6, 29.3	118, 132, 145, 125, 138
6-10	26.1, 33.5, 25.2, 30.7, 28.4	128, 150, 122, 142, 135

Statistical Analysis:

Since we are examining the relationship between two continuous variables, a Pearson correlation would be appropriate (assuming both variables are normally distributed).

Results: Correlation coefficient (r) = 0.72, p-value: < 0.001

Interpretation: There is a strong positive correlation between BMI and systolic blood pressure, indicating that as BMI increases, systolic blood pressure tends to increase as well.

Statistical Test Selector

What is the main purpose of your analysis?

Explanation:

How many groups are you comparing?

Explanation:

What type of data are you analyzing?

Explanation:

Are the data normally distributed?

Explanation:

What type of categorical data?

Explanation:

Are the samples paired or independent?

Explanation:

Are the samples paired or independent?

Explanation:

How to check for normality

Methods to check normality:

Normal Distribution

Non-Normal Distribution

What type of data are you analyzing?

Explanation:

Are the data normally distributed?

Explanation:

Are the groups related or independent?

Explanation:

What types of variables are you analyzing?

Explanation:

Are both variables normally distributed?

Explanation:

How many categories in the categorical variable?

Explanation:

What type of outcome are you predicting?

Explanation:

How many categories in the outcome variable?

Explanation:

Recommended Statistical Test

Loading recommendation...

Sample Size Calculator

Select Study Type

Parameters for Comparing Means

Parameters for Comparing Proportions

Parameters for Correlation

Results

Quick Reference Guide

Comparing Groups

Categorical Data Analysis

Correlation and Regression

Common Statistical Terms

Interactive Examples

Example 1: Comparing Two Treatment Groups

Sample Data:

Statistical Analysis:

Example 2: Before-After Intervention Study

Sample Data:

Statistical Analysis:

Example 3: Association Between Risk Factor and Disease

Sample Data:

Statistical Analysis:

Example 4: Correlation Between Clinical Variables

Sample Data (excerpt):

Statistical Analysis:

References