📋 Formulas
The formula sheet I typically provide for my students can be found below. You can also click on the tab below to find the relevant formulas for each section of the lecture notes.
Formula Sheet Download Formula Sheet
\bar{x} = \frac{1}{n}Σx_i mean
s = \sqrt\frac {Σ(x-\bar{x})^2} {n-1} sample standard deviation
s^2 variance
\widehat{y}\:=\:b_0+b_1x estimated regression line
b_1=r\:\frac{s_y}{s_x} slope of regression line
b_0=\overline{y}-\:b_1\overline{x} y-intercept of regression line
R^2 =\frac{\text{explained variation}}{\text{total variation}} = (r)^2\cdot 100\% coefficient of determination
P\left(A\:\cap\:B\right)=P\left(A\:\text{and}\:B\right)=\:P\left(A\mid B\right)·P\left(B\right)
P\left(A\:\cup\:B\right)=P\left(A\text{ or}\:B\right)=\:P\left(A\right)+P\left(B\right)\:-\:P\left(A\text{ and }B\right)
P\left(A\:\mid B\right)=\frac{\:P\left(A\:\text{and }B\right)}{P\left(B\right)}
Discrete Random Variables
\mu_X=E\left(X\right)=Σ\:x_i·P\left(x_i\right) mean of a discrete random variable
\sigma_X=\:\sqrt{Σ\left(x_i-\mu_X\right)^2·\:P\left(x_i\right)\:}=\:\sqrt{Σ\:\left(x_i^2·P\left(x_i\right)\right)-\mu_X^2} standard deviation of a discrete random variable
Binomial Random Variables
If X has a binomial distribution with parameters n and p, then
P\left(X\:=\:k\right)=\:_nC_k·\:p^k·\left(1-p\right)^{n-k}
\mu_X=E\left(X\right)=np mean for a binomial random variable
\sigma_X=\sqrt{np\left(1-p\right)} standard deviation for a binomial random variable
Normal Random Variables
z\:=\frac{\:x-\mu}{\sigma} direct calculation
x\:=\:z\left(\sigma\right)+\mu inverse calculation
\widehat{p} \sim \text{AN}(p,\sqrt\frac{pq}{n}) if
np \text{ and } nq \geq 10
\mu_\widehat{p} = p mean of sample proportion
\sigma_\widehat{p}=\sqrt\frac{pq}{n} standard deviation of sample proportion
Inferential Statistics
\text{statistic}\pm \text{margin of error} =\text{statistic}\pm \text{critical value}\cdot\text{standard error}= \widehat{p} \pm z^*\sqrt\frac{\widehat{p}\widehat{q}}{n} confidence interval for one proportion
n\:=\frac{\left(z\text{*}\right)^2\widehat{p}\widehat{q}}{(ME)^2 } sample size required for given CL and ME
z=\frac{\widehat{p}-p_0}{\sqrt\frac{p_0q_0}{n}} test statistic for one proportion
\bar{x} \sim \text{AN}(\mu,\frac{\sigma}{\sqrt{n}}) if
X \sim N \text{ or } n \geq 30
\mu_\bar{x} = \mu mean of sample mean
\sigma_\bar{x}=\frac{\sigma}{\sqrt{n}} standard deviation of sample mean
Inferential Statistics
\text{statistic}\pm \text{margin of error} =\text{statistic}\pm \text{critical value}\cdot\text{standard error}= \bar{x} \pm t^*\frac{s}{\sqrt{n}} confidence interval for one mean
t=\frac{\bar{x}-\mu_0}{\frac{s}{\sqrt{n}}} test statistic for one mean
degree of freedom for t is given by \text{df} = n-1
\widehat{p_1}-\widehat{p_2} \sim \text{AN}(p_1 - p_2,\sqrt{\frac{p_1q_1}{n_1}+\frac{p_2q_2}{n_2}}) if
np \text{ and } nq \geq 10
\mu_{\widehat{p_1}-\widehat{p_2}} = p_1-p_2 mean of difference in sample proportions
\sigma_{\widehat{p_1}-\widehat{p_2}} =\sqrt{\frac{p_1q_1}{n_1}+\frac{p_2q_2}{n_2}} standard deviation of difference in sample proportion
Inferential Statistics
\text{statistic}\pm \text{critical value}\cdot\text{standard error}= \widehat{p_1}-\widehat{p_2} \pm z^*\sqrt{\frac{\widehat{p_1}\widehat{q_1}}{n_1}+ \frac{\widehat{p_2}\widehat{q_2}}{n_2}} confidence interval for difference in proportions
z=\frac{\widehat{p_1}-\widehat{p_2}}{\sqrt{\widehat{p_c}\widehat{q_c}(\frac{1}{n_1}+\frac{1}{n_2})}} test statistic for difference in proportions where
p_1 = p_2 is null hypothesis and
\widehat{p_c} = \frac{X_1+X_2}{n_1+n_2}
Sampling Distribution for Difference of Means
\bar{x_1}-\bar{x_2} \sim \text{AN}(\mu_1 - \mu_2,\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}) if
np \text{ and } nq \geq 10
\mu_{\bar{x_1}-\bar{x_2}} = \mu_1-\mu_2 mean of difference in sample means
\sigma_{\bar{x_1}-\bar{x_2}} =\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}} standard deviation of difference in sample means
Inferential Statistics
\text{statistic}\pm \text{critical value}\cdot\text{standard error}= \bar{x_1}-\bar{x_2} \pm t^*\sqrt{\frac{s_1^2}{n_1}+ \frac{s_2^2}{n_2}} confidence interval for difference in proportions
t=\frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2})}} test statistic for difference in proportions where
\mu_1 = \mu_2
degree of freedom for t is given by super complicated formula
Links to an external site. or approximated by \text{df} = min(n_1-1, n_2-1)
\chi^2=\:\sum\frac{\left(O-E\right)^2}{E} test statistic for GoF, where O = observed and E = expected
degree of freedom for \chi^2 is given by
\text{df} = k-1 where
k is the number of levels of the categorical variable
Chi-Square Test for Independence / Homogeneity
\chi^2=\:\sum\frac{\left(O-E\right)^2}{E} test statistic for GoF, where O = observed and E = expected
E = \frac{ \text{(row total)(col total)} }{\text{grand total}}
degree of freedom for \chi^2 is given by
\text{df} = (r-1)(c-1) where
r is the number of rows and
c is the number of columns
ANOVA
The formulas are gross... I will add them later.
The formulas for inference on regression are coming soon!