Simple Linear Regression
- explores a possible linear relationship between two quantitative variables
Model
In R: lm(Response ~ Explanatory)
We assume
Where:
-
Y = response variable
-
true intercept -
true slope -
X = explanatory variable
-
random error term (measure of the variability in the relationship) -
- expectation of Y at a value of X
- theoretical mean of Y at X
The model performs 3 tests to estimate the parameters:
- Intercept
- tests
vs with a t-test
- Slope
- tests
vs with a t-test (does it have a slope) - measures how strong the linear relationship between the two variables is
- ANOVA F-test
- also tests
vs - same hypothesis as the second test, p-value is the same,
Estimating and
- least squares
- goal is to find the estimators that minimize
(residual)
- goal is to find the estimators that minimize
- e = residual
Confidence intervals for the two parameters
Inference
we assume that
Testing
i.e. there is not relationship between X and Y
- true slope = 0, Y not related to X then
If we assumeand (t statistic) - n-2 because two parameters are being estimated(
and )
- n-2 because two parameters are being estimated(
Measuring the correlation
- measure of the strength of the linear relationship
- SLR is not made to measure correlation between non-linear relationships
- last two use different techniques
- (always plot your data beforehand)
r = Pearson Correlation Coefficient
= Coefficient of Determination