Introduction
Regression lines represent the best-fit linear relationship between two variables, $X$ and $Y$. In this post, we explore how to derive key statistical measures—the means, coefficients, correlation, and standard deviations—directly from the equations of the regression lines.
Given Equations
- $3X + 4Y = 65$
- $3X + Y = 32$
Step 1: Finding the Means ($\bar{x}, \bar{y}$)
The point of intersection of the two regression lines is the point $(\bar{x}, \bar{y})$.
Subtract eq(2) from eq(1): $(3X + 4Y) - (3X + Y) = 65 - 32$ $3Y = 33 \Rightarrow Y = 11$ Substitute $Y = 11$ into eq(2): $3X + 11 = 32 \Rightarrow 3X = 21 \Rightarrow X = 7$ Thus, $\bar{x} = 7$ and $\bar{y} = 11$.
Step 2: Finding Regression Coefficients ($b_{yx}$ and $b_{xy}$)
We must determine which line is $Y$ on $X$ and which is $X$ on $Y$. We assume $b_{yx} \cdot b_{xy} \le 1$ to ensure a valid correlation coefficient.
Case A: $3X + 4Y = 65$ is $Y$ on $X$ ($Y = -\frac{3}{4}X + \dots$) $\Rightarrow b_{yx} = -0.75$ Case B: $3X + Y = 32$ is $X$ on $Y$ ($X = -\frac{1}{3}Y + \dots$) $\Rightarrow b_{xy} = -0.33$ Check: $b_{yx} \cdot b_{xy} = (-0.75)(-0.33) = 0.25 \le 1$. This is valid.
Step 3: Correlation Coefficient ($r$)
$r = \pm \sqrt{b_{yx} \cdot b_{xy}} = \pm \sqrt{0.25} = -0.5$ (The negative sign is chosen because both slopes are negative).
Step 4: Ratio of Standard Deviations
We know $b_{yx} = r \cdot \frac{s_y}{s_x}$ and $b_{xy} = r \cdot \frac{s_x}{s_y}$. Thus, $\frac{b_{yx}}{r} = \frac{s_y}{s_x} = \frac{-0.75}{-0.5} = 1.5$. The ratio of $s_x$ to $s_y$ is the inverse.