How to Find the Equation of a Regression Line and Interpret Regression Lines
TL;DR: Take a cloud of scattered data points on a graph, then draw the single straight line that best splits the difference — that’s a regression line. We write it as y-hat equals a plus b times x. The slope b tells you the predicted change in y for every one-unit jump in x, and the intercept a tells you what y looks like when x is zero. Two coefficients, one line — and suddenly you can predict future values from the pattern.
Key takeaways:
- Regression line equation: \(\hat{y} = a + bx\), where \(\hat{y}\) is the predicted value.
- Slope \(b = r \cdot \dfrac{s_y}{s_x}\); intercept \(a = \bar{y} – b\bar{x}\).
- The line passes through \((\bar{x}, \bar{y})\) – the point of means.
- Slope = predicted change in \(y\) for a one-unit increase in \(x\).
- Don’t extrapolate far beyond your data – the relationship may not hold there.
Finding the regression line in several steps is fully and accurately explained in this article.
Step \(1\): Collect data on the two desired variables and organize them in a scatter plot.
Step \(2\): As mentioned at the beginning of the article, the equation of the line of best fit is found using the least squares method. To find the line, you must first calculate the slope (\(m\)) and \(y\)-intercept (\(b\)). Slope and \(y\)-intercept are calculated using the following formulas:
\(m=\frac{(nΣ(xy)-(Σx)(Σy))}{(nΣ(x^2)-(Σx)^2)}\)
\(b=\frac{(Σy-mΣx)}{n}\)
Step \(3\): Using the found slope and \(y\)-intercept, the equation of the regression line will be \(y=mx+b\).
Step \(4\): The slope and \(y\)-intercept of the regression line are used to interpret the line. The slope shows the amount of change in \(y\) against a unit change in \(x\) and can be interpreted as the strength and direction of the relationship between the variables. A positive slope indicates a positive relationship, while a negative slope indicates a negative relationship. The slope shows the amount of change in \(y\) against a unit change in \(x\) and can be interpreted as the strength and direction of the relationship between the variables. A positive slope indicates a positive relationship, while a negative slope indicates a negative relationship. The \(y\)-intercept represents the value of \(y\) when \(x=0\) and represents the intersection of the line with the \(y\)-axis.
Step \(5\): The correlation coefficient (\(r\)) is used to measure the strength and direction of the linear relationship between two variables. A correlation coefficient of \(+1\) indicates a complete positive linear relationship, a coefficient of \(-1\) indicates a complete negative linear relationship, and a coefficient of \(0\) indicates the absence of a linear relationship. You can use the correlation coefficient for interpretation.
Step \(6\): To interpret the regression line, you can also use the coefficient of determination (\(R^2\)), which indicates the ratio of changes in the dependent variable that is explained by the independent variable. \(R^2\) is always between \(0\) and \(1\), and the closer \(R^2\) is to \(1\), the better the model explains the data.
A regression line is only an approximation of the relationship between variables and may not accurately describe the relationship for values of \(x\) that are outside the range of the data.
Recommended EffortlessMath Books
If you want a deeper, structured walk through every stats topic, Statistics for Beginners builds from descriptive stats all the way to inference with worked examples and practice sets. For AP-track students, AP Statistics for Beginners covers the full AP curriculum with exam-style problems.
Frequently Asked Questions
What is a regression line?
A regression line is the straight line that best fits a set of paired numerical data \((x, y)\). “Best fit” usually means it minimizes the sum of squared vertical distances from each point to the line (least squares). The line gives a simple linear model of how \(y\) tends to change with \(x\).
What’s the formula for a regression line?
The equation is \(\hat{y} = a + bx\), where \(\hat{y}\) is the predicted \(y\), \(a\) is the intercept, and \(b\) is the slope. The slope is \(b = r \cdot (s_y / s_x)\) and the intercept is \(a = \bar{y} – b\bar{x}\). Most calculators just compute these directly when you enter the data.
How do you interpret the slope?
The slope is the predicted change in \(y\) per one-unit increase in \(x\). If a regression of weight (kg) on height (cm) gives a slope of \(0.5\), then every extra centimeter of height is associated with a predicted half-kilogram increase in weight. The slope is not causation – just an associated change.
How do you interpret the intercept?
The intercept is the predicted \(y\) value when \(x = 0\). Sometimes that’s meaningful (predicted income at 0 years of experience); sometimes it isn’t (predicted weight when height is 0 cm – nonsense). If \(x = 0\) is outside your data range, the intercept is just a mathematical anchor, not a real prediction.
What’s the role of the correlation coefficient \(r\) here?
The correlation \(r\) measures the strength and direction of the linear relationship between \(x\) and \(y\). It plugs directly into the slope formula \(b = r \cdot s_y / s_x\). When \(r = 0\), the slope is 0 (no linear trend). When \(|r|\) is close to 1, the slope is large in magnitude and the line fits the data tightly.
What does a residual mean?
A residual is the difference between an observed value and the predicted value from the regression line: \(\text{residual} = y – \hat{y}\). Positive residual means the actual point is above the line; negative means below. A pattern in the residuals (like a U-shape) signals that a straight line isn’t capturing the relationship well.
What’s the coefficient of determination \(r^2\)?
\(r^2\) is the square of the correlation. It tells you what fraction of the variation in \(y\) is explained by the linear relationship with \(x\). If \(r = 0.8\), then \(r^2 = 0.64\) – so 64% of the variation in \(y\) is explained by \(x\), and the remaining 36% comes from other factors or noise.
Can I use a regression line to predict outside my data?
Be careful. Predicting inside the range of your \(x\) values is called interpolation and is usually safe. Predicting far outside that range (extrapolation) is risky – the linear pattern may break down, or the relationship may bend. Always note when a prediction is an extrapolation, and flag it as uncertain.
How do I find the regression line on a TI-83/84 calculator?
Enter your \(x\) values into L1 and \(y\) values into L2. Then press STAT \(\rightarrow\) CALC \(\rightarrow\) LinReg(a+bx) and ENTER. The calculator returns \(a\) (intercept), \(b\) (slope), \(r\) (correlation), and \(r^2\). Turn on DiagnosticOn first if you want to see \(r\) and \(r^2\).
Where does linear regression show up on tests?
Algebra II, AP Statistics, the SAT (often as scatter plot questions), the ACT, college intro stats, the GRE, and most data-literacy units on state tests. Common question types: write the regression equation, interpret slope and intercept in context, compute a residual, or assess whether a linear model is appropriate.
Related EffortlessMath Lessons
If a topic on this page feels rusty, these short lessons go deeper:
Related to This Article
More math articles
- Equipment Needed for Online Math Teaching
- Understanding Fractions for 4th Grade
- Using Number Lines to Represent Decimals
- 4th Grade IAR Math Worksheets: FREE & Printable
- 5th Grade NJSLA Math Worksheets: FREE & Printable
- 4 Best Printers for Teachers in 2026
- The Best Grade 3 ELA Practice Tests for Indiana Students
- Single-Skill Grade 5 Math for New Jersey (NJSLA): 49 Free Practice PDFs, No Signup
- How to Identify Independent and Dependent Events?
- How to Use Area Models to Divide Two-Digit Numbers By One-digit Numbers






































What people say about "How to Find the Equation of a Regression Line and Interpret Regression Lines - Effortless Math: We Help Students Learn to LOVE Mathematics"?
No one replied yet.