Contact me

Profile Picture

Roshan kr Patel

Web Developer

  • Dashboard
  • resume
  • project
  • blog

Tech Blogs

๐Ÿ“Š Pure Mathematics Behind Regression

Regression is the backbone of predictive modeling in machine learning. In this blog, we will dive deep into regression techniquesโ€”simple linear, multi-linear, and logistic regressionโ€”and learn how to create our own predictive models from scratch. ๐Ÿง โœจ

๐Ÿš€ What you will learn:

  • How data is interpreted and calculated to form predictive models
  • How to create a model that works accurately using mathematical principles
  • Step-by-step explanation of regression techniques

๐Ÿ” Let's start with Linear Regression!


๐Ÿ“ˆ Linear Regression

Before we dive into calculations, let's understand what regression really means. In simple terms, regression helps machines learn from given data so that they can predict future values based on past trends. ๐Ÿ“Š

๐Ÿ”น What is Linear Regression?

Linear regression is a technique that predicts continuous values using a straight-line equation:

๐Ÿ“Œ Mathematical equation of a straight line:

y=mx+cy = mx + c

Where:

  • y = dependent variable (e.g., salary)
  • x = independent variable (e.g., years of experience)
  • m = slope of the line
  • c = intercept (where the line crosses the y-axis)

Let's break it down with an example. ๐Ÿ‘‡

๐Ÿ“Š Example Dataset

๐Ÿ† Years of Experience (X) ๐Ÿ’ฐ Salary (Y) (in $1000)
1 35
2 40
3 45
4 50
5 55

๐Ÿ“Œ Our goal: Find a pattern so that given a new input (years of experience), we can predict the salary.


๐Ÿ›  Finding the Best Fit Line

๐Ÿ” Step 1: Start with the General Equation

We assume the relationship follows:

y=mx+cy = mx + c

where yy represents salary, xx represents years of experience, mm is the slope, and cc is the intercept.

๐Ÿ“Œ Step 2: Forming Two Main Summation Equations

To generalize for multiple data points, we sum up the equations for all data points:

๐Ÿ“ Summing the original equation over all data points:

โˆ‘Y=mโˆ‘X+Nc\sum Y = m \sum X + Nc

๐Ÿ“ Multiplying each equation by xx and summing:

โˆ‘XY=mโˆ‘X2+cโˆ‘X\sum XY = m \sum X^2 + c \sum X

Now, these two equations contain summations that can be computed for any dataset. ๐Ÿงฎ

๐Ÿ“Œ Step 3: Solving for mm and cc

Rearranging the equations:

m=Nโˆ‘XYโˆ’โˆ‘Xโˆ‘YNโˆ‘X2โˆ’(โˆ‘X)2m = \frac{N \sum XY - \sum X \sum Y}{N \sum X^2 - (\sum X)^2} c=โˆ‘Yโˆ’mโˆ‘XNc = \frac{\sum Y - m \sum X}{N}

โœ… With these formulas, we can compute mm and cc for any dataset, whether small or large!

๐Ÿ“Œ Once we have mm and cc, we can use the equation y = mx + c to predict salary for new experience values.


๐Ÿš€ What's Next?

Now that we've built an understanding of simple linear regression, we will explore multi-linear regression and logistic regression in the next sections. Stay tuned! ๐Ÿ”ฅ๐Ÿ“Š