This Repo Is About Linear Regression With One feature

You will need python2

script is main.ipyndb

This repo codes the Linear Regression problem in this video. The math example: How to calculate linear regression using least square method

Linear regression finds the straight line, called the least squares regression line
Alt text

We want to find the regression line. The line that BEST fits through all our points(the least squares regression line). Alt text

To find the BEST fit line we must minimize our actual data from our estimated data. Alt text

Lets code!

The data x,y

//
values = [[1,2],[2,4],[3,5],[4,4],[5,5]]

![Alt text](rmimg/img1.jpg?raw=true "Title")

Lets plot the data for visualization(actual plotting is not needed in the code at this point). Alt text
what we want to find is the mean of x and the mean of y. We will write a function for that.

//
def mean(values):
    return sum(values) / float(len(values))     

Our line will pass through the point that x and y converge.
![Alt text](rmimg/img3.jpg?raw=true "Title")

Lets continue to find out the best fit line. To do so we must subtract the mean of our x from each x value then square each number and add them all up. The same thing must be done with the y value.
Alt text
Alt text

//
def variance(values, mean):
    return sum([(x-mean)**2 for x in values])   

solving for b1 Alt text
Alt text

//
def covariance(x, mean_x, y, mean_y):
    covar = 0.0
    for i in range(len(x)):
        covar += (x[i] - mean_x) * (y[i] - mean_y)
    return covar

solving for b0 Alt text
Alt text

b0 = y_mean - b1 * x_mean

# Putting it togeather
def mean(values):
    #print sum(values) / float(len(values))
    return sum(values) / float(len(values))


def variance(values, mean):
    #print sum([(x-mean)**2 for x in values])
    return sum([(x-mean)**2 for x in values])


def covariance(x, mean_x, y, mean_y):
    covar = 0.0
    for i in range(len(x)):
        covar += (x[i] - mean_x) * (y[i] - mean_y)
    return covar

values = [[1, 2], [2, 4], [3, 5], [4, 4], [5, 5]]

def coefficients(values):
    x = [row[0] for row in values]
    y = [row[1] for row in values]
    x_mean, y_mean = mean(x), mean(y)
    #var_x, var_y = variance(x, mean_x), variance(y, mean_y)
    #covar = covariance(x, mean_x, y, mean_y)
    b1 = covariance(x, x_mean, y, y_mean) / variance(x, x_mean)
    b0 = y_mean - b1 * x_mean
    return [b0, b1]

values = [[1, 2], [2, 4], [3, 5], [4, 4], [5, 5]]
b0,b1 = coefficients(values)
print('Coefficients: b0= %.3f, b1=%.3f' % (b0,b1))