Friday, April 22, 2016

WEEK_10: Linear Regression

Hi there!

Regression analysis is a very widely used statistical tool to establish a relationship model between two variables. One of these variable is called predictor variable whose value is gathered through experiments. The other variable is called response variable whose value is derived from the predictor variable.

Ex: y=mx+a

In the above example, y is the response variable and x is the predictor variable.
m is the slope and a is called as y intercept. a becomes equal to y when x=0.

There are many types of regression models. Like linear regression, Multiple linear regression, Logistic regression etc.

Today we will  discuss about the linear regression.

In Linear Regression response and predictor variables are related through an equation, where exponent (power) of both these variables is 1. Mathematically a linear relationship represents a straight line when plotted as a graph.

In general, we represent linear regression using the formula y=mx+a.
For an example, let us analyze weight and average life span relationship and try to predict the life span.

Create data:

#create weight data
x <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)

#create average lifespan data
y <- c(70, 67, 65, 62, 61, 68, 79, 80, 70, 62)



# Apply the lm() function.
relation <- lm(y~x)

print(relation)
 
#Output:
 
#Here intercept is the value of response variable when predictor is zero. 
#In other terms it is the y intercept, point where the line meets y axis. 
#And another coefficient is the slope of the line. 
 
#print summary of the relation
print(summary(relation)) 

#Output:

#The difference between the observed value of the dependent variable and the 
#predicted value is called the residual
#R-squared = Explained variation / Total variation. 
#To know more about R-square follow this link.
 
#let us predict lifespan of a person who weighs 75kgs
a <- data.frame(x = 75)
result <-  predict(relation,a)
print(result)
#Output:
 
#let us visualize the regression graphically 
#plot the chart
plot(x,y,col = "blue",main = "Weight & lifespan Regression",
     abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",
+ylab = "age in years")

#save the file
dev.off()
 
#Output:
\
Thanks for visiting my blog. I always love to hear constructive feedback. Please give your feedback in the comment section below or write to me personally here

No comments:

Post a Comment