# Linear Model Project

When we have a set of data, often we would like to develop a model that fits the data.

First we graph the data points (*x*, *y*) to get a scatterplot. Take the data, determine an appropriate scale on the horizontal axis and the vertical axis, and plot the points, carefully labeling the scale and axes.

Summer Olympics: Men’s 400 Meter Dash Winning Times Year (x) Time(y) (seconds) 1948 46.20 1952 45.90 1956 46.70 1960 44.90 1964 45.10 1968 43.80 1972 44.66 1976 44.26 1980 44.60 1984 44.27 1988 43.87 1992 43.50 1996 43.49 2000 43.84 2004 44.00 2008 43.75

Burger | Fat (x) (grams) | Calories (y) |

Wendy’s Single | 20 | 420 |

BK Whopper Jr. | 24 | 420 |

McDonald’s Big Mac | 28 | 530 |

Wendy’s Big Bacon Classic | 30 | 580 |

Hardee’s The Works | 30 | 530 |

McDonald’s Arch Deluxe | 34 | 610 |

BK King Double Cheeseburger | 39 | 640 |

Jack in the Box Jumbo Jack | 40 | 650 |

BK Big King | 43 | 660 |

BK King Whopper | 46 | 730 |

Data from 1997

If the scatterplot shows a relatively linear trend, we try to fit a linear model, to find a line of best fit.

We could pick two arbitrary data points and find the line through them, but that would not necessarily provide a good linear model representative of all the data points.

A mathematical procedure that finds a line of “best fit” is called linear regression. This procedure is also called the method of least squares, as it minimizes the sum of the squares of the deviations of the points from the line. In MATH 107, we use software to find the regression line. (We can use Microsoft Excel, or Open Office, or a hand-held calculator or an online calculator — more on this in the Technology Tips topic.)

Linear regression software also typically reports parameters denoted by *r *or *r*2.

The real number *r *is called the correlation coefficient and provides a measure of the strength of the linear relationship.

*r *is a real number between 1 and 1.

*r *= 1 indicates perfect positive correlation — the regression line has positive slope and all of the data points are on the line.

r = 1 indicates perfect negative correlation — the regression line has negative slope and all of the data points are on the line

The closer |*r*| is to 1, the stronger the linear correlation. If *r *= 0, there is no correlation at all. The following examples provide a sense of what an *r *value indicates.

Source: *The Basic Practice of Statistics*, David S. Moore, page 108.

Notice that a positive *r *value is associated with an increasing trend and a negative *r *value is associated with a decreasing trend. The strongest linear models have *r *values close to 1 or close to 1.

The nonnegative real number *r*2 is called the coefficient of determination and is the square of the correlation coefficient *r*.

Since 0 |*r*| 1, multiplying through by |*r*|, we have 0 |*r*|2 |*r*| and we know that 1 *r * 1. So, 0 *r*2 1. The closer *r*2 is to 1, the stronger the indication of a linear relationship.

Some software packages (such as Excel) report *r*2, and so to get *r*, take the square root of *r*2 and determine the sign of *r *by observing the trend (+ for increasing, for decreasing).

**RESOURCES: Desmos Graphing Calculator and Linear Regression**

You can use the free online Desmos Graphing Calculator to produce a scatterplot and find the regression line and correlation coefficient.

Go to https://www.desmos.com/calculator and launch the calculator.

Select “table” from the menu at the upper left.

Page 1 of 7

Data for Project Example (Men’s 400 Meter Dash) has been entered. Regression help can be accessed via the “?” icon.

Select “expression” from the menu at the upper left.

Type y1 ~ mx1 + b and the values of *r*2, *r*, *m*, and *b *automatically appear.

Selecting the tool at the upper right, you can then adjust the scales on the x and y axes and create labels.

You can give your graph a name. In order to save your graph, sign in with a free account and click the share button. If you share the given link, then by followiing the link, the graph can be opened and manipulated. If you click the Image button, then you can save the graph as a file.

After clicking the Image button, you can view the graph as a stand-alone image, and select from several options to save.

### To complete the Linear Model portion of the project, you will need to use technology (or hand-drawing) to create a scatterplot, find the regression line, plot the regression line, and find *r *and *r*2.

**Below are some options, together with some videos. Each video is limited to 5 minutes or less. It takes a bit of time for the video to initially download. When playing the video, if you want to slow it down to read the text, hit the pause icon. (If you run the mouse over the bottom of the video screen, the video controls will appear.) **You may need to adjust the volume.

The basic options are to:

(1) **Generate by hand and scan.**

(2) **Use Microsoft Excel.**

Visit Scatterplot – Start (VIDEO) to see how to create a scatter plot using Microsoft Excel and format the axes.

Visit Scatterplot – Regression Line (VIDEO) to see how to add labels and title to the scatterplot, how to generate and graph the line of best fit (regression) and obtain the value of *r*2 in Microsoft Excel.

Using Excel to obtain **precise values of slope m and y-intercept b of the regression line**: Video, Spreadsheet

### (3) Use Open Office.

#### (4) Use a hand-held graphing calculator (See section 2.5 in your textbook for help with Texas Instruments hand-held calculators.)

(5) Use a free online tool

Use the free Desmos calculator: See **DesmosLinearRegressionGuide.pdf **to view how to generate a scatterplot and carry out linear regression.