# A tibble: 146 × 3
film critics audience
<chr> <int> <int>
1 Avengers: Age of Ultron 74 86
2 Cinderella 85 80
3 Ant-Man 80 90
4 Do You Believe? 18 84
5 Hot Tub Time Machine 2 14 28
6 The Water Diviner 63 62
7 Irrational Man 42 53
8 Top Five 86 64
9 Shaun the Sheep Movie 99 82
10 Love & Mercy 89 87
# ℹ 136 more rows

Modelling film ratings

What is the relationship between the critics and audience scores for films?

What is your best guess for a film’s audience score if the critics rated it a 73?

Predictor (explanatory variable)

audience

critics

86

74

80

85

90

80

84

18

28

14

62

63

...

...

Outcome (response variable)

audience

critics

86

74

80

85

90

80

84

18

28

14

62

63

...

...

Regression line

Regression line: slope

Regression line: intercept

Correlation

Correlation

Ranges between -1 and 1.

Same sign as the slope.

Models with a single predictor

Regression model

A regression model is a function that describes the relationship between the outcome, \(Y\), and the predictor, \(X\).

The regression line goes through the center of mass point (the coordinates corresponding to average \(X\) and average \(Y\)): \(b_0 = \bar{Y} - b_1~\bar{X}\)

Slope has the same sign as the correlation coefficient: \(b_1 = r \frac{sd_Y}{sd_X}\)

Sum of the residuals is zero: \(\sum_{i = 1}^n \epsilon_i = 0\)

Residuals and \(X\) values are uncorrelated

Interpreting the slope

The slope of the model for predicting audience score from critics score is 0.519. Which of the following is the best interpretation of this value?

For every one point increase in the critics score, the audience score goes up by 0.519 points, on average.

For every one point increase in the critics score, we expect the audience score to be higher by 0.519 points, on average.

For every one point increase in the critics score, the audience score goes up by 0.519 points.

For every one point increase in the audience score, the critics score goes up by 0.519 points, on average.