Lesson 6.1: Linear Association Between Two Numerical Variables

Software Lab 6.1 Solutions

  1. Average evaluation scores range from 2.30 to 4.88 with a mean of 4.08 and a standard deviation of 0.48. Average beauty ratings range from 1.67 to 8.17 with a mean of 4.60 and a standard deviation of 1.59. Age ranges from 29 to 73 with a mean of 47.2 and a standard deviation of 10.3.
    summary statistics - evals_professor data
    Figure 1: Descriptives summary statistics for evals_prof data
  2. Average evaluation scores are slightly right-skewed, with the majority of average scores between about 3.6 and 4.9.
    histogram - professor evaluation scores
    Figure 2: Histogram of professor evaluation scores

    Average beauty ratings are reasonably symmetric, with the majority of average ratings between about 3 and 7.

    histogram - average beauty ratings
    Figure 3: Histogram of evals_prof data, average beauty ratings

    Ages are mostly symmetric except for a couple in their seventies, with the majority of ages between 40 and 60.

    histogram - professor ages
    Figure 4: Histogram of evals_prof data, professor ages
  3. There is a slight positive linear trend as the average value of score tends to increase as bty_avg increases. There are a couple of points that stick-out from the overall point cloud: one with bty_avg about 5.2 and a very low value of score around 2.3, and another with bty_avg about 1.7 and score about 2.7.
    scatterplot - score vs bty_avg
    Figure 5: Scatterplot of score vs bty_avg
  4. There isn’t much of a trend between score and age, and the points seem pretty randomly scattered. There are a couple of points that stick-out from the overall point cloud: one with age about 41 and a very low value of score around 2.3, and another with age about 60 and score about 2.7. These are the same two points that stick out in question 3.
    scatterplot - score vs age
    Figure 6: Scatterplot of score vs. age
  5. The correlation between score and bty_avg is likely to have a slightly higher absolute value than the correlation between score and age since there is a slightly stronger linear relationship between score and bty_avg than between score and age.
  6. The correlation for score and bty_avg (0.156) has a greater absolute value than the correlation for score and age (–0.080).
  7. The correlation for score and bty_avg  decreases to 0.131, while the correlation for score and age is almost unchanged at –0.075.
  8. The linear regression line has a positive slope (correlation is positive), but it is not particularly steep (correlation is close to zero).
    scatterplot with regression line: score vs bty_avg
    Figure 7: Scatterplot of score vs. bty_avg has a linear, positive slope regression line
  9. The linear regression line has a negative slope (correlation is negative), but it is almost horizontal (correlation is very close to zero).
    scatterplot with regression line - score vs age
    Figure 8: Scatterplot of score vs. age has a linear, negative slope regression  line
  10. The variable bty_avg would produce slightly more accurate predictions of score on average, since there is a slightly stronger linear relationship between score and bty_avg than there is between score and age.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Probability and Statistics Copyright © 2023 by Thompson Rivers University is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book