Lesson 1.3: Summarizing Numerical Data

""
“Crayon shaving scatterplot” by Richie Girardin is licensed under CC BY 2.0

Lesson Learning Objectives

  • Build a frequency table for numerical data.
  • Construct histograms, stem-and-leaf displays, and dot plots to display the frequency distribution of a numerical variable.
  • Construct a scatterplot to display the relationship between two numerical variables.
  • Interpret the shape of the distribution of a numerical variable, commenting on its shape, centre, spread, and any unusual features.
  • Calculate the median, range, and interquartile range.
  • Use the outlier rule to identify outliers.
  • Use the 5-number summary to draw a boxplot.
  • Calculate the mean and standard deviation.
  • Compare the properties of the various measures of the centre and the various measures of the spread of a distribution.
  • Choose an appropriate measure of the centre and spread for a variable based on properties of its distribution.
  • Choose the right tool for comparing distributions.
  • Treat outliers with attention and care.

Lesson 1.3 Checklist

Learning activity Graded? Estimated time
Read OpenIntro Statistics section 2.1 and supplementary notes No 30 mins
Watch instructional videos No 15 mins
Answer two lesson check-in questions Yes 15 mins
Work through virtual statistical software lab No 45 mins
Answer two virtual statistical software lab questions Yes 15 mins
Work on practice exercises No 1.5 hours
Explore suggested websites No 15 mins

Learning Activities

Readings 📖 and Instructional Videos 🎦

Examining Numerical Data

Read Section 2.1: Examining Numerical Data in OpenIntro Statistics (Diez et al., 2019) CC BY-SA 3.0. This section introduces a variety of techniques for summarizing numerical data graphically and numerically. Skim Sections 2.1.7 on Transforming Data and 2.1.8 on Mapping Data, but don’t worry too much about the details.

As you read, look up new terminology in the Glossary and self-assess your understanding by attempting the guided practice exercises.

Watch the video, Summarizing and Graphing Numerical Data (Barr & Diez, 2013), on this topic (duration 00:03:10):

 

Summarizing Numerical Data

Read Supplementary Notes 1.3 for more details on histograms, stem-and-leaf displays, dot plots, 5-number summaries, boxplots, means, medians, standard deviations, and interquartile ranges.

Lesson Check-in Questions ✍

Answer the two check-in questions for Lesson 1.3 in your Moodle course. The questions are based on the material covered in the readings and instructional videos. The questions are multiple-choice, fill-in-the-blank, matching, or number questions, and they are auto-graded in Moodle. Once you access the questions, you have 15 minutes to submit your answers. Overall the Lesson Check-in Questions count 6% toward your total grade.

Virtual Statistical Software Lab 💻

Work through the virtual statistical software lab: Software Lab 1.3: Summarizing Numerical Data. This lab will show you how to generate simple graphical and numerical summaries of numerical data in jamovi. You’ll also learn how to process and subset a large dataset.

As you work through the lab, answer the exercises in the shaded boxes. These exercises are not graded, but the solutions are available: Software Lab 1.3 Solutions. The lab should take you no more than 45 minutes to complete.

Virtual Statistical Software Lab Questions ✍

Answer the two virtual statistical software lab questions for Software Lab 1.3 in your Moodle course. The questions are based on the lab you just completed. The questions are multiple-choice, fill-in-the-blank, matching, or number questions, and they are auto-graded in Moodle. Once you access the questions, you have 15 minutes to submit your answers. Overall the Software Lab Questions count 6% toward your total grade.

Practice Exercises 🖊

Work on the following exercises in OpenIntro Statistics: Exercises 2.1, 2.5, 2.7, 2.9, 2.11, 2.13, 2.15, and 2.17 and Chapter Exercises 2.27, 2.29, 2.31, and 2.33 (Diez et al., 2019) CC BY-SA 3.0. Check your answers using these solutions (Diez et al., 2019) CC BY-SA 3.0. You’ll deepen your understanding much more effectively if you genuinely attempt the questions by yourself before checking the solutions.

Work on the questions in the WeBWork platform, which are linked from your Moodle course. Check your answers using the solutions provided.

Suggested Websites 🌎

  • The Gapminder [Application] website encourages the use of statistics and related information to learn about social, economic, and environmental development around the world. Play around with their tools [Application] (Gapminder, n.d.) to discover global trends on just about everything you can think of.
  • Data Science for the Liberal Arts by Kevin Lanning (2021) is a work-in-progress that will ultimately serve as a textbook for introductory undergraduate courses in data sciences. It includes a wonderful chapter on Principles of Data Visualization, which is well worth a read.

Media Attributions

Crayon Shaving Scatterplot, by Richie Girardin (2013), via Flickr, CC BY 2.0

References

Barr, C. D., & Diez, D. [OpenIntroOrg]. (2013, May 5). Summarizing and graphing numerical data [Video]. YouTube. https://youtu.be/Xm0PPtci3JE

Diez, D. M., Çetinkaya-Rundel, M., Barr, C. D. (2019). OpenIntro Statistics (4th ed.). OpenIntro. https://www.openintro.org/book/os/

Gapminder. (n.d.). Animating data [Application]. https://www.gapminder.org/tools/#$chart-type=bubbles&url=v1

Girardin, R. (2013). Crayon shaving scatterplot [Photograph]. Flickr. https://flic.kr/p/faYgZH

Lanning, K. (2021, Sep. 12). Data science for the liberal arts. https://kevinlanning.github.io/DataSciLibArts/

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Probability and Statistics Copyright © 2023 by Thompson Rivers University is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book