Numerical Summaries

Numerical Summaries calculating numbers that tell us about certain aspects of the data

Here are some key concepts:

  1. Center
  2. Spread
  3. Shape
  4. Outliers

Basically, when we wish to discuss about center and spread, there are two separate approach:


Which statistics we should use?

The shape of the distribution, as well as whether we have outliers will determine whether we use order statistics (median and IQR) or moment statistics (mean and standard deviation) to describe the center and spread

In general we prefer to use moment statistics (mean and standard deviation) if we can, but there are certain situations where the mean and standard deviation are not good measures of center and spread

Here are some situations:

Symmetric shape with no ’extreme’ outliers → mean and std. dev.
Skewed shape or outliers (or both) → median and IQR


Parameters and statistics

Statistics are numerical summaries calculated from the sample.

Parameters are numerical summaries calculated from the population.


Z-scores

Standardizing our values is a way of adjusting them so that we can directly compare them. These adjusted values are called z-scores.

It is important because it can help us compare two variables that measure similar things but use different scales.