# shoe size and height

### GET A 40% DISCOUNT ON YOU FIRST ORDER

Shoe Size and Heights Class Activity (collect all StatCrunch outputs such as graphs and tables in a separate Word file named “Appendix 1” and make sure to include all group members’ names on the document and email to flekr@newschool.edu)
Names ___________________________       _____________________________
________________________________        _____________________________

1. Create a box plot of Shoe Size, including horizontal gridlines, and detecting any outliers. Answer the following questions about Shoe Size from the box plot:

(a) Write down the Five Number Summary:
Min:                Q1:                  Median:          Q3:                  Max:
(b) What is the range?
(c) What is the IQR?
(d) Is the distribution symmetric, negatively skewed or positively skewed? Explain.

(e) Create a relative frequency histogram to confirm your answer to (d).

1. Create two parallel box plots of Shoe Size comparing results based on (grouped by)      Gender.  Are there any outliers within each group, if so, what are they? Make sure to highlight       the corresponding cases.  Describe as many differences between the distributions as you can             based on the box plots.  Are there any similarities?

(a)       Calculating the related summary statistics for each group (n, Mean, SD,                                                   Median, IQR, Range, Min, Max, Q1, Q3, and Skewness).  Round your                                            answers to two decimal places, where applicable.
Summary Statistics for Shoe Size Grouped by Gender

 Gender n Mean Std. dev. Median IQR Range Min Max Q1 Q3 Skewness

(b)       Repeat the above calculation excluding the outlier(s) (by using the Where                                                 option, and excluding the outlier(s) by typing: !Row_Selected ). Describe all                                            differences.
Summary Statistics for Shoe Size with Outliers Removed Grouped by Gender

 Gender n Mean Std. dev. Median IQR Range Min Max Q1 Q3 Skewness

1. Using the Frequency Table function in StatCrunch, create two frequency tables for Shoe Size, one for Females and one for Males (excluding the outlier(s) and any missing entries denoted by “…”; in the Where option type !Row_Selected and “Shoe Size (US Scale)”!=…). Copy and paste the output in your Appendix file.

(a)       Then calculate the quantities below:
# of Females with Show Sizes within 1 SD of the Mean:  _7.13-1.18 =5.957.13+1.18=8.31 = 119
(Mean – 1SD)                    (Mean + 1SD)
% of Females with Show Sizes within 1 SD of the Mean:  _5.958.31    =  = 0.73 = 73%

# of Females with Show Sizes within 2 SD of the Mean:  _   ______ — ________=
(Mean – 2SD)          (Mean + 2SD)
% of Females with Show Sizes within 2 SD of the Mean:  _   ______ — ________ =  = ___ =     %
# of Females with Shoe Sizes within 3 SD of the Mean:  _   ______ — ________ =
% of Females with Shoe Sizes within 3 SD of the Mean: _   ______ — ________ =  = ___ =     %
# of Males with Shoe Sizes within 1 SD of the Mean: _   ______ — ________ =
% of Males with Shoe Sizes within 1 SD of the Mean: _   ______ — ________ =  = ___ =      %
# of Males with Shoe Sizes within 2 SD of the Mean:  _   ______ — ________ =
% of Males with Shoe Sizes within 2 SD of the Mean:  _   ______ — ________ =  = ___ =    %
# of Males with Shoe Sizes within 3 SD of the Mean:  _   ______ — ________ =
% of Males with Shoe Sizes within 3 SD of the Mean:  _   ______ — ________ =  = ___ =    %
(b)       According to the Empirical Rule, assuming that our data are normally distributed, what percentages should we have been expecting?  How close were the observed percentages?

(c)       Create two relative frequency histograms, Male and Female, overlaid with the Normal Curve, and describe the fit.

5).       (a)       Create a scatterplot of Heights vs. Shoe Size (include all available data) displaying the                              regression line, and describe the direction and strength of the association.  Are                            there any “unusual observations”?

(b)       Find the correlation coefficient r.  What does it tell you about the relationship between                             the two variables? Hover over the line to see its equation; round all numbers to two                                  decimal places:

Correlation between Height (inches) and Shoe Size is:                      r =

Regression Equation:

(c)       What happens if you remove any “unusual points”?  How does it affect the correlation                              coefficient? Is the regression equation affected?

Updated Correlation between Height (inches) and Shoe Size is:      r =

Updated Regression Equation:

(d)       Find the mean and SD for each of the two variables (not including the value(s) you                                    removed in the previous step), and verify that (i) the slope of the updated                                            regression line is equal to r*(SDy/SDx) and (ii) the point (Meanx, Meany) is                                            (approximately) on the regression line.
Summary statistics:

 Column Mean Std. dev. Shoe Size Height (inches)

(i)

(ii)

### GET A 40% DISCOUNT ON YOU FIRST ORDER

Posted in Uncategorized