Homework 1-Advanced Data Analysis

Advanced Data Analysis

See general homework tips and submit your files via the course website.
For all exercises, use the iris data set from the SAS help (e.g. data=sashelp.iris).
Exercise 1:
a) Obtain box plots for sepallength by species and comment on any differences you notice
between the different species’ sepal lengths.
b) Obtain basic descriptive statistics for the sepal lengths for all species together. Comment on any
general features of the data (e.g. typical sepal lengths, range of values, etc.). Also visually and
quantitatively check if an assumption of normality would be reasonable for the underlying
population.
c) Obtain basic descriptive statistics and visually and quantitatively check the assumption of
normality for sepallength by species. Comment on how the species-wise statistics differ from
those for all species combined, and comment on the conclusions of the species-wise normality
tests.
Exercise 2:
a) Test the null hypothesis that the true mean or median sepal length is 60 against the alternative
that it is not 60. Based on the normality tests from Exercise 1, which location test should we use
and what do we conclude about the true mean or median sepal length of the population?
b) Of the three species, Virginica has the highest mean and median sepal length, but is it
significantly greater than the general population? Use the sample median for sepal length of all
species as the null value and perform a t, sign, or signed rank test to test whether Virginica has
significantly larger sepal length compared to the general population. (Note: you should have the
median value in results from Exercise 1, and tests from Exercise 1 should also tell you whether
to use a t test or one of the rank-based tests).
c) Consider the Setosa and Versicolor sepal lengths. Test for differences of the two populations
(e.g. test for a difference of mean or median if appropriate or test if one population is
stochastically greater if testing the difference of means would not be appropriate), and state
your conclusion.
Exercise 3:
a) Obtain the Pearson correlation matrix for the entire data set. Comment on what the results tell
us about significant relationships between the four length and width measurements.
b) Obtain the Pearson correlation matrix by species. Comment on what the results tell us about
significant relationships between the four length and width measurements for each species and
how these relationships compare with those noted for the entire population.

Previous answers to this question


This is a preview of an assignment submitted on our website by a student. If you need help with this question or any assignment help, click on the order button below and get started. We guarantee authentic, quality, 100% plagiarism free work or your money back.

order uk best essays Get The Answer