Assignment 2: All about data, urban legends and statistics



    For all problems below, always show your work. I am not interested very much in whether or not you obtain the "right answer" but rather on the process you used to try and solve the problem. For questions involving the Z-test, it is best to entire the necessary data into the provided Z-test Calculator

  1. The current (2015) population of Eugene is approximately 163,500 people. One of your social media circles insists that Eugene population is growing out of control and that there will be 300,000 people by the year 2035. Everyone agrees with that so it must be true.

    Okay get the data: by typing "population of eugene oregon" in to Google to bring up this interactive graph (screen shot shown below for the case of 1991)



      a) Determine the growth rate (in units of % per year) for the period 2000-2015 from this graph.

      b) From that rate, using the doubling time to estimate in what year will the population of Eugene double?

      c) Compose a tweet (160 characters max) to your social media circle about why their "truth" is incorrect.

      d) If the Eugene school district is mandated to have 1 elementary school per 10,000 people, in how many more years (from 2015) will 3 new schools have to be built?


  2. The following is a list of total points in 15 randomly selected NFL football games

    23
    31
    17
    25
    12
    18
    48
    35
    27
    33
    41
    63
    24
    22
    48
    
    Calculate the average and standard deviation of these scores any way you like. This tool is very easy to use you can just cut and paste the data into that page. Here is another tool but it wants data separated by commas.

    Your urban legend friend (some one that believes anecodal evidence much more than quantatitave reasoning) insists that scoring more than 45 points in an NFL game is highly improbable (meaning it happens less than 1% of the time).

      a) Based on this data, what would you tell your friend?

      b) Based on this data, what should be the level of the 1% event?

  3. The following represent a set hypothetical exam scores. Prior to the exam, the professor said that a failing grade would occur if a student got less than 2 standard deviations below the mean of the exam.

    Here are the exam scores: 77 78 81 73 75 82 72 84 86 76 61 66 68 71 74 76 70 87 81 83 84 78 85 79

    The student that scored 61 an received an F argued that his score was only 20% below the average so how is that a failing grade? From the data determine if the student is above or below the 2 standard deviation requirement.




  4. Important note: 10 cents = 0.1$ - don't mix up your units for this question

    A survey of 25 gas stations in the Eugene Area showed an average price for unleaded gas of $3.50 per gallon with a standard deviation of 10 cents.

    A survey of 25 stations in the Portland area had an average of 3.20 per gallon with a standard deviation of 15 cents.

    A survey of 50 stations in the Seattle area had an average of $3.00 per gallon with a standard devation of 25 cents.



    a) What is the probability that in Eugene, Portland and Seattle you will find a gas station that charges $3.60 per gallon or more?

    b) What are the probalities of paying $3.75 per gallon in Eugene, Portland, and Seattle?

    c) Use the Z-test as described in the lecture material to determine if the difference in mean gas prices between Eugene and Portland is statistically significant.

    d) Suggest a reason that tbe standard deviation is so much higher in Seattle.



  5. The following data represent the average GPA of UO Students as a function of year.
    1972	2.43
    1973	2.44
    1974	2.45
    1975	2.44
    1976	2.47
    1977	2.48
    1978	2.5
    1979	2.52
    1980	2.56
    1981	2.58
    1982	2.58
    1983	2.58
    1984	2.57
    1985	2.6
    1986	2.6
    1987	2.59
    1988	2.58
    1989	2.6
    1990	2.64
    1991	2.66
    1992	2.74
    1993	2.76
    1994	2.79
    1995	2.78
    1996	2.8
    1997	2.82
    1998	2.84
    1999	2.79
    2000	2.82
    2001	2.85
    2002	2.87
    2003	2.89
    2004	2.91
    2005	2.9
    2006	2.93
    2007	2.94
    2008    2.96
    2009    2.97
    2010    3.01
    2011    3.03
    2012    3.07
    2013    3.13
    2014    3.09
    2015    3.21
    2016    3.17
    


    a) Use the Z-test as described in the lecture material and divide the data into two halves and calculate the Z-statistic.

    b) The standard around around the 1972 GPA was 0.7 and the standard deviation around the 2016 GPA was 0.6. Determine the percentage of students whose GPA was above 3.5 in 1972 compared to 2016.

    c) Using these results (and not your personal opinion), argue whether or not grade inflation is significant at the UO. (note: this is a real life question as I was recently on a committee that decided GPA inflation was not "real" or a problem at the U0)



  6. Another data graphing exercise:

    NASA has recently released the composite land+ocean surface temperature data set that is calibrated all the way back to 1880.



    You can cut and paste from the spread sheet into the plotting tool, like you did in the first homework assignment. By so doing put the year in column A and the temperature anamoly in column B and produce the plot and insert the plot here.

    Note the temperature anamoly refers to a given years temperature with respect to a temperature baseline that is chosen to be 1960 to 1990, for this data set. If one changes the baseline the anomaly data will change slightly, but the overall form of the curve will remain unaffected.

    a) Comment on the overall form of this graph and any features that might be present (note this means you have to look at the graph and think about it).

    b) Is the trend significant? Well lets' use the Z-test in the following way:

      1) Determine if the average value of temperature anamoly during the period 1990-2016 is significantly different than the value determined for the time period 1900-1989.

      2) Based on this determination, compose a Tweet to you know who about whether or not Global Warming is "Real" of just fake news.


    c) Ignoring the last 3 years, indicate on the graph some tupe of extrapolation (you can type the numbers into column C for some years and they will appear as Orange. For instance, adding the value of 1.0 for 2050 would product a point that looks like this:



    as a way of representing an extapolation(obviously not a good extrapolation). So just enter numbers for the appropriate years in column C to product a set of orange dots that can be your extrapolation from the current data. Submit that graph.

    d) Now include the last three years of data into your extrapolation and show how much larger the 2050 Temperature anamoly will be. Submit that graph

    e) The Paris Accord of November 2015, when translated into units of temperature anamoly means that the world needs to avoid an anomaly value of 1.75 - based on your extrapolations argue whether or not this goal is likely to be met.