The normal curve shown represents the sampling distribution of a sample mean for sample size , selected at random from a population with standard deviation .

Estimate the standard deviation of the population .

Normal distribution with mean at 150, 1SD = 15

  • Recall that the standard distribution for a sampling distribution is

The histogram shown summarizes the responses of 100 people when asked, “What was the price of the last meal you purchased?”

Based on the histogram, which of the following could be the interquartile range of the prices?

Histogram of price frequency distribution

  • Since there are 100 responses, the IQR should contain 50% of all responses ( responses)

The manager of a restaurant tracks the types of dinners that customers order from the menu to ensure that the correct amount of food is ordered from the supplier each week. Data from customer orders last year suggest the following weekly distribution.

Type of DinnerBeefChickenFishPorkVegetarian
Proportion0.180.410.150.200.06

The manager believes that there might be a change in the distribution from last year to this year. A random sample of 200 orders was taken from all customer orders placed last week. The following table shows the results of the sample.

Type of DinnerBeefChickenFishPorkVegetarian
Frequency3286343018

Assume each order is independent. For which type of dinner is the value of its contribution to the appropriate test statistic the greatest?

  • Recall the chi-square formula
  • Since the is in the denominator, a smaller would yield a greater value
  • Therefore, “vegetarian” contributes the most to the test statistic ( in this case)
  1. A company that makes fleece clothing uses fleece produced from two farms, Northern Farm and Western Farm. Let the random variable  represent the weight of fleece produced by a sheep from Northern Farm. The distribution of  has mean 14.1 pounds and standard deviation 1.3 pounds. Let the random variable  represent the weight of fleece produced by a sheep from Western Farm. The distribution of  has mean 6.7 pounds and standard deviation 0.5 pound. Assume  and  are independent.

Let  equal the total weight of fleece from 10 randomly selected sheep from Northern Farm and 15 randomly selected sheep from Western Farm. What is the standard deviation, in pounds, of ?

  • The combined standard deviation from two samples is given by the formula

**A pharmaceutical company manufactures medicine to reduce pain caused by migraine headaches. The company is investigating whether a new medicine is more effective in reducing pain than the current medicine. A random sample of 500 participants who experience migraines was selected, and the participants were randomly assigned to one of two groups of equal size. The first group received the current medicine and the second group received the new medicine. When a participant experienced a migraine, he or she was instructed to take the medicine and, 15 minutes after taking the medicine, to rate the pain relief on a scale from 1 to 10, with 1 being no relief to 10 being complete relief. At the end of six months, the average pain relief for each participant was calculated. **

Describe the study.

  • Cause and effect is being established (experiment)
  • Participants are randomly assigned
  • Therefore, this study is an experiment using a randomized design

To obtain certification for a certain occupation, candidates take a proficiency exam. The exam consists of two sections, and neither section should be more difficult than the other. To investigate whether one section of the exam was more difficult than the other, a random sample of 50 candidates was selected. The candidates took the exam and their scores on each section were recorded. The table shows the summary statistics.

Mean Pct. CorrectStdev. Pct. Correct
First Section7510
Second Section655
Difference108

What is the appropriate test statistic to determine if there is a significant mean difference between the percent correct on the two sections (first minus second) for all candidates similar to those in the investigation?

  • Quantitative mean two samples: matched-pair -test
    • Note: the difference in standard deviation is given as 8. When calculated as , we find that the difference is

A medical center conducted a study to investigate cholesterol levels in people who have had heart attacks. A random sample of 16 people was obtained from the names of all patients of the medical center who had a heart attack in the previous year. Of the people in the sample, the mean cholesterol level was 264.70 milligrams per deciliter (mg/dL) with standard deviation 42.12 mg/dL.

Assuming all conditions for inference were met, which of the following is a 90 percent confidence interval for the mean cholesterol level, in mg/dL, of all patients of the medical center who had a heart attack in the previous year?

  • A -interval will be used since the sample size is less than 30 (15 degrees of freedom)
  • The confidence interval is given by the formula
  • is calculated with the -table at 0.95 to be 1.753
  • Bounds (246.241, 283.159)

The histograms show the results of three simulations of a sampling distribution of a sample mean. For each simulation, 1,500 samples of size n were selected from the same population and the sample mean was recorded. The value of n was different for each of the three simulations.

A is spread from 0 to 200; B is spread from 50 to 150; C is spread from 25 to 175

Which of the following is the correct ordering of the graphs from least value of n to greatest value of n ?

  • According to the central limit theorem, the sample standard deviation decreases with size
  • Therefore, the order is A, C, B (due to range)

A random sample of 1,018 city residents were asked to rate their level of support for a proposal being considered by the city council. The table shows the responses by level of support.

Level of SupportNumber of Responses
Very supportive336
Somewhat supportive387
Not supportive295

Based on the responses, which of the following is a 95 percent confidence interval for the proportion of all city residents who would respond very supportive or somewhat supportive of the proposal?

  • for 95% is 1.96

A fitness center offers a one-month program designed to reduce body fat through exercise. The table shows the body fat percentage before and after completing the program for 10 randomly selected participants.

ParticipantABCDEFGHIJ
Before (%)10.821.518.917.020.824.615.418.219.921.2
After (%)10.720.419.116.120.622.315.518.118.520.0

The director of the program wants to investigate whether knowing the body fat percentage before beginning the program can help to predict body fat percentage for someone who completes the program. Which procedure is the most appropriate for such an investigation?

  • Quantitative cause/effect: linear regression -test

According to a recent report, customers who shop at a certain online store spend, on average, $1,500 a year at the store. To investigate whether the mean amount spent was greater than the reported average, an economist obtained the mean and standard deviation of the amount spent in the past year by a random sample of 120 customers who shop at the store. With all conditions for inference met, the economist conducted the appropriate hypothesis test and obtained a -value of 0.25.

What statements would be an appropriate conclusion for the investigation?

  • The null hypothesis is that the average spending is $1500
  • A -value of 0.25 would not be significant in a confidence interval of 90% or greater
  • Therefore, there is not convincing statistical evidence suggesting that the mean amount that customers spent is greater or less than $1500. is accepted

Scientists working for a water district measure the water level in a lake each day. The daily water level in the lake varies due to weather conditions and other factors. The daily water level has a distribution that is approximately normal with mean water level of 84.07 feet. The probability that the daily water level in the lake is at least 100 feet is 0.064.

What is the probability that on a randomly selected day the water level in the lake will be at least 90 feet?

  • We know that and ( also follows a normal distribution)
  • Using the standard normal table, we find that 0.936 occurs at approximately
  • Using the formula:
  • We then find the -score of 90 to be

A consumer group wanted to investigate the relationship between the number of items purchased at a single visit to the local grocery store and the total cost of the items purchased. The group obtained a random sample of 11 receipts from the store and recorded the total number of items and the total cost from each receipt. The computer output of an analysis of total cost versus number of items purchased is shown in the table.

EstimateStd Errort RatioProb | t |
Intercept1.8826.68520.280.7847
Number of items2.7840.226512.29< 0.0001

Assume all conditions for inference were met. Based on the results shown in the table, what is the 95 percent confidence interval for the average change in total cost for each increase of 1 item purchased?

  • The formula for slope confidence intervals is
  • From the table, we find that when and DOF = 9

A doctor uses a new diagnostic test to indicate whether a patient has a certain disease. The doctor will prescribe medication for the patient if the doctor believes the patient has the disease, as indicated by the diagnostic test. The situation is similar to using a null hypothesis and an alternative hypothesis to decide whether to prescribe the medication. The hypotheses can be stated as follows.

: The patient does not have the disease; : The patient has the disease.

Describe the power of the test.

  • Recall that power = , which is the probability that a false null is rejected
  • A false null occurs when a healthy patient is diagnosed with the disease
  • Thus, the power of the test is to diagnose a patient with the disease as having the disease

To investigate the relationship between age and preference for two mayoral candidates in an upcoming election, a random sample of city residents was surveyed. The residents were asked which candidate they preferred, and each resident was classified into one of three age-groups. The test statistic for the appropriate hypothesis test was 3.7408.

Approximately what is the probability that the observed responses would be as far or farther from the expected responses if there is no association between age-group and preference?

  • Categorical one sample 2+ groups independence: test for independence
  • DOF is calculated as (Rows - 1) x (Columns - 1)
    • 3 age groups and 2 candidates = (3 - 1) x (2 - 1) = 2
  • Using the cumulative distribution function, we find that the -value at is 0.8459
  • Since this is , we take the compliment to get