QCAA Specialist Mathematics Statistical inference
15 sample questions with marking guides and sample answers
is a random variable with mean and standard deviation .
From random samples of values, each of size , the sample mean is calculated. This sampling and calculation is repeated a large number of times.
The mean of the distribution of the sample means would be approximately
Reveal Answer
This formula does not represent the mean of the sampling distribution. The mean of the sample means is equal to the population mean and is not divided by the sample size .
This incorrectly combines the population mean with the denominator for the standard error. The standard deviation of the sample means is , but the mean remains .
represents the mean of a single specific sample. The question asks for the mean of the distribution of all possible sample means, which is a population parameter.
According to the properties of sampling distributions, the expected value (or mean) of the distribution of sample means is always exactly equal to the population mean .
The time taken to complete orders at a pizza store is normally distributed with a mean time () of 10 minutes.
The owner of the pizza store records the time taken to complete orders for a random sample of 20 pizzas each day over a 30-day period. From this data, an approximate 90% confidence interval for is calculated at the end of each day.
How many of these confidence intervals would be expected to contain ?
3
18
27
30
Reveal Answer
3
This represents of the 30 days (). This is the expected number of intervals that would \textit{fail} to contain the mean, not the number that would contain it.
18
This represents only of the 30 days (). Given a confidence level, the expected number of successful intervals should be higher.
27
By definition, a confidence interval is expected to contain the true population parameter of the time in repeated sampling. Therefore, the expected number is .
30
This assumes that every single interval will contain the mean (). While possible, the expected value is determined by the specific confidence level of , not .
In a town, the mean number of residents per household is 3.79 people with a standard deviation of 1.47 people.
Using a random sample of 45 households from the town, determine the probability that the mean number of residents per household will be more than 4.
0.17
0.33
0.83
0.96
Reveal Answer
0.17
First, calculate the z-score: . The probability is .
0.33
This value does not result from the standard normal distribution calculation using the Central Limit Theorem parameters provided.
0.83
This represents the probability that the mean is less than 4 (). You must subtract this from 1 to find the probability of being more than 4.
0.96
This value is the calculated z-score (), not the probability associated with that z-score.
Rounded to two decimal places, the z-value used in the calculation of an approximate 95% confidence interval for is
0.95
1.64
1.96
2.58
Reveal Answer
0.95
This value represents the confidence level itself (0.95), not the critical z-score derived from the standard normal distribution.
1.64
This z-value (approximately 1.645) is typically used for a 90% confidence interval, corresponding to a tail area of 0.05.
1.96
For a 95% confidence interval, the significance level is . The critical value leaves in the upper tail, which corresponds to .
2.58
This z-value is typically used for a 99% confidence interval, corresponding to a tail area of 0.005.
The masses of avocados in a crop may be assumed to be normally distributed, with a mean of grams and a standard deviation of grams.
After an avocado of mass grams is peeled and the stone is removed, the mass of edible flesh grams is given by . Four avocados are randomly selected from the crop.
What is the probability, correct to four decimal places, that a total of more than grams of edible flesh is obtained?
Reveal Answer
This answer is incorrect and likely results from an error in calculating the standard deviation of the combined mass of the four avocados.
The total edible mass of 4 avocados has a mean of g and a variance of . Calculating gives .
This is incorrect. It stems from misapplying the properties of variance when combining independent normally distributed variables.
This is the probability that 4 times the mass of a single avocado is greater than grams. It incorrectly uses instead of the correct sum of independent variances .
A company claims that the mean battery life of their latest model of smartphone is 9.5 hours.
To test this claim, the battery lives of a random sample of 40 of the smartphones were measured.
A sample mean of 9.31 hours and a standard deviation of 0.52 hours were calculated from this data.
Determine an approximate 95% confidence interval for . Give your answer to at least two decimal places.
Reveal Answer
Given and
Using GDC
hours
| Descriptor | Marks |
|---|---|
correctly calculates 95% confidence interval to at least two decimal places | 1 |
Determine an approximate 99% confidence interval for . Give your answer to at least two decimal places.
Reveal Answer
Using GDC
hours
| Descriptor | Marks |
|---|---|
correctly calculates 99% confidence interval to at least two decimal places | 1 |
A manager comments that either confidence interval could be used to support the company’s claim.
Use your results from Questions 11a) and 11b) to evaluate the reasonableness of the manager’s comment. Justify your decision using mathematical reasoning.
Reveal Answer
The 95% confidence interval does not include the claimed mean battery life of 9.5 hours, although the 99% CI does.
So the comment is not reasonable.
| Descriptor | Marks |
|---|---|
justifies decision using mathematical reasoning | 1 |
provides appropriate statement of reasonableness | 1 |
Consider the following information.
| mean | variance | |
|---|---|---|
| Continuous random variable |
The waiting time (minutes) until workers at a certain call centre receive their th phone call, where , is a random variable with probability density function
where is a positive constant.
The waiting time until workers receive their 5th call is collected from a random sample of 80 workers.
Determine the probability that the mean waiting time from this sample is more than 16 minutes.
Reveal Answer
Using the property of a PDF
Using in the given PDF
Solving the equation:
Mean of distribution for waiting time until 5th call,
Variance of distribution for 5th call
Consider the distribution of the sample mean of the waiting time until the 5th phone call is received, .
As the sample size is large, the distribution of can be considered normal.
and
Using normal cdf on GDC:
| Descriptor | Marks |
|---|---|
Correctly determines equation in terms of k | 1 |
Solves equation to determine k | 1 |
Determines population mean | 1 |
Determine population variance | 1 |
Justifies that the distribution of T can be considered normal | 1 |
Determines mean and standard deviation of the sample mean | 1 |
Determines required probability | 1 |
A scientist investigates the distribution of the masses of fish in a particular river. A 95% confidence interval for the mean mass of a fish, in grams, calculated from a random sample of 100 fish is (70.2, 75.8).
The sample mean divided by the population standard deviation is closest to
1.3
2.6
5.1
10.2
13.0
Reveal Answer
1.3
This is incorrect. The sample mean is 73 and the population standard deviation is approximately 14.29, which does not yield a ratio of 1.3.
2.6
This is incorrect. This value is half of the correct ratio, which might result from incorrectly using the full interval width (5.6) instead of the margin of error (2.8) to calculate the standard deviation.
5.1
This is correct. The sample mean is the midpoint of the interval, . The margin of error is 2.8, so , giving . The ratio is .
10.2
This is incorrect. This is double the correct ratio, likely resulting from forgetting to divide the interval width by 2 when calculating the margin of error, which would incorrectly halve the calculated standard deviation.
13.0
This is incorrect. This value does not represent the ratio of the sample mean (73) to the population standard deviation ().
The travel time for students attending a certain university is assumed to be normally distributed, with a population mean of 25.2 minutes and standard deviation of 4.7 minutes.
Travel times are collected from a random sample of 120 of these students and used to calculate a sample mean, , in minutes.
Determine .
Reveal Answer
Given
Using GDC
| Descriptor | Marks |
|---|---|
correctly calculates for the first sample | 1 |
calculates required probability | 1 |
Given , determine the value of .
Reveal Answer
Using GDC
minutes
| Descriptor | Marks |
|---|---|
calculates | 1 |
Travel times are collected from a second random sample of the university's students and used to calculate a second sample mean, , in minutes.
Given , determine the number of students in the second sample.
Reveal Answer
Using GDC
The sample size is 35.
| Descriptor | Marks |
|---|---|
correctly calculates the z-value based on given probability | 1 |
determines an equation in terms of the sample size (n) | 1 |
determines an approximate value of n | 1 |
evaluates the reasonableness of the solution by rounding n to an integer value | 1 |
The scores on a test are assumed to be normally distributed.
Researchers use the results from a random sample of scores to calculate a confidence interval for the population mean. However, a shorter confidence interval width is required so the researchers decide to use a second sample for their calculations.
Assuming that the standard deviations for both samples are the same, the researchers can ensure that a shorter confidence interval width is produced by
decreasing the sample size and decreasing the confidence level.
decreasing the sample size and increasing the confidence level.
increasing the sample size and decreasing the confidence level.
increasing the sample size and increasing the confidence level.
Reveal Answer
decreasing the sample size and decreasing the confidence level.
Decreasing the sample size increases the standard error (), which widens the interval and counteracts the narrowing effect of a lower confidence level.
decreasing the sample size and increasing the confidence level.
Both decreasing the sample size and increasing the confidence level contribute to a wider confidence interval, not a shorter one.
increasing the sample size and decreasing the confidence level.
A confidence interval width is determined by . Increasing the sample size () reduces the standard error, and decreasing the confidence level reduces the critical value (), both of which shorten the interval.
increasing the sample size and increasing the confidence level.
Increasing the confidence level requires a larger critical value, which widens the interval and opposes the narrowing effect of the increased sample size.
The mass of a certain species of kangaroo is known to be normally distributed with a mean mass of kg and standard deviation of kg.
When one of the kangaroos is randomly selected, the probability that its mass is greater than 83.2 kg is 0.145.
When a sample of 12 kangaroos is randomly selected, the probability that the sample mean mass is less than 74.1 kg is 0.079.
A 90% approximate confidence interval for is calculated using a random sample of of the kangaroos that has a sample mean mass of 79.1 kg and a sample standard deviation equal to .
Determine the possible range of values that could have been, given that the confidence interval did not contain .
Reveal Answer
Sample 1:
Sample 2:
Using graph facility of GDC to solve (1) and (2)
Sample 3: Consider the 90% CI
Since can only lie in an interval below the lower bound of CI.
Determining where the lower bound of CI
Using solve facility of GDC,
As must lie in an interval below the lower bound of CI, the range of values is where .
| Descriptor | Marks |
|---|---|
correctly uses the sample of 1 to determine an equation in terms of μ and σ | 1 |
correctly uses the sample of 12 to determine an equation in terms of μ and σ | 1 |
solves simultaneous equations to determine the values of μ and σ | 1 |
determines solution of n | 1 |
evaluates the reasonableness of the solution to the equation to determine suitable integer values of n | 1 |
shows logical organisation communicating key steps | 1 |
A random variable is normally distributed with a mean . An approximate 95% confidence interval for from a sample from this distribution is .
An approximate confidence interval for based on the same sample, using a confidence level greater than 95%, could be
Reveal Answer
The confidence interval must be centered at the sample mean . The original interval has a mean of , while this option is centered at , implying it is not from the same sample.
This interval is centered at the same sample mean () and is wider than the original interval. Increasing the confidence level increases the critical value, which increases the margin of error and results in a wider interval.
While this interval is centered at the correct mean (), it is narrower than the original 95% interval. A narrower interval corresponds to a lower confidence level (smaller margin of error), not a higher one.
The interval must be centered at the sample mean calculated from the original data (). This option is centered at , so it cannot be based on the same sample.
A random variable is normally distributed with a mean of 36 and a standard deviation of 4.
The respective mean and standard deviation of the distribution of from repeated random samples of size 9 are
4 and
4 and
36 and
36 and
Reveal Answer
4 and
This is incorrect because the mean of the sampling distribution should equal the population mean (36), not the population standard deviation (4). Additionally, the standard deviation is calculated incorrectly as .
4 and
This is incorrect because the mean of the sampling distribution is equal to the population mean (36), not 4. However, the standard deviation value of is calculated correctly.
36 and
This is incorrect because the standard deviation of the sample mean (standard error) is calculated as instead of the correct formula .
36 and
The mean of the sampling distribution equals the population mean (36), and the standard deviation is calculated as .
The height of Year 9 students at a school is assumed to be normally distributed with a population mean height of cm.
A teacher at the school measured the height of all the students in her Year 9 class. This data was used to calculate an approximate 95% confidence interval for of cm.
The teacher repeated the procedure using data from another Year 9 class. Although this class had the same number of students, its data produced an approximate 95% confidence interval for of cm.
Using the same data, the teacher recalculated the approximate confidence intervals for for each class using a confidence level of . She observed that the upper bound of the confidence interval from her Year 9 class now equalled the lower bound of the confidence interval from the other Year 9 class.
Determine the value of . Give your answer rounded to one decimal place.
Reveal Answer
Situation 1
-score for 95% CI = 1.96
Teacher's class:
Other class:
Teacher's class:
Let be the sample standard deviation for Class 1 of sample size .
Other class:
Let be the sample standard deviation for Class 2 of sample size .
Situation 2
Let the -score for the CI using a confidence level of be .
New upper limit of CI for teacher's class equals new lower limit of CI for other class.
Using earlier results
Using GDC
| Descriptor | Marks |
|---|---|
correctly determines the z-score associated with a 95% CI | 1 |
correctly determines the sample means for both classes | 1 |
determines a relationship between the sample standard deviation and the sample size for the teacher's class | 1 |
determines a relationship between the sample standard deviation and the sample size for the other class | 1 |
determines an equation in terms of the z-score associated with the new CIs using the data from the two classes | 1 |
determines the z-score associated with the new CI calculations | 1 |
determines the confidence level for the new CI calculations, rounded to one decimal place | 1 |
The height (cm) of people in a certain population is normally distributed with a standard deviation of 7.42 cm.
A researcher takes repeated random samples of 15 people and calculates the mean height for each sample.
The expected standard deviation (cm) of the distribution of these sample mean heights would be approximately
0.49
1.92
2.02
5.51
Reveal Answer
0.49
Incorrect. This value is obtained by dividing the population standard deviation by the sample size (), rather than by the square root of the sample size .
1.92
Correct. The standard deviation of the distribution of sample means (standard error) is calculated using the formula , which gives .
2.02
Incorrect. This value does not correctly apply the standard error formula .
5.51
Incorrect. This value does not correctly apply the standard error formula .