VCAA Mathematical Methods Data analysis, probability and statistics
15 sample questions with marking guides and sample answers · Avg. score: 72%
In a computer fishing game, a player repeatedly casts their hook into either a blue pond or a red pond. Each fish they catch scores points.
In the blue pond, the probability of catching a fish on a cast is and each fish caught scores 10 points.
In the red pond, the probability of catching a fish on a cast is and each fish caught scores 15 points.
A player has three casts left and needs to score at least 30 points to win. All remaining casts must be in the same pond.
It is claimed that the probability of winning if casting in the blue pond is more than the probability of winning if casting in the red pond. Evaluate the reasonableness of the claim.
Reveal Answer
Let be a binomial random variable representing the number of casts that caught a fish.
corresponds to a probability of a catch (success).
Blue pond:
Red pond:
In the blue pond, 30 points will require three successes in three casts:
In the red pond, 30 points will require at least two successes in three casts:
i.e.
In the blue pond, the required probability is
In red pond, the required probability is
The claim is correct. Probability of winning if casting in the blue pond is more than the probability of winning using the red pond.
| Descriptor | Marks |
|---|---|
correctly identifies a method to determine at least 30 points in three casts for the blue pond | 1 |
determines the probability of scoring at least 30 points in the blue pond | 1 |
correctly identifies a method to determine at least 30 points in three casts for the red pond | 1 |
substitutes appropriate values into the chosen method for the red pond | 1 |
determines the probability of scoring at least 30 points in the red pond | 1 |
evaluates the claim | 1 |
Handspans of teenagers are approximately normally distributed, with a mean of 15 cm and a standard deviation of 2 cm.
Which of the following groups is expected to be the largest?
teenagers with handspans that are between 7 cm and 11 cm
teenagers with handspans that are between 11 cm and 15 cm
teenagers with handspans that are between 13 cm and 17 cm
teenagers with handspans that are between 17 cm and 21 cm
Reveal Answer
teenagers with handspans that are between 7 cm and 11 cm
This range falls between 2 and 4 standard deviations below the mean ( to ). This represents the far left tail of the curve, which contains a negligible percentage of the population.
teenagers with handspans that are between 11 cm and 15 cm
This range covers the area from 2 standard deviations below the mean up to the mean ( to ). While substantial, it captures less area than an interval of the same width centered directly on the peak of the distribution.
teenagers with handspans that are between 13 cm and 17 cm
This range corresponds to exactly one standard deviation below and above the mean ( cm). Because the normal distribution is symmetric and peaks at the mean, the interval centered on the mean contains the largest proportion of data (approximately 68%).
teenagers with handspans that are between 17 cm and 21 cm
This range falls between 1 and 3 standard deviations above the mean ( to ). This represents the tapering right tail of the distribution, which contains significantly fewer teenagers than the central range.
Assuming the approximate normality of sample proportions ( and ) and based on two independent samples, the approximate confidence interval for the difference of two proportions is given by
If the approximate confidence interval for the difference between two proportions does not contain 0, this provides evidence that the two proportions are not equal.
The data in the table shows the observed frequencies of two drink preferences for independent samples of people who live in Town A and Town B.
| Town | Tea | Coffee | Total |
|---|---|---|---|
| A | 111 | 105 | 216 |
| B | 150 | 107 | 257 |
Using the approximate 99% confidence interval for the difference of two proportions, determine if there is evidence to conclude that drink preference is associated with the town where the person lives.
Reveal Answer
proportion of Town A who prefer to drink tea
proportion of Town B who prefer to drink tea
The sample proportions are:
Using the 99% confidence interval for the difference of two proportions
This interval contains zero; therefore, there is no evidence in the data to say that the two proportions are different, i.e. preference to drink tea does not depend on where the person lives.
| Descriptor | Marks |
|---|---|
correctly determines the sample proportions | 1 |
establishes confidence interval for the difference of two proportions | 1 |
determines 99% confidence interval | 1 |
interprets 99% confidence interval to determine equality of proportions | 1 |
shows logical organisation communicating key steps | 1 |
The normal distribution probability density function is
, with the parameters mean, , and standard deviation, .
The speeds of electric scooter (e-scooter) riders on a particular section of a bike path are approximately normally distributed with a mean of 18 km/h. It is known that .
The speed limit for e-scooters on this section of bike path is 23 km/h.
A speed camera is set up and records the speeds of 75 e-scooter riders. Every rider travelling faster than the speed limit is given a $143 fine. Before setting up the speed camera, the following suggestion was made.
The total of the fines expected to be issued will be more than $1500.
Evaluate the reasonableness of this suggestion.
Reveal Answer
Given that
Using a GDC:
Two solutions are obtained for the std deviation
and
Reject 28.4020 as it is not a possible standard deviation
because for example three standard deviations less than
the mean would produce a negative speed or three above
would result in an impossible speed on an e-scooter
().
Use a GDC to determine with ,
The number of riders is:
The total fines obtained:
The expected total fines is about $1133, which is less
than the suggested $1500, so the suggestion is not
reasonable.
| Descriptor | Marks |
|---|---|
Correctly substitutes the known information, and , into the given normal distribution formula | 1 |
Determines a possible value for the standard deviation | 1 |
Determines the proportion of riders above 23 km/h | 1 |
Determines the number of riders above 23 km/h | 1 |
Determines the expected total fines | 1 |
Provides appropriate statement of reasonableness | 1 |
The uniformly distributed continuous random variable has an expected value of 6 and a maximum value of 9. Determine the variance of .
Reveal Answer
The expected value of a uniformly distributed continuous random variable is midway between the maximum and minimum values, so the probability density function is
The variance of is given by
| Descriptor | Marks |
|---|---|
determines the correct value and domain of the probability density function | 1 |
writes a correct integral expression for the variance | 1 |
correctly calculates the variance | 1 |
The binomially distributed discrete random variable has a mean of and a variance of . Evaluate .
Reveal Answer
From the question and . It follows that
And
Hence,
| Descriptor | Marks |
|---|---|
correctly states two equations relating and | 1 |
correctly solves for and | 1 |
correctly calculates the probability | 1 |
A council wants to survey residents about a new dog park. Which sampling method would best minimise bias in the survey?
Questioning every third resident entering a supermarket near an existing dog park.
Collecting responses from residents who clicked a survey link on the website.
Asking residents visiting a dog park on a randomly selected day.
Selecting residents using a random number generator.
Reveal Answer
Questioning every third resident entering a supermarket near an existing dog park.
Surveying near an existing dog park introduces location bias, as people in that area might have stronger opinions about dog parks than the general population.
Collecting responses from residents who clicked a survey link on the website.
This relies on voluntary response sampling, which introduces self-selection bias because only residents with strong opinions are likely to participate.
Asking residents visiting a dog park on a randomly selected day.
Surveying at a dog park introduces selection bias by overrepresenting dog owners and excluding residents who do not currently use dog parks.
Selecting residents using a random number generator.
Using a random number generator creates a simple random sample, giving every resident an equal chance of being chosen and effectively minimizing bias.
A mathematics teacher uses a coin flip activity to demonstrate confidence intervals to their class. They flip a fair coin 50 times in front of the class and observe 30 heads and 20 tails.
Calculate a 90% confidence interval for the proportion of heads obtained when the coin is flipped.
Reveal Answer
The sample proportion of heads is given by
Hence, the 90% confidence interval is
| Descriptor | Marks |
|---|---|
correctly calculates sample proportion of heads | 1 |
correctly calculates confidence interval | 1 |
As a homework exercise, the teacher asks all 20 students in the class to repeat the coin activity and calculate their own individual 90% confidence interval for the proportion of heads. Let be a random variable that denotes the number of students whose confidence interval contains the true proportion of heads.
State the distribution for .
Reveal Answer
| Descriptor | Marks |
|---|---|
states that the distribution is binomial | 1 |
states correct distribution parameters | 1 |
Determine the expected value and variance of .
Reveal Answer
The expected value of is given by
The variance of is given by
| Descriptor | Marks |
|---|---|
correctly calculates expected value | 1 |
correctly calculates variance | 1 |
Calculate the probability that the confidence intervals of three students do not contain the true proportion.
Reveal Answer
If three confidence intervals did not contain the true proportion, then 17 did contain the true proportion.
| Descriptor | Marks |
|---|---|
identifies that they are considering 17 confidence intervals containing the true proportion or defines the distribution for the complementary event | 1 |
calculates the correct probability | 1 |
A percentile is a measure in statistics showing the value below which a given percentage of observations occur.
The continuous random variable has the probability density function
Determine the 36th percentile of .
Reveal Answer
or
Given
| Descriptor | Marks |
|---|---|
correctly determines the definite integral | 1 |
determines the quadratic equation | 1 |
determines values of a | 1 |
evaluates the reasonableness of solutions | 1 |
A survey was conducted to understand whether people support a new policy.
Using a z-score of 2, the approximate confidence interval for the population proportion of people who support the policy was calculated as .
Determine the margin of error.
Reveal Answer
The confidence interval corresponds to , where E is the margin of error about .
subtracting:
| Descriptor | Marks |
|---|---|
Correctly determines the margin of error | 1 |
Determine the number of people surveyed.
Reveal Answer
Upper CI value =
25 people were surveyed.
| Descriptor | Marks |
|---|---|
Determines the value of the sample proportion | 1 |
Substitutes and z-score into the confidence interval formula | 1 |
Determines the number of people surveyed | 1 |
If the probability of success in a Bernoulli trial is 0.30, the variance is
0.70
0.46
0.30
0.21
Reveal Answer
0.70
This value represents the probability of failure (), not the variance.
0.46
This value is approximately the standard deviation (), rather than the variance.
0.30
This is the probability of success (), not the variance.
0.21
The variance of a Bernoulli trial is calculated using the formula . With , the variance is .
Mrs Euler is having her car serviced at BIMDAS Mechanics. She drops her vehicle off at 8 am and is told that her car will be ready for collection at some time between 1 pm and 5 pm that day.
Let the random variable denote the time after noon (12 pm) at which a vehicle is ready for collection at BIMDAS Mechanics. The probability density function for is shown in the graph below.
The probability of a vehicle being ready for collection between 2 pm and 3 pm is 0.1.
Mr Euler is also having his car serviced, but by Addition Autos. He drops his vehicle off at 8 am and is told that his car will be ready for collection at some time between 1 pm and 5 pm that day.
Let the random variable denote the time after noon (12 pm) that a vehicle is ready for collection at Addition Autos. The cumulative distribution function for is given by
Determine the value of .
Reveal Answer
The area under the curve must be equal to 1. Hence
| Descriptor | Marks |
|---|---|
states that the area under the curve must equal 1 | 1 |
obtains correct value of | 1 |
An incomplete expression for the probability density function of is given below. Fill in the boxes to complete the missing parts of the expression.
Reveal Answer
The probability density function for is given by
| Descriptor | Marks |
|---|---|
correctly completes the interval | 1 |
correctly completes the linear function | 1 |
Determine the expected time that Mrs Euler's vehicle will be ready for collection at BIMDAS Mechanics.
Reveal Answer
Therefore, the expected pickup time is 3:48 pm.
| Descriptor | Marks |
|---|---|
states a correct integral expression for the expected value of | 1 |
determines the correct expected value of | 1 |
states the expected value as a time | 1 |
Determine the probability that Mr Euler's vehicle will be ready to collect
by 3 pm.
Reveal Answer
| Descriptor | Marks |
|---|---|
calculates correct probability | 1 |
between 3 pm and 4 pm.
Reveal Answer
| Descriptor | Marks |
|---|---|
expresses the probability as the difference | 1 |
calculates correct probability | 1 |
Determine the expected time at which Mr Euler's vehicle will be ready for collection at Addition Autos.
Reveal Answer
The probability density function is given by
for (0 otherwise). Hence the expected value is given by
Therefore, the expected pickup time is 2:20 pm.
| Descriptor | Marks |
|---|---|
determines correct expression for the probability density function for | 1 |
determines the correct expected value for | 1 |
states the expected value as a time | 1 |
Unexplained respiratory symptoms reported by athletes are sometimes incorrectly thought to be exercise-induced asthma. A researcher wants to investigate the proportion of Australian athletes with unexplained respiratory symptoms who do have exercise-induced asthma. Using a nationwide repository of medical records, the researcher collects a random sample of 71 athletes referred by their doctor for unexplained respiratory symptoms.
Identify and explain a possible source of bias in the sampling method.
Reveal Answer
Exclusion bias: Athletes with unexplained respiratory symptoms who do not seek medical intervention, or who do not identify themselves as athletes to their doctor, will not be represented in the sample.
| Descriptor | Marks |
|---|---|
correctly explains a valid source of bias | 1 |
Ignore the potential bias in the sampling method in the remaining parts of the question.
Suppose that 25 athletes from the sample were found to have exercise-induced asthma.
Calculate a 95% confidence interval for the true proportion of athletes with unexplained respiratory symptoms who do have exercise-induced asthma.
Reveal Answer
The sample proportion
Hence the 95% confidence interval is given by
| Descriptor | Marks |
|---|---|
calculates the correct sample proportion | 1 |
calculates the correct 95% confidence interval | 1 |
Determine the margin of error of the 95% confidence interval from part (b).
Reveal Answer
The margin of error is given by
| Descriptor | Marks |
|---|---|
correctly determines the margin of error | 1 |
All else remaining unchanged, what would you expect to happen to the margin of error if the sample size was increased.
Reveal Answer
The margin of error would decrease if the sample size is increased.
| Descriptor | Marks |
|---|---|
states that the margin of error would decrease | 1 |
All else remaining unchanged, what would you expect to happen to the margin of error if the confidence level was increased.
Reveal Answer
The margin of error would increase if the confidence level is increased.
| Descriptor | Marks |
|---|---|
states that the margin of error would increase | 1 |
All else remaining unchanged, what would you expect to happen to the margin of error if the sample proportion of athletes with exercise-induced asthma decreased. Justify your answer.
Reveal Answer
Since the sample proportion is less than 0.5 (0.5 is the sample proportion yielding the largest margin of error), a decrease in sample proportion would move the margin of error away from being maximum. Hence the margin of error would decrease.
| Descriptor | Marks |
|---|---|
states that the margin of error would decrease | 1 |
provides a valid justification based on the sample proportion moving away from 0.5 | 1 |
Determine the minimum sample size required to guarantee a margin of error for the 95% confidence interval of at most 0.04.
Reveal Answer
We require
Since must be an integer the minimum value of the sample size is .
| Descriptor | Marks |
|---|---|
correctly substitutes values into the margin of error equation to obtain an equation for | 1 |
correctly solves for , rounding up to an integer value | 1 |
A separate large-scale study from the United States of America claims that 20% of American athletes with unexplained respiratory symptoms do have exercise-induced asthma.
Based on the 95% confidence interval calculated in part (b) on page 20, is the proportion of Australian athletes with unexplained respiratory symptoms who do have exercise-induced asthma different from the American proportion? Justify your answer.
Reveal Answer
Yes. The claimed American proportion is outside the confidence interval for Australian athletes, so it is possible to conclude that the proportions are different at the 95% level of confidence.
| Descriptor | Marks |
|---|---|
states that the proportion is different | 1 |
provides a justification based on the American proportion being outside the Australian confidence interval | 1 |
The probability of hitting a bullseye on a standard dartboard is 1 in 1250. What is the probability of hitting a bullseye exactly once in 10 attempts?
Reveal Answer
This option incorrectly uses for the number of trials in the binomial coefficient . Since there are 10 attempts, the coefficient should be .
This option uses the wrong number of trials () and swaps the exponents, calculating the probability of 9 successes and 1 failure instead of 1 success and 9 failures.
This correctly applies the binomial probability formula with trials, success, and probability of success .
This option swaps the exponents for success and failure. It calculates the probability of hitting the bullseye 9 times () and missing once, rather than hitting it exactly once.
Bottles of soft drink should contain a volume with a mean of 591 mL, but some variation is expected.
Any bottle at or below the 20th percentile of the volume distribution is rejected. A percentile is a measure in statistics that shows the values below which a given percentage of observations occur.
Thirty-five per cent of the bottles contain 593 mL or more of soft drink.
Assuming the volumes are normally distributed, determine the smallest volume (in mL) that will be accepted.
Reveal Answer
Given
Using GDC
z-score associated with 65th percentile
Using GDC
z-score associated with 20% rejection region
To determine the smallest volume that will be accepted ()
mL
| Descriptor | Marks |
|---|---|
Correctly determines the z-score associated with the 65th percentile | 1 |
Determines | 1 |
Correctly determines the z-score associated with the 20% rejection region | 1 |
Determines the smallest volume | 1 |
At council meetings in a particular town, new proposals are only discussed if more than 80% of the community are in favour of the proposal.
To discover community opinion on a new bus route proposal, the council conducted several surveys, each with a sample size of 120. The distribution of the sample proportions from the surveys had a standard deviation of 0.04.
Make a justified decision as to whether the new bus route proposal would be discussed at a council meeting.
Reveal Answer
Analytical procedure:
(formula book)
Use GDC equation solver or graph and the
square root function to find intersections:
or
Both population proportions (26% and 74%) are less than
80%.
Therefore, the bus route proposal would not be discussed
at a council meeting.
| Descriptor | Marks |
|---|---|
Correctly substitutes the given information into the standard deviation formula for sample proportion | 1 |
Determines a possible value for the population proportion | 1 |
Determines a second possible population proportion value | 1 |
Makes a justified decision regarding the proposal | 1 |