VCAA Mathematical Methods Data analysis, probability and statistics
15 sample questions with marking guides and sample answers · Avg. score: 68.8%
Mika is flipping a coin. The unbiased coin has a probability of of landing on heads and of landing on tails.
Let be the binomial random variable representing the number of times that the coin lands on heads.
Mika flips the coin five times.
The height reached by each of Mika's coin flips is given by a continuous random variable, , with the probability density function
where is the vertical height reached by the coin flip, in metres, between the coin and the floor, and , and are real constants.
Mika's sister Bella also has a coin. On each flip, Bella's coin has a probability of of landing on heads and of landing on tails, where is a constant value between 0 and 1.
Bella flips her coin 25 times in order to estimate .
Let be the random variable representing the proportion of times that Bella's coin lands on heads in her sample.
Find .
Reveal Answer
| Descriptor | Marks |
|---|---|
Calculates the correct exact probability: or | 1 |
Find .
Reveal Answer
| Descriptor | Marks |
|---|---|
Calculates the correct exact probability: or | 1 |
Find , correct to three decimal places.
Reveal Answer
Working
| Descriptor | Marks |
|---|---|
Sets up the correct conditional probability expression or substitution, e.g., or | 1 |
Answer
| Descriptor | Marks |
|---|---|
Calculates the correct answer to three decimal places: | 1 |
Find the expected value and the standard deviation for .
Reveal Answer
,
Expected Value
| Descriptor | Marks |
|---|---|
Calculates the correct expected value: or | 1 |
Standard Deviation
| Descriptor | Marks |
|---|---|
Calculates the correct exact standard deviation: | 1 |
State the value of the definite integral .
Reveal Answer
| Descriptor | Marks |
|---|---|
States the correct value of the definite integral: | 1 |
Given that and , find the values of , and .
Reveal Answer
, ,
Working
| Descriptor | Marks |
|---|---|
Sets up the correct system of definite integrals: , , and | 1 |
Demonstrates a correct method to evaluate the integrals and solve the system of equations for the constants | 1 |
Answer
| Descriptor | Marks |
|---|---|
Calculates all three correct exact values: (or ), (or ), and | 1 |
The ceiling of Mika's room is 3 m above the floor. The minimum distance between the coin and the ceiling is a continuous random variable, , with probability density function .
The function is a transformation of the function given by ,
where is the minimum distance between the coin and the ceiling, and and are real constants.
Find the values of and .
Reveal Answer
| Descriptor | Marks |
|---|---|
Finds both correct values: and | 1 |
Is the random variable discrete or continuous? Justify your answer.
Reveal Answer
Discrete, countable
| Descriptor | Marks |
|---|---|
Identifies the random variable as discrete and provides a valid justification (e.g., the number of heads is countable) | 1 |
If , find an approximate 95% confidence interval for , correct to three decimal places.
Reveal Answer
| Descriptor | Marks |
|---|---|
Calculates the correct 95% confidence interval, correct to three decimal places: | 1 |
Bella knows that she can decrease the width of a 95% confidence interval by using a larger sample of coin flips.
If , how many coin flips would be required to halve the width of the confidence interval found in part c.ii.?
Reveal Answer
| Descriptor | Marks |
|---|---|
Calculates the correct number of coin flips required: | 1 |
In a computer fishing game, a player repeatedly casts their hook into either a blue pond or a red pond. Each fish they catch scores points.
In the blue pond, the probability of catching a fish on a cast is and each fish caught scores 10 points.
In the red pond, the probability of catching a fish on a cast is and each fish caught scores 15 points.
A player has three casts left and needs to score at least 30 points to win. All remaining casts must be in the same pond.
It is claimed that the probability of winning if casting in the blue pond is more than the probability of winning if casting in the red pond. Evaluate the reasonableness of the claim.
Reveal Answer
Let be a binomial random variable representing the number of casts that caught a fish.
corresponds to a probability of a catch (success).
Blue pond:
Red pond:
In the blue pond, 30 points will require three successes in three casts:
In the red pond, 30 points will require at least two successes in three casts:
i.e.
In the blue pond, the required probability is
In red pond, the required probability is
The claim is correct. Probability of winning if casting in the blue pond is more than the probability of winning using the red pond.
| Descriptor | Marks |
|---|---|
correctly identifies a method to determine at least 30 points in three casts for the blue pond | 1 |
determines the probability of scoring at least 30 points in the blue pond | 1 |
correctly identifies a method to determine at least 30 points in three casts for the red pond | 1 |
substitutes appropriate values into the chosen method for the red pond | 1 |
determines the probability of scoring at least 30 points in the red pond | 1 |
evaluates the claim | 1 |
Handspans of teenagers are approximately normally distributed, with a mean of 15 cm and a standard deviation of 2 cm.
Which of the following groups is expected to be the largest?
teenagers with handspans that are between 7 cm and 11 cm
teenagers with handspans that are between 11 cm and 15 cm
teenagers with handspans that are between 13 cm and 17 cm
teenagers with handspans that are between 17 cm and 21 cm
Reveal Answer
teenagers with handspans that are between 7 cm and 11 cm
This range falls between 2 and 4 standard deviations below the mean ( to ). This represents the far left tail of the curve, which contains a negligible percentage of the population.
teenagers with handspans that are between 11 cm and 15 cm
This range covers the area from 2 standard deviations below the mean up to the mean ( to ). While substantial, it captures less area than an interval of the same width centered directly on the peak of the distribution.
teenagers with handspans that are between 13 cm and 17 cm
This range corresponds to exactly one standard deviation below and above the mean ( cm). Because the normal distribution is symmetric and peaks at the mean, the interval centered on the mean contains the largest proportion of data (approximately 68%).
teenagers with handspans that are between 17 cm and 21 cm
This range falls between 1 and 3 standard deviations above the mean ( to ). This represents the tapering right tail of the distribution, which contains significantly fewer teenagers than the central range.
Assuming the approximate normality of sample proportions ( and ) and based on two independent samples, the approximate confidence interval for the difference of two proportions is given by
If the approximate confidence interval for the difference between two proportions does not contain 0, this provides evidence that the two proportions are not equal.
The data in the table shows the observed frequencies of two drink preferences for independent samples of people who live in Town A and Town B.
| Town | Tea | Coffee | Total |
|---|---|---|---|
| A | 111 | 105 | 216 |
| B | 150 | 107 | 257 |
Using the approximate 99% confidence interval for the difference of two proportions, determine if there is evidence to conclude that drink preference is associated with the town where the person lives.
Reveal Answer
proportion of Town A who prefer to drink tea
proportion of Town B who prefer to drink tea
The sample proportions are:
Using the 99% confidence interval for the difference of two proportions
This interval contains zero; therefore, there is no evidence in the data to say that the two proportions are different, i.e. preference to drink tea does not depend on where the person lives.
| Descriptor | Marks |
|---|---|
correctly determines the sample proportions | 1 |
establishes confidence interval for the difference of two proportions | 1 |
determines 99% confidence interval | 1 |
interprets 99% confidence interval to determine equality of proportions | 1 |
shows logical organisation communicating key steps | 1 |
The normal distribution probability density function is
, with the parameters mean, , and standard deviation, .
The speeds of electric scooter (e-scooter) riders on a particular section of a bike path are approximately normally distributed with a mean of 18 km/h. It is known that .
The speed limit for e-scooters on this section of bike path is 23 km/h.
A speed camera is set up and records the speeds of 75 e-scooter riders. Every rider travelling faster than the speed limit is given a $143 fine. Before setting up the speed camera, the following suggestion was made.
The total of the fines expected to be issued will be more than $1500.
Evaluate the reasonableness of this suggestion.
Reveal Answer
Given that
Using a GDC:
Two solutions are obtained for the std deviation
and
Reject 28.4020 as it is not a possible standard deviation
because for example three standard deviations less than
the mean would produce a negative speed or three above
would result in an impossible speed on an e-scooter
().
Use a GDC to determine with ,
The number of riders is:
The total fines obtained:
The expected total fines is about $1133, which is less
than the suggested $1500, so the suggestion is not
reasonable.
| Descriptor | Marks |
|---|---|
Correctly substitutes the known information, and , into the given normal distribution formula | 1 |
Determines a possible value for the standard deviation | 1 |
Determines the proportion of riders above 23 km/h | 1 |
Determines the number of riders above 23 km/h | 1 |
Determines the expected total fines | 1 |
Provides appropriate statement of reasonableness | 1 |
Suppose that the queuing time, (in minutes), at a customer service desk has a probability density function given by
for some .
Show that .
Reveal Answer
| Descriptor | Marks |
|---|---|
Correctly forms an integral equation equal to 1, antidifferentiates, and solves to show | 1 |
Find .
Reveal Answer
| Descriptor | Marks |
|---|---|
Sets up the correct integral for the expected value, e.g., | 1 |
Evaluates the integral correctly to find the final answer of or | 1 |
What is the probability that a person has to queue for more than two minutes, given that they have already queued for one minute?
Reveal Answer
| Descriptor | Marks |
|---|---|
Correctly formulates the conditional probability expression, such as | 1 |
Correctly substitutes the integrals into the numerator and denominator as a single fraction | 1 |
Correctly evaluates the expression to find the final probability of or | 1 |
A council wants to survey residents about a new dog park. Which sampling method would best minimise bias in the survey?
Questioning every third resident entering a supermarket near an existing dog park.
Collecting responses from residents who clicked a survey link on the website.
Asking residents visiting a dog park on a randomly selected day.
Selecting residents using a random number generator.
Reveal Answer
Questioning every third resident entering a supermarket near an existing dog park.
Surveying near an existing dog park introduces location bias, as people in that area might have stronger opinions about dog parks than the general population.
Collecting responses from residents who clicked a survey link on the website.
This relies on voluntary response sampling, which introduces self-selection bias because only residents with strong opinions are likely to participate.
Asking residents visiting a dog park on a randomly selected day.
Surveying at a dog park introduces selection bias by overrepresenting dog owners and excluding residents who do not currently use dog parks.
Selecting residents using a random number generator.
Using a random number generator creates a simple random sample, giving every resident an equal chance of being chosen and effectively minimizing bias.
A mathematics teacher uses a coin flip activity to demonstrate confidence intervals to their class. They flip a fair coin 50 times in front of the class and observe 30 heads and 20 tails.
Calculate a 90% confidence interval for the proportion of heads obtained when the coin is flipped.
Reveal Answer
The sample proportion of heads is given by
Hence, the 90% confidence interval is
| Descriptor | Marks |
|---|---|
correctly calculates sample proportion of heads | 1 |
correctly calculates confidence interval | 1 |
As a homework exercise, the teacher asks all 20 students in the class to repeat the coin activity and calculate their own individual 90% confidence interval for the proportion of heads. Let be a random variable that denotes the number of students whose confidence interval contains the true proportion of heads.
State the distribution for .
Reveal Answer
| Descriptor | Marks |
|---|---|
states that the distribution is binomial | 1 |
states correct distribution parameters | 1 |
Determine the expected value and variance of .
Reveal Answer
The expected value of is given by
The variance of is given by
| Descriptor | Marks |
|---|---|
correctly calculates expected value | 1 |
correctly calculates variance | 1 |
Calculate the probability that the confidence intervals of three students do not contain the true proportion.
Reveal Answer
If three confidence intervals did not contain the true proportion, then 17 did contain the true proportion.
| Descriptor | Marks |
|---|---|
identifies that they are considering 17 confidence intervals containing the true proportion or defines the distribution for the complementary event | 1 |
calculates the correct probability | 1 |
Let be the probability density function for a continuous random variable , where
and is a positive real number.
The value of is
Reveal Answer
This is the value of the constant , not the probability . The constant is found by setting the total area under the PDF to 1.
First, find by setting the integral of from to equal to . Then, calculate by evaluating .
This is only the probability . It incorrectly omits the probability which must be included since the lower bound of the domain is .
This is an incorrect calculation. A probability value cannot exceed 1, so is impossible for any valid probability density function.
Let be the random variable that represents the sample proportion of households in a given suburb that have solar panels installed.
From a sample of randomly selected households in a given suburb, an approximate 95% confidence interval for the proportion of households having solar panels installed was determined to be .
Use to approximate the 95% confidence interval.
Find the value of that was used to obtain this approximate 95% confidence interval.
Reveal Answer
| Descriptor | Marks |
|---|---|
Calculates the correct value of | 1 |
Find the size of the sample from which this 95% confidence interval was obtained.
Reveal Answer
| Descriptor | Marks |
|---|---|
Sets up a correct equation involving using the margin of error or confidence interval bounds | 1 |
Calculates the correct sample size, | 1 |
A larger sample of households is selected, with a sample size four times the original sample.
The sample proportion of households having solar panels installed is found to be the same.
By what factor will the increased sample size affect the width of the confidence interval?
Reveal Answer
Confidence interval width is halved (reduced or decreased by a factor of 2; altered by a factor of ).
| Descriptor | Marks |
|---|---|
States the correct factor by which the width is affected (e.g., , halved, or decreased by a factor of 2) | 1 |
The uniformly distributed continuous random variable has an expected value of 6 and a maximum value of 9. Determine the variance of .
Reveal Answer
The expected value of a uniformly distributed continuous random variable is midway between the maximum and minimum values, so the probability density function is
The variance of is given by
| Descriptor | Marks |
|---|---|
determines the correct value and domain of the probability density function | 1 |
writes a correct integral expression for the variance | 1 |
correctly calculates the variance | 1 |
The binomially distributed discrete random variable has a mean of and a variance of . Evaluate .
Reveal Answer
From the question and . It follows that
And
Hence,
| Descriptor | Marks |
|---|---|
correctly states two equations relating and | 1 |
correctly solves for and | 1 |
correctly calculates the probability | 1 |
A percentile is a measure in statistics showing the value below which a given percentage of observations occur.
The continuous random variable has the probability density function
Determine the 36th percentile of .
Reveal Answer
or
Given
| Descriptor | Marks |
|---|---|
correctly determines the definite integral | 1 |
determines the quadratic equation | 1 |
determines values of a | 1 |
evaluates the reasonableness of solutions | 1 |
A survey was conducted to understand whether people support a new policy.
Using a z-score of 2, the approximate confidence interval for the population proportion of people who support the policy was calculated as .
Determine the margin of error.
Reveal Answer
The confidence interval corresponds to , where E is the margin of error about .
subtracting:
| Descriptor | Marks |
|---|---|
Correctly determines the margin of error | 1 |
Determine the number of people surveyed.
Reveal Answer
Upper CI value =
25 people were surveyed.
| Descriptor | Marks |
|---|---|
Determines the value of the sample proportion | 1 |
Substitutes and z-score into the confidence interval formula | 1 |
Determines the number of people surveyed | 1 |
For a certain population the probability of a person being born with the specific gene SPGE1 is .
The probability of a person having this gene is independent of any other person in the population having this gene.
In a randomly selected group of four people, what is the probability that three or more people have the SPGE1 gene?
Reveal Answer
| Descriptor | Marks |
|---|---|
Formulates the correct probability expression for by summing the probabilities of exactly 3 and exactly 4 successes using the binomial distribution | 1 |
Calculates the correct final probability of | 1 |
In a randomly selected group of four people, what is the probability that exactly two people have the SPGE1 gene, given that at least one of those people has the SPGE1 gene? Express your answer in the form , where .
Reveal Answer
| Descriptor | Marks |
|---|---|
Evaluates and or correctly substitutes into the conditional probability formula | 1 |
Provides the correct final answer in the required form, | 1 |
Mrs Euler is having her car serviced at BIMDAS Mechanics. She drops her vehicle off at 8 am and is told that her car will be ready for collection at some time between 1 pm and 5 pm that day.
Let the random variable denote the time after noon (12 pm) at which a vehicle is ready for collection at BIMDAS Mechanics. The probability density function for is shown in the graph below.
The probability of a vehicle being ready for collection between 2 pm and 3 pm is 0.1.
Mr Euler is also having his car serviced, but by Addition Autos. He drops his vehicle off at 8 am and is told that his car will be ready for collection at some time between 1 pm and 5 pm that day.
Let the random variable denote the time after noon (12 pm) that a vehicle is ready for collection at Addition Autos. The cumulative distribution function for is given by
Determine the value of .
Reveal Answer
The area under the curve must be equal to 1. Hence
| Descriptor | Marks |
|---|---|
states that the area under the curve must equal 1 | 1 |
obtains correct value of | 1 |
An incomplete expression for the probability density function of is given below. Fill in the boxes to complete the missing parts of the expression.
Reveal Answer
The probability density function for is given by
| Descriptor | Marks |
|---|---|
correctly completes the interval | 1 |
correctly completes the linear function | 1 |
Determine the expected time that Mrs Euler's vehicle will be ready for collection at BIMDAS Mechanics.
Reveal Answer
Therefore, the expected pickup time is 3:48 pm.
| Descriptor | Marks |
|---|---|
states a correct integral expression for the expected value of | 1 |
determines the correct expected value of | 1 |
states the expected value as a time | 1 |
Determine the probability that Mr Euler's vehicle will be ready to collect
by 3 pm.
Reveal Answer
| Descriptor | Marks |
|---|---|
calculates correct probability | 1 |
between 3 pm and 4 pm.
Reveal Answer
| Descriptor | Marks |
|---|---|
expresses the probability as the difference | 1 |
calculates correct probability | 1 |
Determine the expected time at which Mr Euler's vehicle will be ready for collection at Addition Autos.
Reveal Answer
The probability density function is given by
for (0 otherwise). Hence the expected value is given by
Therefore, the expected pickup time is 2:20 pm.
| Descriptor | Marks |
|---|---|
determines correct expression for the probability density function for | 1 |
determines the correct expected value for | 1 |
states the expected value as a time | 1 |