QCAA General Mathematics Bivariate data analysis 1

15 sample questions with marking guides and sample answers

Q22
2020
QCAA
Paper 1
4 marks
Q22

A store asked its junior and senior staff whether or not they would like to change the store uniform.
The results are in the frequency table.

 Change uniformDo not change uniform
Junior staff9228
Senior staff2367
Q22a
2 marks

Convert the two-way table into a percentaged two-way frequency table using column totals.

Reveal Answer

Total # change uniform = 115
Total # do not change = 95

 Change uniformDo not change uniform
Junior staff80%29.5%
Senior staff20%70.5%
 100%100%
Marking Criteria
DescriptorMarks

Correctly determines column totals

1

Correctly represents the data in a percentaged two-way table

1
Q22b
2 marks

Explain whether there is an association between staff groups and a desire to change the store uniform.

Reveal Answer

There does appear to be an association between the staff groups and wanting to change the uniform.
The data suggests that junior staff want to change the uniform (80% as opposed to 20% of senior staff) and senior staff do not want to change (70.5% compared with 29.5% of junior staff).

Marking Criteria
DescriptorMarks

Suggests the presence of an association

1

Provides reasons to support conclusion

1
Q1
2023
QCAA
Paper 1
1 mark
Q1
1 mark

A linear association with a correlation coefficient of 0.23 is best described as

A

weak positive.

B

weak negative.

C

strong positive.

D

strong negative.

Reveal Answer
A

weak positive.

Correct Answer

The correlation coefficient is positive (r>0r > 0) and closer to 00 than to 11, which indicates a weak positive linear association.

B

weak negative.

A negative association requires a correlation coefficient less than zero (r<0r < 0), but the given value is 0.230.23.

C

strong positive.

A strong positive association typically corresponds to an rr value closer to 11 (e.g., r>0.7r > 0.7), whereas 0.230.23 represents a weak relationship.

D

strong negative.

This option describes a correlation coefficient close to 1-1, but the given value is positive and indicates a weak relationship.

Q2
2024
VCAA
Paper 1
1 mark
Q2
1 mark

Freddie organised a function at work. He surveyed the staff about their preferences.

He asked them about their payment preference (cash or electronic payment) and their budget preference (less than $50 or more than $50).

The variables in this survey, payment preference and budget preference, are

A

both categorical variables.

B

both numerical variables.

C

categorical and numerical variables, respectively.

D

numerical and categorical variables, respectively.

Reveal Answer
A

both categorical variables.

Correct Answer

Both variables group responses into distinct categories ('cash' vs. 'electronic' and 'less than $50' vs. 'more than $50') rather than measuring specific numerical quantities.

B

both numerical variables.

Neither variable asks for a specific numerical measurement, such as an exact dollar amount. Since the responses are groups or labels, they are not numerical variables.

C

categorical and numerical variables, respectively.

While payment preference is categorical, budget preference is also categorical because it groups responses into ranges ('less than $50' or 'more than $50') rather than asking for an exact numerical value.

D

numerical and categorical variables, respectively.

Payment preference ('cash' or 'electronic') is clearly a categorical variable, not numerical. Budget preference is also categorical, making this option entirely incorrect.

Q7
2025
QCAA
Paper 1
1 mark
Q7
1 mark

The association between two numerical variables is modelled by the equation y=4.6x35y = 4.6x - 35, with a correlation coefficient of 0.92.

The association is best described as

A

weak and linear.

B

strong and linear.

C

weak and non-linear.

D

strong and non-linear.

Reveal Answer
A

weak and linear.

While the equation represents a linear relationship, a correlation coefficient of 0.92 indicates a strong association, not a weak one.

B

strong and linear.

Correct Answer

The equation y=4.6x35y = 4.6x - 35 is a linear equation, and a correlation coefficient of 0.92 is close to 1, indicating a strong positive linear association.

C

weak and non-linear.

The equation y=4.6x35y = 4.6x - 35 represents a linear relationship, and a correlation coefficient of 0.92 indicates a strong association, making both parts of this description incorrect.

D

strong and non-linear.

Although the association is strong, the equation y=4.6x35y = 4.6x - 35 is in the form y=mx+cy = mx + c, which models a linear relationship, not a non-linear one.

Q11
2021
QCAA
Paper 1
1 mark
Q11
1 mark

Which option is an example of bivariate data?

A

The rating given to a brand of meat pies as poor, fair or good.

B

The number of people in a household and amount of water used.

C

The number of cars passing through a particular set of traffic lights.

D

The time a person spends using a mobile phone on a Friday evening.

Reveal Answer
A

The rating given to a brand of meat pies as poor, fair or good.

This is an example of univariate data because it involves only one variable (the rating) for each observation.

B

The number of people in a household and amount of water used.

Correct Answer

This is bivariate data because it involves two distinct variables (household size and water usage) collected for each household to analyze the relationship between them.

C

The number of cars passing through a particular set of traffic lights.

This is univariate data because it records only a single variable (the count of cars) at a specific location.

D

The time a person spends using a mobile phone on a Friday evening.

This is univariate data because it measures only one variable (time spent) for each person observed.

Q3
2024
QCAA
Paper 1
1 mark
Q3
1 mark

The coefficient of determination, R2R^2, is equal to 0.36 for the linear association between xx (explanatory variable) and yy (response variable).

Which statement is correct?

A

36% of the variation in xx can be explained by the variation in yy.

B

36% of the total variation can be explained by the linear association.

C

36% of the predicted outcomes can be explained by the variation in xx.

D

36% of the variation in xx can be predicted by the linear association.

Reveal Answer
A

36% of the variation in xx can be explained by the variation in yy.

This reverses the variables; R2R^2 measures the proportion of variation in the response variable (yy) explained by the explanatory variable (xx), not the variation in xx explained by yy.

B

36% of the total variation can be explained by the linear association.

Correct Answer

The coefficient of determination, R2R^2, is defined as the proportion of the total variation in the response variable (yy) that is explained by the linear relationship with the explanatory variable (xx).

C

36% of the predicted outcomes can be explained by the variation in xx.

R2R^2 measures the proportion of the variation in the observed response values (yy), not the predicted outcomes, that is explained by the model.

D

36% of the variation in xx can be predicted by the linear association.

This refers to the variation in the explanatory variable (xx), whereas R2R^2 specifically measures the explained variation in the response variable (yy).

Q1
2024
QCAA
Paper 2
5 marks
Q1
5 marks

Each of the 60 performers in a music and dance concert is either a Year 11 or Year 12 student and either a musician or a dancer.

There are four more Year 11 students than Year 12 students. One quarter of the Year 11 students are dancers and half of the Year 12 students are dancers.

Complete the two-way frequency table to calculate the percentage of students who are musicians.

 Year 11Year 12Total
Musician   
Dancer   
Total  60
Reveal Answer

 Year 11Year 12Total
Musician328=2432 - 8 = 24half of 28=1428 = 1424+14=3824 + 14 = 38
Dancerone-quarter of 32=832 = 8half of 28=1428 = 148+14=228 + 14 = 22
Total322860

Percentage of students who are musicians:
3860×100%=63.3˙%\frac{38}{60}\times 100\% = 63.\dot{3}\%

Marking Criteria
DescriptorMarks

correctly calculates the frequencies for total Year 11 students and total Year 12 students

1

calculates frequencies for dancers in Year 11 and dancers in Year 12

1

calculates frequencies for musicians in Year 11 and musicians in Year 12

1

calculates frequencies for total musicians and total dancers

1

calculates percentage of students who are musicians

1
Q10
2022
QCAA
Paper 1
1 mark
Q10
1 mark

Which example states an explanatory variable followed by a response variable?

A

car manufacturers and car colours

B

dog breeds and frequency of names

C

plant growth and amount of fertiliser used

D

daily temperatures and daily ice cream sales

Reveal Answer
A

car manufacturers and car colours

These are typically treated as two categorical variables associated with a car, rather than a clear explanatory variable driving a response variable.

B

dog breeds and frequency of names

This example describes an association between a category and a summary statistic, rather than a direct explanatory-response relationship between variables.

C

plant growth and amount of fertiliser used

This option lists the response variable (plant growth) first and the explanatory variable (amount of fertiliser) second, which is the reverse of the order requested.

D

daily temperatures and daily ice cream sales

Correct Answer

Daily temperature is the explanatory variable because it influences or causes changes in the response variable, daily ice cream sales.

Q13
2022
QCAA
Paper 1
1 mark
Q13
1 mark

The two-way table summarises the semester 1 results for students enrolled in two courses, Machinery and Electrical. Students achieved either satisfactory (S) or unsatisfactory (U).

 Machinery SMachinery U
Electrical S80%10%
Electrical U20%90%

The 10% cell in the table indicates that

A

10% of all students achieved satisfactory in Electrical.

B

10% of all students achieved unsatisfactory in Machinery.

C

10% of the students who achieved satisfactory in Electrical achieved unsatisfactory in Machinery.

D

10% of the students who achieved unsatisfactory in Machinery achieved satisfactory in Electrical.

Reveal Answer
A

10% of all students achieved satisfactory in Electrical.

This option describes the marginal percentage of all students who passed Electrical. The table provides conditional percentages based on Machinery results, not the total population distribution.

B

10% of all students achieved unsatisfactory in Machinery.

This option describes the marginal percentage of all students who failed Machinery. The value 10% represents a specific intersection of results relative to a subgroup, not the total proportion of students failing Machinery.

C

10% of the students who achieved satisfactory in Electrical achieved unsatisfactory in Machinery.

This interprets the condition in reverse (conditioning on the row). Since the rows do not sum to 100% (80%+10%100%80\% + 10\% \neq 100\%), the percentages are not based on the group of students who achieved satisfactory in Electrical.

D

10% of the students who achieved unsatisfactory in Machinery achieved satisfactory in Electrical.

Correct Answer

The columns in the table sum to 100% (10%+90%=100%10\% + 90\% = 100\%), indicating that the percentages are conditional on the column variable. Therefore, the 10% represents the portion of students within the 'Machinery U' group who achieved 'Electrical S'.

Q6
2023
QCAA
Paper 2
7 marks
Q6
7 marks

The table shows the average superannuation account balance for workers of various ages in two different industries. The coefficient of determination, R2R^2, for age versus account balance is 0.95 for industry A and 0.96 for industry B. 40-year-old Leigh works in the industry for which age explains a higher percentage of the account balance variation. Tony is 10 years older than Leigh and works in the other industry.

Age (years)Account balance ($) Industry AAccount balance ($) Industry B
2275008100
3242 00060 000
4298 000120 000
52160 000210 000
62290 000360 000
72400 000480 000

Use linear models to predict the difference in current superannuation account balances for Leigh and Tony.

Reveal Answer

Compare R2R^2 values: 0.95<0.960.95 < 0.96.
So, age explains a higher percentage of the account balance variation for the industry B dataset.

Linear model for industry A:
Let x=age,y=account balancex = \text{age}, y = \text{account balance}
y=bx+ay = bx + a
Using calculator, b=7910b = 7910 and a=205520a = -205\,520
y=7910x+205520y = 7910x + -205\,520

Linear model for industry B:
Let x=age,y=account balancex = \text{age}, y = \text{account balance}
y=bx+ay = bx + a
Using calculator, b=9570b = 9570 and a=243440a = -243\,440
y=9570x+243440y = 9570x + -243\,440

40-year-old Leigh works in industry B; substitute x=40x = 40
y=9570×40+243440y = 9570 \times 40 + -243\,440
=139360= 139\,360

Tony's age =40+10=50= 40 + 10 = 50
Tony works in industry A; substitute x=50x = 50
y=7910×50+205520y = 7910 \times 50 + -205\,520
=189980= 189\,980

Difference =189980139360= 189\,980 - 139\,360
=50620= 50\,620
The difference in account balances for Leigh and Tony is predicted to be $50 620.

Marking Criteria

Response

DescriptorMarks

correctly identifies dataset for which age explains a higher percentage of the account balance variation

1

correctly determines linear model for age vs account balance for industry A data

1

correctly determines linear model for age vs account balance for industry B data

1

substitutes x = 40 into appropriate equation and calculates Leigh’s current account balance

1

substitutes x = 50 into appropriate equation and calculates Tony’s current account balance

1

calculates difference in current account balances for Leigh and Tony

1

Communication

DescriptorMarks

shows logical organisation communicating key steps

1
Q4
2023
QCAA
Paper 1
1 mark
Q4
1 mark

Athletes were surveyed about their preferred shoe brand: X, Y or Z. The results are shown in the frequency table.

 XYZTotal
Field athletes2612240
Track athletes1418840
Total40301080

The percentage of field athletes who prefer brand Y is

A

12%

B

15%

C

30%

D

40%

Reveal Answer
A

12%

This is simply the raw count of field athletes who prefer brand Y (12). To find the percentage, you must divide this count by the total number of field athletes.

B

15%

This represents the percentage of the total population (80 athletes) who are field athletes preferring brand Y (1280=15%\frac{12}{80} = 15\%). The question asks specifically for the percentage of field athletes.

C

30%

Correct Answer

To find the percentage of field athletes who prefer brand Y, divide the number of field athletes preferring Y (12) by the total number of field athletes (40): 1240=0.30\frac{12}{40} = 0.30 or 30%30\%.

D

40%

This calculates the percentage of athletes preferring brand Y who are field athletes (1230=40%\frac{12}{30} = 40\%). You used the column total (Total preferring Y) as the denominator instead of the row total (Total Field athletes).

Q10
2023
VCAA
Paper 1
1 mark
Q10
1 mark

A study of Year 10 students shows that there is a negative association between the scores of topic tests and the time spent on social media. The coefficient of determination is 0.72

From this information it can be concluded that

A

a decreased time spent on social media is associated with an increased topic test score.

B

less time spent on social media causes an increase in topic test performance.

C

an increased time spent on social media is associated with an increased topic test score.

D

too much time spent on social media causes a reduction in topic test performance.

E

a decreased time spent on social media is associated with a decreased topic test score.

Reveal Answer
A

a decreased time spent on social media is associated with an increased topic test score.

Correct Answer

A negative association means that as one variable decreases, the other increases. Therefore, a decrease in time spent on social media is associated with an increase in topic test scores.

B

less time spent on social media causes an increase in topic test performance.

Correlation does not imply causation. While there is a statistical association, we cannot conclude that spending less time on social media directly causes an increase in test performance.

C

an increased time spent on social media is associated with an increased topic test score.

This describes a positive association, where both variables increase together. The problem explicitly states there is a negative association.

D

too much time spent on social media causes a reduction in topic test performance.

This option incorrectly assumes causation. A negative association only shows a relationship between the variables, not that one causes a change in the other.

E

a decreased time spent on social media is associated with a decreased topic test score.

This describes a positive association, where both variables decrease together. A negative association means they move in opposite directions.

Q3
2021
QCAA
Paper 1
1 mark
Q3
1 mark

The table shows the results of a student survey about their preferred movie genre.

 Genre  
Year levelComedyActionScience fiction
7–8202521
9–10245321
11–12362812

Of the students who preferred comedy, what percentage were in Year 9 or higher?

A

25%

B

30%

C

60%

D

75%

Reveal Answer
A

25%

This represents the percentage of students who preferred comedy and were in Year 7–8 (2080=25%\frac{20}{80} = 25\%), rather than those in Year 9 or higher.

B

30%

This calculation only includes students in Year 9–10 (2480=30%\frac{24}{80} = 30\%) and neglects to include the Year 11–12 students who are also part of the "Year 9 or higher" group.

C

60%

This is the raw count of students in Year 9 or higher who preferred comedy (24+36=6024 + 36 = 60), not the percentage relative to the total number of students who preferred comedy.

D

75%

Correct Answer

First, find the total number of students who preferred comedy: 20+24+36=8020 + 24 + 36 = 80. Then, sum those in Year 9 or higher: 24+36=6024 + 36 = 60. Finally, calculate the percentage: 6080×100%=75%\frac{60}{80} \times 100\% = 75\%.

Q5
2023
QCAA
Paper 1
1 mark
Q5
1 mark

A scatterplot is created to identify the nature of the relationship between two variables: vehicle age and distance travelled.
Which statement is correct?

A

The vertical axis should show vehicle age as the response variable.

B

The horizontal axis should show vehicle age as the explanatory variable.

C

The horizontal axis should show distance travelled as the response variable.

D

The vertical axis should show distance travelled as the explanatory variable.

Reveal Answer
A

The vertical axis should show vehicle age as the response variable.

Vehicle age is the explanatory variable because it predicts the distance travelled, not the response variable.

B

The horizontal axis should show vehicle age as the explanatory variable.

Correct Answer

Vehicle age is the explanatory (independent) variable, which is conventionally plotted on the horizontal axis (xx-axis).

C

The horizontal axis should show distance travelled as the response variable.

Response variables are plotted on the vertical axis (yy-axis), not the horizontal axis.

D

The vertical axis should show distance travelled as the explanatory variable.

Distance travelled is the response (dependent) variable because it depends on the age of the vehicle, not the explanatory variable.

Q14
2020
QCAA
Paper 1
1 mark
Q14
1 mark

A sample of university staff and students was asked whether they preferred catching public transport or driving their own car to university. The data collected is shown in the table.

 Public transportDrive own car
Staff218
Students4812

What percentage of university students prefer to drive their own car?

A

12%

B

15%

C

20%

D

40%

Reveal Answer
A

12%

This is the raw count of students who drive (12), not the percentage. To find the percentage, you must divide this count by the total number of students.

B

15%

This value represents the percentage of the entire sample (staff and students combined) who are students driving their own car (1280=15%\frac{12}{80} = 15\%), rather than the percentage of just the student group.

C

20%

Correct Answer

First, calculate the total number of students: 48+12=6048 + 12 = 60. Then, divide the number of students who drive by the total number of students: 1260=0.2\frac{12}{60} = 0.2, which is 20%20\%.

D

40%

This figure represents the percentage of all drivers who are students (1230=40%\frac{12}{30} = 40\%), rather than the percentage of students who drive.

Frequently Asked Questions

How many QCAA General Mathematics questions cover Bivariate data analysis 1?
AusGrader has 117 QCAA General Mathematics questions on Bivariate data analysis 1, all with instant AI grading and detailed marking feedback.

Ready to practise QCAA General Mathematics?

Get instant AI feedback on past exam questions, aligned to the syllabus

Start Practising Free