Lecturer: Kate Saunders
Department of Econometrics and Business Statistics
What we’ll cover:
Unit overview and details
An Introduction to Data Visualisation
Societal importance of data visualisation
Learn about the different types of data
Getting you set up in R (Hands on)
I wish to acknowledge the people of the Kulin Nations, on whose land we are gathered today. I pay my respects to their Elders, past and present.
In this unit, we will learn about data visualisation and communication. This is not a new topic.
First nations peoples have used symbols to share stories and messages for generations.
Resources: Fact sheet on Indigenous Rock Art

Kate Saunders
Lecturer at Monash University
👩🎓 PhD in Statistics
🌐 Home State is Queensland
👨💻 Research is in statistics of climate extremes
🧑💻 Passionate about open data, data visualisation and data ethics
👩💻 Started R coding in 2012 (before tidyverse!)
❤️ Hobby is playing real basketball and fantasy basketball.
Bets Ruscoe
PhD Student at Monash University
🎓 Studied Econometrics at Monash
👨🎓 Worked at Deloitte in the AI & Data team, before coming back for PhD
👩💻 Lover of R and wanted to introduce R at work over Power BI
🌐 Just returned from a trip to Taiwan
❤️ Misses eating dumplings everyday for breakfast!
Break-out activity
Introduce yourself the people around you. You might like to ask each other:
Importance of Data Visualisation and Communication
Data visualisation is at the core of every Business Analytics Project.
A picture says a 1000 words: We can turn data into insights using visualisation
Search and show patterns, trends, structure, irregularities, relationships among data
Integral for communicating results from an analysis with clarity and efficiency
Summer Semester is an INTENSIVE unit!
What that means:
We will cover 12 weeks of material in 4 weeks
We will cover 3 lectures each week
That’s ~36 hours of study each week! (= Full time job)
You will need to keep up with the material
Coming to classes and consultations will help you
Last minute cramming won’t be possible
Note
Compared to normal semesters the content is condensed.
Lectures and tutorials will run together as a workshop in one class block.
The balance of lecture material to in class activities will change a little each week.
The unit is running flexibly so if you want to attend class on zoom you can.
At the end of this unit you will be able to:
Critically evaluate the quality of a data visualisation using the principles of graphical excellence
Successfully create high-quality data visualisations for a given dataset using software, such as Power BI or R.
Develop the skills to effectively use data visualisations to communicate insights to an audience
Develop the skills to effectively use data visualisations to communicate in the appropriate medium, report writing, in presentations and digital storytelling.
Apply advanced topics in data visualisation, such as using interactivity and animation to enhance communication.
Week 1: Introduction to Data Visualisation
Workshop 1: Getting Started with Data Visualisation
Workshop 2: The Good, the Bad and the Ugly of Data Visualisation
Workshop 3: Visualisation in BI
4 quizzes due end of the week
There are group exercised in workshop 2 and 3 that you need to complete to do your 1st assignment.
Week 2: Visualisation in R
Workshop 4: Visualisation in R: Plots of 1 - 2 Variables
Workshop 5: Data Wrangling in R for Visualisation
Workshop 6: Visualisation in R: Plots of 2 or more Variables
3 quizzes due end of the week.
Assignment 1 will cover the first two weeks of material.
Week 3: Language of Visual Communication
Workshop 7: Iterating your Visualisation
Workshop 8: Effective Communication
Workshop 9: Visual Storytelling
3 quizzes due end of the week.
Week 4: Advanced Topics in Visualisation
Workshop 10: Infographics (Uncertainty and Missing Data)
Workshop 11: Interactivity and Animation
Workshop 12: Revision (Not-in-real-time: Pre-recorded videos)
2 quizzes due end of the week.
Assignment 2 will cover the last two weeks of material.
Quizzes
12 quizzes related to each lecture and tutorial
Count the best 10 out of 12 quizzes
This allows you two “off” quizzes for sudden and short term emergencies.
Quizzes are worth 1% each
10% of your total grade
Quizzes will be due at the end of each week, so 3 quizzes due each Friday.
No special consideration for quizzes - do not leave them until Sunday 11:55 pm
Assignments
2 assignments worth 20% each
40% of your total grade
Assignment 1 will cover content from weeks 1 and 2
Assignment 2 will cover content from weeks 3 and 4
Apply for special consideration centrally. (This includes short extensions of 48 hours)
If you need special consideration, apply ASAP and no later than 11.55 pm on the day your assessment is due.
Do not email me. You must apply centrally using the link above.
Final exam
Exam is in the Febraury Summer Semester Exam Block
50% of your total grade
I will run an additional session in January to help you study. Date: TBC

You can use Generative AI in this unit.
In fact I encourage it!
It’s a great tool for those learning to code
But
You must never copy and paste output from AI you don’t understand or can not explain
You must always provide appropriate acknowledge of you AI use
You need to be careful not to short cut your learning
What does academic integrity mean to you?
Still not sure - Monash Resources
What is academic integrity? Click here
What does maintaining academic integrity mean? Click here
What happens if I breach academic integrity? Click here
1. Ask your peers using the discussion forum
Suitable for:
General questions about course materials, tutorials, R or assignment clarifications.
Any emails regarding general matters will be redirected to the discussion forums.
Sharing helps you learn from each other!
Also I don’t want to answer the same question twice (three times, four times etc.)
Do not post code from your assignments or any assignment hints to the discussion portal. I may deduct marks
2. Attend Consultation
Suitable for:
When you need more specific one-on-one help
Working through problems with your tutor
Getting a head start on your assignments
Support debugging your code
Asking detailed questions about your assignments
Getting additional feedback on your assessments
Being really nerdy about the unit!
Unit Email
Suitable for:
For personal questions or issues email etx2250-etf5922.caulfield-x@monash.edu.
Response times are within 1 - 2 days but may vary during busy periods.
Also email if you notice issues with assessments or Moodle.
For remarking, get feedback on your assessment from your tutor and if this does not provide clarity email within 10 days of the due date.
Do not direct emails to my staff account, I receive a high volume of high volume of emails and they risk going into a black hole and never being seen again!
Let’s develop some intuition for why data visualisations are important!
Cholera
In 1854 there was a breakout of the cholera disease in London killing 616 people.
At the time it was speculated that the disease was carried in the air.
A physician called John Snow was skeptical and began to collect data…
Map of outbreaks
The map showed the cholera was more prevalent around a water pump on Broad Street.
The pump was closed down.
Eventually it was established that cholera is a water-borne disease.
Data visualisation saves lives!
Florence Nightingale ❤️
At the same time Great Britain was at war against Russia in the Crimean peninsula.
Florence Nightingale is famous as a nurse who treated the wounded soldiers.
She also advocated to the British Parliament for more sanitary conditions in military hospitals.
She knew the power of using data visualisation.
Rose chart
The improved sanitation at military hospitals was eventually implemented in civilian hospitals.
Data visualisation saves lives.
Florence Nightingale became the first female member of the Royal Statistical Society.
Background
In 1812 Napoleon thought it was a good idea to invade Russia.
This campaign was a disaster for the French.
Engineer Charles Joseph Minard captured the extent of this catastrophe using visualisation.
Infographic
This visualisation provides information on 6 variables in one chart.
Number of troops
Whether troops advance or retreat
Temperature and time
Longitude and latitude
Despite the clear message that invading Russia in winter is a bad idea, some people did not learn this lesson.
1
::: callout-note ## Break-out discussion
Take some time to discuss in groups where you encounter data visualisation in your everyday life?
:::
Background
In the early stages of the COVID19 pandemic, hospital capacity was not able to cope with the amount of cases.
Collectively as a society we needed to to slow the spread of the virus and reduce the load on our healthcare systems.
People needed to understand and listen to the public health advice about hand washing, social distancing and quarantining etc.
Let’s look at some visualisations that helped get the importance of this message across!
A visualisation showing the importance of slowing the spread and reduce the peak in cases (Source).
Background
Climate Change is leading to more frequent and more severe weather
There are severe socio-economic consequences as a result of more weather and climate extremes.
It is important to communicate with people that our climate is changing.
It is also important to communicate that if we reduce our emissions and take rapid action we can slow down the impacts of climate change.
Professor Ed Hawkins (University of Reading, UK) created data visualisations to help with this!
A visualisation showing the climate is warming.
This visualisation was shown at the Rio Olympics in 2016!
A visualisation showing that our choices today will impact our rate of future warming.
Why visual political data?
Visualising election data helps people understand and trust the results
Common election visualisaitons focus on voter behaviour, such as differences in voters between regional areas and cities, or difference based on voters demographics (e.g. age, gender, income).
Visualisation also occurs in the polls in the lead up and on election night
Visualisation is also used by politicians to support their policy platforms
Voter Needle: Shows election results in context as they come in and displays the uncertainty in the total as votes are counted.
Tessalated Hexabin Maps: Show election result by district on a map
And here is the chart credited with saving President Trump’s life.
It doesn’t adhere to the principles of graphical excellence, we’ll show you why and provide a better example later!
A world of possibilities
There are lots of different kinds of visualisations.
But before we can create these visualisations …
Need to learn:
Which type of plot works best for what type of data?
What are the different kinds of data variables?
What are different types of data?
What can those different kinds of data represent?
What can you show with that data?
In other words: we need to learn how to go from Data-to-Viz!
Discrete Variables
Discrete variables can take only specific, separate values within a range.
Often represent counts or whole numbers.
Example: Number of students in a class, number of cars sold.
Characteristics:
Finite set of values1 (e.g. Dice rolls takes values 1 to 6.)
Cannot take on fractional values between two distinct points (e.g. can’t saw in student in half!)
Categorical Variables
Categorical variables are a type of discrete variable
Categorical variables represent categories or groups.
Often represented by labels or names rather than numbers.
Two types of Categorical Variables:
Nominal: Categories with no logical order (e.g. your names)
Ordinal: Categories with a meaningful order (e.g. low, medium, high)
Nominal Variables
Ordinal Variables
Note
Step 1: Is understanding what your data is.
Step 2: Is understanding what you can show with your data.
Spatial Data
Temporal Data
Data visualisation is important!
We’ve seen examples of high profile and important data visualisations.
We’ve learnt when data visualisation is done right, it is a powerful tool for communication.
We’ve also seen an example of when data visualisation can be misleading.
An important part of this unit will be teaching you how to be critical of data visualisation.
You will learn what makes a good or bad visualisation, and to identify when a visualisaton is misleading.
We will revisit some of these examples in later.
Data to Visualisation
You’ve also learnt there are lots of different kinds of visualisations
We’ve learnt about different types of variables and what these variables can represent
This will be important for when we start building our own data visualisations
Take a closer look at the Data-to-Viz website to see what kind of data matches best with what plot.
This content is all made available in the form of pre-recorded videos.
Getting set up
By the end of this unit you’ll be creating your own amazing, data visualisations
Before you can do that we need to get you set up with the software you’ll use in this unit
We’ll now go through the steps to install R and RStudio
Think about R as the paint (raw materials) and RStudio as the paintbrush and canvas (tools we need to bring the artwork together).
We’ll open RStudio to create data visualisations using R.
Step 2: Install R
Step 2: Install RStudio
If you need more help refer to this online module
Why do we need a Programming Language
It allows us to have reproducible steps, which can be applied for many different data sets
Make sure the analysis is not just point and click, you can work as a team on it on the same code
It also means we can more easily create our own bespoke visualisations
Why do we use R?
It has been around for a while.
It is regularly maintained and is open source.
It is beginner friendly
Even if you use other languages, you might still use R for your data visualisations
Learning a new language is hard!
You need to think about grammar and structure, and how to communicate well in it!
You will make mistakes, lots of them.
Below you see code that plots points showing the GDP per capita against life expectancy. The points are coloured by country and the size of the points shows the population.
It might look impossible now, but by the end of this semester you will be able to write this yourself!
Important pieces
Make sure you have
Have installed R and RStudio
Are able to set up a Project in R and know why we use them
Can read in data from a file into R
Know how to install packages and load the package libraries
Can assign variables and run basic functions

ETX2250/ETF5922
Social Distancing
A visualisation showing the importance of social distancing (Source).