Note to teacher
There are a lot of percentages used in the data visuals in this activity. Based on the needs of your students and context, consider including a short activity before the Misuse of Data. This will give students an opportunity to think about the meaning of percents, percent of, ratios and proportions, and how to calculate percent. Look at the graphs here to consider what types of questions you might include.
Frame the next Data Ethics conversation saying something like, “As we move through the Data Science Process when working with data, there are Data Ethics questions at each stage. When we gather data we consider where the data comes from, who collected it, and how it was collected. When we model and communicate the data we consider what data is being used, how it is being used, and how we critique the use of data as educated consumers of information. This is the aspect of Data Ethics that we will explore in the next activity.”
When launching the activity, say something like, “The goal of this next activity is for you to understand the importance of bringing your critical thinking to how data is used in articles, tweets, facebook and more. It is essential to ask questions of the story presented, since what is presented may not be the complete ‘truth’. You should always ask, ‘Is there more to the story?’”
Share with them that they will be looking at four different pairs of data visuals to consider these questions about how data can be misrepresented. The first of the pair of visuals will share data that is true, but misleading. The second of the pair will show a bigger picture with more data. For each, you will tell what story each piece of data tells. You will then give specifics on why the first visual is misleading given the additional data given in the second graphic. Tell students that this part of being a Data Scientist requires them to be a skeptic in their approach to interpreting data. Invite students to share ideas about what a skeptic would do or say when interrogating a data visual.
When discussing students’ ideas about being a skeptic, introduce the Skeptic framework by telling students they need to be convincing. A wonderful teacher named Cathy Humphreys uses this framework. In mathematics we must always be convincing. The first level of convincing is when you convince yourself. The second level of convincing is when you convince a friend. The third and most challenging level is when you try to convince the skeptic.
Let your students know that the topics in this activity are sensitive topics. Say something like,
“One of the goals of this course is to become critical consumers of data and data visuals. Keep in mind, however, that these topics hit close to home for many of us in different ways. The topics in this section have to do with polio vaccinations and death rates, DACA recipients and crime rates, and taxes and income. These are all sensitive topics for different reasons. If I am a DACA recipient or have close friends or family who are, this is a very important and personal topic to me. If I have experienced a death related to COVID, then talking about vaccinations and death rates can be an emotional topic. We don’t want to avoid these topics because they are sensitive, but we do want to understand and be supportive of our classroom community as we talk about them.”
Set students up to work in groups to discuss the ethics of a collection of data visuals. Share the activity handout and data visual resources with students, Misuse of Data (Handout A) & Visual Resources. Let them know that all of the resources in the activity come from the book, Data and Data Graphics for this activity from Alberto Cairo, How Charts Lie, Chapter 4. Make mention that knowing where you are getting information from is an important piece of being a data scientist.
While students are working, encourage them to be skeptics about the numbers and visuals. Remind them that skeptics ask a lot of questions to make sense of what they see, what it means, and if it’s valid. Encourage them to find specific parts of the visuals to use as evidence to make sense of what it communicates.
Listen to what students are noticing, some examples of what they might bring up below:
- How area is used to compare amounts: the size of the rectangles compared to each other and the size of the circles compared to each other
- There is an issue reporting numbers without including the total number that the data is out of (2,139 out of how many?, 40 out of ____, 99 out of _____)
- Average can be misleading. You lose a lot of data when you report an average because the values can have such a large range that one number doesn’t capture the situation.
- When is it better to use actual numbers of people or rates? When sharing data about hunger, poverty, or other humanitarian topics, you can lose a sense of magnitude when you report percentages. Using the numbers of people humanizes the data and story.
- When and how can rates be used to make comparisons? For example, we could use the rate of people getting smallpox who are not vaccinated to predict how many people would have been infected if no one got the vaccine. This would have been 0.4 x 1,000,000. In this case this would have been a more valid number to compare to the 99 who died after getting the vaccine.
Note to teacher
Here are additional teacher notes about each visual and how the data is misused.
After students have had an opportunity to work through the three pairs of graphs. Debrief each pair of graphs as a class. As groups share out around each pair, begin to build a class list of misuses of data that we should be aware of when looking at data and data graphics. For example, reporting values without a comparison group can be very misleading.