Unit 1

Data Tells a Story

In this unit students will be introduced to data science through a reflection of their own experiences using self-generated data, an exploration of a larger dataset of people’s media use, and an analysis of business data. Through these activities students will learn about the data science process, begin using data…

In this unit students will be introduced to data science through a reflection of their own experiences using self-generated data, an exploration of a larger dataset of people’s media use, and an analysis of business data. Through these activities students will learn about the data science process, begin using data to tell stories, and think about the ethics involved in working with data. Students will make sense of the questions: What part of the story is told by data? What is variation? How is data generated? What data is gathered about themselves? During the unit, students will be learning to use CODAP and Google Sheets as they consider the ways data can be used to model the world. As students learn about data, they will be introduced to many different ways to represent data and will explore univariate, bivariate, and multivariate data. From the data visualizations they will consider what story they can tell from their data.

View Unit 1

Unit 2

The Data of our Community

In Unit 2 students will explore different ways of modeling data, starting with the basic models of measures of center and spread, as well as considering sampling. Students will likely already be familiar with the calculations needed to find measures of center and spread for small data sets,…

In Unit 2 students will explore different ways of modeling data, starting with the basic models of measures of center and spread, as well as considering sampling. Students will likely already be familiar with the calculations needed to find measures of center and spread for small data sets, but this unit takes a deeper dive into understanding the concepts, deeper meanings, limitations, and the impact of outliers in the context of data modeling. Students will explore distributions and the role of probability in understanding them. Additionally, students will collect their own data and compare it to a larger data set. During the project, students will consider their sampling choices and those of the larger data set to see how such decisions impact the comparisons drawn between the two data sets.

Google has provided a feature that allows IT administrators within schools a toggle that will give students access to Colab. Information is available here. Information about an alternative setup using Kaggle can be found here.

View Unit 2

Unit 3

Water in Your Life

In this unit, students will learn about bivariate data through discussions and data explorations around the theme of water usage. Students will explore scatter plots as a visual way to represent the relationship between two variables, draw their own lines of best fit, and learn how data scientists determine and…

In this unit, students will learn about bivariate data through discussions and data explorations around the theme of water usage. Students will explore scatter plots as a visual way to represent the relationship between two variables, draw their own lines of best fit, and learn how data scientists determine and analyze lines of best fit . Throughout the unit, students will use the analytic tools of Google Sheets, CODAP and Tableau to make and refine claims about water usage based on both self-collected data and large, publicly available data sets.

View Unit 3

Unit 4

Shuffling Songs

In this unit, students will again consider the modeling process and the role played by variation,  reflecting on the data collected from simulations and the ways data can help answer probabilistic questions and leverage this power for decision-making. In the process of creating powerful simulations, students will learn…

In this unit, students will again consider the modeling process and the role played by variation,  reflecting on the data collected from simulations and the ways data can help answer probabilistic questions and leverage this power for decision-making. In the process of creating powerful simulations, students will learn the basics of programming, which will continue to be a powerful tool for data analysis. During this unit students will use Python in Edu-Blocks and Colab.

Google has provided a feature that allows IT administrators within schools a toggle that will give students access to Colab. Information is available here. Information about an alternative setup using Kaggle can be found here.

View Unit 4

Unit 5

Skin Tones and Representation

Unit adapted from lessons on the data of representation written by Princewill Okoroafor, Yunhan Huang, and the Young Data Scientists League. In this unit, students explore the issues around skin tone representation in the media through a data-based…

Unit adapted from lessons on the data of representation written by Princewill OkoroaforYunhan Huang, and the Young Data Scientists League.

In this unit, students explore the issues around skin tone representation in the media through a data-based exploration of skin tone representation in magazines. Students conduct both a categorical and a numerical analysis and compare the benefits and drawbacks of both. In their categorical analysis students create two-way tables based on their interpretation of the skin tones of the people pictured, and in the numerical analysis they use the RGB values of the images themselves. After both analyses, students chose an audience for whom the information would be relevant and write a data-supported piece to share their findings with that audience. During the unit students will work in Google Sheets and Google Colab (Python).  

Google has provided a feature that allows IT administrators within schools a toggle that will give students access to Colab. Information is available here. Information about an alternative setup using Kaggle can be found here.

If you are waiting for Google Colab access, after Unit 4 move to Unit 6 and then return to Unit 5.

View Unit 5

Unit 6

What’s the Best Place for Me?

In this unit students will build a prioritization model to create a ranking. In this process, students will decide what they value, collect variables based on their values, gather and clean data, create functions to combine variables, normalize data, and create a weighting system for prioritizing their data. Students will…

In this unit students will build a prioritization model to create a ranking. In this process, students will decide what they value, collect variables based on their values, gather and clean data, create functions to combine variables, normalize data, and create a weighting system for prioritizing their data. Students will do a sensitivity analysis on their weighting system. During this process, students will discuss how bias impacts mathematical models. They will use reasoning, justifications, and visualizations to explain their decisions. During this unit students will use Google Sheets, Google Data Commons, and Tableau.

View Unit 6

Unit 7

Predicting My Preferences

In this unit, students will be introduced to the big ideas behind machine learning. They will build two different machine learning algorithms to make predictions on whether they will like a song. In this process they will learn about using vectors and matrices as data structures as well…

In this unit, students will be introduced to the big ideas behind machine learning. They will build two different machine learning algorithms to make predictions on whether they will like a song. In this process they will learn about using vectors and matrices as data structures as well as applying conditional probability and exercising their basic programming abilities. Students will also consider how machine learning impacts their lives and others’ lives and will share their newly gained understandings of machine learning with a member of their community. During the unit, students will work in Colab and Edublocks.

View Unit 7

Unit 8

Being a Data Scientist

This unit will bring together all that the students have been working on. Students will have an opportunity to work through the full cycle of data science: making their own decisions about the questions they are interested in exploring, finding data to answer that question, cleaning the data, creating and…

This unit will bring together all that the students have been working on. Students will have an opportunity to work through the full cycle of data science: making their own decisions about the questions they are interested in exploring, finding data to answer that question, cleaning the data, creating and analyzing a model, communicating with the data visually and reflecting on their process. This will be an iterative process mirroring how data scientists work on a project. Students will gather their own data. They will make decisions about how to work with it and describe the choices they have made including what technology tools to use, cleaning moves, visualization selection, univariate or bivariate data choices, combining data, and other content relevant to their project of choice.

View Unit 8