**Description
**

This lesson introduces students to working with a spreadsheets in both Microsoft Excel, Google Forms and Google Sheets. The lesson teaches students how to clean up a spreadsheet in Excel, keep a data diary and set up a data dictionary. Students will develop an independent project using Google Forms, which requires them to write and distribute a class-wide survey to gather data and prepare a spreadsheet in Google Sheets for further analysis in the next lesson.

**Objectives**

- Students will begin to familiarize themselves with Microsoft Excel, a spreadsheet software program.
- Students will edit and refine a data set to prepare it for analysis.
- Students will demonstrate best practices in data analysis by creating a data diary and data dictionary that accompanies their spreadsheet.

**Common Core State Standards**

CCSS.ELA-LITERACY.RST.9-10.7 | Translate quantitative or technical information expressed in words in a text into visual form (e.g., a table or chart) and translate information expressed visually or mathematically (e.g., in an equation) into words. |

CCSS.ELA-LITERACY.RST.11-12.9 | Synthesize information from a range of sources (e.g., texts, experiments, simulations) into a coherent understanding of a process, phenomenon or concept, resolving conflicting information when possible. |

CCSS.MATH.CONTENT.8.SP.A.4 | Understand that patterns of association can also be seen in bivariate categorical data by displaying frequencies and relative frequencies in a two-way table. Construct and interpret a two-way table summarizing data on two categorical variables collected from the same subjects. Use relative frequencies calculated for rows or columns to describe possible association between the two variables. |

**Length**

120-150 minutes

**Resources**

Microsoft Excel software access

Sample spreadsheet: Introduction to spreadsheets

Sample spreadsheet: Cleaned spreadsheet

1:1 or 1:2 computer access

Google Drive access, specifically Google Sheets and Google Forms

*Note to teachers:* This lesson uses spreadsheets developed in Microsoft Excel. You may use another spreadsheet program, such as Google Sheets, but you will need to upload these spreadsheets first and check that everything is still formatted correctly when converted to Google Sheets. This lesson uses both Excel and Google spreadsheet programs to help students feel comfortable in both programs. This will require students to have access to Google Drive, so plan accordingly with enough time to set up and prepare, especially if your students have not used Google Drive.

**Lesson step-by-step
**

*1. Introduction to cleaning spreadsheets — 15 minutes*

Ask students what experience they have with spreadsheet programs like Excel or Google Sheets. In what ways and for what purposes have they used these programs? What anxieties do they have about using this kind of software? Students often are afraid of these software programs, so encouraging them to talk about their previous use can help to assess their familiarity more accurately.

Remind students that in the previous lesson, they gathered big data from the web using different web tools, including open records and web scraping. The goal of that lesson was to collect data in a useable format: the spreadsheet.

Explain that even after they have scraped or collected data, spreadsheets still require some attention before we can start to analyze the data. This is called “cleaning” the data.

Ask students: If we were to look at a spreadsheet filled with rows and columns of numbers, what sorts of things would we look for to make sure the data is “clean” and ready for analysis?

Brainstorm and record answers on the board. Hopefully, students will respond with suggestions such as:

- The data is complete (there aren’t any missing or empty cells).
- There are no errors in the data (there aren’t any duplicate entries or jumbled entries).
- The data looks the same or is consistent (use of dollar signs, commas, decimals, upper or lower case usage should be the same across all entries).

Share the spreadsheet “Sample spreadsheet: Introduction to spreadsheets” projected on a whiteboard or screen. Ask what types of inconsistencies students see in the spreadsheet. They should identify areas that need cleaning, including:

- The race column (different entries for like groups, such as white and caucasian)
- Debt years 2 and 4 have inconsistent use of dollar sign, and some are not written out as numbers
- Scholarship, student athlete and loan counseling have inconsistent entries. Some say “yes/no” while others say “Y/N”.

All of this would need to be “cleaned” before we could analyze.

*2. Rules for cleaning — 15 minutes*

Explain to students that working with the data at this point is make or break — do it right, and you’ve made your spreadsheet usable. Do it wrong, and you’ve not only made your spreadsheet less usable, but you may have made your data inaccurate.

Explain that as a class, they must always follow three steps when working with spreadsheets (write these on the board, and explain in turn):

- Make a copy first, and label it appropriately. We never edit the original. If we make a mistake, or Excel fritzes out on us, we always want the original to refer to.
- Create a data dictionary. We will do this by creating a new sheet at the bottom of our COPY and naming that sheet “Dictionary.” Then, on this sheet, we will explain to ourselves what different columns of data mean, or any specifics about the way the data was gathered that might matter. Essentially, the dictionary is reminding us what is there.
- Create a data diary. We will do this by creating another new sheet at the bottom of our COPY and naming that sheet “diary.” Then, on this sheet, we keep track of any and all changes we make while cleaning up the data. This is like a log or running timeline of edits over time.

Take a second to open “Sample spreadsheet: Cleaned spreadsheet” and show them the two new sheets at the bottom, which have both a dictionary and diary. Show how the diary keeps track of what changes were made to clean up the data and how the dictionary explains the data in each column. Explain that if you get a data set from the web, you should always look for a dictionary to come with it, or use the original source website to help build your dictionary. Don’t linger on every line of the dictionary or diary; the students will do their own cleaning and dictionary/diary creating in a moment, and you want them to troubleshoot their own approach without directly mimicking the sample provided.

*3. Practice — 20-30 minutes*

Using individual computers (or in pairs), students will now practice cleaning the sample spreadsheet themselves. Each computer should have a copy of the file “Sample spreadsheet: Introduction to spreadsheets,” or make it accessible via a server system. Ask students to clean up the spreadsheet according to the list of necessary changes made earlier (hopefully still visible on a whiteboard). They also need to follow and execute the three rules we just reviewed (make a copy, make a diary, make a dictionary). They can format their diary and dictionary in many different ways, but it should be easy for an outside person to look at those sheets and tell exactly what they’ve done with the data.

*4. Check for understanding — 15-30 minutes*

Once all groups have completed their own spreadsheet cleaning, use the projector again to display the file “Sample spreadsheet: Cleaned spreadsheet” for them to compare to their own work. As a class, discuss items they may have missed, or compare the different ways each student or pair might have created a dictionary and diary. Remind students that there are many ways to format and use their diary or dictionary, but anything they do should be recorded with enough detail for an outside person to recreate the same steps. If that level of detail is not present in their diary and dictionary, students should go back in and revise.

While students are revising, walk the room and observe their progress, checking especially to see if you could recreate their steps and whether you understand their dictionary.

*5. Application — 30-45 minutes*

To expand their comfort level with spreadsheets, students will use their knowledge to gather data using the online software Google Forms and Google Sheets. At their computers, students should log into Google Drive and create a new form. This allows them to develop a survey of their peers, which will automatically feed the responses into a spreadsheet for later analysis. Students can choose to survey their class on any topic, but they should be sure to include questions that ask demographic data, such as:

- Name (optional)
- Sex
- Grade

Then, students should devise questions that relate to a particular survey topic. Ask students to be creative and survey each other on topics that relate to both the school community and their lives outside school. Instruct students to create a 10-question survey (seven questions in addition to the three demographic questions above).

Each survey MUST include the following types of questions: two with yes/no answer formats, two which provide ranked choices for all answers (i.e. agree, neutral, disagree), and three questions that ask for numerical responses. Having this variety of questions is essential for using this survey in the next lesson of this unit.

Show and discuss these sample questions with the class for a survey on homework loads:

- How many classes are you taking?
- How many release/free/off periods do you have?
- How many hours do you spend on homework each night?
- How many hours (if any) do you work at a job each week?
- Do you participate in extracurricular activities?
- How many hours do you spend on those activities combined each week?
- To what extent do you agree with this statement: “We are always given a reasonable amount of homework” (with answers ranked as “agree, mostly agree, neutral, mostly disagree, disagree)

Now, allow students to develop their own survey using Google Forms. The program is relatively intuitive, but if you feel like students need help using it, you can point them to this tutorial first:

Once student surveys are finished, consider a method for distributing the surveys to the class. You could post each link on a class website, send an email to all students or leave each survey up on a different computer and rotate students through the surveys in class. The idea is to have all surveys completed by the entire class.

Once surveys have been completed, students should click on the “Responses” tab of the survey form to create their spreadsheet of results. On this tab, you should see a “Create spreadsheet” button on the right, next to a green and white icon. Create a new spreadsheet, and take a look at how the results are displayed.

*6. Homework assessment*

Students should use their newly created spreadsheet of survey results and clean up any inconsistencies. If they built the survey using selection, button or drop-down tools, there should be few if any inconsistent or strange responses. At the beginning of the next class period, have all students open their cleaned spreadsheet for a quick formative assessment.

**Differentiation**

Students at a more beginner level would likely benefit from watching YouTube tutorials that introduce them to spreadsheets in a way that would allow the to pause and rewatch certain parts. Recommended tutorial:

Ask students to watch the tutorial, pausing to take notes of each step, tip and recommendation along the way.

Students with more advanced spreadsheet knowledge could begin to run basic calculations (mean, mode, media) on the file “Sample spreadsheet: Cleaned spreadsheet” in preparation for the next lesson in this unit.