In the September issue of the Interchange, a new event, called the Michigan Statistics Poster Competition (MSPC), was announced. As a reminder, a statistics poster is a visual display that uses one or more related graphs to summarize data, discuss different points of view, and answer question(s) about the data. The purpose of the poster competition is to get students involved in statistical activities while exercising essential communication skills. All students in K-12 residing in Michigan are eligible to submit statistics posters to the competition. More information, the registration form, and a complete set of rules are available at the event's web site. Additional information can be obtained by contacting Dan Frobish.
In the November issue of the Interchange, a method for selecting a research topic was presented. The method involves brainstorming ideas as a class or a group, critiquing ideas after all suggested topics are listed, and reaching a consensus on which topic to pursue. As part of the process, students need to discuss ways in which the supporting data can be collected (i.e., by observing the behaviors of others, by conducting a survey, by conducting an experiment, etc).
In this issue, we offer suggestions for how to display the data graphically. We will build upon the examples provided in the November issue of the Interchange. As you are creating your graphs, keep in mind that the reader of your poster should be able to look at the graphs and understand the story of the data. The reader should be able to immediately ascertain your research question and your conclusions, without having to read the description on the back of the poster. It may be useful to think of graphs as photos that make it easy for your reader to visualize all of the information that you have collected.
In trying to select an appropriate graph or graphs to represent your data, perhaps a brief review of the kinds of data might be helpful. Broadly, data fall into two major categories: qualitative data (e.g., words or text) and quantitative data (numbers). Quantitative data can be further broken down into whether the variable is discrete (it takes on only certain values in a range of numbers) or continuous (it takes on all values in a range of numbers). Bar graphs and pie charts are more appropriate graphs for displaying qualitative variables or discrete quantitative variables. Graphs such as histograms and stem-and-leaf plots are more appropriate for displaying continuous quantitative variables or numerical values that have been grouped together.
For instance, the topic in Example 1 in the November issue focused on the number of home runs hit by the home run champions over the last 40 years in both the American and the National Leagues. The primary variable of interest is the number of home runs hit by the yearly home run champion in both leagues. This is a discrete quantitative variable (since you cannot have 63.2 or 55.3 home runs). An appropriate graph for displaying this variable would be a bar graph where the home run values are placed along the x-axis and the y-axis scale represents the frequency or percent of home runs achieving a particular value. If the home runs values are grouped (i.e., 50-54, 55-59, 60-64, etc), then a histogram would be an appropriate graph choice.
If one wanted to display the distribution of home runs for both the American and the National Leagues on one graph to facilitate comparison, then one might choose a side-by-side bar chart. With this graph, there are two bars placed alongside each other for each value of home runs and the bars are differently marked (perhaps one bar is blue and the other is green) to distinguish between the two leagues. If the data are grouped, then two histograms might be used that share a common vertical center scale. The bars for one league would extend horizontally to the right of the scale (with the lengths of the bars representing the frequency or percent of cases falling within a given range of scores) and the bars for the other league would extend horizontally to the left of the scale. A back-to-back stem-and-leaf plot would also be an appropriate choice for simultaneously comparing these two distributions.
In Example 2 in the November issue, the topic focused on the distribution of colors in a bag of Skittles and a bag of plain M&Ms. Since the variable of interest, color, is qualitative (red, blue, yellow, etc), a bar chart or pie chart would be an appropriate graph for displaying the color distribution of a particular candy. In a bar chart, the various colors would appear along the x-axis and the y-axis scale would represent the frequency or percent for each color. If one wanted to compare the color distributions for the two candies together on one graph, either a side-by-side bar chart or stacked bar charts would be appropriate options. With stacked bar charts, the two bars associated with each color, as described with the side-by-side bar chart, are placed on top of each other rather than side-by-side. The two segments are still differentially marked (i.e., one is shaded, the other is not) to distinguish between the two candy types.
Suppose you wanted to examine the relationship between the fat grams and the calories in a single serving of popular candy bars. Both of these variables are quantitative and continuous in nature. An appropriate graph choice for displaying the distribution of fat grams or the distribution of calories would be a histogram with the values of fat grams or calories grouped along the x-axis and the frequency or percent of observations in each grouping indicated along the y-axis. A stem-and-leaf plot would also be an appropriate choice for displaying each of these distributions individually. A graph that could be used to display both of these variables together is a scatterplot with fat grams on the x-axis and calories on the y-axis. Each point on the scatterplot represents a particular candy bar.
Remember that your reader is not as familiar with your project and data as are you. You should strive to make your graphs as "user-friendly" as possible. As a check, have a friend or family member who is unfamiliar with your data look at the graph and describe what the graph represents to him or her. This will give you an indication of how clear your graph is in portraying your data. A graph should contain an accurate and descriptive title as well as descriptive labels for the axes. Since this is a competition, a creative title might also attract a judge's attention.
Be creative (but accurate) when creating your graphs. For instance, with the graphs in Example 1, you could use graphics of baseballs or baseball caps (with team logos) to make the bars. Similarly, with Example 2, graphics of Skittles or M&Ms could be used to create the bars. If your data are time dependent, the bars could be arranged in a circular fashion around a clock graphic. This would immediately allow the reader to see the relationship between the variable and time. While creativity is encouraged, please do not attach any perishable items to your poster. Any non-perishable items attached to your poster should be firmly affixed.
For the Michigan Statistics Poster Competition, posters submitted by students in grades K-3 must contain at least one graph. Posters submitted by students in grades 4-12 must contain at least two graphs. For teams with members from different grade levels, the highest grade level determines the category.
In conclusion, with a little thought and creativity, you can create graphs that are visually interesting, are creative, tell a story, and are accurate in their representation. Most of all, HAVE FUN!
Note: A reference you might find useful is Graphing Statistics and Data: Creating Better Charts by Anders Wallgren, Britt Wallgren, Rolf Persson, Ulf Jorner, and Jan-Aage Haaland (1996), Newbury Park, CA: Sage Publications.
In the next issue of the Interchange, the last issue you will receive prior to the poster submission deadline, we will present ideas related to poster layout or how to put all of the pieces together.