class: right .left-column[ <img src="img/day1-cover-preview.png" alt="dot chart"> ] # Perception, reasoning, and credibility ## Limitations of common graph types <br> <br> Richard Layton 2022-02-14 <br> <br> <https://graphdr.github.io/data-stories/> <https://github.com/graphdr> <a href="mailto:graphdoctor@gmail.com">graphdoctor@gmail.com</a> --- class: left .left-column[ <img src="img/day1-cover-preview.png" alt="dot chart"> ] <br> #### `\(\quad\)` Effective alternatives to pie charts #### `\(\quad\)` <br> #### `\(\quad\)` Effective alternatives to bar charts #### `\(\quad\)` <br> #### `\(\quad\)` Aligning the design to the story #### `\(\quad\)` <br> #### `\(\quad\)` Advice from experts --- class:inverse, center, middle # Effective alternatives to pie charts --- class: left # Judging pie slices is a low-accuracy task .pull-left[ ![](img/top-5-pop.png) ] .pull-right[ The pie represents 3.6B people in 5 countries. Fill in the blanks with your visual estimate of the percentage (%) of each pie slice. <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Percentage </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> China </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 39 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> India </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 38 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> United States </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 9 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Indonesia </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 8 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Pakistan </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 6 </td> </tr> </tbody> </table> The total should be 100%. ] <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Judging values along a common axis is a high-accuracy task The same data is displayed along a common scale. Fill in the blanks with your visual estimate of the percentage (%) for each country. <br> .right-column[ ![](slides-day1-01_files/figure-html/unnamed-chunk-4-1.png)<!-- --> ] .left-column[ <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Percentage </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> China </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 39 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> India </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 38 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> United States </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 9 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Indonesia </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 8 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Pakistan </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 6 </td> </tr> </tbody> </table> ] <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Judging values along a common axis is a high-accuracy task Compare your visual estimates to the data values. <br> <br> .right-column[ ![](slides-day1-01_files/figure-html/unnamed-chunk-6-1.png)<!-- --> ] .left-column[ <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Percentage </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> China </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 39 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> India </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 38 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> United States </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 9 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Indonesia </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 8 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Pakistan </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 6 </td> </tr> </tbody> </table> ] <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # 3D effects distort our judgment even further .pull-left[ ![](img/three-d-pie.png) ] .pull-right[ The pie represents 404M people in 5 countries. Fill in the blanks with your visual estimate of the percentage (%) of each pie slice. <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Percentage </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Japan </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 31 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Germany </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 21 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> UK </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 17 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> France </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 16 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Italy </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 15 </td> </tr> </tbody> </table> The total should be 100%. ] <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Again, a common scale improves our visual judgments The same data is displayed along a common scale. Fill in the blanks with your visual estimate of the percentage (%) for each country. <br> .right-column[ ![](slides-day1-01_files/figure-html/unnamed-chunk-10-1.png)<!-- --> ] .left-column[ <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Percentage </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Japan </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 31 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Germany </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 21 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> UK </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 17 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> France </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 16 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Italy </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 15 </td> </tr> </tbody> </table> ] <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Again, a common scale improves our visual judgments Compare your visual estimates to the data values. <br> <br> .right-column[ ![](slides-day1-01_files/figure-html/unnamed-chunk-12-1.png)<!-- --> ] .left-column[ <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Percentage </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Japan </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 31 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Germany </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 21 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> UK </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 17 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> France </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 16 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Italy </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 15 </td> </tr> </tbody> </table> ] <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Graph design should align with one's rhetorical goals .pull-left[ In going from this design... <img src="img/three-d-pie-preview.png" alt="pie chart preview" style="height:180px;"> ] .pull-right[ ...to this design... <img src="img/dot-chart-preview.png" alt="stacked bar preview" style="height:180px;"> ] <br> - we improve the accuracy of the audience's **perception** of the data - we make it easier to **reason** quantitatively about the data - which should enhance our **credibility** with the audience --- class:inverse, center, middle # Effective alternatives to bar charts --- class: left # 3D effects always distort our judgment Make visual estimates of each country's 2021 population (fill in the blanks). <br> .left-column[ <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Millions </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> China </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 39 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> India </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 38 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> United States </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 9 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Indonesia </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 8 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Pakistan </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 6 </td> </tr> </tbody> </table> ] .right-column[ ![](img/three-d-bar.png) .center[2021 population (in millions)] ] <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Same data—without 3D effects—along a common scale Make new visual estimates (fill in the blanks). <br> .left-column[ <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Millions </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> China </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 1439 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> India </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 1380 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> United States </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 331 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Indonesia </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 274 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Pakistan </td> <td style="text-align:right;color: white !important;background-color: white !important;"> 221 </td> </tr> </tbody> </table> ] .right-column[ ![](slides-day1-01_files/figure-html/unnamed-chunk-16-1.png)<!-- --> ] <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Same data—without 3D effects—along a common scale Compare your visual estimates to the data values. <br> .left-column[ <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Millions </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> China </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 1439 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> India </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 1380 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> United States </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 331 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Indonesia </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 274 </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> Pakistan </td> <td style="text-align:right;color: black !important;background-color: white !important;"> 221 </td> </tr> </tbody> </table> ] .right-column[ ![](slides-day1-01_files/figure-html/unnamed-chunk-18-1.png)<!-- --> ] <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # With a zero baseline and no 3D effects, bars are OK Bar charts must have a zero baseline to avoid deceiving your audience. <br> ![](slides-day1-01_files/figure-html/unnamed-chunk-19-1.png)<!-- --> <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Row order affects the audience's ability to make comparisons Ordering rows by the data values is usually superior to alphabetical order for visual comparisons. <br> ![](slides-day1-01_files/figure-html/unnamed-chunk-20-1.png)<!-- --> <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Area is perceived differently than position Area encodes no information. The only information in the bar is the position of its end point. <br> ![](slides-day1-01_files/figure-html/unnamed-chunk-21-1.png)<!-- --> <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class: left # Dot charts allow direct visual comparison of quantities Dot charts are effective replacements for pie charts and bar charts. <br> ![](slides-day1-01_files/figure-html/unnamed-chunk-22-1.png)<!-- --> <div class="my-footer"> <span> Data source: <a href="https://worldpopulationreview.com/"> 2021 World population review</a> </span> </div> --- class:inverse, center, middle # Aligning the design to the story --- class: left # Survey: “What was your reason for taking this postdoc?” <img src="img/stacked-bar-04.png" alt="stacked bar chart" height="275"> First, let's agree on what the chart *shows*. - What does a color represent? - What does a single color-segment of a bar represent? - What does the changing height over time of a segment represent? <div class="my-footer"> <span> Data source: Main, Joyce B., Yanbing Wang, and Li Tan (2021) “The career outlook of engineering PhDs” <em>J of Engineering Education</em> 110 (4): 977–1002. </span> </div> --- class: left # What ideas are conveyed by the chart? <img src="img/stacked-bar-04.png" alt="stacked bar chart" height="275"> Now we can consider what the chart says. - Describe a trend for one of the six reasons for obtaining postdoc training. - Compare two of the reasons over time. - Describe the main idea this chart conveys to you. --- class: left # Variables can be classified <table class=" lightable-paper" style='font-size: 24px; font-family: "Arial Narrow", arial, helvetica, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;font-weight: bold;color: #8c510a !important;background-color: #f6e8c3 !important;"> numeric </th> <th style="text-align:left;font-weight: bold;color: #8c510a !important;background-color: #f6e8c3 !important;"> (quantitative) a characteristic whose values are numbers </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> discrete </td> <td style="text-align:left;color: black !important;background-color: white !important;"> Countable and can assume only a finite number of real values on a given interval </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> continuous </td> <td style="text-align:left;color: black !important;background-color: white !important;"> Measureable and can assume an infinite number of real values on a given interval </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;font-weight: bold;color: #8c510a !important;background-color: #f6e8c3 !important;"> categorical </td> <td style="text-align:left;color: black !important;background-color: white !important;font-weight: bold;color: #8c510a !important;background-color: #f6e8c3 !important;"> (qualitative) a characteristic that can’t be quantified </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> ordinal </td> <td style="text-align:left;color: black !important;background-color: white !important;"> Levels have a natural order, e.g., days of the week, education level, age group </td> </tr> <tr> <td style="text-align:left;color: black !important;background-color: white !important;"> nominal </td> <td style="text-align:left;color: black !important;background-color: white !important;"> Levels are unordered, e.g., gender, eye color, mode of transportation </td> </tr> </tbody> </table> --- class: left # What can we say about the variables? <img src="img/stacked-bar-04.png" alt="stacked bar chart" height="275"> 1. Time (discrete years) is one categorical variable. 2. The other categorical variable is `___________________________` 3. The quantitative variable is `___________________________` 4. Which is the independent variable? `___________________________` --- class: left # The appropriate design for a time series is a line graph ![](slides-day1-01_files/figure-html/unnamed-chunk-26-1.png)<!-- --> --- class: left # The appropriate design for a time series is a line graph Separating reasons into individual panels clarifies the data ![](slides-day1-01_files/figure-html/unnamed-chunk-27-1.png)<!-- --> - Describe the main idea this chart conveys to you. --- class: left # The problem is time Time—in this case—is not a useful independent variable - Time locks us into the graphical conventions of time series. - What Howard Wainer calls *emphasizing the trivial*. <div class="my-footer"> <span> Howard Wainer (1997) <em>Visual Revelations</em>, Lawrence Erlbaum Assoc., Mahwah, NJ, p. 30. </span> </div> -- <br> Suppose we drop the year variable - a single quantity: set of reported percentages - conditioned by a category: reasons for postdoc training -- - boxplots summarize distributions .center[ ![](slides-day1-01_files/figure-html/unnamed-chunk-28-1.png)<!-- --> ] --- class: left # Our final design shows distributions of annual percentages ![](slides-day1-01_files/figure-html/unnamed-chunk-29-1.png)<!-- --> - Describe the main idea this chart conveys to you. --- class: left # Graph design should align with one's rhetorical goals .pull-left[ In going from this design... <img src="img/stacked-bar-03.png" alt="stacked bar preview" style="height:250px;"> ] .pull-right[ ...to this design... <img src="img/boxplot-preview.png" alt="stacked bar preview" style="height:250px;"> ] - we improve the audience's **perception** of the stories in the data - we make it easier to **reason** quantitatively about the data - which should enhance our **credibility** with the audience --- class:inverse, center, middle # Advice from experts --- class: left # Jean-Luc Doumont .left-column[ <img src="img/doumont-face.png" alt="Jean-luc Doumont" style="width:250px;"> ] .right-column[ The optimal graph design depends on two factors, primarily, 1. The message to be conveyed 2. The variables to be shown ] <div class="my-footer"> <span> Jean-luc Doumont (2009) <em>Trees, Maps and Theorems</em>, Principiae. </span> </div> --- class: left # Edward Tufte .left-column[ <img src="img/tufte-face.png" alt="Edward Tufte" style="width:250px;"> ] .right-column[ ... the task of the designer is to give visual access to the subtle and the difficult— that is, the revelation of the complex. ] <div class="my-footer"> <span> Edward Tufte (1983) <em>The Visual Display of Quantitative Information</em>, Graphics Press. </span> </div> --- class: left # Stephanie Evergreen .left-column[ <img src="img/evergreen-face.png" alt=" Stephanie Evergreen" style="width:250px;"> ] .right-column[ What’s your point? Seriously, that’s the most important question. ] <div class="my-footer"> <span> Stephanie D. H. Evergreen (2017) <em>Effective Data Visualization</em>, Sage. </span> </div> --- class: left # Alberto Cairo .left-column[ <img src="img/cairo-face.png" alt="Alberto Cairo" style="width:250px;"> ] .right-column[ Charts can lie to to us because we read too much into them or see in them what we want to believe. <br> Don't lie to yourself (or others) with charts. ] <div class="my-footer"> <span> Alberto Cairo (2019) <em>How Charts Lie</em>, W.W. Norton </span> </div> --- class: left # Ideas to consider - **Characterize** the data structure and content - **Explore** a story's context, causality, and complexity - **Align** visual and verbal logic by revising iteratively - **Edit** to suit the rhetorical goals for each audience - **Control** every pixel—avoid thoughtless conformity - **Question** are you seeing only what you want to believe? --- class:inverse, center, middle