Writing with Statistics

Daniel Emery

Students throughout the university write with statistical information. At the simplest level, they many report findings from research as evidence for a claim. At the most sophisticated level, students may need to both perform and interpret complex multivariate analyses using data they have collected and organized. In both cases, students may undervalue attention to the ways writers use statistical information, presuming that numbers will speak for themselves. However, scholars and professionals know that the written explanation of statistical conclusions can be as significant as the calculation. Here are a few suggestions to consider when teaching students how to incorporate statistical data in their writing. Note: it may be beneficial to review the February 2015 teaching tip on Writing with Numbers for additional information.

Consider an ungraded pretest activity

If your course requires a familiarity with statistics, assess your students’ statistical knowledge early in the semester. A low-stakes quiz on relevant statistical terms and concepts will help you to understand the extent of students’ background knowledge. If you find that students lack understanding of some foundational concepts, you might either consider a brief review in class or recommend supplemental instruction on statistics online. Misunderstandings of statistical models and tests may only appear in final graded work when it is too late to revise.

Emphasize precision when working with correlation and significance

Among the most common errors students make when writing about statistical information is confusing the precise language used to characterize relationships between variables with ordinary uses of terms like “cause” and “significance.” When addressing issues of correlation, students should be clear about relationships of association and typically avoid causal language. Consider this example from the Journal of The American Medical Association:

Strong associations have been shown between retrospective adult reports of increasing numbers of traumatic childhood events with greater prevalence of a wide array of health impairments, including coronary artery disease, chronic pulmonary disease, cancer, alcoholism, depression, and drug abuse (Shonkoff, Boyce, and McEwen, 2009).

Even with an implied time sequence (childhood trauma presumably occurs before lifelong health consequences emerge), the authors avoid errors associated with implied causality. Fields differ on conventions regarding causality, but typically require a consistent, strong relationship between variables across conditions and a plausible explanatory mechanism.                                                                                                                                                                              

Similarly, students may confuse statistical significance with meaningful significance. Tyler Vigen demonstrates these errors on his website and in his book Spurious Correlations. Confounding variables and mathematical chance may each suggest relationships where none exists.

Familiarize your students with disciplinary standards of reporting

Despite the mathematical consistency of quantitative data and the ability to use common statistical tests across contexts (whether as simple as means and medians or as complex as ordinary least squares regressions), disciplines will differ when it comes to reporting significant digits, standard units of measure, and appropriate levels of aggregation. Call student’s attention to typical cut off lines and similar standards when students are reading research in your field and as they draft their own documents.

Assign statistical writing attention to purpose and audience

Reporting conventions change depending upon the purpose and audience of a document. While a conservation biologist presenting at a professional conference might not need to explain sampling methods to peers, doing so may be appropriate for an audience of concerned citizens. This is not a question of “dumbing down” your analysis, but rather of selecting the appropriate degree of detail necessary for your statistical data to support your claim and purpose. Expert and lay audiences may be interested in variance and consistency, but Cronbach’s α and interclass correlation coefficients may be unfamiliar terms for non-experts.

In teaching, it can be very useful for students to display the same statistical data in multiple ways. While professional audiences may prefer conciseness, asking your students to express their findings in prose sentences, in tabular forms, and in figures and other visualizations will benefit student learning. Not only does it give students opportunities to practice writing with statistical software packages, but it also can help students recognize how the meaning and value of statistical evidence may be better illustrated with different displays.

Learn More

Further Support 

Visit us online at https://wac.umn.edu/tww-program. To schedule a phone, virtual, or face-to-face teaching consultation, click here.