



SHORT REPORT 

Year : 2020  Volume
: 3
 Issue : 2  Page : 8284 

What impact does the use of mean versus median statistics have on student grading?
Kenneth D Royal^{1}, Erika Cretton Scott^{2}, Terri M Wensel^{3}
^{1} Department of Clinical Sciences, North Carolina State University, Raleigh, North Carolina, USA ^{2} Department of Pharmaceutical, Social and Administrative Sciences, McWhorter School of Pharmacy, Samford University, Birmingham, Alabama, USA ^{3} Department of Pharmacy Practice, McWhorter School of Pharmacy, Samford University, Birmingham, Alabama, USA
Date of Submission  10Jun2020 
Date of Acceptance  18Jun2020 
Date of Web Publication  27Jul2020 
Correspondence Address: Dr. Kenneth D Royal Department of Clinical Sciences, North Carolina State University, Raleigh, North Carolina USA
Source of Support: None, Conflict of Interest: None
DOI: 10.4103/EHP.EHP_20_20
The decision to calculate students' grades using a mean or a median statistic is an important consideration for educators. However, the choice of which statistic is used could have a significant impact on students' grades. Using real data from a preclinical course at a large pharmacy school in the United States, students' grades on five examinations were calculated using both mean and median statistics. With respect to grade performance, 84 (61.3%) students would have received a score increase if using the median statistic to calculate grades, whereas 53 (38.7%) students would have received a score decrease if using the median statistic. Letter grades for the course would also vary for some students depending on which statistic was used to calculate grades. The choice of using a mean or a median statistic can have a considerable effect on student grading. We encourage other educators to examine the influence of mean versus median statistics may have on grade calculations and select the statistic that will lead to the most valid indicator of student performance.
Keywords: Assessment, evaluation, grading, medical education, preclinical education
How to cite this article: Royal KD, Scott EC, Wensel TM. What impact does the use of mean versus median statistics have on student grading?. Educ Health Prof 2020;3:824 
How to cite this URL: Royal KD, Scott EC, Wensel TM. What impact does the use of mean versus median statistics have on student grading?. Educ Health Prof [serial online] 2020 [cited 2020 Aug 14];3:824. Available from: http://www.ehpjournal.com/text.asp?2020/3/2/82/290921 
Introduction   
Grading is a fundamental aspect of education. The purpose of grading is to describe the level of student performance in achieving the learning goals declared for a given course.^{[1]} If you ask educators how they typically assign grades, most will acknowledge that they turn to an electronic database that houses student data and calculates an average score. The numerical grade is then converted to a letter grade and recorded on students' transcripts. Some educators will go so far as to describe the fairness and objectivity of the process, noting the mathematically precise scores calculated by an electronic gradebook.
However, Royal and Guskey ^{[2]} make clear that mathematical precision does not necessarily ensure valid grades. Using a hypothetical example in which the same students' scores are calculated using a mean, a median, and a mean after deleting the lowest grade, the authors effectively illustrate that students' scores and subsequent letter grades can often vary considerably depending on which statistic is used. Although the example presented by Royal and Guskey was purely hypothetical, the purpose of this work was to explore the impact of the selection of either a mean or median statistic might have on real student data in a health profession program.
Methods   
Course setting
PHRX 505 is the third and final course of a threecourse sequence in pharmaceutical sciences that occur in the fall semester of the third professional year of a 4year Doctor of Pharmacy (PharmD) program located at a large pharmacy school in the United States. The student cohort consists of an average class size of 125. Course assessments consist of four standardized, multiplechoice examinations. Three examinations assess new material, and a fourth examination, given during finals weeks, contains both new and previously assessed materials. All four examinations are weighted equally in the course.
Analysis
Data analysis consisted of calculating descriptive statistics for each examination and histograms to visually display data distributions. Minitab statistical software (version 17, Minitab Inc., State College, PA, USA) was used to perform the statistical analyses.
Results   
The distribution of examination scores is presented in [Figure 1]. A statistical summary of the scores is available in [Table 1]. Results indicate that although all five examinations had negative skew values, three of the five examinations, namely examinations 2, 3, and 4, had a visually discernible negative skew.
With respect to grade performance, 84 (61.3%) students would have received a score increase if using the median statistic to calculate grades, whereas 53 (38.7%) students would have received a score decrease if using the median statistic. In fact, when comparing the breakdown of overall scores based on mean versus median statistics using a conventional grading scale of A = 90–100, B = 80–89, C = 70–79, D = 60–69, and E = <69, 32.8% (n = 45) would receive an “A,” 45.3% (n = 62) would receive a “B,” 18.2% (n = 25) would receive a “C,” and 3.6% (n = 5) would receive a “D,” if using the median statistic. Conversely, 30.0% (n = 41) would receive an “A,” 48.2% (n = 66) would receive a “B,” 18.2% (n = 25) would receive a “C,” and 3.6% (n = 5) would receive a “D,” if using the mean statistic.
Discussion   
Results from this study confirm that some students' grades did change letter grades according to a traditional grading scale that uses 9–10 point increments (e.g., 90–100 = A, 80–89 = B, etc.). However, if the grading scale was more nuanced (e.g., 97–100 = A+, 92–96 = A, 88–91 = A, etc.) as is the case in many medical education programs, then the changes in students' grades would be much more discrepant. Thus, the choice of a mean versus median statistic to calculate grades is both a very serious and important consideration for faculty and students alike.
Several interesting questions developed in light of the study's findings. First, which statistic is most appropriate? Although Royal and Guskey ^{[2]} note that there are many factors to consider (e.g., examination weighting schemas, use of cumulative examinations, etc.), for the present study, it appears that a median statistically might arguably be a more appropriate statistic. The reason is because most grade distributions were negatively skewed, and the median statistic is less sensitive to extreme (skewed) data. Given the gradebook provided in this study is not atypical of many, if not most, medical education programs in which many students perform very well (e.g., a large distribution of scores exceeding 80%), an argument could be made that a median may be a better statistic in most cases.
Another question that arises is should some examinations be calculated with means and others with medians, perhaps depending on the distributional shape of the scores? Much like the decision to use parametric versus nonparametric statistics depends on the distributional shape of the data, there is a temptation for a thoughtful educator to use one statistic for one examination(s) and another statistic for another examination(s). In practice, however, this decision probably would be unwise, as inconsistent grading practices that are determined after data have been collected are much more difficult to defend than grading practices that are articulated a priori. For this reason, we encourage educators to choose either the mean or the median statistic (perhaps whichever appears most appropriate based on previous course data and other important considerations) and apply this grading schema consistently throughout the duration of the course.
Conclusion   
The choice of using a mean or a median statistic to calculate grades is an important consideration for all educators charged with the task of grading student performance. The impact of this decision can have a considerable bearing on student performance. In fact, in some instances, students may receive two entirely different grades (e.g., an “A” vs. a “C”) for a course depending on which statistic was used to calculate grades. We encourage other educators to examine the influence mean versus median statistics may have on grade calculations and select the statistic that will lead to the most valid indicator of student performance.
Financial support and sponsorship
Nil.
Conflicts of interest
Dr. Royal is the EditorinChief of Education in the Health Professions. All peerreview activities relating to this manuscript were independently performed by other members of the editorial board.
References   
1.  Royal KD, Guskey TR. A case for differentiated grades. Med Sci Educ 2015;25:3235. 
2.  Royal KD, Guskey TR. Does mathematical precision ensure valid grades? What every veterinary medical educator should know. J Vet Med Educ 2015;42:2424. 
[Figure 1]
[Table 1]
