Representing uncertainty on model analysis plots

Model analysis provides a mechanism for representing student learning as measured by standard multiple-choice surveys. The model plot contains information regarding both how likely students in a particular class are to choose the correct answer and how likely they are to choose an answer consistent with a well-documented conceptual model. Unfortunately Bao's original presentation of the model plot did not include a way to represent uncertainty in these measurements. I present details of a method to add error bars to model plots by expanding the work of Sommer and Lindell. I also provide a template for generating model plots with error bars.


I. INTRODUCTION
Model analysis is a powerful tool for representing student learning in terms of both increases in the use of correct models and decreases in the use of incorrect models. Bao and Redish introduced model analysis as a complement to typical representations of learning gains that focus on student correctness [1]. The model plot simultaneously shows how much a class's use of the correct model increases and how much their use of a well-defined incorrect model decreases (or vice versa). Student use of these models are often measured by a mutiplechoice survey, such as the Force Concept Inventory (FCI) [2] or the Force and Motion Conceptual Evaluation (FMCE) [3]. Smith, Wittmann, and Carter have used model analysis in conjunction with statistical analyses of students' normalized gains to compare the effects of various instructional strategies on student learning at several colleges and universities [4]. Rakkapao, Pengpan, Srikeaw, and Prasitpong also report on the benefits of using model analysis to represent the rich variety of data that come from comparing instructional methods, including cases in which student use of both the correct and common incorrect model increase [5].
Smith, Wittmann, and Carter introduced a method for adding error bars to a model plot as a representation for experimental uncertainty [4]. In this paper I refine this process and provide additional details about the methods and assumptions for generating errors bars. I also provide templates for generating model plots that include error bars using either Mathematica or the R software environment.

II. DENSITY MATRICES AND THE MODEL PLOT
The main goal of model analysis is to use response frequencies to determine the probabilities of students in a particular class using each well-defined model. One step is to create a density matrix to represent a class's knowledge state at a given time [1]. The class density matrix D is the sum of students' individual density matrices, each of which is determined by the measured frequencies of each student using each of the models.
where p j,i is the probability that the ith student uses the jth model to answer a particular question. Typically model 1 is the correct Newtonian model (e.g., net force is proportional to acceleration), models 2 through (n − 1) are associated with well-documented incorrect models (e.g., net force is proportional to velocity), and model n is the catch-all for any "other incorrect" responses.
The eigenvalues and eigenvectors of D are used to characterize the class's knowledge state [1]. The primary eigenvalue and the components of its associated eigenvector are used to create a single point on a model plot representing a class's probability of using each model at a given time. Figure 1 shows a sample model plot for a physical situation with two well-defined models [1,4].

Model 2 Region
Mixed Region Secondary Region and originally published in Ref. [4], where σ 2 µ is the µth eigenvalue of the class density matrix, and v k,µ is the kth component of the µth eigenvector.

III. UNCERTAINTY AND THE NEED FOR ERROR BARS
One shortcoming of model analysis and the model plot is that statistical uncertainty is not represented in the results. Is data point "B" in Fig. 1 really in the Mixed Region, or could it be in the Model 2 Region? It is impossible to have confidence in the interpretation of a class's "state" without error bars on the model plot.

A. Uncertainty in eigenvalues
Sommer and Lindell recognized the omission of measures of statistical power and proposed a method for determining the uncertainty in the eigenvalues of the class density matrix [6]. Their method considers that the measured probability that a student uses a particular model has an associated uncertainty j,i that may be positive or negative (−1 ≤ j,i ≤ 1). The real probability may be as high (or low) as p j,i + j,i . This results in a single-student error matrix, e i , where, Unfortunately, the uncertainty of a single student choosing a particular model is typically not knowable from data sets of pre-and post-test surveys. Therefore, Sommer and Lindell assume that the error matrix for the class will have the same form as that of Eq. (2): Where D kk is one of the diagonal elements of the class density matrix.
Given that the error in the measured probability could be positive or negative, each term in E could also be either positive or negative. This information is used to generate a set of specific error matrices. Because the (n × n) density matrix is symmetric, the general error matrix is also symmetric, yielding 2 n(n+1)/2 specific matrices with different combinations of positive and negative terms. By adding each of these specific error matrices to the class density matrix D and computing the eigenvalues of each of the resulting adjusted density matrices, one can determine the upper and lower bounds for each of the eigenvalues [6]. We may now be confident that the actual eigenvalue falls within the range [σ 2 µ,min , σ 2 µ,max ].
While this is a step in the right direction, it falls short of providing a mechanism for representing statistical uncertainty within the model plot (the points on which depend on both eigenvalues and the associated eigenvectors). Moreover, this method requires an initial assumption of the values of the uncertainties, i , that are used to create the general error matrix.

B. Creating error bars on the model plot
To create error bars on the model plot one must translate the uncertainty associated with the eigenvalue of the density matrix to an uncertainty in each dimension of the model plot. As shown in Fig. 1 the horizontal coordinate (x) corresponds with the probability of choosing Model 2 and is defined as the product of the primary eigenvalue σ 2 µ with the square of the second component of the associated eigenvector v 2 2,µ . Similarly, the vertical coordinate (y) is associated with Model 1 and the first component of the eigenvector: The uncertainty in each of these coordinates is determined by the uncertainty in the primary eigenvalue and the components of its associated eigenvector, but this relationship is not necessarily straightforward. Smith, Wittmann, and Carter assumed that the fractional uncertainty in the primary eigenvalue will be the same as the fractional uncertainty of each coordinate [4], where ∆ µ , is defined by the upper and lower bounds: The actual coordinates will fall within the ranges x ± ∆ x and y ± ∆ y . However, this assumption causes the error bars to be proportional to the value of the coordinate, which may not accurately reflect the uncertainty in the class's model state, e.g., ∆ x = x ∆ µ /σ 2 µ . A less restrictive method is to determine the coordinates (x k , y k ) for each of the adjusted density matrices by calculating the eigenvalues and eigenvectors of each. The uncertainty represented by the error bars would then be [x min , x max ] and [y min , y max ]. I provide an example in Sec. IV that shows the results of each assumption.
C. Choosing an initial estimate of uncertainty Sommer and Lindell propose using a single uncertainty for simplicity ( = max{ 1 , 2 , . . . , n }) but provide no straightforward method for determining an initial estimate [6]. There are several options for choosing an initial estimate of the uncertainty based on the pre-and post-test data. The choices I present are based on the standard error of a particular data set.
One of the simplest choices is to assume that the uncertainty will be the same for all models and will be the same for both pre-and post-test data. To accomplish this I use the pooled standard error in terms of the standard deviations of both the pre-and post-test data sets: where N is the number of students and s is the standard deviation of the number of correct answers for each data set [7]. For cases in which one uncertainty may not fit the data one may choose to calculate the standard error for the pretest and post-test separately: Additionally, one may choose not to accept the assumption proposed by Sommer and Lindell that the uncertainty is the same for all models. Equations (10)-(12) may all be applied to the data sets of students using each of the models. In the following section I provide examples for a single value of uncertainty for all models pre-and postinstruction (the most restrictive assumption) and different uncertainties for all models (the least restrictive assumption).

IV. EXAMPLE OF GENERATING ERROR BARS
I present an example to illustrate the process of creating error bars on the model plot and to examine the implications of each of the assumptions for choosing an initial error estimate. For the sake of brevity I only present pretest and post-test data for a single class in one year answering questions within a single question cluster [8]. In this question cluster there are two well-defined models (correct and common incorrect) and one "other" model. The pre-and post-test class density matrices are: x post = 0.045 y post = 0.514.
As seen in Fig. 2(a) this class starts in the Model 2 region and moves to the Model 1 region. In the following sections we examine the implications of each of our assumptions: equal fractional uncertainties vs. coordinate-specific uncertainties, and equal initial error estimates vs. model-specific errors.
A. Assuming equal fractional uncertainty and equal error estimates As a starting point I assume that the fractional uncertainties for both coordinates are the same as that of the primary eigenvalue and that a single estimate of uncertainty will suffice for all models for both the pre-and post-test data [9]. The statistics for each data set are contained in Table I. The pooled standard error for the correct responses is = 0.0470, which is higher than either of the individual standard errors for pre/post data. This provides general error matrices of, where the "±" indicate that each cell could be positive or negative. As can be seen in Fig. 2(b) the point in the Model 2 region has a much larger uncertainty in the horizontal coordinate and vice versa. This is a direct result of the assumption of equal fractional uncertainties for which the uncertainty in a coordinate is proportional to the value of that coordinate.

B. Assuming coordinate-specific uncertainty and model-specific error estimates
We now present the results of rejecting both the assumption of equal fractional uncertainties and the assumption of equal initial error estimates. Using the standard error for each model in each data set (see Table I We use these to calculate the x and y coordinates for each of the 2 n(n+1)/2 adjusted density matrices.
As shown in Fig. 2(c) this gives a much different representation of the uncertainty in the coordinates on the model plot. This provides a more accurate representation of the span of the model space each point could occupy given that these uncertainties do not depend directly on the values of the coordinates themselves. These error bars also show that the uncertainty in the post-test data is greater than in the pretest data (see Table  I).

V. SUMMARY
The assumptions involved in calculating error bars can have a dramatic effect on the interpretations reflected in the model plot. The sample data above show that using the standard error of the data yields error bars with a non-negligible extent on the model plot. This is most visible in Fig. 2(c) where the modelspecific error estimates and coordinate-specific uncertainties provide macroscopic error bars in both coordinates for both points. These results support the notion that it is imperative to represent the uncertainty in the coordinates on the model plot in some fashion. The class model state is not precisely known as would be implied otherwise.
In an effort to facilitate the use of this method I have included several template files in the supplemental materials that may be used to generate density matrices and model plots with error bars [10]. The Excel file (MA_FMCE_template.xltx) generates class density matrices from student responses to the FMCE [11]. This file may be modified in order to create density matrices for any other multiple choice data with well-defined models. The text files include the commands for importing data from Excel, performing the necessary matrix calculations, and generating model plots with error bars using the open-source R software environment [12]. The file MA_3Model_templateR.txt assumes a question cluster with three models (two well-defined models and one "other incorrect" as is the case with the above example), and the file MA_4Model_templateR.txt assumes a question cluster with four models [13]. I have also included Mathematica template files (MA_3Model_template.nb and MA_4Model_template.nb) that perform the same func-tions as the R files but are not as flexible. All files include instructions for creating model plots with error bars starting from a class set of multiple-choice survey data.

ACKNOWLEDGMENTS
I thank Michael C. Wittmann for productive conversations about model analysis, and Ian Griffin, Nick Wright, and Ashley Smith for testing the analysis templates. I am also grateful to an anonymous reviewer on Ref. [4] for bringing my attention to the work of Sommer and Lindell. This research was partially supported by the Rowan University Seed Funding Program.