Initial study of neutral post-instruction responses on the Maryland Physics Expectation Survey

Epistemological studies generally focus on how students think about their construction of knowledge compared to how experts think about the same ideas. Instruments such as the MPEX and CLASS use a Likert scale to gauge whether students agree or disagree with how experts think about the same ideas. During analysis, five point scale responses are typically reduced to favorable, neutral, and unfavorable with neutral being treated as a nonresponse. What if students are actively selecting neutral and not treating it as a “does not apply?” To address this question we chose to analyze the postinstruction neutral responses of students in our Physics I course using data from multiple years, multiple sections, and multiple instructors. We found that classroom average postinstruction neutral responses were consistently within a band of 15%–25% and that this was also consistent with other published results. It is not yet clear what this pattern means. Is this a measure of students receiving mixed messages from instructors or a measure of a transitional stage that students go through when learning how to be a good college physics student? These initial findings are interesting enough that we are presenting them here with a more detailed question-byquestion analysis to be published in the near future. For example, high levels of neutral responses to applied questions (e.g., “All I need to do is. ...”) may indicate that students are receiving mixed messages from instructors. On the other hand, high levels of neutral responses to conceptual questions (e.g., “Knowledge in physics...”) may indicate that students are in a transitional stage between novice and expert.


I. INTRODUCTION
There is a body of literature dedicated to the study of student beliefs about their role in the physics classroom and how such beliefs affect student engagement and achievement in the physics course [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16].These are instrumental in our understanding of how student expectations of learning may change during the course of an introductory physics course, concentrating on student favorable responses.In general, student favorable responses go down after instruction, even in reformed courses in which students achieve relatively high gains in conceptual understanding.One cannot say that their course will achieve epistemic gains based on the level of content engagement with the students.Yet, when physics courses are specifically designed to emphasize positive epistemological thinking there are increases in favorable responses [6][7][8].Instruction designed for preservice elementary teachers has also shown similar gains in favorable responses [9][10].
These measurements may be across multiple institutions, multiple instructors, multiple instructional pedagogies, multiple years, or multiple studies synthesized into a single source [1,3,4,7,[11][12][13][14][15].Successes are measured on an increase in favorable responses-compared to expertsand classrooms are retooled if the favorable responses decrease.In the favorable versus unfavorable plot by Redish, Saul, and Steinberg [4] the preferred direction of pre-to-post instructional changes is up and to the left.Looking at the favorable and unfavorable responses means that we are leaving out neutral responses.But looking at the postinstruction neutral responses might yield important information.It is unclear what students think when they give a neutral response; are they being indifferent, are they letting us know about mixed messaging from instructors, or are they unsure about how they feel about a question.
In the case of indifference, one might think that a student would go down a single column answering the same letter regardless of the question and there would be no neutral responses or the converse, all neutral responses.In the case of mixed messaging interactions with the instructor in the classroom students might see a strong emphasis on problem solving and connection making but then they are assessed using strong conceptual and back of the book questions that do not match the in-class conversations [16].Finally, the neutral responses might be an indication of a transition stage in the way that students think about themselves as learners in the physics classroom.Moore suggested that students would transition from a dualist to a more relativist point of view during their time in college [17].When students enter college they have dualist point of view where there is one right answer and the professor is there to give me the right answer.By the time many students leave college, they have become more committed to their own role in making sense of the world in a more relativist point of view.While students are transitioning, they may question their convictions and the neutral responses might show this.If either mixed-messaging or transitional thinking is going on can the neutral responses indicate this?In this study, we have multiple instructor data spanning three instructors, seven years, and nine sections of our introductory principles of physics course as described below.
We have chosen to concentrate on the overall results of the Maryland Physics Expectation Survey (MPEX, 4) to see if there are any emergent phenomena.Our questions are as follows: • What are the overall results by instructor on the MPEX?• What ideas begin to emerge from including neutral responses in our analysis of the reduced data?

II. STUDY CONTEXT
This study began as preinstruction and postinstruction measures of student outcomes on the Force Concept Inventory (FCI) and MPEX [4,18] in our first semester introductory physics course with calculus (Physics I).Initially, the instructor (instructor A) used the FCI and MPEX to gauge students' conceptual and expectation changes while implementing active engagement techniques.For continuity and this study, the MPEX and FCI were used in every year even after the development of more widely used epistemological instruments [1,5].
Development of the Performance-Based Physics (PbP) classroom based on the Student Centered Activities for Large Enrollment-University Programs (SCALE-UP) model from North Carolina State University [19] began with a pilot study using round conference tables and studio techniques in an existing classroom.The collection of observational data for a multiyear study was initiated at the beginning of the pilot study.The success of the pilot study led to a summer 2006 remodel of two classrooms and two storerooms into one 99-seat SCALE-UP size classroom with courses being taught in the PbP classroom beginning Fall 2006.The PbP classroom is composed of eleven 2meter-diameter round tables that allows up to three groups of three students at each table.Each group has a laptop computer and experimental equipment to perform activities and experiments integrated into three 1 h, 50 min long meeting sessions per week.
Our Physics I covers Newtonian mechanics for science majors with typically 10%-15% of students in the course nondeclared but science leaning (Exploratory) students.Instructor A taught the course from 2005 to 2008.Fall 2009 a new instructor (instructor B) took over but maintained the previous instructional materials.Small changes were made to the in-class activities (i.e., some context rich problems were replaced with experiments), but the overall format of the instruction stayed the same (Table I).Ithaca College had an unexpected large freshman class in 2009 that led to even larger enrollment in Physics I.The course was broken into two sections in 2010-2011: one of majors and one of nonmajors.Majors were led by instructor B and nonmajors were led by instructor C in 2010 and 2011.In-class instructional materials and content coverage were the same, but mathematical remediation was higher for nonmajors.

III. INSTRUCTION
Instructional materials for the course were drawn from Peer Instruction (PI) and Cooperative Group Problem Solving [20][21][22][23][24]. Student grouping was done on day one using group and role organization from Cooperative Group Problem Solving.PI, using Mazur's implementation, was used on a daily basis as formative assessment for instructor and students [24].Each Friday the class time was devoted to a Context Rich Problem (CRP).Implementation of CRPs used problems available from the University of Minnesota PER website [22].We used these problems verbatim from the website early in the semester, but we found the need to make the questions increasingly more challenging as the students became more confident and competent in later weeks.To increase the challenge and enhance an emphasis on assumption identification, selection, and estimation, we removed much of the extra information.Instructors supplemented class time with explicit conversations about how to make good assumptions and estimations.Student scores emphasized expectations of good assumptions and estimations (e.g., making an assumption to ignore frictional forces when the problem clearly states that a bicycle runs off of the pavement and into sand is clearly a poorly made assumption reflected in student scores).The decision to break the course into two sections was based on an overall increase in the enrollment of the course.Instructional materials were the same between both sections.Separating the course into two sections is a constraint on this study.Besides the distribution of majors, preinstruction MPEX and FCI averages for the two sections were not significantly different at the 95% confidence level (ANOVA).

IV. RESULTS AND DISCUSSION
Data for all years were collected in the same manner; MPEX and FCI were administered on the first day of class as preinstruction measures of understanding.The FCI and MPEX were given as parts of the final exam as postinstruction measures.Students were scored on the FCI for credit, but were instructed that there were opinion questions on the multiple choice and that participation credit would be given for answering as long as it appeared that the answers were given some thought (i.e., no participation was given for students that did not answer all of the questions, or that answered all of the same letter on the postinstruction).We chose this option for postinstruction for two reasons; (a) we wished to have as close to 100% participation in the postinstruction measures, and (b) the study by Hake [25] indicated that there was no significant difference between the postinstruction FCI scores for instructors that offered the FCI for credit versus those that did not.We made the assumption that placing the MPEX on the final exam would not lead to a significant difference in responses compared administration at some other time.
Participants were excluded if they did not attempt both pre-and postinstruction implementations, they did not fully complete pre-or postinstruction implementations, or if they answered all of the same letter on pre-or postinstruction implementation (Table II).We chose to look at the MPEX results based on instructor due to smaller numbers and lack of convincing evidence that the demographics of each section were different.
Question 1: What are the overall results by instructor on the MPEX?
Preinstruction average overall scores range from 50% to 65% favorable, as shown in Table II; consistent with national values previously reported [4,5].Applying ANOVA to the preinstruction MPEX results indicated that the favorable results for instructor A were significantly higher.We do not have an explanation for why this is the case.We had assumed that a low participation rate created a possible selection effect, but the participation rates were not different.A chi-squared test for independence indicated that preinstruction favorable scores were not dependent upon the section the students were enrolled in.We thought that choice of major might be an indicator since all of the physics majors enrolled in the course are freshmen and the lowest preinstruction favorable score was for the majors section.Again, a chi-squared test indicated that preinstruction favorable score and major were independent of each other.This indicates that there was not a fundamental difference in the students coming into the course and some other hidden factor that we did not measure led to the difference.
Postinstruction favorable responses ranged from 53% to 68% for overall scores (Table II).The postinstruction favorable range reported is from 45% to 60% [4,5].Paired t tests on postinstruction favorable responses did not indicate that the increases were significant (Fig. 1).Unfavorable responses for instructors A and B did not change, but the unfavorable responses for instructor C went up significantly with an effect size of 0.70.ANOVA indicated that there is a difference by instructor on the postinstruction MPEX unfavorable scores.Chi-squared tests did not show any dependence of postinstruction unfavorable scores on major or section, but only on instructor.
Favorable responses in overall surveys of introductory physics courses: (a) Perkins et al. ( [1], Table I) reported that CLASS showed small percent gains of 1.0, 1.4, 1.5, and 1.5 in four courses, but relatively large losses in two others: 9.8% and 8.2%; (b) Redish, Saul, and Steinberg ( [4], Table 4) reported that MPEX showed percentage losses for the six courses surveyed of 5, 2, 8, 1, 8, and 6; and (c) Adams et al. ( [5], Table I) reported that CLASS showed a loss of 6% for a reform-oriented class.Our changes were small increases of 3%-6% that were not significant at the 95% confidence level.A detailed analysis of these results is forthcoming.
Question 2: What ideas begin to emerge from including neutral responses in our analysis of the reduced data?
Analysis of postinstruction overall responses in our data led to an emerging pattern that we are currently trying to better understand.Looking at the outcomes of our multiyear study, we noticed that in each section the average favorable plus unfavorable postinstruction responses ranged between 75% and 85% total.On average, in each section we found 15%-25% neutral response in the postinstruction results.ANOVA on the neutral responses indicated that there was no difference in the postinstruction means.However, there was a wide spread (20%-39%) in the preinstruction neutral responses and ANOVA indicated that there was significant difference between each of the means.Searching the literature, we found that in work stating both favorable and unfavorable responses in their postinstruction MPEX scores, the pattern repeated to within 3% to 5% for all introductory physics courses regardless of instructional type.We found similar results for CLASS data with both favorable and unfavorable responses reported [4,7,[13][14][15].Figure 2 shows a favorable versus unfavorable Redish plot using the sources that we found with lines indicating the 15% and 25% unfavorable bounds.
With four exceptions all of the postinstruction scores were within the boundary created by the 15% and 20% bounds.A 28% neutral bound line would include all of our sources (Fig. 2).
The explanations that might explain this pattern are that this is • a measure of the indifference that students have toward the course by the time that the postinstruction implementation happens.
• a measure of the confusing aspect of messaging from instructors.• a measure of a transitional stage that students go through when learning how to be a good college physics student.During initial data reduction, we removed students that showed apparent indifference during instrument implementation.We reduced the data to be used in paired statistical analyses by removing any student that did not complete all questions on both pre-and postimplementations, or that answered only one possible letter (i.e., all A).In our estimation, students that chose only one letter in order to try and get participation credit were showing indifference to the instrument and the implementation.With these reduction choices, it is unlikely that the results are a measure of indifference.
The other two possibilities seem more plausible in our estimation.First, Hammer saw that there was a change in student expectations due to mixed messaging [16].Many of the questions on the MPEX are behavior questions (i.e., All I need to do to understand most of the basic ideas in this course is just read the text, work most of the problems, and/ or pay close attention in class).In the case of cooperative group problem solving, the students spend 2 h per week solving a CRP that attempts to connect the physics they are learning to the real world while connecting the individual concepts we have previously discussed.The interactions with the instructors and TAs are deep and the students are given the impression that the deep connections between all of the concepts and problem solving techniques are important.During the midterms and the finals the questions might seem to be more related to the book steps and the exercises done in the classroom and less about the CRP type conversations.This leads to a mixed message that would confuse the students when thinking about the behavior type questions on the MPEX.In order to infer that this might be the case, a question-by-question study is needed.That is the larger goal of this work.
Third, the students might be going through a personal epistemological change.The work of Moore might lead us to an explanation of what we are seeing and that this is a transition phase that all students may go through [17].In Moore's scheme based on the work of Perry [26] and Belenky [27] and reinterpreted by Redish, Saul, and Steinberg [4], students enter with an absolutist or dualist view of knowledge and truth as it extends to physics.There are right answers and the professor is there to give them.We take down the right answers and we forget the wrong answers regardless if the wrong answers make sense.There is a progression through a multiplicity or relativistic stage where there all views have merit and thus we can do anything that we want because there really is no right answer.Finally, students enter into a stage where their personal role in constructing understanding is acknowledged and accepted.Both the MPEX and CLASS use high agreement between the experts as part of the choice for favorable responses [4,5].However, we would expect that there should be a transitional stage where all students of physics or science might be questioning their role.The student will be unsure about views that he or she was confident about previously.This could lead to an increase in neutral responses and a decrease in both favorable and unfavorable responses.What we might be capturing is the beginning of the transitional stages that students undergo to become more like the experts that we would like them to be.This is another area that a detailed question-by-question study of the MPEX might help us to understand, if this is indeed something that can be captured.

V. CONCLUSIONS
Our study has recorded consistent postinstruction neutral MPEX scores regardless of the preinstruction scores of the student populations.Our current interpretation of our MPEX neutral scores is that this is either an effect of the mixed messages that students might be getting from inclass interactions versus assessment questions or we are capturing the transitional stage that students go through as they start to think more like expert physicists.
In order to gauge which of these scenarios is more plausible we are undertaking a question-by-question study of the data.We have two new questions that would arise from this new study: • Are the student neutral responses random across the questions?• Do the questions that have higher postinstruction neutral responses fall into either a conceptual category or applied category?If the questions are random, then these results are interesting, but might not be a measure of anything.If the questions are in the applied category (i.e., All I need to do is…), we feel this would be an indication of mixed messaging from the instructors.If the questions are in the conceptual category (i.e., Knowledge in physics… or Physical Laws…), we feel that this might be capturing the transitional stage of the students.More than likely the correct answer lies somewhere in between.

TABLE I .
Summary of changes made to the Physics I course during the seven years of study.Only changes to the instruction are noted.If no changes are stated, the previous year was followed exactly.

TABLE II .
Average Pre-Post MPEX scores for three instructors.MPEX scores are given as FAV/UNFAV.