Preliminary investigation of instructor effects on gender gap in introductory physics

Gender differences in student learning in the introductory, calculus-based electricity and magnetism course were assessed by administering the Conceptual Survey of Electricity and Magnetism preand postcourse. As expected, male students outgained females in traditionally taught sections as well as sections that incorporated interactive engagement (IE) techniques. In two of the IE course sections, however, the gains of female students were comparable to those of male students. Classroom observations of the course sections involved were made over an extended period. In this paper, we characterize the observed instructor-student interactions using a framework from educational psychology referred to as wise schooling. Results suggest that instructor practices affect differential learning, and that wise schooling techniques may constitute an effective strategy for promoting gender equity in the physics classroom.


I. INTRODUCTION
Over the past several decades, physics education research (PER) has identified a deficit in the ability of traditional instruction to promote coherent conceptual understanding of topics in introductory physics. Even faculty recognized at their own institutions as outstanding teachers have found only modest gains on validated multiple choice instruments designed to measure conceptual understanding [1]. The PER community has responded vigorously, developing instructional approaches that have been shown to boost performance. These interventions share a common strategy of engaging students actively in their own learning, and have thus been referred to as interactive engagement (IE) techniques. Learning gains for IE pedagogies such as Tutorials in Introductory Physics (TIIP) [2] and Peer Instruction [3] have been replicated at a variety of institutions and by many different instructors [4][5][6][7]. The focus has been on the reproducibility of these gains, rather than possible instructor effects on variations in outcomes.
More recently, PER has studied differences in the learning of men and women in introductory physics courses. In mechanics, male students have repeatedly outperformed females on concept inventories administered at the start of the course [8,9]. Lorenzo and colleagues at Harvard University associated the amount of subsequent reduction in this gender gap with the extent of IE instruction in the course, and found that the posttest gender gap was eliminated in several fully IE courses taught by different instructors [10]. The investigators attributed these results to the use of specific IE teaching strategies, rather than instructor effects. Pollock and colleagues at the University of Colorado, however, were unable to replicate these findings, and instead found that differential achievement persisted through courses that involved IE techniques, and that instructor differences impacted performance, even within the same IE-based pedagogies [11]. The Colorado results, and others like them, suggest that reform pedagogies alone are not enough to reduce the gender gap.
Additional work at the University of Colorado has explored the effectiveness of self-affirmation in mitigating differential learning [12,13]. Students reflected in writing on self-defining values, a process postulated to reduce the identify threat experienced by women in introductory physics courses. Although the results were mixed, evidence does suggest that attending to social-psychological factors, and particularly to factors involving identify threat, may be effective in promoting equity.
The University of Colorado work highlights questions about the sensitivity of student learning to the types of interpersonal interactions that occur within a course. The manner in which an IE pedagogy is implemented may have significant impact. This paper reports on ongoing work at Western Washington University (WWU) and Clemson University to examine the extent to which instructor practices affect differences in performance between male and female students in introductory physics courses. A small, pilot study has been conducted at WWU to explore the following question: Within an IE classroom, what specific instructor practices are effective in reducing the gender gap?
At WWU, the Conceptual Survey of Electricity and Magnetism (CSEM) [14], a validated, 32-item multiple choice inventory, has been used as a measure of conceptual understanding of students taking the calculus-based electricity and magnetism (E&M) course. In multiple course sections, we have found that while IE instruction enhances the learning of all students, male students show higher normalized gains than females. Two sections of the course, however, presented what seemed to be an anomalous finding. In these sections, taught in different academic terms by the same instructor, the performance of males and females was comparable. In seeking an explanation for this result, we examined the classroom practices and IE implementation of this instructor. In this paper, these practices are analyzed using a theoretical framework from educational psychology. An overview of this framework is provided below; more detailed description of the specific elements of the framework follows in Sec. III.
Over the past several decades, educational psychology has examined the classroom experiences of students from underrepresented groups [15][16][17]. Low achievement has been explained in part by barriers related to a lack of identification with the academic domain or subdomain (e.g., physics). A set of recommendations for teacher practices, referred to by Steele as wise schooling, has emerged [18]. Wise schooling practices seek to foster domain belongingness, and are intended to support the learning of all students while mitigating performance gaps between dominant-group students and underrepresented students. The recommendations focus not on pedagogical strategies per se, but rather on ways of interacting with students. While many of the wise schooling recommendations were originally formulated with the experiences of black American students K-20 in mind, they are also relevant to the experiences of women in undergraduate science, technology, engineering, and mathematics education [19,20].
Many established IE practices from physics education research are aligned with the recommendations from educational psychology. For example, one of the IE elements identified in the Harvard University study, the use of interactive environments that enhance cooperation between students, is clearly related to the strategy of valuing multiple perspectives advocated by Steele. However, implementation of the recommendations seems to be independent of use of PER-based IE pedagogy. That is, it is possible to employ IE techniques in physics instruction without implementing wise schooling practices. Results of the present study suggest that implementing IE instruction together with wise schooling promotes the conceptual learning of all physics students, while reducing, or even eliminating, the gap in what male and female students learn.
Section II of this paper presents results from the administration of the CSEM in multiple sections of the introductory physics course. These data are used to establish the existence of a gender gap in learning gains in courses that use IE techniques and to identify selected IE sections in which this gap is absent. Section III describes the methodology used in conducting observations of classroom teaching. A theoretical framework from educational psychology is presented in parallel. This framework is used to link the specific instructor practices that were observed to the measured reduction of the gender gap. The framework includes a set of teacher practice recommendations intended to enhance learning by promoting domain belongingness. We discuss how these recommendations can account for the efficacy of the observed instructor practices.

A. Context
At WWU, the E&M course is the third of a three quarter introductory physics sequence and is taken by students majoring in physics, chemistry, computer science, geology, and other sciences, engineering technology, and mathematics. The fraction of students that are female is typically about 30%. Lecture sections meet 4 hours per week and enroll about 60 students. Students from different lecture sections mix together in a required, 3-hour lab. Lab sections consist of 27 students and are taught by an undergraduate teaching assistant, usually a junior or senior physics major. Teaching assistants attend a required weekly preparation meeting led by a faculty member.
The lab curriculum consists of an initial, guided inquiry portion, and a culminating ''synthesis challenge.'' The guided inquiry portions have in most cases been adapted from TIIP. Students work in groups of three, making qualitative predictions and then carrying out investigations with simple apparatus in order to check predictions and develop lines of thinking. Instructional sequences include questions designed to target specific difficulties that have been identified through research. After this guided inquiry, students collaborate on an open-ended task. These tasks, inspired by the context rich problems developed at the University of Minnesota [21] as well as the challenge lab approach of Greer and Bierman [22], seek to cement students' conceptual understanding while securing student interest and generating excitement. Typically a design or measurement challenge, students are offered a small amount of extra credit if their design is successful or their measured result is within a given precision on the first try.
The CSEM was administered on the first day of class and during the last week of class in 8 different lecture sections. These sections were taught by five different instructors from fall 2008 through spring 2010. The study includes only those students who completed both the pretest and posttest, resulting in an overall sample size of N ¼ 380.
All 8 sections used the labs described above and thus can be characterized as involving partial IE instruction. Three of the sections involved traditional teaching methods in lecture, in which students were generally in a passive learning mode, and are referred to as ''partial IE.'' These sections were taught by two different instructors, referred to below as instructors A and B. The remaining five sections utilized some combination of peer instruction, TIIP, and cooperative group problem solving. (The tutorials were administered as interactive lectures, with small group work punctuated by full class discussions.) These five sections were similar in that every class meeting, or nearly every meeting, required students to work out answers to tasks posed by the instructor, interact with their peers, and explain their thinking. While the fraction of time spent on this type of activity varied between sections, all five displayed elevated CSEM gains and are referred to as ''full IE.'' These sections were taught by three different instructors, referred to below as instructors C, D, and E.

B. Analysis
An initial linear regression was performed using the data from all class sections (N ¼ 380) to examine the effects of both gender and type of instruction. Letting x represent the type of instruction (x ¼ 0 for partial IE and x ¼ 1 for full IE) and y represent gender (y ¼ 0 for female students and y ¼ 1 for male students), it was found that both variables had a statistically significant relationship with the normalized CSEM gain, with p < 0:001 in each case. The regression equation was as follows: These findings are consistent with results reported in the physics education research literature: students receiving interactive engagement instruction tend to outperform traditionally taught students on measures of conceptual understanding, and men tend to outgain women.
While examining the differences in performance between male and female students broken down by class section, however, we noticed an anomaly. In one of the full IE sections, the gains of female students were not statistically different from those of male students, and in another full IE section female students outgained males. These results are summarized in Fig. 1. The two sections with anomalous results were taught by the same instructor, instructor E.
The linear regression was repeated, expanding the variable representing type of instruction to three values: x ¼ 0 for partial IE instruction, x ¼ 1 for full IE instruction with instructor C or D, and x ¼ 2 for full IE instruction with instructor E. (The small sample size of the full IE group prevented a linear regression analysis of only those students.) Once again, the variables corresponding to type of instruction and gender both had statistically significant relationships with normalized CSEM gain, with p < 0:001 for each variable. The second linear regression analysis yielded hgi ¼ 0:313 þ 0:071x þ 0:089y: These results indicate that in addition to a reduced gender gap on the CSEM, students in the full IE course sections taught by instructor E posted somewhat higher normalized learning gains than students in other full IE sections. These finding prompted us to investigate instructor effects as an explanation for differential performance. Below we summarize the background and teaching experience of instructors A-E; Sec. III presents observations of classroom teaching.

C. Additional context: Background and experience of instructors
Each of the five instructors holds a Ph.D. in physics, with instructors A-D completing doctoral work in traditional subfields of experimental physics or astronomy, and instructor E in PER. Instructor C, while not active in research in physics education, is an avid PER consumer, with substantial experience with research-based curricula and instructional strategies. Instructor C has, for example, served as a field tester for a nationally disseminated research-based introductory physics curriculum. Instructor D can be characterized by a modest level of PER consumption; for example, instructor D is familiar with implementation strategies for Peer Instruction. Instructors B, D, and E all attended the NSF-supported New Faculty Workshop [23], in which specific active engagement strategies and curricula were discussed extensively. Instructors A, C, D, and E all typically receive good end-of-term evaluations of teaching from students (i.e., ratings typically between the fourth and fifth levels on a 6-level scale of ''very poor,'' ''poor,'' ''fair,'' ''good,'' ''excellent,'' and ''outstanding''). Instructor B typically has somewhat higher ratings (between ''excellent'' and ''outstanding''), and has received a college-wide award for excellence in teaching.
Years of teaching experience varied. Instructors A, D, and E each had between 5 and 10 years of experience as instructor of record in university physics courses, while instructor B had between 10 and 15 years, and instructor C more than 20 years. Instructor E, the PER-trained faculty member, had more than 5 additional years of experience using PER-based curricula as a graduate teaching assistant. Instructors A-D had much less teaching experience during their graduate training. Instructor characteristics for each of the 8 course sections involved in the study are summarized in Table I.

III. METHODS AND FINDINGS
The classroom practices of different instructors, including instructor E, were observed using methods from ethnography. A theoretical framework from educational psychology was employed to interpret the observations and account for the reduction of gender disparity in student learning in the anomalous full IE class sections. This section of the paper describes research methods and the theoretical framework and then discusses results.

A. Progression of the research
The ethnography consisted of field notes collected by the observer. (Video data were not collected in this study.) Field notes were initially collected for instructor E only, for a project unrelated to the present study. These original observations were conducted before CSEM posttest data were collected (and thus before student learning gains on the CSEM were analyzed by gender), and before the researchers were aware of the wise schooling framework. The gender gap was then identified in some of the course sections 1-6, and the anomalous lack of gender gap in course sections 7-8 (those of instructor E) and instructor effects were considered as a possible explanation. This prompted exploration of the educational psychology literature, identification of wise schooling as a relevant theoretical framework, and use of that framework to analyze the field notes from the sections of instructor E. Finally, classroom observations were made and field notes collected for instructors A-D, in order to perform a comparative analysis using the wise schooling framework.
Features of this research progression that are important to note include the following: (i) the field notes of the classroom teaching of instructor E were collected before wise schooling had been identified as an analysis tool, whereas the field notes from instructors A-D were collected after the researchers were familiar with wise schooling, and (ii) in some cases, instructor teaching practices were observed in an academic term different from the term in which the CSEM data for that instructor were collected. (The observations were, however, made in the same course, introductory, calculus-based physics.) For these reasons, the study must not only be regarded as preliminary, but also as retrospective in nature.

B. Observation methods and general findings
Classes taught by instructors A-E were observed over an extended period of time. Each instructor was observed in no fewer than 30 class meetings over a period of time not less than 9 months. Detailed written notes about instructorstudent interactions were recorded in real time as the class was conducted. During observations of the course sections of instructors A-D, the observer was familiar with the wise schooling perspective and thus actively looked for the use of specific wise schooling practices. A rubric was employed to analyze the written field notes; the rubric elements, shown in the Appendix, were based on the specific wise schooling practices described below. Each class was evaluated on the same scale. While the two authors discussed the rubric together, and reached agreement on the coding of the field notes using the rubric, there was no check for reliability between observers in the collection of the field notes themselves. The field notes were the product of only a single observer. IE instruction inherently embodies some forms of wise schooling, such as the reduction of competition through cooperation. In the classroom observations, this was taken into account. That is, in the IE sections, instructor behavior was examined for additional wise schooling techniques beyond those inherent in IE instruction. In the classes with traditional lecture format, all forms of wise schooling were looked for, including those found in IE classes. If an instructor demonstrated at least three different wise schooling practices over a span of several class meetings, the teaching was deemed to employ wise schooling.
Within the full IE classrooms (instructors C, D, and E), only one instructor, instructor E, was found to utilize wise schooling practices throughout the course. As described above, instructor E has a background in physics education research, instructor C has extensive familiarity with main results of PER, and instructor D has limited familiarity. Additionally, the number of years of teaching experience varied, with instructor C having the greatest. Instructors C and D were found to use only the wise schooling methods inherent within the IE structure, while instructor E utilized many wise schooling strategies within and outside the classroom.
In each IE classroom students engaged in small group work and had some opportunities for group-to-instructor contact. In addition, nonjudgmental responsiveness, a component of wise schooling, was a feature of each of the IE classrooms, with instructors offering examples and avoiding direct evaluation of student responses. Below we use the framework of wise schooling to describe additional practices that characterized instructor E's classroom in particular.

C. Perspective from educational psychology: Wise schooling
Wise schooling is a group of classroom practices meant to dissipate stereotype threat in the classroom. Stereotype threat is an implicit threat that exists when a negative stereotype about a group has the potential to become relevant to an individual member of that group. Stereotype threat can be a barrier or a threat to domain identification. In stereotype threat, domain-identified students experience risk of confirming the stereotype, and may thus be subject to increased internal pressure to succeed [24]. Educational psychology has studied underrepresented groups extensively and found that wise schooling can dispel stereotype threat in the classroom for both domain-identified and domain-unidentified students [25]. Wise schooling practices do not consist of content oriented teaching strategies, but rather, of intentional ways of interacting with students to foster domain belongingness. While these practices are intended to close performance gaps between nondominant and dominant student groups, the practices are likely to be of benefit for all students. Examples of wise schooling practices are described in some detail below.
The wise schooling recommendations cited in this paper are feasible to implement in introductory physics courses and have substantial overlap with IE techniques advocated by physics education research. We emphasize, however, that while the basic structure of IE pedagogy may be consistent with the recommendations, it is possible to utilize IE strategies without using all of the wise schooling recommendations.

D. Results
The recommendations described in the educational psychology literature and utilized at WWU fall into five basic categories. To support gender equity in the classroom, instructors can cultivate optimistic student-teacher relationships, affirm domain belongingness in women, practice nonjudgmental responsiveness, value multiple perspectives, and emphasize the expandability of knowledge.
Below we describe each recommendation as it is put forth in the educational psychology literature. In parallel, we provide vignettes of the teaching practices of instructor E, the instructor whose sections exhibited a reduced gender gap, to illustrate how the recommendations can be implemented in an IE physics class. The vignettes were taken from classroom observations that occurred in many classes over the span of a year.

Cultivating optimistic student-teacher relationships
Optimistic student-teacher relationships are characterized by efforts to instill confidence in the student that he or she can understand the material no matter what his or her background or grade in the class. Through this relationship, the instructor signals concern to each individual about how she or he is doing in the course. During small group work, and outside of class, instructor E coached students to go through the thinking for themselves, providing adequate ''space'' for students to do this regardless of whether they were currently low or high achievers in the course. This practice conveys confidence in student ability to master the material in a way that direct instruction (i.e., telling answers) may not.
It has been argued that, above all, students must feel valued by the professor in order to fulfill their potential [26]. Instructor E made learning students' names a priority more so then instructors C and D. Strategies included requiring students to display index card name tags during class and to identify themselves before asking or answering questions in full class discussions. Occasionally, ''quizzes'' were given in which an extra credit point was awarded to the class if the professor could identify each student by name. Learning names has been shown to signal to students that an instructor is personally invested in how well they do in the course.

Affirming domain belongingness in women
In physics, even women who highly identify with the subject can feel devalued within it. This may contribute to what is referred to as the ''leaky pipeline'': the progressive loss of women at higher rungs of the educational and professional ladder. (For example, a 2005 report from the American Institute of Physics indicates that women earned 22% of all physics bachelor's degrees but only 18% of doctoral degrees [27].) Affirming domain belongingness based on intellectual potential can aide domain-identified students by signaling to those students that they are a valued part of the physics culture. One way to affirm domain belongingness is to cultivate optimistic studentteacher relationships indiscriminately. This allows women in the course to feel valued and respected [28]. Enhanced domain belongingness combats stereotype threat and the alienation that it can foster. Coupling critical assessments with affirmation of one's intellectual potential motivates students to work harder in the classroom [29].
In our study, instructor E actively encouraged women (and men) with interest in physics to continue on in the subject and to consider declaring a major, regardless of their current performance in the course, while instructors C and D tended to offer less overt encouragement, unless students directly inquired about a physics degree. Affirming belongingness to the domain of physics challenges societal stereotypes by communicating to students that physics is a viable and accessible field for women. Role models are important and need not be gender matched; studies of graduate and undergraduate education have shown that students can feel encouraged and valued by professors and faculty of the opposite gender [30].

Nonjudgmental responsiveness
Responding to students without judgment creates space for them to interact with the material. Note that the importance of critical feedback is not minimized; rather, by focusing on understanding instead of right or wrong answers, the teacher allows students to establish a stronger connection to the material and to develop comfort in raising their own questions. As instructors expect students to explain how they know what they know, so too should the instructor focus on the substance of student thinking rather than the final output of an answer. According to Steele, ''high standards, at least in the relative sense, should be an inherent part of teaching, and critical feedback should be given in the belief that the recipient can reach those standards'' [31].
During class discussions and small group work, we observed instructor E to avoid immediate labeling of student responses as incorrect. Instead, he often asked students to check their answers for consistency with prior knowledge. This strategy provided opportunities for the students to build self-efficacy by reasoning their way out of wrong answers. This approach was coupled with a tendency to refrain from immediate confirmation of correct answers, providing opportunities for students to connect more strongly to the domain by recognizing for themselves that they understand. While instructors C and D provided opportunities for students to offer their own answers to questions posed in class, they employed a ''three-turn'' pattern of interaction more often than instructor E. (This pattern consists of a teacher question, a student response, and a teacher confirmation, in the case of a correct student response, or correction, in the case of an incorrect response.) The three-turn pattern may reduce space for students to interact with the disciplinary content.

Valuing multiple perspectives
Like nonjudgmental responsiveness, valuing multiple perspectives works through allowing students to reason their own way to a correct answer. Wolfe and Spencer examine how the effects of stereotypes and prejudice can be reduced, concluding that ''the classroom atmosphere needs to encourage and support different viewpoints and discussion from all students'' [32]. Encouraging collaborative group work of the type supported by many IE pedagogies can increase interest, foster positive attitudes toward the domain, and cause students to value each other [33]. Within the group, a variety of different approaches may be used to reason through a problem. Supporting this variety erodes stereotypes by challenging the strict sense of how physics should work that many students bring to the course. Instructor E asked students to work in groups during nearly every class meeting. Respectful, nonjudgmental responses were modeled as described above. Portable whiteboards were sometimes employed so that student groups could present their ideas during full class discussions. These implicit messages that multiple perspectives were valued worked in concert with frequent, explicit reminders that the course expected students to learn through collaboration and consensus.

Emphasizing the expandability of knowledge
Some students view ability as expandable, leading them to regard challenges and mistakes as opportunities for learning, while others regard ability as inherent, and thus view the same challenges only as opportunities to confirm inherent intellectual capacity [34]. Such views of ability affect how productively students utilize formative feedback; furthermore, this mechanism has been shown to have greater negative impacts on the learning of minority students compared to those in the mainstream [35]. By emphasizing in class that capability is expandable, and connecting new ideas to what students already know, instructor E demonstrated to students that knowledge can be increased indefinitely. By administering ungraded ''pretest'' tasks at the beginning of a unit of study, and then providing opportunities for reflection on the same tasks at the conclusion of the unit, instructor E helped all students to recognize their own learning.
Instructor E encouraged students to attend office hours, during which he worked collaboratively with groups of students on challenging concepts and problems. Occasionally, instructor E held optional review sessions before exams. Instructors C and D had lower attendance at office hours and fewer review sessions.

IV. CONCLUSIONS
Cooperative group learning has been shown to produce an increase in positive attitudes towards school and identification with the subject matter at hand [36]. Additional research has established stereotype threat as a mechanism through which students in underrepresented groups, including women enrolled in college science courses, experience reduced learning gains compared with dominant-group students. These findings make plausible the use of wise schooling methods to reduce the observed gender gap in student learning in introductory physics courses.
The present study provides some evidence that wise schooling methods can, in fact, address the gender gap. As a retrospective, preliminary study, however, the results must be received with caution. A limitation of the methodology is the absence of checks for interrater reliability. Observation protocols were not strict or formalized; for example, although all class instructors were observed over an extended period, observation time was not controlled and did vary between class sections. Finally, some instructors were observed during the academic term in which the CSEM data were collected from their students, while other instructors were observed in a previous or subsequent term. These concerns notwithstanding, the findings suggest that the practices of individual instructors may contribute in a significant manner to the effectiveness of PER-based interactive engagement teaching strategies, and that the use of wise schooling techniques may result in a reduction of the gender gap in student learning. We feel that these results are promising and warrant further investigation.