Not all interactive engagement is the same : Variations in physics professors ’ implementation of Peer Instruction

While educational reforms in introductory physics are becoming more widespread, how these reforms are implemented is less well understood. This paper examines the variation in faculty practices surrounding the implementation of educational reform in introductory physics courses. Through observations of classroom practice, we find that professors’ actual practices differ strikingly. We present a framework for describing and capturing instructional choices and resulting variations in enacted practices for faculty who are implementing Peer Instruction. Based on our observations, there are a variety of scientific practices that are supported and modeled in the use of Peer Instruction. In all of the classrooms studied, students were found trying out and applying new physical concepts and discussing physics with their peers. However, there were large discrepancies in students’ opportunities to engage in formulating and asking questions, evaluating the correctness and completeness of problem solutions, interacting with physicists, identifying themselves as sources of solutions, explanations, or answers, and communicating scientific ideas in a public arena. Case studies of six professors demonstrate how these variations in classroom practices, in aggregate, create different classroom norms, such as the relative emphasis on student sense-making vs answer-making during Peer Instruction.


Consider two representative vignettes of interactive engagement in a large-enrollment introductory physics course:
Professor Green's Classroom.Professor Green displays a ConcepTest 1 for his students to answer.The students discuss the question with their peers and enter their answers on individual handheld devices.75% of the students answered correctly after the peer discussion.The professor requests an explanation from the students.One student contributes a clear and concise explanation.The professor paraphrases the student's explanation, agrees that it is the correct solution, and moves on.
Professor Red's Classroom.Setting up the clicker question in a similar way, Professor Red displays a ConcepTest, the students respond after discussing with their peers and the professor, and enter their answers individually.75% percent of students answered correctly after peer discussion.The professor requests an explanation from the students and many students respond, each giving the reasoning behind their answers.After a student speaks, the professor repeats the student's idea so that the other students can hear the idea.Most of the contributing students comment or build on previous comments made by fellow students, arguing and debating with each other.After the students seem to have made a good amount of progress on their own, the professor displays his solution on a PowerPoint slide and walks through it quickly.
These two vignettes of classroom practice are easily recognizable as Peer Instruction, 1 one of the more widespread innovations in introductory college physics courses. 2 We note that both professors appear to be actively engaging their students; they employ the same tool ͑personal response sys-tems͒, present the ConcepTests similarly, etc.Both professors demonstrate certain hallmarks of interactive engagement ͑IE͒, but we also note significant variation in their implemen-tations.In Green's class, one student contributes a correct explanation and the professor moves on, while in Red's course more students are engaged in public discussion and debate about the reasoning process and the correctness and completeness of the ideas presented.Noting these and related differences, we seek to describe how variation in faculty practices impact student learning.The present paper begins to address this question by developing a system for describing and measuring classroom practices that contribute to the construction of different classroom norms, [3][4][5][6][7] i.e., different roles and rules of using personal response systems and Peer Instruction in these classrooms.Through an investigation of six, large-enrollment lecture courses that use personal response systems ͑or, "clickers"͒ as a primary tool for IE, we show how differences in instructors' practices can be delineated and measured.Then, we discuss how these pedagogical differences result in different roles for students and instructor as well as different rules for the use of clickers in the classroom.We find that variation in teacher practice results in disparate opportunities for students to practice conceptual reasoning, 1,8,9 skills at talking physics, 10,11 agency, [12][13][14] and scientific inquiry. 8,15,16

II. BACKGROUND A. Prior research
8][19] Researchers have documented variations in teacher practice in the context of K-12 classroom; however, little has been done to document educational practices in the context of higher education.What little work has been done has tended to document broad distinctions such as reformed or not reformed, interactive engagement or not interactive engagement, student cen-tered or not student centered.While useful, these categorizations are too coarsely grained to establish research metrics for distinguishing specific implementations and to inform the specific instructional choices of educators.Research on curriculum and instruction implementation has highlighted the importance of and lack of attention to the roles and role relationships necessitated by curricular change. 20This research has also noted that these organizational aspects of curricular change are usually not evaluated and left implicit in discussions of the curricular design, if addressed at all. 20his paper provides a metric for documenting and discussing these organizational changes associated with the use of Peer Instruction.In this paper, we present a fine-grained set of characteristics of instructional practice that are both research based and theoretically grounded.
Research studies that have examined the adoption and use of pedagogical innovations in undergraduate physics have tended to examine, either: ͑1͒ student learning through pre-/ post multiple-choice content assessments or ͑2͒ instructors' perspectives on the use of physics education research ͑PER͒based approaches in teaching through surveys or interviews.A well-cited example of this first type of research is the 1998 paper by Hake, 21 which showed consistently higher learning gains for courses where the professors reported using "interactive engagement methods" versus traditional lecture format classes.However a closer look at these data reveals that there is still a significant variation in student learning gains within interactive engagement classrooms; the bottom decile of IE courses achieve average normalized learning gains 22 ranging from about 0.16 to 0.24 while the top decile of IE courses achieve average normalized learning gains ranging from about 0.60 to 0.64. 21Hake hypothesizes that large variations in average student learning gains may be due to "course-to-course variations in the effectiveness of the pedagogy and/or implementation" ͓Ref.21, p. 66͔.Hake's study establishes that there is large variation in average student learning gains across courses, but leaves unanswered exactly what variations in pedagogical practices may exist between these classrooms that may affect student learning.
Other researchers in physics education have documented variation among student learning gains for various implementations of the same curriculum.In a study of five university physics courses implementing Workshop Physics 23 across different institutions and different instructors, researchers at the University of Maryland found that average normalized learning gains for these courses ranged from 0.39 to 0.57 ͑Ref.24͒ as measured by the Force Concept Inventory ͑FCI͒. 25Similarly, researchers at University of Colorado ͑CU͒ studied five introductory courses implementing the Tutorials in Introductory Physics 26 in recitations at the same institution and found a range of average normalized learning gains in different courses ͓from 0.45 to 0.64 ͑Ref.27͔͒ as measured by the Force and Motion Conceptual Evaluation ͑FMCE͒. 28Such significant variation in student learning suggests possible differences in how these curricula are implemented, and calls for a characterization of faculty practices and a framework for measuring similarities and differences in instructors' implementation of interactive engagement techniques.
0][31][32][33] These research studies, drawing from evidence gathered through interviews or surveys, develop frameworks for examining professors' broad conceptions about teaching and learning, 30 reported classroom practices, 30 and beliefs and values about teaching and learning problem solving. 32This work has contributed to a deeper understanding of how professors think about teaching and learning in higher education and has questioned the common assumption that professors' beliefs about teaching and learning sit in conflict with the principles of research-based educational reforms. 30,32Furthermore, the work of Henderson and Dancy [29][30][31] found that although professors' conceptions about teaching and learning physics align with nontraditional or alternative educational ideas, many professors report that their practices did not always align with these conceptions due to situational constraints.That is, faculty practices are constrained more by structural considerations ͑such as expectations of content coverage, lack of instructor time, class size, or room layout͒ than by their beliefs about productive educational practices. 31uilding on and complementing these prior research programs, we investigate actual practices of professors within similar situational constraints who implement the same pedagogical technique.We focus on faculty use of Peer Instruction and develop a framework and measurement tool for describing the differences and similarities in how professors conduct the same instructional activity.This work compares six university faculty members implementing Peer Instruction in six different introductory physics courses, and documents similarity and variation in faculty practices through classroom observations, audio recordings of lecture classes, and interviews with these faculty members.These studies result in: an observational protocol for documenting professor's implementation of Peer Instruction, measurements of observable aspects of faculty practice that can vary, often significantly, and a framework for documenting variation along 13 dimensions of practice.We also show how these differences in practices surrounding the implementation of PI impact the opportunities students have to engage in various scientific practices and create different classroom norms ͑cultures͒.Subsequent work will examine how these practices and norms are associated with student perceptions in these environments.

B. Studying an intervention: Peer Instruction
One of the primary interactive engagement techniques used in large-scale lectures at the University of Colorado at Boulder is Peer Instruction. 1 According to Mazur, who developed the method, Peer Instruction is a pedagogical approach in which the instructor stops lecture approximately every 10-15 min to pose a question to the students.These questions or ConcepTests are primarily multiple choice, conceptual questions in which the possible answer options represent common student ideas.Mazur describes the Peer Instruction process as follows: 1,34 General format of ConcepTest ͑1͒ Question posed ͑2͒ Students given time to think ͑3͒ Students record or report individual answers ͑4͒ Neighboring students discuss their answers ͑5͒ Students record or report revised answers ͑6͒ Feedback to teacher: Tally of answers ͑7͒ Explanation of the correct answer By implementing this questioning process as described, Mazur presents compelling evidence that Peer Instruction methods improved his students' ability to complete both conceptual and traditional computational physics problems. 1The Mazur research group has also investigated conceptual learning gains across 11 separate higher education institutions including 30 introductory physics courses in which Peer Instruction has been implemented. 35According to this study, professors that implement Peer Instruction in this format on average achieve average normalized learning gains of 0.39Ϯ 0.09 as measured by the FCI. 35These results provide evidence that implementing this pedagogical technique can result in substantial student conceptual learning.
For the purposes of this paper, we have collapsed the questioning format above into a three stage process which was found to be common across the implementations of Peer Instruction by the professors observed in this study.The stages we use are: the Clicker Question Set Up Stage ͑Mazur steps 1-2͒, the Clicker Question Response Stage ͑Mazur steps 3-5͒, and the Clicker Question Solution Discussion Stage ͑Mazur steps 6-7͒.These stages will be described in more detail in the data section.
Furthermore, the present study examines clicker questions ͑CQs͒ more broadly than focusing on solely ConcepTests or conceptual questions.Within this terminology, Peer Instruction ͑PI͒ as described by Mazur is a particular subset of clicker use, and ConcepTests are a subset of CQs.7][38] We seek to characterize faculty use of Peer Instruction which requires a slight broadening of the definition of Peer Instruction based on results of faculty adaptation.

III. BRIDGING CLASSROOM NORMS AND OBSERVABLE PRACTICES
We take classrooms to be cultural systems which are constituted by norms of behavior that arise out of the repeated use of shared practices. 3,7Instructors and students make choices ͑implicit or explicit͒, which, in collection, establish a microculture with specific norms and expectations of the participants. 3,6,7,39These microcultures are important because they are tightly coupled to the new understandings about physics, the nature of learning, and the nature of physics that participants develop as part of the course. 7,39In order to connect classroom norms to specific, pedagogically relevant, observable classroom practices we describe two finer-grained scales of practice which in combination, over time make up the classroom microculture: observable characteristics of practice and dimensions of practice ͑DoP͒.Collections of observable characteristics of practice ͑such as the professor leaving the stage͒ make up DoPs ͑such as faculty-student collaboration͒, and combinations of DoPs make up class-room norms ͑such as a classroom highly valuing facultystudent collaboration͒.We begin by describing the DoPs.We then present observable characteristics of practice that were documented in six classrooms implementing PI and link these observable characteristics to the associated DoPs.In the data analysis section, we connect the observable characteristics and DoPs to classroom norms through two methods: summaries of stand-out collections of practices in six classrooms and case studies of a single PI episode in two different classrooms.
As we began our observations of several courses implementing Peer Instruction, we initially found it difficult to describe and measure the dramatic differences that we saw.We sought to create a framework to guide the documentation and discussion of these differences.We particularly focused on practices that we hypothesized would influence student learning outcomes of: content knowledge, epistemological beliefs about science, 40,41 attitudes about learning science, [42][43][44][45] and scientific abilities. 16][52][53][54][55] We then began an iterative process of revising and modifying our DoPs on the basis of additional classroom observations, aspects of practice presented in the literature, and the utility of these dimensions in documenting the practices in the learning environments we were studying.Based on our collection of descriptive field notes [56][57][58] and a synthesis of literature on interactive engagement, we were able to frame and organize pedagogical considerations surrounding the implementation of Peer Instruction along 13 dimensions.This literature is described in more detail in Appendix B, Part 1 along with the detailed descriptions of the DoPs. These dimensons direct our classroom observations by guiding us to look at particular practices that have been demonstrated to impact student learning in other classroom contexts.In this way each DoP is based on broad educational research into student learning and our observations of PI practices in particular. Thesedimensions help us to focus on certain aspects of the classroom norms, obligations, and expectations about the participant roles and the rules of interaction [3][4][5][6][7] that surround the use of clickers.
We have organized these DoPs into two sets.The first set of dimensions involves how the professor communicates what the classroom participants will and will not be doing during this activity-his or her expectations of students.These decisions are negotiable boundaries placed on the activity by the professor in advance of the students actually attempting to solve the problem.We call this first set of dimensions: defining the academic task.The second set of dimensions describes student-professor interactions during the academic task.The use of clickers is further defined and negotiated based on the nature of these interactions.Another set of dimensions could be developed to describe the details surrounding student-student interactions; [59][60][61][62][63] however, this set was beyond the scope of this study.The dimensions that we have identified are summarized in Table I.The dimensions described below are not completely independent, but rather form a set of overlapping considerations that instructors ͑and students͒ manage in the classroom while implementing Peer Instruction.These DoPs are designed to, on the one hand, link to observable choices and actions that faculty make, and, on the other hand, lead collectively to establishing norms and expectations in the classroom.The purpose of the DoPs is to help us link classroom norms to specific, pedagogically relevant, observable classroom practices and choices of faculty.

IV. DESCRIPTION OF SETTING AND METHODS
All courses observed in this study were large-enrollment introductory undergraduate physics courses with average lecture attendance ranging from 130 to 240 students.Pseudonyms have been chosen to assure the professors' anonymity: Yellow, Green, Blue, Purple, Red, and White.Five of the six courses studied were courses required for science-based degree programs.The other, Purple's course, was an elective course for nonscience majors.The lead instructors for these courses varied from tenured professors to temporary instructors.Two of the six instructors observed, Green and White, were novices with respect to the use of Peer Instruction and clickers in large-enrollment courses.Both also happened to be temporary instructors with no experience teaching largeenrollment courses.Three of the instructors observed, Blue, Red, and Purple, were active members of the physics educa-tion research group at the University of Colorado at Boulder.It is also important to mention that prior to this research study Blue mentored Yellow as he learned to use clickers and Peer Instruction in his own large-enrollment introductory physics courses.These course and professor attributes are summarized in Appendix A, Table A for reference.
Although all of these educators used the language of Peer Instruction to describe their practices, none of them implemented Peer Instruction exactly as described by Mazur.Each of these professors used an electronic classroom response system to collect and tally the students' votes.These systems do allow students to change their answers while the voting time is still open.The most notable variation between these professors' practices and Mazur's description is that none of the faculty observed in this study had an explicit "silent" phase of the CQ where the students came to an answer individually first.We observed significant student discussion in all classes.For most questions, we found that the noise in the room is very limited at the beginning of the CQ and then the noise level quickly rises.We hypothesize that students were spending some fraction of the CQ response stage to think independently even though students were not asked to commit to an answer individually prior to peer discussion.In this way, the use of the term "Peer Instruction" by physics faculty and our use in this paper should be loosely interpreted.
Ethnographic research methods [56][57][58] were used for this study.Through extensive engagement of the researcher within a community, ethnographic research aims to build models based on both insider and outsider perspectives.The researcher's goal is to make explicit models of the implicit meaning systems of the community.These models are influenced by the insiders' interpretations of events, but due to the researcher's ability to both bring an outside perspective and a reflective approach to the system, the researcher is in a unique position to identify overarching patterns that can give sense to the patterns of the community.In this way, the researcher's interpretation of the values, unspoken in the practices of the community, is important and can help to make explicit cultural elements that are largely implicit to the participants. 64Following from this tradition, the data sources for this study are qualitative and quantitative including: interviews with each professor, audio recordings of a subset of lecture periods, daily electronic records of CQs asked along with student responses, broad descriptive field notes, focused observational field notes, and student survey responses surrounding clicker use and classroom dynamics.
For the six courses that constitute the focus of this study, field notes of two distinct types were collected: descriptive narrative field notes and focused observation rubric data.The first type of field notes, collected primarily at the beginning of the semester, were broad narrative descriptions of the instructor's actions, interactions between students, and also interactions between the instructor and the students. 65,66These preliminary field notes informed the creation of an observation rubric for better collecting aggregate patterns of interaction with primary emphasis being placed on the instructional choices of the professor.See Appendix B, Part 2 for the observation rubric and Appendix B, Part 3 for a user's guide to accompany this rubric.Multiple researchers, two from the PER group at Colorado and one from another institution, used the rubric and its associated instructions and provided formative feedback on the instruction guide.In subsequent reliability studies conducted with an additional researcher, reliability of 80% or greater was achieved on all items, averaging 96% agreement overall.This observation rubric along with narrative field notes was designed to collect information relevant to the 13 DoPs.
The observation rubric was used as a descriptive and analytical tool in the observations of an additional 6-10 class periods that constituted at least 20% of the class periods for each course.Data from the observation rubrics were used to quantify and compare characteristics of instructor practices across courses.After comparing these aggregate data, the initial descriptive field notes were revisited along with the audiotapes of the class periods to infer how these variations in practices were contributing to the norms and expectations of these classroom communities.
At the end of the semester, semi-structured interviews were conducted with each of the instructors who participated in the study. 56,67The topics discussed included differences between a traditional lecture and an interactive engagement lecture, descriptions of what an engaged student would be doing in an introductory physics lecture, descriptions of what the professor and their students do during a typical CQ, purposes of clicker use, and the importance of various student activities associated with CQs such as articulating their reasoning in the lecture.Additional information about each instructor's classroom practices was available for all courses through the daily electronic data collected by the clicker software program and additional artifacts such as course syllabi were collected through the course web pages.

V. DATA
The 13 DoPs, presented in Table I, frame the data collection and case studies of six professors' practices.We begin by presenting these data organized by the chronological stages presented in the background section: Clicker Question Set Up, Clicker Question Response, and Clicker Question Solution Discussion.The Clicker Question Set Up stage includes the professor's introduction and framing of the clicker question for the students, in addition to decisions made prior to class, such as the types of clicker questions to be presented.The Clicker Question Response stage is the time that the students are given to construct their answers to the clicker question and communicate their responses to the professor through an electronic response system.This response stage includes both silent and group response times, if these are offered.The Clicker Question Solution Discussion stage is the whole class explanation and solution discussion phases.It is the time that the class or professor spends constructing a public solution to the clicker question.We will revisit these data in the analysis section, triangulating multiple observable characteristics, 68 to describe classroom norms for each professor.

A. Clicker question set up stage
Some instructor decisions, concerning the set up of the CQ, occur prior to any class period.In order to understand on a broad scale the norms and expectations surrounding clicker use in these various classroom communities, we analyzed the explicit expectations laid out in the course syllabi.All courses gave reading assignments to their students, but Just-in-Time Teaching ͑JiTT͒ ͑Ref.69͒ was not used in any of the courses; therefore, students were not directly held responsible for completing these assignments.Although professors were occasionally observed using Interactive Lecture Demonstrations ͑ILDs͒ 70 and/or Socratic questioning, in all courses studied PI was the vastly dominant interactive engagement technique. 71The course syllabi outline for the students how clickers will be used and also how students will be evaluated for their participation.Four of the six professors in this study, White, Blue, Purple, and Red, had a mandatory fraction of students' course grade based on clicker points.In Blue's course, clickers accounted for 1% of the students' grade and clicker points were awarded based on correctness of the response.In Red's course, 15% of students' grade was based on clicker participation and CQs were rarely graded based on correctness ͑only for occasional reading quizzes͒.In White and Purple's courses, clicker participation was 5% of the students' grade and the CQs were not graded based on correctness.In Purple's course, students could also be awarded extra credit clicker points which were graded based on correctness and these extra credit points could replace up to 12% of students' total exam grade.In Yellow and Green's courses, CQs were graded based on correctness and clicker points were only awarded as extra credit points to replace up to 10% of the student midterm exam grade.These grading policies give one perspective on the relative value of clicker participation and the emphasis on the correctness of the CQs.
Additionally, the role of clickers in the course is framed by the type of lecture resources made available for students.All of the courses involved in this study provided students with copies of CQs and CQ answers after the class period was over.Except for Yellow, all professors provided short explanations of the CQ solutions in the resources made available for the students.Three of the six professors, White, Purple, and Red, provided lecture notes with these CQs placed in sequence with the presentation of other relevant information.These lecture resources provide one piece of evidence about the degree to which CQs were embedded in the course and the degree to which the explanation to the CQ was emphasized.
Other messages concerning the role of clickers within the course are established implicitly through repeated patterns of classroom practice.Many additional patterns of practice will be discussed throughout the next two clicker implementation stages, but we will describe here some broad class level patterns such as the average number of CQs per hour and the average fraction of students getting the CQs correct after peer discussion.
Using data collected through the clicker software, we were able to calculate the average number of CQs asked per hour of class using data from the entire semester, shown in Table II.From these data we find that Green asked the fewest number of questions per hour of class, averaging about 3.2 questions.Yellow, Blue, Purple, and Red asked a moderate number of questions per hour of class ranging from about 5 to 6.5.White asked the largest number of CQs per class averaging about 8.2 questions per hour of class.
Similarly, we were able to calculate the average fraction of students getting the CQ correct.These data are summarized in Table III.Based on these calculations, we see that Blue has the highest average percent of correct student answers at about 76%, and White has the lowest percent correct at about 64%.
While the average fraction of correct responses do not vary dramatically, the distributions are found to be skewed toward higher percent correct and do vary by professor.This means for courses such as Blue and Red's half or more of the questions asked received 80% correct or greater.Another way of presenting this data is to look at the fraction of questions where at least 70% of students got the question correct.In Blue's class, at least 70% of students answer correctly most of the time ͑71% of CQs͒, while in White's class this does not occur most of the time ͑41% of CQs͒.This shows that students are more commonly successful at answering CQs correctly in Blue's class as compared to White's.The degree to which students are correctly responding to the CQs varies from course to course as evidenced by both the average and median values of average fraction of correct student answers.
During the clicker question set up stage, the instructor must also determine which types of CQs to ask.Through our analysis and observations we broadly distinguish types of CQs: logistical questions and content questions.As described in more detail in Appendix B, Part 1, logistical questions were questions used to poll students' perspectives or opinions about the course.Green and White almost never ͑less than 2% of the time͒ asked logistical questions, while Yellow, Blue, Purple, and Red occasionally asked logistical questions ͑6-12 % of the time͒.We also found that within the category of content questions, one of the professors observed, Red, was occasionally using CQs to give graded reading quizzes.
][74][75][76] We limit our analysis of question type to three coarsegrained categories: Recall, Algorithmic and Conceptual Questions.Descriptions of these categories can be found in the descriptions of the DoPs in the Appendix B, Part 1 and most closely resemble the categories used by Towns. 73The results of a categorization of a random subset of CQs are shown in Table IV.These results show that all of the courses are primarily ͑65-85 % of the time͒ asking conceptual questions.Additionally, Red, White, and Yellow were found to be a For these courses, the correct answer to the CQs needed be entered manually by the researcher.Therefore, the correct answers were only entered for a subset of the CQs.
asking a relatively higher fraction of recall questions compared to the other professors.It is interesting to note however that half of the recall questions ͑2 out of 4͒ asked by Red were reading quizzes.
Another classroom practice surrounding clickers that was not observed to vary significantly from question to question was the role of student-student collaboration in answering the CQ.In this way, it appears that whether student talk was allowed or not allowed during the CQs was set up early in the semester and not negotiated by the professor and the students on a question-by-question basis.In all of the classrooms observed, students were most often allowed and encouraged to discuss with their peers in constructing their CQ answer.In most cases, students were observed to spend a small amount of time at the beginning of the CQ quietly thinking before consulting with their peers.However, students were not expected to commit to an answer individually by voting as described in Mazur's original descriptions of Peer Instruction. 1 Red's classroom was the only classroom which occasionally asked questions that were intended to be answered individually and these were reading quiz questions.Out of the 38 CQs observed, Red asked that the CQ be answered individually only twice.These data from this stage are summarized in Appendix A, Table B.
To summarize, all professors are choosing similar types of CQs ͑Dimension of Practice No. 2͒, focusing primarily on conceptual knowledge in their question statements.Due to the nature of the questions being posed in class, students in all these courses are given the opportunity to try out and apply new physics concepts on their own.Another important similarity across these courses is that student discussion is allowed, encouraged, and does occur in all of these courses.In this way, the level of student-student collaboration ͑DoP 4͒ is comparable across these courses and students are given similar opportunities to practice discussing physics with their peers.We also note that although CQs are integrated into the lecture of each of these classes by asking questions throughout the evolution of the lecture, the extent to which CQs are integrated with the rest of the course and the course assessment varies from course to course ͑DoP 1͒.For example CQs have a different role in Yellow's course as compared to Red because in Yellow's course CQ solution explanations are not provided or embedded into the online lecture notes and there is no mandatory fraction of the students' grade that is dependent on their CQ participation or responses.In Red's class however CQs and CQ solution explanations are embedded into the online resources and a large mandatory fraction of the students' grade is based on their clicker participation which helps to place greater emphasis on the clicker activity.

B. Clicker question response stage
During the time interval where students were constructing an answer to a CQ, what did the professors do?Based on our observations, individual physics professors spent this time differently.One significant difference was the extent to which the professor left the stage area of the lecture hall and walked around the classroom among the students.The first column of data in Fig. 1 shows the fraction of observed CQs where the professor left the stage area of the classroom.The professors also varied in how they interacted with the students during this CQ response time.The fraction of the observed CQs where the professor answered students' questions ͑where the students initiated a question͒ or discussed with the students ͑initiated by either instructor or student͒ varied, as shown in columns two and three of Fig. 1, respectively.
From Fig. 1, we see that Yellow, Green, and Blue almost never leave the stage ͑less than 15% of the time͒.Based on our observations, Yellow and Green not only chose to stay in the stage area, they chose to have very limited interactions  with students during the response time.Each of these professors would specifically distance him or herself from the activity the students were engaged in.Green would stand in a far back corner of the stage, removing himself as an object of focus during that time, but also distancing himself from the students.Yellow often spent this time organizing his lecture materials and overhead projector sheets for the explanation period.He would also pace around the podium area as he monitored the clicker votes coming in.Although Blue did not often leave the stage ͑10% of the time͒, he occasionally engaged in listening to students in the first few rows of the class.
On the other hand, we see that Purple, Red, and White usually leave the stage ͑more than 50% of the time͒.These professors would walk among different areas of the classroom during questions, sometimes as far as the furthest back rows of the lecture hall.These professors would sometimes listen to student conversations without participating, and at other times discuss with students or answer student questions.We also see that Purple and Red are much more likely to be answering student questions during the response time, while Yellow, Green, Blue, and White are engaged in answering student questions during the CQ response time for less than a quarter of all CQs.Similarly, Yellow, Green, and Blue rarely ͑less than 20% of the time͒ discuss with students during the CQ response time.We also note that Red discusses with students very often, approximately 80% of the time.Purple and White discuss with students moderately often, approximately 40-50 % of the time.
Another important task of the professor during the CQ response time is to determine the length of time that students should be given to complete the CQ.The length of time students are given to respond provides additional evidence concerning the type of tasks the students are engaged in and also the amount of conversation between students that is expected.We first compare the average length of time given for students to respond to CQs as shown in Fig. 2. From this comparison, we see that in general Green and Red give their students the most time to respond to the CQ, averaging about two and a half minutes.We see that Yellow, Blue, and Purple fall just below this upper group, on average giving their students just over two minutes to complete the CQs.Finally we see that White's students are given the least amount of time to respond, averaging approximately one and a half minutes.
By looking at the distribution of clicker timing data, we can begin to describe further differences between these professors' practices.The time elapsed for each question, as captured by the clicker software, is binned into 30 s intervals and shown in Fig. 3 for two professors, Red and Yellow.These professors were chosen to represent one professor with a relatively high average time and one professor with a moderate average time, relatively.
From the comparison of these distributions, we see that while Yellow's data appear as a Gaussian skewed toward lower time intervals, Red's timing data appear fairly flat over the time ranges of half a minute to four and a half minutes.These variations in distributions inspired us to compare the standard deviations of the timing data by course.We found that the Yellow and Green had the smallest standard deviations, 59Ϯ 3 and 61Ϯ 4 s, respectively.White's standard deviation was just greater than Yellow and Green at 69Ϯ 3 s.The three highest standard deviations were Blue, Red, and Purple at 89Ϯ 5, 103Ϯ 6, and 116Ϯ 7 s, respectively. 77hese data suggest that the tasks that Blue, Red, and Purple ask their students to complete during a CQ are more varied than in the classrooms of Yellow, Green, and White.These data from this stage are summarized in Appendix A, Table C.
In summary, we see that professors are creating or reducing spatial boundaries ͑DoP 6͒ between themselves and students to varying degrees.For example, Blue leaves the stage only 10% of the time, while Purple leaves the stage almost 80% of the time.This simple act can open new possibilities for students to interact with physicists.We can also see that faculty-student collaboration ͑DoP 8͒ varies across these courses.In one course, Blue only answers student questions during CQ about 20% of the time and discusses with students only 15% of the time.This is contrasted with Red's course for example where student questions are answered 60% of the time and faculty-student small group discussions occur approximately 80% of the time.Students in these classrooms are given different opportunities to practice formulating and asking questions.Similarly, opportunities for the instructor to model discussion and justification practices are varied depending on the frequency and type of facultystudent collaboration.

C. Clicker question solution discussion stage
After the students had finished responding to the CQ, how did the professor conduct the explanation phase of Peer Instruction?As a preliminary metric, we used data collected from the observation rubric and the audio files to calculate the average amount of class time spent discussing the CQ solution, see Table V.
From these data we can see that White spends the least amount of time discussing the CQ solution, averaging about 1 min and 10 s.Blue spends about two and a half minutes discussing the solution, while Yellow, Green, Purple, and Red all spend over three minutes.Overall, the time spent discussing the CQ solution varies by as much as a factor of 3.
We have also identified two characteristics of this discussion that vary across professors: whether incorrect CQ answers were addressed and whether students actively contributed to the explanation of the CQ solution.The first column of data in Fig. 4 shows the fraction of observed CQs where incorrect answers were discussed during the description of the solution.The second column of data in Fig. 4 shows the fraction of the observed CQs where student explanation͑s͒ of the CQ solution were heard during the whole class solution discussion.From Fig. 4 we can see that Purple and Red usually discuss incorrect CQ answers and that they do so more frequently than the rest of the educators observed.
Although Green and Blue are discussing incorrect options about the same fraction of the time, in Green's class these occasional explanations of the incorrect options originated from the students while in Blue's class the occasional explanations of the incorrect options originated from the professor's explanation of common student difficulties.In Purple and Red's courses, the discussion of incorrect answer options was commonly originated from both the students and the professor.The second column of data in Fig. 4 also shows that Green always uses student explanations when explaining the CQ solution.Purple and Red usually use student explanations in the construction of the CQ solution, while Yellow, Blue, and White rarely ͑less than 20% of the time͒ use student explanations.
The number of students explanations usually heard in class for a given question also fluctuated from course to course.When students were asked to contribute explanations of the CQ solution, Yellow on average hears from 2.2Ϯ 0.2 students, Green: 1.4Ϯ 0.1, Blue: 1.3Ϯ 0.3, Purple: 2.3Ϯ 0.5, and Red: 2.4Ϯ 0.4.White was not included in this analysis since student explanations were only used once.We see that Blue and Green primarily hear from only one student concerning the correct answer, when student explanations are used, while Yellow, Purple, and Red usually hear from at least two students.This is characteristic of practice where Yellow, Purple, and Red's classrooms place more emphasis on discussing the incorrect answers and associated student reasoning.
In order to get a further sense of how often students offer their explanations in each course, we calculated the average number of student explanations that are heard in every hour of class.We find that students most frequently speak publicly in the classrooms of Purple, Green, and Red which heard from an average of 4.2Ϯ 0.5, 4.6Ϯ 0.6, and 4.8Ϯ 1.3 students per hour, respectively.Professor Yellow hears a moderate number of student explanations at 2.4Ϯ 0.6 students per hour.Professor Blue and White hear from a relatively low number of students per hour at 0.6Ϯ 0.4 and 0.1Ϯ 0.1, respectively.These data from this stage are summarized in Appendix A, Table D.
We find that faculty-student collaboration ͑DoP 8͒ is even more varied now that we have examined the CQ solution discussion stage.For example in Green's class students' are always contributing their own descriptions of the CQ solution, while in Blue's class student explanations of the solutions are heard only 10% of the time.Similarly, we see that the use of student voice ͑DoP 10͒ varies from course to course.Students in these courses are given different opportunities to practice communicating in public.Even when student explanations are heard, it varies how many different students are heard.We also see in Purple, Green, and Red's class average the most number of students sharing ͑ϳ4.5͒ while in Blue and White's classrooms less than a single student contributes a solution publicly in each class period.Students are given different opportunities to practice identifying  themselves as sources of solutions, explanations, or answers.

A. Summaries across Dimensions of Practice
Observable characteristics of practice vary from professor to professor and we can see how different combinations of these characteristics over time, in aggregate, create different classroom norms and expectations surrounding the implementation of Peer Instruction.We use both the characteristics of practice described in the data section and the DoPs, which frame the observations, to summarize the standout classroom norms surrounding clicker use.These descriptions are a narrative summary of the data presented in the previous section with augmentations from relevant field notes.We describe a few examples of how these collections of DoPs combine to create classroom norms; however, these are not an exhaustive examination of norms.
We are particularly interested to investigate if studentprofessor interactions are modified from traditional patterns through the use of Peer Instruction.9][80] In these interactions between the professor and the students: the educator initiates ͑I͒ an exchange usually by asking the students a question, then the student͑s͒ respond ͑R͒, and then the educator gives evaluation ͑E͒ or feedback ͑F͒ based on the responses.We note that within these traditional patterns of interaction, the educator holds all of the responsibility for deciding which questions will be asked and also for determining the correctness and completeness of the responses provided. 81Within this pattern of interaction, students are not given opportunities to practice formulating and asking questions, evaluating the correctness and completeness of problem solutions, or identifying themselves as sources of solutions, explanations or answers.Different educational environments break out of this traditional IRE pattern to varying degrees and therefore provide students with different opportunities to engage in these scientific practices.
While we emphasize characterizations that distinguish faculty practice, it is worth noting that each of these courses is considered a successful educational environment, engaging students and leading to student learning.All professors in this study asked at least a few CQs interspersed throughout the lecture ͑See Dimension of Practice 1 and data from Table II͒.All of the professors in this study asked conceptual questions the majority of the time ͑DoP.2; Table IV͒.All of the professors allowed and encouraged student-student collaboration during the CQ response stage and the students exhibited significant discussion in all courses ͑DoP.4͒.All of these uses of Peer Instruction resulted in new opportunities for student-student interactions and increased emphasis on conceptual understanding when compared to a traditional lecture.We also note that these instructors may be learning about the use of these tools, so these summaries of their practices represent a snapshot in time.
Blue.During both the CQ response time and the CQ solution discussion, there were few interactions between the faculty and the students ͑DoP.6, 8, and 10; Fig. 1 and 4͒.In this way clickers were primarily used in an IRE format and responsibility for determining the completeness and correctness of the solutions was not shared with the students.It is interesting that although this professor understood common student ideas that may lead to incorrect answer options, these common missteps were not usually discussed explicitly with the students.When the professor discussed the correct solution, the professor did build on students' prior knowledge in his explanation as well as emphasize the concepts and big ideas that were being used in the solution ͑DoP.9͒.
Yellow.There are many similarities between the standout clicker practices of Yellow's and Blue's classrooms as may be expected because Blue mentored Yellow in teaching a prior version of this course.It was the norm for the professor to have limited interactions with the students and the students had a limited role in the construction and evaluation of the public solution ͑DoP.6, 8, and 10, Figs. 1 and 4͒.The professor's explanations were usually quite detailed, consisted of asking students questions at key substeps, and clearly illustrated the big conceptual steps on an overhead projector.Similar to Blue, clickers were used to foster new student-student interactions, but not new opportunities for students-professor collaboration.Clickers were primarily used in a traditional IRE/IRF interaction sequence.
Green.Although Green's classroom looked similar to those of Yellow and Blue during the CQ voting, Green's classroom used more student voice during the CQ solution discussion.During the CQ solution discussion, the professor usually requested student explanations ͑DoP.7, 8, and 10, Fig. 4͒.However, many times the professor would state that 'the majority of people chose "a"' and then ask if "someone could give a quick motivation for response a" without discussing incorrect answer options ͑Fig.4͒.In this way, the professor usually gave the correct answer away early in the public discussion and only heard from one student who usually contributed a fairly complete and correct explanation ͑DoP.10 and 11͒.Therefore, clickers were used to somewhat modify faculty-student interactions beyond the traditional IRE format; however, students were not given more responsibility for determining the correctness or completeness of the solution description.In this way, faculty-student collaboration was not significantly modified.
White.The CQ voting time looked very different in this course.White would usually walk around among the students ͑DoP.6, Fig. 1͒, sometimes answering student questions, and occasionally discussing with the students ͑DoP.7 and 8, Fig. 1͒.However these interactions with students were usually brief ͑DoP.8 and 10, Fig. 2͒.The professor almost never requested student explanations during the solution discussion ͑DoP.7 and 8, Fig. 4͒.Students therefore had very little voice or role in the final construction of the answer solution ͑DoP.10͒.Notably, White had the lowest percent of students answering correctly and the shortest time for response and solution description ͑Table III, Fig. 2, and Table V͒.In this way, the use of clickers in this class was primarily for quick check-for-understanding questions ͑DoP.13͒ in which the professor was attempting to gauge if students sufficiently mastered a topic or not, but the use of clickers was not treated as a significant opportunity to involve students in significant sense-making ͑DoP.12 and 13͒.
Red.During the introduction of the CQ, the professor would usually explicitly state to the students that, "I am interested in your reasoning" or "I'm going to ask for you to explain your answer" ͑DoP.3͒.While the students were responding to the CQ, the professor would wander around the room, answering questions or discussing with students by asking groups of students, "What do you guys think?" or "How are you all doing here?"͑DoP.6, 7, and 8, Fig. 1͒.The professor would usually get a chance to interact with two to four different groups of students during the CQ response time ͑DoP.7 and 8͒.During the CQ solution discussion, the professor would then ask students to contribute their explanations publicly in the whole class discussion ͑DoP.7 and 8, Fig. 4͒.The professor would usually hear from multiple students and would usually ask clarifying questions of the students as they described their solution ͑DoP.8, 10, and 11͒.The professor would often follow one student's explanation with a phrase like, "Does anyone want to retort?"In this way, the professor made a space for students to actively disagree with each other in a respectful way.Red's classroom did establish significantly different forms of faculty-student collaboration.
Purple.Purple's classroom looked similar to Red's during both the CQ response stage and the CQ solution discussion stage.When introducing the CQ, Purple would remind the students to "try to convince yourself why the other answer options are wrong" or "what are some easy ways that other students might get the wrong answer" ͑DoP.3͒.During both the CQ response stage and the solution discussion, Purple often collaborated with the students: walking around the room, answering student questions, and discussing with various groups of students ͑DoP.6, 7 and 8, Fig. 1͒.Professor Purple usually asked students to contribute explanations of the CQ solution ͑DoP.7 and 8, Fig. 4͒.As the student contributed an explanation, the professor intermittently interrupted and asked other students if that first idea made sense, or repeated what the student said for the rest of the class ͑DoP.8 and 10͒.In this way, clickers were used to change faculty-student interactions beyond the traditional IRE format.The professor usually heard from multiple students and verbally encouraged students to think about different ways to get to or think about the solution ͑DoP. 10 and 11͒.
Based on our observations, there are a variety of scientific practices that we value and that students can gain experience with through the use of Peer Instruction: ͑i͒ To try out and apply new physical concepts ͑ii͒ To discuss physics content with their peers ͑iii͒ To justify their reasoning to their peers ͑iv͒ To debate physical reasoning with their peers ͑v͒ To formulate questions and ask questions ͑vi͒ To evaluate the correctness and completeness of problem solutions ͑vii͒ To interact with physicists ͑viii͒ To begin to identify themselves as sources of solutions, explanations or answers ͑ix͒ To communicate in a public arena While not traditionally assessed, there are a variety of practices such as those described above that we value for our students.Our studies demonstrate the potential for Peer Instruction to support the development of these scientific prac-tices; however, it depends upon the specifics of PI implementation.In all of the classrooms studied, students were found practicing the first four items in this list.In other instances, there were large discrepancies in students' opportunities to engage in the remaining five practices.The large discrepancies in students' opportunities to engage in the last five practices will be further illustrated in the following case studies from Red and Green's classrooms.

B. Case studies illustrating a classroom norm
Now that we have summarized some differences that exist on the scale of the course, we can demonstrate one utility of the dimensions by identifying key differences between professors' implementation of a single conceptual CQ.Collections of varying DoPs accumulate to create differing norms within the classrooms-differing roles and rules for professor and student participation.We present case studies of a typical conceptual CQ from Red's classroom and another from Green's classroom.The case studies below draw from audio data, observational notes, and clicker software data.

Green CQ Case Study: Calculus-based Introductory Physics 2
This is the second CQ of the class which begins about 26 min into the 50 min class period.Prior to this CQ the professor had briefly discussed the domain model of magnetism and described permanent magnets.A student asks a question about particular materials and their magnetic properties which the professor addressed.The professor then said, "It's time to go to the next chapter… electromagnetic induction.͓pause͔."The professor then begins writing the title of the chapter on the board.The professor continues, "I think that this is something that you can actually understand based on what we have done before.So I will start with asking a question on it before I have really started the chapter." The professor puts up the CQ ͑shown in Fig. 5͒ and describes the question to the students, "So here I have a magnetic field going into the board and then a conducting ball, a metal ball, is moving through the magnetic field… moving to the right.And if you remember now that a conductor has lots of valence electrons that can move around inside the conductor then you should be able to determine what will happen with this ball when it moves through this field.And there are options there, that it will be polarized in different directions or that it will not be affected at all."The professor spent about 40 s on his introduction to the CQ.After about 30 s in the voting, the professor asks, "Is it pretty clear what's happening here?If there is anyone that thinks that anything in this question is not clear please raise your hand."The noise level rises as the students begin to discuss.Not one of the students raises a hand.The professor replies, "Okay, good." During the CQ voting time, the professor stands by a door that is located at the very back left corner of the stage.He paces around the front of this doorway for most of the CQ voting time.Then the professor walks to the podium and checks the incoming clicker votes.Meanwhile there seems to be a significant amount of discussion occurring among the students.
The professor warns the students, "Okay, 20 more seconds."A little bit later the professor says, "Last few votes.Okay, I'll stop it there."The voting time lasted about 2 min and 30 s.The professor displayed the voting results ͑A: 72% B: 17% C: 4% D: 2% E: 5%͒.The professor says, "Most people thought that it would be polarized for sure and that it would be polarized and… that it has a net positive charge on the top and a net negative on bottom.Can somebody explain how they determined that?" Pause.One student raises his hand to offer an explanation.The professor calls on the student by name, "Joe."The student explains, "Well in the ball the positive charges are moving to the right and they are affected by a magnetic field that is going into the board so the force on the positive charges would be up, so they would move up.But for the negative charges in the ball their velocity would be negative so there the force would be pointing down on the negative charges… so those forces would force the positive charges to the top of the ball and the negative charges to the bottom of the ball."The professor responds, "Okay, Joe says that the positive charges in the ball are moving to the right, so it's an effective current to the right.With a B-field into the board, so the positive charges would be deflected by a force trying to push them up.And the negative charges are moving to the right, but it's an effective current for the negative charges to the left and B-field into the board, so the force on the negative charges would be pointing down to the bottom of the ball.Does this make sense?͓pause͔.Yeah, It does make sense."The students laugh at this comment.The professor continues, "But is it completely true though?…Both of these things?Or is it just one of these things that is true."The students respond in murmurs, "Only one, the second one."The professor continues, "Yeah, we usually think of the nuclei, the positive charges, in a conductor as being fixed and it is electrons that move around, but it is perfectly fine to think of positive charges moving as well.We can't see positive charges are not moving around.But if we measure it, it will look like the positive charges have moved.Since usually… we will now consider current as well, which is like positive charges moving.So, it will be convenient to think of the positive charges moving as a result of force.Excellent.So I guess that I gave away that that was the correct response."The solution discussion period lasted approximately 3 min.
The professor continues into a discussion of how this is an example of electromagnetic induction using examples from demonstration equipment.The professor starts the time for the clickers and wanders around the front of the room.He talks to a student in the front row.It was not obvious if he had initiated this interaction.Then he moves to a student in the second row on the other side of the room and he is heard asking a group of students, "What do you guys think?"The professor continues to engage in a discussion with this group of students.
After 2 min and 50 s the Professor says, "Everybody in?Okay, Three, two, one.͓CQ is closed with Student Responses: ͑A: 0%; B: 17%; C: 74%; D: 8%; E: 0%͔͒ Okay, we might have an all time high in attendance.Okay if we keep doing this do you know how many students we're going to have at the end of the semester?An infinite number.͓The students laugh.͔That's kinda cool.Students from all other universities are going to be piling into this class.So, I heard a bunch of great reasons.All of the reasoning was in essence correct that I heard; it's just that some of the reasoning was incomplete.So… someone want to give a quick stab at what's up?"The professor points to one of the students and FIG. 6. ͑Color͒ Screen shot of a conceptual CQ in Red's classroom ͑correct answer: C͒. says, "Yep."The student says, "I said that more electrons got kicked out …because the photons have greater energy they are going to knock out more electrons from deeper inside the metal than they would have before."The professor responds, "Okay does everybody agree that the purple or violet has greater energy than the blue?Okay, so then your argument is… if you got more energy then you can scoop down into the metal deeper, because the length of that arrow is longer, right?Okay… Do you want a ͓indiscernible candy name͔ or a chocolate?"After a student contributed a response, the student was tossed a piece of candy.The professor asks, "Okay… Does anybody want to retort?"The next student that speaks is inaudible on the recording, but the professor paraphrases the student comment to the rest of the class as follows, "Aaaha, so it could kick off.But wait a sec, there is enough from the blue to dig off from the top.Okay, so it could…" A student interrupts, "But don't all the electrons have an equal probability of getting hit?"The professor says, "Aaaha, But photons aren't very smart.They don't know what ones they're going to go for.So they all have equal probability.It's not like there's this hand guiding it."A student asks another question, "I thought that there was always one photon kicking out one electron."The professor responds, "Yes, One photon always interacts with one electron, but we don't know which electron."A student asks, "Just those top electrons?"The professor responds, "No, it could be any of those electrons."Another few students speak.After the students seem to have made a good amount of progress on their own and have brought forward some of the key ideas, the professor displays his solution on a PowerPoint slide and walks through it fairly quickly.The solution discussion stage lasts about 5 min and 20 s.After an explanation of the question, the professor discussed typical photon energies and typical energy scales for work functions of different metals.

Comparative analysis of Red and Green case studies
Both professors are asking conceptual type CQ and both Red and Green introduce their CQs in similar ways ͑DoP. 2 and 3͒.They both read the CQ out loud but rather than just reading the question verbatim, they elaborate as they describe the question, reminding students of relevant ideas that had been previously discussed ͑DoP.3͒.In both classes there is significant student-student discussion as evidenced by the noise level on the audio recording during the voting time ͑DoP.4͒.Student in these two classrooms are given similar opportunities to discuss physics content with their peers.The professors spend a similar amount of time introducing the CQ ͑DoP. 3 and 5͒.In this way, the professors conduct the moment-to-moment set up of the CQ very similarly.
During the CQ response stage, the professors also give the students a similar amount of time to respond to the CQ ͑DoP.5, Fig. 3͒.However, the professors participate in the CQ response time differently ͑DoP.6 and 8, Fig. 1͒.Green stands at the front of the stage for the entire question while Red leaves the stage and actively discusses with students ͑DoP.6 and 8, Fig. 1͒.Red inquires and listens to what the students are thinking during this voting time ͑DoP.7 and 8, Fig. 1͒.We see that a similar fraction of students are getting the CQ correct in each of these cases.Students in these various courses are given different opportunities to practice interacting with physicists.Students in these classrooms are given different opportunities to practice formulating and asking questions.Similarly, opportunities for the instructor to model discussion and justification practices are varied depending prevalence of faculty-student collaboration.
The most significant differences between Green and Red become apparent during the CQ solution discussion stage.Although both professors elicit student responses ͑DoP.10͒, Green and Red spend significantly different amounts of time discussing the solution, Green: ϳ3 min and Red: ϳ5.5 min ͑DoP.5͒.In addition to the differences in time spent, the types of participation from the professor and students vary during the solution discussion ͑DoP.8, Fig. 4͒.In Green's case, only a single student explanation was elicited and this student's explanation was clear and correct ͑DoP.10 and 11, Appendix A, Table D͒.Following this correct student explanation, the professor communicated the correctness of this explanation and did not elicit additional student comments although more than 25% of the students had answered the question incorrectly.In Red's case, we see that multiple students contribute explanations and some correct and some incorrect ideas are presented publicly ͑DoP.10 and 11, Appendix A, Table D͒.In this example, the student explanations build on fellow students' answers ͑DoP.11͒.Furthermore, each student contribution includes reasoning for their answer.In Red's class, students are responsible for evaluating the correctness and completeness of the problem solution proposed by their peers.Students in these classrooms are given different opportunities to practice identifying themselves as sources and evaluators of solutions, explanations, or answers.
These differences result in different kinds of facultystudent collaboration ͑DoP.8͒ and differences in the use of student prior knowledge ͑DoP.9͒.Additionally, these differences in implementation contribute to varying degrees of emphasis on reasoning and sense making.It appears that although students do have a significant amount of voice in Green's class ͑DoP.10͒, the students that are contributing are usually contributing a clear and correct explanation to the CQ.Flawed student reasoning is not voiced equally in this class even on questions where there is a significant fraction of students incorrectly answering the CQ.Since incorrect ideas are not as likely to be shared, this reduces the importance of reasoning and sense making in this class.It is the answer that is predominantly valued.
Red's course, on the other hand, further emphasizes the importance of reasoning through the professor's management of disagreement among his students ͑DoP.11͒.Because it was fairly uncommon for professors in our sample to foster discussion and debate among their students, it is worth describing how this was achieved in this specific case.From this case study we can see how Red encouraged student-tofaculty dialogue by asking clarifying questions ͑DoP.8 and 10͒.Red also structured student-to-student dialogue during the solution discussion usually by positioning students in a way such that they should respond to or comment on another student's contribution ͑DoP.4 and 8͒.In this way, the professor structured the students' interactions with other stu-dents such that they are debating, providing alternative explanations, arguing, defending, challenging, or clarifying each others ideas.Students in Red's class were given opportunities to practice communicating in public and defending and clarifying their scientific ideas.

VII. CONCLUSIONS
Although many professors talk about Peer Instruction and its implementation similarly in interviews, we have found that there are significant differences in professors' classroom practices that combine over time to have significant pedagogical implications.We have identified observable and quantifiable aspects of practice which vary from classroom to classroom.Prior research has shown that faculty practices are constrained more by structural considerations ͑such as expectations of content coverage, lack of instructor time, class size, or room layout͒ than by their beliefs about productive educational practices. 31In this investigation, we find that instructors within similar structural or situational constraints are making different instructional decisions.These results suggest the need for a more detailed account of how instructors use their knowledge of educational innovations and situational constraints to arrive at practical decisions in the moment-to-moment demands of the classroom.
Differences in observable practices can be grouped along dimensions to illustrate the potential implications of smallscale classroom practices.We find that variation in teacher practice results in disparate opportunities for students to practice conceptual reasoning, 1,8,9 skills at talking physics, 10,11 agency, [12][13][14] and scientific inquiry. 8,15,16Based on our observations, there are a variety of scientific practices that students can gain experience with through the use of Peer Instruction.In all of the classrooms studied, students were found trying out and applying new physical concepts and discussing physics with their peers.However, there were large discrepancies in students' opportunities to engage in formulating and asking questions, evaluating the correctness and completeness of problem solutions, interacting with physicists, identifying themselves as sources of solutions, explanations, or answers, and communicating scientific ideas in a public arena.Our investigation has uncovered possible benefits of particular implementations of Peer Instruction that are yet to be explored and assessed.The assessment of students' facility with these scientific practices is a fruitful direction for future research.
Ultimately, these different classroom practices, over time, contribute to the construction of different local classroom norms and communicate different values to students.The case studies of Red's implementation and Green's implementation of PI demonstrate how in practice these professors place different degrees of emphasis on sense-making in the classroom.In future work, we investigate how students from these classrooms perceive the norms of the classroom and the use of PI differently.

ACKNOWLEDGMENTS
We acknowledge the generous contributions of many faculty members in the University of Colorado physics department.Without their willingness to discuss their teaching practices and open their classrooms to researchers, this work could not be done.We would also like to thank the physics education research group at Colorado for their continuing thoughtful feedback.We would particularly like to thank Jessica Watkins and Lauren Kost for their assistance with reliability studies and Margaret Eisenhart, Susan Jurow, Ben Kirshner, and Valerie Otero at the University of Colorado, School of Education for their feedback on preliminary versions of this work.This work has been supported by the National Science Foundation ͑Grants No. NSF 0410744 and No. NSF 0448176͒, the AAPT/AIP/APS ͑Colorado PhysTEC program͒, the Center for the Integration of Research, Teaching and Learning-CIRTL ͑Grant No. NSF 0717768͒, and the University of Colorado at Boulder.

FIG. 1 .
FIG.1.͑Color͒ The percentage of observed CQs where the professor was observed to participate with the students during the response time by leaving the stage ͑column 1͒, answering student questions ͑column 2͒, or actively discussing with the students ͑column 3͒.The error bars shown are the standard error on the proportion.

FIG. 2 .FIG. 3 .
FIG. 2. ͑Color͒ Average time given for students to respond ͑sec-onds͒.The error bars shown are the standard error on the mean.

FIG. 4 .
FIG. 4. ͑Color͒ The percentage of observed CQs where the wrong answers were discussed and student explanations were heard.The error bars shown are the standard error on the proportion.

2 .
Red CQ Case Study: Calculus-based Introductory Physics 3 This is the second CQ of the class which begins about 10 min into the 50 min class period.The question is preceded by a discussion of what the work function is and the range of values of initial kinetic energy that the electrons could reasonably have.The professor has used a representation of a well with balls stacked in it along levels and these balls are given kicks by photons.The professor has walked through an energy conservation argument for this exact physical situation when blue light interacts with the metal.The professor puts up a clicker question ͑see Fig.6͒.The professor says, "Enough of me yammering.Electrons can have a large range of energy and equal chances of absorbing a photon.Okay.So umm, If I come in with higher energy light, initially you have blue light shining on a metal and if you change that frequency to violet light, at the same number of photons per second okay… So I've increased the intensity, but I have the same number of photons coming in per second, but the energy in the violet photons is… bigger or smaller?"The students call out answers, mostly saying bigger.The professor continues, "Bigger, okay.What happens to the number of electrons coming out?"He says, "So get into your discussion groups and chit chat."The introduction of the question lasts about 50 s and shortly after, the students begin to discuss with each other.

TABLE I .
Summary of dimensions of practice.As I pose the question, how do I convey to the students what this question is about and what the students should be doing?͑4͒ Student-Student Collaboration: Do I allow, support, or encourage student discussion during the CQ? ͑5͒ Determining Time Constraints: Given the nature of this question and what I expect students to be doing, how long should this take?
͑6͒ Creating or Reducing Spatial Boundaries between Students and Instructor: Should I walk around the room?͑7͒ Listening to Student Explanations and Comments: Do I need to listen to students' ideas and reasoning and what are the benefits of listening to students' ideas?͑8͒ Faculty-Student Collaboration: What kinds of interactions should I have with the students during the CQ? ͑9͒ Instructor's Use of Student Prior Knowledge: Do I build on students' prior knowledge and in what ways?͑10͒ Use of student voice: Should students voice their understanding and reasoning during class?If so when and how should this happen?͑11͒ Management of disagreement among students: How do I respond to the CQ results when there is a split response among the students?͑12͒ Formative Use of Students' Ideas and Aggregate Student Responses: Is the information that I am gathering useful in determining what happens next?If so how is this information useful?͑13͒ Summative Use of Aggregate Student Responses: Do I want to know where students got to?Where did students get to?

TABLE II .
Average number of clicker questions per hour of class.

TABLE IV .
Fraction of Clicker Questions that were: Logistical, Recall, Algorithmic, or Conceptual

TABLE V .
Average time spent discussing the CQ solution ͑minutes:seconds, N = number of questions͒

TABLE A .
Course and professor attributes.

TABLE B .
Characteristics of Practice-CQ Set Up Stage.

TABLE C .
Characteristics of Practice-CQ Response Stage.

TABLE D .
Characteristics of Practice-CQ Solution Discussion Stage.