Open-ended versus guided laboratory activities: Impact on students' beliefs about experimental physics

Improving students' understanding of the nature of experimental physics is often an explicit or implicit goal of undergraduate laboratory physics courses. However, lab activities in traditional lab courses are typically characterized by highly structured, guided labs that often do not require or encourage students to engage authentically in the process of experimental physics. Alternatively, open-ended laboratory activities can provide a more authentic learning environment by, for example, allowing students to exercise greater autonomy in what and how physical phenomena are investigated. Engaging in authentic practices may be a critical part of improving students' beliefs around the nature of experimental physics. Here, we investigate the impact of open-ended activities in undergraduate lab courses on students' epistemologies and expectations about the nature of experimental physics, as well as their confidence and affect, as measured by the Colorado Learning Attitudes about Science Survey for Experimental Physics (E-CLASS). Using a national data set of student responses to the E-CLASS, we find that the inclusion of some open-ended lab activities in a lab course correlates with more expert-like postinstruction responses relative to courses that include only traditional guided lab activities. This finding holds when examining postinstruction E-CLASS scores while controlling for the variance associated with preinstruction scores, course level, student major, and student gender.

Improving students' understanding of the nature of experimental physics is often an explicit or implicit goal of undergraduate laboratory physics courses. However, lab activities in traditional lab courses are typically characterized by highly structured, guided labs that often do not require or encourage students to engage authentically in the process of experimental physics. Alternatively, open-ended laboratory activities can provide a more authentic learning environment by, for example, allowing students to exercise greater autonomy in what and how physical phenomena are investigated. Engaging in authentic practices may be a critical part of improving students' beliefs around the nature of experimental physics. Here, we investigate the impact of open-ended activities in undergraduate lab courses on students' epistemologies and expectations about the nature of experimental physics, as well as their confidence and affect, as measured by the Colorado Learning Attitudes about Science Survey for Experimental Physics (E-CLASS). Using a national data set of student responses to the E-CLASS, we find that the inclusion of some open-ended lab activities in a lab course correlates with more expert-like postinstruction responses relative to courses that include only traditional guided lab activities. This finding holds when examining postinstruction E-CLASS scores while controlling for the variance associated with preinstruction scores, course level, student major, and student gender.

I. INTRODUCTION
Laboratory physics courses represent an important and unique component of the undergraduate physics curriculum [1]. These courses can provide students with opportunities to engage in authentic scientific practices, develop technical lab skills, and engage collaboratively with other students. Moreover, lab courses can be structured in such a way as to allow for considerable student autonomy when selecting interesting phenomena to investigate, designing experimental apparatus, and choosing analysis methods. Consistent with this wide range of potential opportunities that a lab course can offer, the learning goals of lab courses often extend beyond just content delivery [2]. For example, increasing students' appreciation for, and understanding of, the nature of experimental physics has been consistently cited as an important learning outcome for laboratory courses [1][2][3].
Many undergraduate lab courses are currently taught using only traditional guided lab activities, which are typically characterized by students completing an experiment on a predetermined topic using a lab manual that guides them through the required setup, data collection, and analysis. While there is considerable variability in these types of activities, more often than not traditional guided labs are highly structured and cookbook in nature. These cookbook labs have been heavily critiqued as being rote and inauthentic to the process of experimental physics [4,5]. However, inauthentic lab activities can stand in opposition to the non-content goals of lab courses, such as helping students to appreciate and understand the nature and importance of experimental physics.
In response to these and other critiques of traditional lab courses, members of the PER community have developed several new pedagogical approaches for the introductory level specifically designed to allow students to engage more authentically in the process of experimental physics. Examples of these environments include the Investigative Science Learning Environment (ISLE) [6], Modeling Instruction [7], and integrated lab/lecture environments such as studio physics [8] or SCALE-UP (Student-Centered Activities for Large Enrollment University Physics) [9]. A consistent feature of these pedagogical approaches is the inclusion of open-ended activities in which students have greater autonomy in what and how physical phenomena are investigated, rather than simply following instructions in a lab guide. All of these instructional approaches were either designed with the explicit goal of improving students epistemologies about the nature of science [6,10], or have resulted in documented improvements in students' epistemologies, attitudes, and beliefs as measured by assessments like the CLASS (Colorado Learning Attutides about Science Survey [11]) or MPEX (Maryland Physics Expectation Survey [12]) [13,14].
The literature described above suggests that transformed instructional approaches that include open-ended lab activities may help to promote expert-like student epistemologies and expectations about the nature of science in introductory courses. Here, we explore the impact of open-ended activities more generally on students' epistemologies about and appreciation of experimental physics in lab courses both at and beyond the introductory level. To do this, we examine students' responses to the E-CLASS (Colorado Learning Attitudes about Sci-ence Survey for Experimental Physics) [15]. E-CLASS is a 30 item, Likert-style survey that targets students epistemologies and expectations about experimental physics, as well as student affect and confidence when doing physics experiments. The E-CLASS presents students with a set of prompts (e.g., "Calculating uncertainties helps me understand my results better.") and asks them to rate their level of agreement both from their personal perspective when doing experiment in class and that of a hypothetical experimental physicist. The E-CLASS was developed in conjunction with efforts to transform the upper-division laboratory courses at the University of Colorado Boulder (CU) [2]. The assessment was intended to be used in both introductory and advanced lab courses and, thus, includes items targeting a wide range of learning goals. E-CLASS was validated through student interviews and expert review [16], and was tested for statistical validity and reliability using responses from students at multiple institutions and at multiple course levels [17]. This work is part of ongoing analysis of a growing, national data set of student responses to the E-CLASS.
In this paper, we first describe the data sources (Sec. II A) and analysis methods (Sec. II B) used in this study. We then present our findings with respect to whether the inclusion of open-ended activities was accompanied by improvements in students' postinstruction E-CLASS scores and examine how this varied for different course levels (Sec. III A). In addition to examining raw postinstruction E-CLASS scores, we also determine whether the trends in postinstruction scores for different laboratory activities persisted after controlling for other factors such as preinstruction scores, course level, major, and gender (Sec. III B). To investigate the relative effectiveness of different types of open-ended activities, we compare scores from students in courses using shorter-scale openended activities with those using longer-term, multi-week projects (Sec. III C). Finally, we end with a discussion of limitations of the study and future work (Sec. IV).

II. METHODS
In this section, we present the data sources, student and institution demographics, and analysis methods used for this study.

A. Data sources
This study is part of ongoing analysis of data collected using the E-CLASS centralized administration system [18] as part of a broader investigation of students' epistemologies in the context of physics lab courses (e.g., [19,20]). The data set used here includes 3 semesters of E-CLASS responses collected between 01/2015 and 05/2016. Students completed the E-CLASS online twice during the course, typically during the first and last weeks of class. In addition to student responses to the E-CLASS prompts, the postinstruction version of the assessment also collected information on student demographics such as student major and gender.
Only students with matched pre-and postinstruction E-CLASS responses were included in the analysis. Pre to post matching was done based on student ID number or, when ID matching failed, by first and last name. The E-CLASS also includes a filtering questions to eliminate responses from students who did not read the item prompts; any student who responded incorrectly to this filtering question was also dropped from the analysis (for more information see Ref. [17]). The final data set included N = 4915 matched responses from 108 distinct courses at 67 institutions. Based on estimates of the total enrollment provided by the instructors at the beginning of the course, this represents a matched response rate of roughly 40%. This response rate is only an approximation of the true response rate as enrollment may have fluctuated over the course of the semester. The institutions in the data set spanned a range of institution types including 2-year (N = 3) and 4-year colleges (N = 35), as well as masters (N = 8) and Ph.D. granting universities (N = 21). Several of these institutions used the E-CLASS in the same course during multiple semesters, thus the full data set includes student responses from 147 separate instances of the E-CLASS. These courses also span multiple levels including first-year (FY) introductory courses and beyond-first-year (BFY) courses (Table  I).
In order to use E-CLASS through its centralized administration system [18], instructors complete a Course Information Survey (CIS) in which they report basic information about their course and institution. The CIS collects both logistical information (e.g., estimated enrollment, course start and end dates, course syllabi, etc.) as well as information on course activities and the instructors' use of various pedagogical techniques. On the CIS, instructors were asked to report how many weeks of the semester were spent on "all guided lab activities" and how many weeks were spent on "all openended activities or projects." The terms "guided lab" and "open-ended" activities were not operationalized in the CIS; thus, instructors responses are self-reported and TABLE II. Demographic breakdown of the full data set, as well as the subset of courses with open-ended activities and those with only traditional guided lab activities. Number of courses refers to the number of distinct courses, and percentages represent the percentage of students rather than the percentage of courses. For Major and Gender demographics, the totals may not sum to 100% as some students did not complete these questions or selected 'Other' as their gender. self-classified. While the relative fraction of the course spent on open-ended activities varied significantly (see Fig. 1), 84 courses reported having at least one week of open-ended activities. To preserve statistical power, the remainder of this analysis will treat courses dichotomously as either having open-ended activities (regardless of the fraction of the course those activities represent) or having only traditional guided lab activities.
The the demographic breakdown of the matched E-CLASS data set in terms of course level, major, and gender is given in Table II. Table II also compares the breakdown of these data across courses that were classified as including open-ended activities and those with only traditional guided labs. A larger fraction of the courses that included some open-ended activities were also BFY courses. This trend may be driven in part by the smaller class sizes characteristic of BFY courses, as open-ended activities often require lower student-to-teacher ratios in order to provide sufficient instructor support to the students. Table II does not include racial demographic data because these data were collected only in the final two semesters of data collection. Examination of E-CLASS scores with regards to racial dynamics will be the subject of a future publication. In addition to gender data, the postinstruction E-CLASS also asked students for their primary major. While students were offered 15 distinct major options, we focus here on students' major as the dichotomous distinction between 'physics' or 'non-physics' majors, where physics includes both engineering physics and physics majors, and non-physics includes all other majors (including other science majors, non-science majors, and students who are open-option or undeclared). The variations in the prior and ongoing experiences of students in different non-physics majors are likely significant; however, clearly characterizing the nature of these differences given the large number of courses and institutions in the data set is not possible. Given this, and the physics focus of the E-CLASS, we have chosen to focus or analysis of student major specifically on the difference between physics and non-physics majors.

B. Analysis
For the purposes of scoring the E-CLASS, we collapsed students' responses to each 5-point Likert item into a standardized, 3-point scale in which the responses '(dis)agree' and 'strongly (dis)agree' were collapsed into a single category. Students' responses to individual items were given a numerical score based on their consistency with the accepted, expert-like response [15]: +1 for favorable (i.e., consistent with experts); +0 for neutral; and −1 for unfavorable (i.e., inconsistent with experts). A student's overall E-CLASS score was then given by the sum of their scores on each of the 30 items resulting in a possible range of scores of [−30, 30]. For more information on the scoring of the E-CLASS see Ref. [17]. In previous work, we have cautioned instructors using the E-CLASS against focusing exclusively on the overall score when interpreting their results [17]. The E-CLASS targets a range of learning goals some of which may not be relevant to a specific course, and we encourage instructors to focus also on the individual items most relevant to their learning goals. For this reason, we provide also a breakdown of students' scores by item. However, the overall score is still useful in that it provides a continuous variable that offers a wholistic view of students' performance on the E-CLASS that can be used to quantitatively examine how that performance varies across subpopulations of students. As the distribution of scores on the E-CLASS is typically skewed towards positive scores [17,19], the following sections report statistical significance based on the non-parametric Mann-Whitney U test [21] unless otherwise stated. Where differences between means are statistically significant, we also report Cohen's d [22] as a measure of effect size and practical significance [23].
Table II highlights demographic differences between courses using open-ended activities and those using only traditional guided labs. As has been observed previously in student responses to the E-CLASS [19,20], these demographic differences may confound comparisons of students E-CLASS scores in these two types of courses. To account for this, we utilize an analysis of covariance (AN-COVA) [26] in addition to examining students' raw preand postinstruction E-CLASS scores. ANCOVA is a statistical method for comparing the difference between population means after adjusting them to account for the variance associated with other variables. In order for the results of an ANCOVA to be valid, the data must meet several assumptions. The assumptions of an AN-COVA are discussed in detail in Refs. [26,27]; tests of the E-CLASS matched data showed that they satisfied these assumptions, with one exception. In our data, the covariate (i.e., preinstruction score) is not independent of the other variables (i.e., gender, major, and course level). Shared variance between the covariate and independent variables is to be expected in any observational study in which randomized assignment to experimental groups was not done or not possible [28]. Violation of the assumption of covariate independence implies that our results should be interpreted as a lower bound on the relationship between each independent variable and postinstruction E-CLASS score.

III. RESULTS
This section presents findings with respect to whether the inclusion of open-ended activities was accompanied by improvements in students' postinstruction E-CLASS responses using raw scores and an ANCOVA.

A. Open-ended vs. traditional guided lab activities
To explore general trends in the aggregate data, we first examine differences in raw pre-and postinstruction E-CLASS scores for students in courses using open-ended activities and those using only traditional guided labs (Table III). Table III shows that students in courses using open-ended activities scored significantly higher than students in courses using only traditional guided labs both pre-and postinstruction (p ≪ 0.01). While the difference is statistically significant both before and after instruction, the magnitude of this effect was larger for the postinstruction scores. Moreover, students in courses using open-ended activities showed a small (d = 0.08), but statistically significant, positive shift (p ≪ 0.01) before and after instruction, while students in courses using only traditional guided labs showed a small (d = −0.2) but statistically significant negative shift (p ≪ 0.01). We can also examine the difference between courses using open-ended activities and courses using only traditional guided labs on an item-by-item scale. Fig. 2 shows the difference between the average scores of students in these two types of courses for each of the 30 items on the pre-and post-instruction E-CLASS. Consistent with the small difference in the overall preinstruction score (Table  III) While comparisons of E-CLASS scores both overall and by-item in the full, aggregate data set preliminarily suggest that open-ended activities have a positive impact on students' performance, the statistically significant difference in preinstruction scores between students suggests that these two types of courses may have served different student populations. This conclusion is supported by the demographic data presented in Table II, which, for example, shows that courses using open-ended activities are considerably more likely to also be BFY courses. Previous work has shown that students in BFY courses consistently score higher on E-CLASS than students in FY courses [17], potentially accounting for some of the difference in preinstruction scores between open-ended and guided lab courses.
To determine whether the relationship between the type of activities used and postinstruction E-CLASS score varies across course levels, we examined the overall E-CLASS scores for students in FY and BFY courses separately. The trend of higher postinstruction scores in courses using open-ended activities persisted for both 5 10 24 12 9 6 13 22 3 25 2 23 30 11 21 14 18 8 20 1 26 15 19 28 29 27 17 16 7 4 Question number FY and BFY students (see Table IV); however, the pattern of shifts varies between the two types of courses. In FY courses, students in courses using open-ended activities did not show statistically significant shift from pre-to postinstruction (p = 0.4), while students in courses using only guided activities only showed a small (d = −0.2), but statistically significant, negative shift (p ≪ 0.01). Alternatively, in BFY courses, courses using open-ended activities showed a small (d = 0.2) but statistically significant positive shift from pre-to postinstruction (p ≪ 0.01), while courses using only guided activities showed a small (d = −0.1) negative shift (p = 0.03). Table IV summarizes the effect of one potential confounding variable (i.e., course level) on the analysis of the the relationship between open-ended activities and E-CLASS scores; however, there may be other variables to take into consideration. For example, student responses to the E-CLASS have been shown to vary based on students' major [17]. Additionally, prior research suggests that some transformed instructional approaches may have a differentially positive impact on the E-CLASS scores of women [20], suggesting that student gender may also be a significant factor. Sec. III B explores these dynamics using an analysis of covariance.

B. Analysis of covariance
To more clearly explore the relationship between openended activities and post-instruction E-CLASS scores independent from other factors, we used an ANCOVA. AN-COVA is a statistical method for comparing the difference between population means while adjusting them to account for the variance associated with other variables. In this case, we want to determine if the difference between the E-CLASS scores of students in courses using different types of lab activities (open-ended vs. guided only) remains statistically significant after accounting for differences in preinstruction scores, as well as student major and gender. Only students for whom we have matched E-CLASS scores along with both major and gender data were included in the ANCOVA (N = 4759).
We performed a 5-way ANCOVA to compare postinstruction E-CLASS means for courses using open-ended and guided lab activities while controlling for the three categorical variables: course level, student major, and student gender, as well as preinstruction score as a covariate. To determine how these variables might also be related to one another, we initially included all possible interaction terms. None of the interaction terms were statistically significant predictors of postinstruction scores. This result should be interpreted as evidence that the impact of open-ended activities did not vary significantly depending on the other variables. For example, these data do not suggest that open-ended activities had a more positive impact on women than men. As the interaction terms did not contribute significantly, they were removed from the model.
The results of the 5-way ANCOVA (without interaction terms) are summarized in Table V. All four categorical variables (gender, major, course level, and type of activities) were statistically significant predictors of postinstruction E-CLASS score (F-test, p < 0.01). Type of activities (open-ended vs. guided only) accounted for the largest difference in adjusted postinstruction means with students in courses using open-ended activities scoring higher. Thus, when adjusting for the variance asso-TABLE V. Comparison of postinstruction means as adjusted by the 5-way ANCOVAs for each categorical variable. A difference between group means is indicated only when that difference was statistically significant. Here, O is the predicted postinstruction mean for students in courses using open-ended activities and similarly for students in courses using only guided activities G , physics students P , non-physics students N P , men M , women W , BFY students BF Y , and FY students F Y . Variables are listed in descending order by size of the difference in adjusted postinstruction means between groups.

Catagorical Variable
Postinstruction mean comparison Activities ciated with preinstruction score, course level, major, and gender, students in courses using open-ended activities demonstrate more expert-like E-CLASS responses than those in courses using only traditional guided labs.

C. Open-ended activities vs. multi-week projects
The results presented in the previous sections support the idea that the use of open-ended activities in undergraduate lab courses improved students epistemologies, expectation, and confidence with respect to the nature of experimental physics. However, courses in the data set were self-classified by instructors as including openended activities; thus, there is likely significant variation in the types of open-ended activities represented in the data. Moreover, it is likely that not all open-ended activities are equally effective at encouraging expert-like epistemologies and expectations. For example, we argue that longer term, multi-week projects have the potential to provide some of the most authentic experimental physics activities that an undergraduate student might engage in outside of undergraduate research. Because of this, we hypothesized that courses that included multiweek projects would result in higher E-CLASS scores than shorter, week-to-week open-ended activities.
Whether a course included a multi-week project was not specifically asked on the CIS; however, the CIS did collect course syllabi, which generally include a description of the course activities and expectations and/or a grading breakdown showing the fraction of the grade from each of the activities in the course. Courses were coded as having a project component if the syllabus listed a project in either the course description or grading breakdown. In a few cases, we were not able to determine if the course included a project because the syllabus was unclear or missing. In our data set, all of the 22 courses (N = 231) that were identified as including a project component were BFY, and all were classified by the in-structor as including open-ended activities. To account for this, the comparison group was the 31 BFY courses (N = 306) that included open-ended activities but whose syllabi clearly indicated they did not include a project component. Table VI shows the pre-and postinstruction scores for courses that include a project and those that do not. There was no statistically significant difference in the average E-CLASS scores either before or after instruction for students in these two sets of courses. This result indicates that, while courses with multi-week projects did score significantly higher than courses using only traditional guided labs, they did not result in a significant increase in E-CLASS scores above and beyond that of shorter-term open-ended activities. However, this finding should not be interpreted as evidence that projects do not contribute significantly to lab courses in ways beyond that of general open-ended activities. As discussed in Sec. I, lab courses have a multiple different learning goals, and E-CLASS targets only a subset of these goals. The inclusion of multi-week projects may have significant impact on these other learning goals (e.g., developing student ownership, or practical and managerial lab skills).

IV. SUMMARY AND CONCLUSIONS
We analyzed a large data set of student responses to the E-CLASS for evidence of the impact of openended laboratory activities on students' epistemologies, expectations, and confidence with respect to experimental physics. We found that courses that included openended activities during one or more weeks of the laboratory had higher pre-and postinstruction E-CLASS scores as well as more favorable shifts relative to courses using only traditional guided lab activities. This result was reinforced by an analysis of covariance, which showed that the type of activity used (open-ended vs. guided only) was a significant predictor of postinstruction E-CLASS score even after adjusting for the variance associated with preinstruction score, course level, student major, and student gender. We also examined the effectiveness of multi-week projects relative to shorter-term open-ended activities and found no evidence that multiweek projects resulted in more expert-like E-CLASS responses than open-ended activities generally.
Overall, our findings support the claim that the use of open-ended activities may have a positive impact on students epistemologies about the nature of experimental physics and their affect and confidence when performing physics experiments. We also found that this positive impact does not require implementation of multi-week projects. However, there are several limitations of this work. While our data set is extensive, spanning a large number of institutions, courses, and student populations, it is not comprehensive. For example, there are only a few 2-year colleges in our data. Additionally, we focused here on a specific subset of potential variables that might confound the comparison of courses using different types of activities (i.e., major, course level, and preinstruction scores). These variables were selected based on the findings of previous work [17,19,20], which suggested they were important factors in predicting postinstruction E-CLASS scores. However, there are other factors that might also correlate with they type of activities used by the instructor, including class size, instructor familiarity with PER, and student-to-teacher ratio. Similarly, the instructors for the courses in our data set generally chose to use E-CLASS without external pressure, and thus these courses are not randomly selected. Additionally, to preserve statistical power, all courses using any open-ended activities were aggregated together as a single group. Thus, while it may be that having a greater fraction of the course dedicated to open-ended activities would be more effective at promoting expert-like epistemologies and expectations, the current data set can-not address this dynamic. As data collection with the E-CLASS centralized administration system continues, examination of the impact of more frequent open-ended activities may become possible.
Additionally, the purely quantitative analysis reported here cannot speak to how open-ended activities may have improved students' epistemologies and expectations relative to traditional approaches. We hypothesized that open-ended activities may provide greater opportunities for the students to engage authentically in the process of experimental physics; however, clearly determining the mechanism underlying the findings reported here will likely require additional research with a significant qualitative component. Future work could also include more fine-grained investigation of the relative effectiveness of different types of open-ended activities. Here, we examined the relative effectiveness of multi-week projects relative to other open-ended activities; however, there are many other types of open-ended activities that may be more or less effective at encouraging expert-like E-CLASS responses. Investigations of this type would require the creation of a robust and valid classification scheme for different open-ended activities. The development and implementation of such a scheme might require collection of course artifacts (e.g., lab manuals) or in-class observations.