Comparison of university students’ understanding of graphs in different contexts

This study investigates university students’ understanding of graphs in three different domains: mathematics, physics (kinematics), and contexts oth


I. INTRODUCTION
Scientific data are very often communicated through graphs, because they allow the skilled user to quickly recognize and extract important features of the data set under analysis, such as trends, rates of change, etc. This is usually done through analyses of graph slopes and areas under the graph. Students in Croatia are introduced to graphs through different school subjects, but mostly through mathematics and physics. However, students also encounter graphs in contexts other than those of mathematics and physics, such as biology, chemistry, everyday life, economy, etc. This study attempts to investigate and compare student ability to interpret graphs in mathematics, physics, and contexts other than physics. It attempts to answer the following research question: How does student ability to interpret graph slopes and areas under the graph change across three different domains: mathematics without context 1 (M domain), physics or kinematics (P domain), and mathematics in contexts other than physics (C domain)?
The ability to interpret graphs is considered one of the important outcomes of high school mathematics, physics, and other science courses, and is often assumed by university faculty to be fully developed by the time that students enroll at university. It was shown in several physics education research studies (e.g., [2][3][4][5][6]) that this assumption does not hold, and that students still have many difficulties with graph interpretation at university level which are similar to those found at earlier levels [7][8][9][10], as well as to those identified through mathematics education research (e.g., [11][12][13][14][15][16]).
The main student difficulties can be classified as interval-point confusions, where students focus on a single point instead of on an interval; slope-height confusions, where students mistake height of the graph for the slope; and iconic confusions, where students incorrectly interpret graphs as pictures [12]. Overall, the findings of both physics and mathematics education research are rather similar and point to the presence of similar student difficulties in both domains. However, the prevalence or relative strengths of different difficulties in both domains were rarely investigated or compared. The issue of transfer of knowledge between mathematics and physics (usually from mathematics to physics) is very important for physics education. It was tackled in several studies on graphs [1,[17][18][19] with mostly negative results. It was suggested in one of the studies that most secondary students, even those who do well in mathematics and physics, do not make substantial links between the two domains, and that some students even think that it is not appropriate to transfer concepts from mathematics to physics [17]. For transfer to occur it is necessary that students possess the required mathematical knowledge, but this is not always the case, especially when advanced concepts such as derivative or integral are concerned [1,18].
However, students' problems with mathematics may not be the only or even the main reason for students' difficulties with graphs in physics. In our previous study we made an attempt to compare high school students' understanding of the line graph slope in the domains of physics and mathematics [19]. It was found that, contrary to the prevalent belief of physics teachers, the main source of student difficulties with the concept of the line graph slope in physics was not their lack of mathematical knowledge, but rather their lack of ability to interpret the meaning of the line graph slope in physics context. Many students successfully solved the mathematical questions but were unable to solve parallel physics questions, or used different strategies for solving analogous mathematics and physics problems. It was observed that the transfer of knowledge to a different domain, such as physics, did not always occur, even though many students possessed the needed mathematical knowledge. (Interestingly, besides the expected transfer from mathematics to physics, which was relatively weak, some occasional cases of transfer from physics to mathematics were also observed.) Also, the same student difficulty known as slope-height confusion was detected in both domains, but it occurred far more frequently in physics than in mathematics (about twice as often). It was natural to pose the question about the reason for the observed higher difficulty of physics questions relative to parallel mathematics questions: Is the higher difficulty of physics questions the consequence of students' lack of relevant physics knowledge, or would the same effect be observed to the same extent also in parallel questions situated in different contexts, which did not require additional content knowledge? This study attempts to investigate this issue further through the analysis and comparison of student answers to parallel questions from mathematics without context, physics (which requires some physics content knowledge), and other contexts which do not require additional content knowledge (this area can be described as mathematics in context). We are not aware of any studies that tried to compare student understanding of graphs across the different domains, and this comparison could be helpful to both physics and mathematics teachers, not only to identify student difficulties, but also to try to understand their origin and their relative importance.

II. THEORETICAL BACKGROUND
Transfer of learning is usually defined as the ability to extend what has been learned in one context to new contexts [20], and is sometimes regarded as one of the ultimate goals of education. Hammer et al. [21] suggest that it would generally be more appropriate to speak of activation of cognitive resources than of transfer, since knowledge and reasoning abilities are composed of many resources that may or may not be activated in a particular context. They oppose the view of knowledge and abilities as objects which are acquired, manipulated, and transferred as intact units, with the exception of locally coherent sets of resources which activate together and possess internal structural stability. Such cognitive units whose mechanism of stability is structural rather than contextual can be viewed as transferable [21]. In our opinion, students' concepts of the graph slope and of the area under the graph can be examples of such transferable units in cases when they are well formed and stable.
Whether or not transfer will happen depends not only on the presence or absence of relevant resources, but also on students' framing of the situation [22]. Framing means that students have to interpret what is going on in a certain situation or in a certain problem and accordingly decide what resources to use, or which epistemic game to play [22]. In physics education we usually expect students to transfer their mathematical knowledge from mathematics to physics. There are several reasons why the expected transfer could fail: either the required resource does not exist, or the resource exists, but is not activated due to the wrong framing of the problem, or the resource is activated, but its mapping to the problem is not appropriate [23]. Research suggests that transfer is more likely to happen when students have seen the given idea in at least two separate contexts or when they receive metacognitive scaffolding [20].

III. DATA COLLECTION AND ANALYSIS
Eight sets of parallel mathematics, physics, and other context questions about graphs were developed. The construction of sets usually started from physics questions. Some of the physics questions in this study had already been used in other studies on graphs (e.g., [3,5,16,19]) to probe important student difficulties related to graphs. After the selection and modification of physics questions, analogous mathematics and other contexts questions were constructed. Each set of questions referred to the same concept and required the same mathematical procedure in different contexts-one question was a direct mathematical question, one was situated in the context of physics (kinematics), and one was in some context other than physics. Physics content knowledge that was required for answering physics questions included definitions of and relationships among basic kinematic concepts (such as distance, velocity, acceleration, uniform and uniformly accelerated linear motion) and the ability to interpret their graphical representation. In other context problems no specialized content knowledge, which is not common for university students, was needed. For example, it was assumed that students are familiar with terms such as price growth, GDP, stocks, river water level, bus rentals, etc., but knowledge of the definitions or laws concerning those concepts was not required.
Five sets of questions referred to the concept of the graph slope and three to the concept of the area under the graph. Four sets of questions were in a multiple choice format and four sets were open ended. In addition to choosing the correct answer in multiple choice questions or providing the answer in open-ended questions, students were asked to provide explanations for their answers and/or necessary calculations where appropriate, so that insight into the underlying student reasoning could be obtained. In this paper physics slope questions are labeled P-S1 through P-S5, physics area questions P-A1 through P-A3, mathematics slope questions M-S1 through M-S5, mathematics area questions M-A1 through M-A3, other context slope questions C-S1 through C-S5, and other context area questions C-A1 through C-A3. Questions with the same two last labels are parallel in content (e.g., P-S1, M-S1, and C-S1). An example of one set of slope items is given in Fig. 1, and the whole test is included in the Supplemental Material [24]. A test consisting of these eight sets of questions (24 questions in all) was administered to 385 first year students at Faculty of Science, University of Zagreb in Zagreb, Croatia. Students were either prospective physics or mathematics teachers or prospective physicists or mathematicians. Students were tested at the beginning of the first semester. They were told that the testing was part of the research on student understanding of graphs and were later informed of their score on the test. No incentives such as grades were offered for taking the test, but students received some credit points for writing explanations (regardless of their correctness) and/or for required calculations. Students were generally willing to take the test since they were told that the results of the test would be informative both for them and for their physics and mathematics teachers and that they would help them see the areas in which their knowledge of graphs could be improved. Students were not informed that the test contained parallel questions in different contexts. Parallel questions did not follow each other in the test but were separated by other questions. They were also not labeled in the same way as in this paper but were numbered 1-24. [Labels used in this paper (e.g., P-S1) were added for the readers' convenience to the test questions included in the Supplemental Material [24], but they were not present in the original test.] All students had previously studied motion graphs and kinematics in high school physics courses (physics is a compulsory subject in Croatian high schools) and linear functions and linear graphs in high school mathematics. The allocated time for taking the test was 60 minutes, and students were able to finish the test in that period of time.
The tests were scored. On multiple choice questions, if a correct answer was given with a correct explanation, the student was awarded 2 points. In cases where the correct answer was given with an incomplete explanation or with no explanation at all, the student was awarded 1 point. If a correct answer was given for wrong reasons, as could have been deduced from the explanation accompanying the answer, the answer was counted as incorrect and awarded 0 points. For incorrect answers with or without explanation the student was awarded 0 points. On open-ended questions, for the correct answer with the correct work the student was awarded 2 points. For partially correct answers (correct idea with some minor mistake in calculation) students were awarded 1 point, and for incorrect work or explanations, or completely missing presentations of work or explanations, 0 points.
After scoring, data were analyzed with the WINSTEPS [25] software for Rasch analysis [26] to obtain linear measures for item difficulties. Percentages of correct answers can reflect the correct ranking of persons or items, but not the correct intervals between person abilities or between item difficulties, meaning that percentages are not linear in the variable which they represent [26]. The linearity of measures, on the other hand, is very important because meaningful arithmetic operations can only be performed with linear measures, thus enabling comparisons and statistical studies. WINSTEPS performs logistic transformation on the raw scores of persons and items (p values of students and items), and in this way transforms the raw scores in linear measures of student ability and item difficulty. For more detailed introduction in the Rasch model see, for example, Ref. [26] or the short introductions to Rasch modeling in our previous publication [27]. The model defines the unit logit (short for ''log-odds unit'') in which all measures are expressed. Each item and person measure comes with its Rasch standard error which indicates the uncertainty of the estimate. The estimates are more precise if the number of persons and items is large, and if there is good targeting of the test on the distribution of students [26].
To evaluate the fit of data to the model Rasch analysis, programs usually report two fit statistics: infit and outfit mean square statistics (MNSQ) [26,28]. Outfit is based on the conventional averaged sum of squared standardized residuals, whereas infit is an information-weighted sum which gives more value to on-target observation. A large infit value on a particular item indicates that some persons of the ability which is close to the difficulty of the item have not responded in a way consistent with the model. A large outfit value of an item indicates that persons who are far in ability from the difficulty of the item have responded in an unexpected way. Large infit values are generally considered more problematic than large outfit values. The expected value of both infit and outfit is 1. Items which are sufficiently in accordance with the Rasch model to be productive for measurement will have infit and outfit values between 0.5 and 1.5 [28].

A. Analysis of item difficulties
The functioning of the test as a whole was satisfactory with very high item reliability (0.99), which is of greatest importance for this study, and somewhat lower, but satisfactory, person reliability (0.85) and Cronbach alpha (0.88). Overall, our data seem to fit the Rasch model. The item-person map, which visually summarizes several aspects of Rasch analysis, is shown in Fig. 2.
Distribution of items according to their difficulty and distribution of persons according to their ability is shown along the same axis with scale in logit. The most able students and the most difficult items are at the top of the figure. It can be noted that the targeting of the test on the sample is very good. The width of the test is adequate for most students-only 16 students have abilities outside the range of item difficulties in the test. In the middle of the distribution there are many items which are very close in difficulty; for example, 10 of the total of 15 slope items are found in the interval which is about 0.6 logit wide (M-S5 to M-S2). Area items from P and C domains are found in the upper part of the difficulty distribution, whereas mathematics area items are at the very bottom of the distribution. Interestingly, parallel area questions (e.g., P-A3, C-A3, and M-A3) usually differ quite significantly in difficulty.
The fit of the items with the model can be evaluated from Table I in the Supplemental Material [29]. No items are degrading for measurement (all have infit and outfit MNSQ values within the range of 0.5-1.5). Items M-A3, M-S1, and M-A2 have the largest outfit values. However, infit values of all these items are smaller than their outfit values, and infit is generally regarded as the more important indicator of fit than outfit, since large outfit can be caused by careless mistakes or lucky guessing of a small number of students. From correlations listed in Table I, which are all positive and greater than 0.3, it can be concluded that all items worked together. The analysis of the test as a whole suggests that the test succeeded in defining the underlying variable (student understanding of graphs), and that a reliable scale of item difficulties was obtained for the items in the test, which allowed further analysis of difficulties of different groups of items.
In order to compare the difficulties of items in each investigated context, the average values of item difficulties over three different domains (mathematics without context, physics, mathematics in context) and two investigated concepts (slope, area) were calculated and graphically represented in Figs. 3 and 4. The average difficulty of all items in the test is usually set to zero in Rasch analysis, so positive item difficulties indicate items more difficult than the average, and negative difficulties indicate items easier than the average. From Fig. 3 it is visible that the concept of slope in all three domains is close to the average difficulty, or easier, and that the differences among the domains are not very large. Error bars indicate the level of dispersion of item difficulties from the average difficulty. The results suggest that the concept of the graph slope in the M domain is easier for students than the same concept in the P domain, but difficulties of the concept of slope in P and C domains, as well as of the same concept in M and C domains, cannot be clearly distinguished from one another due to large uncertainty of the average difficulty in the C domain. Altogether, the concept of slope appears rather homogenous in difficulty. This is also visible in Fig. 2, where 10 slope items are found in the 0.6 logit wide interval, indicating small differences in difficulty levels for the majority of slope items in the test. The slope items that are outside this group are C-S1 and P-S1 (the most difficult slope items), which required calculation of the line graph slope in P and C domains, followed by P-S5 and C-S5, which required interpretation of the slope of a curved graph. On the other side of the distribution is item C-S2, which was the easiest slope item (Fig. 1).
The differences among domains are much more pronounced when the concept of the area under the graph is analyzed. The concept of the area under the graph in the M domain appears to be much easier than the same concept in P and C domains, but also much easier than the concept of slope in any of the domains. On the other hand, the concept of the area under the graph appears to be of similar difficulty in P and C domains, and of much higher difficulty than the concept of slope in any of the domains.
When item difficulties are averaged over domains it can be seen (Fig. 4) that mathematics without context (M domain) was the easiest domain for students in this test, whereas physics and other contexts were much more difficult. Because of large uncertainties of average difficulties of physics and other contexts, the difference in the difficulties of these two domains cannot be resolved, and they appear equally difficult.

B. Analysis of student explanations
In addition to analyzing mean item difficulties over different domains and concepts, it was also important to analyze student written explanations and calculations which accompanied their answers to questions. Students provided many explanations that gave us important insight in their ways of reasoning on different items. Since it is rather extensive, the full report on student explanations and student difficulties with graphs, which were detected from them, will be given in a subsequent paper. Here we will give only an illustration of student reasoning through the example of the set of parallel items M-S2, P-S2, and C-S2, shown in Fig. 1. The following comments of two students illustrate different reasoning strategies used by students in different domains.

Student 1 M domain (correct answer):
Line p is steeper, we can also see that as tg ¼ 2=1 ¼ 2 for p, and tg¼1=1¼1 for q.
C domain (correct answer): Student 2 M domain (correct answer): Line p has a larger interval on y-axis than line q for the given interval on the x-axis.
C domain (correct answer): In the period of 3 months stock ING increased for 230 €, and stock EXP for 120 €.
P domain (incorrect answer): a ¼ v=t; since body A has larger v at t ¼ 2 s, then its a is also larger Both students answered questions M-S2 and C-S2 correctly but failed on physics question P-S2. The difference in reasoning strategies in the three domains is obvious. In the M and C domains student reasoning is based on the concept of slope, either explicitly or implicitly (as rise over run). Both students are obviously able to reason on the graph slope, using it either explicitly or implicitly. However, in the physics context, they both resort to a different strategy (use of formulas, in these cases incorrect ones), and do not activate their knowledge or reasoning strategies about the slope. Not only in these two cases, but generally, students' preferable strategy in the P domain seemed to be the use of formulas, and it often led them to the wrong conclusions. On this set of questions, 44% of students used a formula (either correctly or incorrectly) as the basis for their reasoning in physics, compared to 21% and 8% in mathematics and other contexts, respectively. At the same time, many students displayed the ability to reason on the basis of graph slope in mathematics, and sometimes also in other contexts, but in physics this type of reasoning seems to have been blocked in many cases by students' choice or habit to rely on formulas. Students often used different strategies in different domains, and that suggests that the activation of their cognitive resources is largely context dependent. For example, explicit reasoning on the basis of rise over run, expressed in words, was used by almost every fourth student on C-S2 (23%), whereas on P-S2 and M-S2 it was almost negligible (5% and 3%, respectively). In mathematics, one important incorrect strategy was to identify the slope as the angle between the straight line and the x axis, either explicitly (18% of the students) or implicitly (10%), by reasoning on the basis of the steepness of the line. The same was done explicitly by 3% of the students in the P domain and 7% in the C domain, with the possibility that many more students used it implicitly through the word ''slope'' in those contexts. Large fractions of students used the word slope in P and C domains (27% and 38%, respectively), but the meaning of that concept may not be clear for many of them. This is suggested by the fact that in the M domain, where students actually had to explain what they meant by slope if they wanted to explain which straight line had the larger slope, about half of them (49%) gave incoherent or irrelevant explanations or no explanations at all. Questions M-S2, P-S2, C-S2 are convenient for illustrating differences in student reasoning strategies in different domains, because C-S2 was the best solved question from the C domain. Many explanations were provided by the students on that question, more than on the other C domain questions, and we were therefore able to get a good insight into students' reasoning. It is possible, although not certain, that C-S2 was solved better than the other questions in C domain because of the context of the question. Comparing prices and their growth (not necessarily of stocks, but also of many other everyday items) may be something that students are accustomed to in everyday life, so this problem may have activated some of the resources that they use in everyday life. If that is really true, then this question might be an example of how context in some cases may help students in solving the problem, although in most cases in this study it seemed to have presented an additional obstacle in that process.

V. CONCLUSIONS
In physics, as well as in other sciences, it is usually expected that students posses necessary mathematical knowledge on graphs, and also to transfer it readily to new contexts which are more or less familiar to them. In this study we have attempted to compare student performance on mathematically similar problems, which were situated in different contexts: each problem was posed once directly mathematically, once in the physics (kinematics) context, which required physics content knowledge, and once in some other context, which did not require additional content knowledge. The physics context (kinematics) should have been rather familiar to students, since they had encountered similar problems in high school. Other context problems, presented in the test, were expected to be less familiar for students-such problems are far less frequently given in Croatian high schools than kinematics problems. It was expected that the M domain would be the easiest and the C domain the most difficult domain, with physics somewhere in between those two domains. The results confirmed the prediction that mathematics without context is the easiest domain in the test, but contrary to expectations, C and P domains turned out to be approximately equally difficult. The results suggest that students do not solve kinematics problems better than the presumably less familiar other contexts problems and that kinematics is still a difficult context for students, even though it was rather extensively covered in high school. Student understanding of kinematics concepts seems to still not be sufficiently developed. As was shown in our previous study on graphs [19], many physics teachers attribute student difficulties with graphs in physics to student lack of mathematical knowledge. It was again shown in this study that even if students have the needed mathematical knowledge, which was generally the case in this study (although some problems were noticed in that area too), it does not guarantee their success on parallel physics or other context problems. The added context generally increased the difficulty of parallel problems with regards to mathematics, because problems including context required one more step in solving: interpretation and translation of context into mathematical language. However, there was one instance (item C-S2) where the context seemed helpful and decreased the difficulty of the problem even below the difficulty of the parallel mathematical problem.
The results of the study reveal the differences between student understanding of the concept of the graph slope and the concept of the area under the graph. The concept of slope, even though usually mathematically more demanding, seems to be better understood than the concept of the area under the graph. The interpretation of the meaning of the area under the graph seemed to present the biggest problem for students in our study. The reason for this could be found in the fact that during teaching of kinematics the interpretation of the slope is usually emphasized much more than the interpretation of the area under the graph. In this study, student reasoning about the graph slope appears as a rather compact element, not so much influenced by the presence of context or the lack of it, as can be concluded from Figs. 2 and 3. This could suggest the transfer of knowledge between the domains (mostly from mathematics to the other two domains), and seems to be consistent with our initial assumption that the concept of the graph slope can be regarded as a transferable cognitive unit. The analysis of explanations of student answers showed that the student understanding of the concept of the slope is often only partly correct. The slope items that stand out in terms of difficulty are the items that ask for computation of the slope (C-S1, P-S1) and the items that ask for the reasoning about the slope of the curved graph (C-S5, P-S5). Computation of the slope requires procedural knowledge in addition to understanding, which explains the higher difficulty. This agrees with the results of some other studies [16], which found that slope computation seemed to be the most difficult aspect of the concept of slope. Analysis of curved graphs is not so often tackled in the high school curriculum, so this was something new for most of the students.
The difficulty of the concept of the area under the graph differs dramatically between mathematics on the one side and physics and other contexts on the other, as shown in Fig. 3. This is consistent with the findings of Nguyen and Rebello [18] that very few students are able to apply this concept in physics problems. Here it is obvious that the interpretation of the mathematical quantities in physics or in other contexts is a crucial step which most students in our sample were not able to perform. Some cases of transfer of the approach to the problem solving from physics to other contexts were found on the area items (e.g., dimensional analysis).
Regarding the research question of the study it can be concluded that context generally seems to increase the difficulty of items. Context added to the mathematical slope or the area problem will increase the cognitive demand on the students, acting as an additional barrier in the problem, and will therefore also increase the difficulty of the item. The exception might be only very familiar contexts to which students are very much accustomed, in which case context can even reduce the difficulty of the item with regards to the parallel mathematical problem.
The analysis of student explanations which accompanied their answers revealed many interesting student difficulties regarding concepts of the graph slope and the area under the graph. They will be presented in a separate paper. The examples of differences in student strategies on parallel questions presented in this paper were chosen to illustrate that students' preferred strategy in solving physics problems seems to be based on the use of formulas (often incorrect ones). This strategy in some cases seems to be blocking the use of potentially more successful reasoning strategies based on slope, which students do posses. This is a difficulty that might be interpreted as a problem of framing. Even though students possess certain resources, they do not activate them because different contexts of problems lead them to choose different solving strategies. Had this study been limited to only kinematics graphs, we might have concluded that students have very limited knowledge on graphs. However, in many instances this would not have been true, since students have often displayed their ability to extract relevant information from graphs in mathematics, and sometimes also in other contexts, but did not use that ability on physics problems, in spite of the received high school instruction on kinematics graphs. Why that happened and what the students' most common reasoning difficulties concerning graphs are remains to be further investigated by a careful analysis of student written explanations and interviews with selected students which are planned to be carried out soon. The findings will be presented in a separate paper.