Developing the use of visual representations to explain basic astronomy phenomena

[This paper is part of the Focused Collection on Astronomy Education Research.] Several decades of research have contributed to our understanding of students’ reasoning about astronomical phenomena. Some authors have pointed out the difficulty in reading and interpreting images used in school textbooks as factors that may justify the persistence of misconceptions. However, only a few studies have investigated to what extent usual textbook images influence students’ understanding of such phenomena. This study examines this issue, exploring 13–14 year old students’ explanations, drawings, and conceptions about three familiar phenomena: change of seasons, Moon phases, and solar or lunar eclipses. The research questions that guided the study were (RQ1) how are students’ explanations and visual representations about familiar astronomical phenomena affected by different image-support conditions? (RQ2) How are students’ conceptions about familiar astronomical phenomena affected by different image-support conditions? (RQ3) Which features of the used images most affected the students’ visual representations and explanations of familiar astronomical phenomena? To answer our research questions, we designed three instructional contexts under increasing support conditions: textbook images and text, teaching booklets with specially designed images and text, only text. To analyze students’ drawings, we used exploratory factor analysis to deconstruct drawings into their most salient elements. To analyze students’ explanations, we adopted a constant comparison method identifying different levels of increasing knowledge. To investigate students’ conceptions, we used a mixed multiple-choice and true false baseline questionnaire. For RQ1, results show that the specially designed images condition was effective in helping students producing informed drawings in comparison to text-only condition for all phenomena, and more effective than textbook images condition when one considers seasonal change drawings. Concerning RQ2, the specially designed images condition was the most effective for all phenomena. Concerning RQ3, prevalent elements of astronomy images that affected students’ explanations and visual representations were Earth’s elliptical orbit; the position of the Sun with respect to the Moon orbit; and Sun, Moon, and Earth alignment. Our findings confirm concerns about textbook astronomy images, whose features may interfere with the identification of the relevant factors underlying the phenomena. Moreover, findings of this study suggest that affordances of the specially designed images may play an essential role in scaffolding meaningful understanding of the targeted phenomena. Implications for teaching through and learning from visual representations in astronomy education are briefly discussed.


I. INTRODUCTION
The use of visual representations in scientific communication and science education nowadays is well established [1][2][3]. However, the debate about their effectiveness in educational contexts has gone on for almost thirty years (for a summary, see Refs. [4,5]). In the pre-Internet era, many research studies found evidence to support the effectiveness of graphics in the learning process [6][7][8][9][10][11][12]. At the same time, other authors have pointed out that illustrations are not always adopted in an effective manner in the teaching practice [13][14][15][16]. In more recent times, the debate has shifted towards a more general analysis of the use and design principles of visual representations [17][18][19][20], including the role of images in science textbooks [21][22][23][24] and the extent to which it is possible to learn science from infographics [25]. However, the interest of educational research in investigating the link between students' interpretation of images and their understanding of the underlying scientific concepts has been quite limited in contrast with such increasing availability of massive iconic resources.
In this paper, we will investigate how high school students read, interpret, and generate visual representations in astronomy. The main reason for choosing this disciplinary area is that astronomy education and popularization are historically based on images. For instance, science and physics textbooks often feature both Sun-Moon-Earth system diagrams and photographs to explain astronomical phenomena. Similarly, planetariums and science centers regularly offer to visitors realistic simulations of space travels and explorations based on computer elaboration of real photographs. Consequently, software packages (Celestia, Starry Nights, Stellarium) are now increasingly being included in teaching proposals about astronomical phenomena [26]. Moreover, multimedia visual representations may foster astronomy literacy since they aim at representing large-scale systems and at supporting students in conceptualizing and representing complex phenomena that cannot be experienced by first-hand, as the evolution of Universe or cosmology [27].
Images are also fundamental tools for astronomers and professional researchers in astrophysics, who extensively use high-definition photographs obtained from Earth telescopes and satellites (e.g., Hubble Space Telescope, Hipparcos, Spitzer) to study the morphology of celestial objects, for instance, galaxies, and inferring their physical properties.
At the same time, however, astronomy is a subject where students frequently retain a variety of alternative conceptions, especially about basic astronomical phenomena as the day-night cycle [28][29][30], the cause of seasons [31][32][33], and the mechanism of the lunar phases [34][35][36][37]. Some authors [38][39][40] have generically pointed out the difficulty with reading and interpreting the images commonly used in school textbooks to illustrate astronomical phenomena as a possible factor that justifies the persistence of such alternative conceptions. However, previous studies either focused only on a specific phenomenon or did not interpret such difficulty in the light of broader results about visual representations in science education and educational psychology. This study takes such more general perspective and examines the relationship between astronomy visual representations and students' understanding in the case of change of seasons, Moon phases, and solar or lunar eclipses. We chose such phenomena since they have been thoroughly studied in astronomy education literature and because visual representations of these phenomena are present in most science textbooks at middle and high school level.

A. Students' interpretation of science visual representations and textbook images
The increasing use of pictures or graphical representations in science teaching raises the issue of knowing the socalled "visual language," which features functions and structures like those of verbal language [41,42]. In particular, knowledge of visual language involves the awareness of semantic and interpretative processes at the basis of communication through visual representations [43]. In other words, to know the visual language adopted in science implies to know how information are encoded in scientific representations through the representational syntax [44,45]. In the teaching of science, visual language necessary exists alongside with more traditional modalities of communication such as verbal or mathematical ones [46][47][48]. The nature of images used in science teaching varies according to the school level. Figurative images that represent ideas or scientific concepts are mostly used at the primary and middle school level, when students still have a limited familiarity with the language of mathematics. As the educational level raises up to university instruction, science communication increasingly relies on more schematic and technical images. According to Lemke [49], scientific texts can be considered in general multimodal; that is, they feature different modes (verbal, mathematical, and visual), which interact with each other with the aim of expressing the concepts to be communicated [50]. For example, a frequently used modality is the verbal-visual one, where a text is accompanied by one or more Cartesian graphs that detail the reported contents [51]. A further example is the illustration-based modality, where the content is sketched to show only its main features and to highlight the relationships among different components of a system [52].
Some authors have suggested that visual representations might foster students' learning processes since they preserve geometric and topological relationships with the objects they represent [43]. Similarly, it has been reported [53] that students who received information about the human circulatory system in diagrammatic form performed better than students who received the same kind of information with text. In particular, diagrams helped students in learning the structure and function of the circulatory system. The authors interpreted such evidence as a beneficial effect of diagrams and visual representations in general on metacognitive strategies of the students as self-explanation, while other studies [54] suggested that this result could be due to activation of cognitive strategies as inference or judging.
It should be pointed out, however, that all the above studies emphasize that specific support is needed to take full advantage of visual representations. As evidenced by some studies [55,56], when such support is lacking, or when representations are oversimplified or impoverished, the sole presence of photographs, drawings, and diagrams in school textbooks does not guarantee a greater effectiveness in communicating scientific concepts. This may be due, for instance, to the fact that students need to know how to decode the specific visual language of an image so that they can correctly interpret its content [2,44]. For instance, some studies show that students consider more useful, for the understanding of a scientific concept, diagrams and not photographs, which could hide implicit messages not easily decipherable [19]. Other studies show that images may produce an effect contrary to that intended by the authors themselves, unavoidably altering the traditional function attributed to images of helping the explanation of a concept [57,58]. To this respect, some authors [59] suggested that textbooks images do not always provide students with a suitable textual or graphical context for making sense of graphs or diagrams, often isolated and not connected to other representations that would convey the same information. Finally, some studies [60] seem to imply that such difficulty is due to students' scarce awareness of the role that visual representations play in communicating science concepts, or to a poor knowledge of the topic represented in the image [54]. We will deepen the latest issue in the field of astronomy education in the Sec. II B.

B. Students' interpretation of astronomy textbook images
While many studies have analyzed students' difficulty understanding astronomical phenomena [61], only a few studies have focused on students' difficulty with the interpretation of diagrams and iconic representations of these phenomena. Most research studies generically point out that representations of astronomical phenomena can be misleading because they are often complex, ambiguous, and necessarily represent only a specific view (e.g., top or side) being the 3D represented phenomena forced into a 2D depiction [39,[62][63][64]. On the other hand, Vosniadou [65] suggested that typical textbook representations of astronomical phenomena, as, for instance, those representing the motion of Earth around the Sun, are conceptual models and, as such, can be difficult for the students to interpret because they (i) require a domain-specific knowledge; (ii) are often not consistent with the perceptually based models that students have created using their everyday experience. While the above studies have often called for more research in order to better understand the role of visual representations in education, however, only four studies, to the best of our knowledge, addressed this issue with a systematic research design [38,[66][67][68].
As far as the cause of seasons is concerned, in a study with about 100 prospective primary teachers [38], the author inferred that textbook images contain three potentially misleading representations. The first two concern representations of the solar rays hitting Earth's spherical surface either without or with indication of the tilt of Earth's axis ( Figs. 1 and 2). In both images, rays are represented as segments of different length at the polar regions and at the equator. According to the author, the different length of the segments indicated as "sunrays" suggests a distance between the equatorial zone and the Sun that is "seemingly" shorter compared to that of the polar regions resulting in a temperature difference between pole and equator (distance misconception). Students, hence, should recognize that the perspective of both images is exaggerated only to highlight the spherical shape of Earth and the tilt of its axis; and the temperature effect due to different distances from the Sun of the two terrestrial regions is negligible.
The third one concerns representations in which Earth's orbit around the Sun is represented with an emphasized eccentricity (Fig. 3). Such representation could be a "visual pitfall" [40] since it may lead students to think that during its motion around the Sun, Earth changes significantly its distance from the Sun, thus justifying seasonal changes with Earth being closer to (summer) or farther from (winter) the Sun. The distance misconception may be reinforced also if the representation of Fig. 3 is used in combination with those of Figs. 1 and 2 without suitable explanations. The author also points out that students may experience difficulty in interpreting the presence of four FIG. 1. Representation of the nonuniform distribution of solar light on Earth's surface. Adapted from Ref. [38]. Earths in the same diagram, and in correctly inferring the position of the Sun within the orbit.
In another study with 652 9th grade students, no significant influence of the elliptical representation of Earth's orbit on students' explanations about the cause of seasons was found [67]. The author presented six different shaped diagrams of the orbit of Earth around the Sun. Four out of six diagrams showed an emphasized eccentricity of the orbit, while two represented a circular orbit. In the four diagrams with the elliptical orbit, Earth's axis was reproduced with the correct inclination, while in the two drawings with circular orbits, there was no reference to Earth's axis. In three diagrams, a light shading indicated the portion of the illuminated hemisphere. In all six diagrams, the Sun and Earth had the same size and the Sun was always placed at the center of the orbit. All diagrams featured the presence of text to indicate the beginning of spring and autumn equinoxes and the beginning of winter and summer solstices. Explanations given by the students on the cause of the seasons were collected before and after viewing the diagrams. The results show that the pronounced eccentricity of Earth's orbit did not seem to favor explanations based on the Earth-Sun distance. In particular, the explanation based on Earth' axis tilt was the most frequent amongst the students who received a diagram with an elliptical orbit. We note, however, that the inclination was present only in the four diagrams with the elliptical orbit, and most likely the presence of the tilted axis could have influenced students' explanations. On the other hand, the use of shaded zones seems to have favored an (incorrect) explanation based on the rotation of Earth around its axis: one side of Earth is exposed to the sun (summer), while the other side is farther from it (winter).
Regarding the phases of the Moon, the results of the studies in Refs. [66,68] concur that common textbook images are misleading because they either wrongly represent the relationship between the planes of the Moon and Earth orbits or show all phases together as if they were caused by Earth's rotation. In particular, results [66] show that in most of the analyzed diagrams (i) it is necessary to identify the exact position of the Sun and of the observer to construct geometrically the image of the Moon as seen by the observer on Earth; (ii) the Moon's orbit is often shown aligned with the ecliptic, so that a confusion between full (new) Moon and lunar (solar) eclipse may arise; (iii) the observer's viewpoint (on Earth) is not explicitly indicated, so that in some cases Moon phases are represented from the space viewpoint but with the shape as seen from Earth; (iv) the relationship between the duration of the phases and the Moon rotational period is not explicit; (v) the relationship between the represented positions of the Moon and the illumination received from the Sun is not explicit. Similar concerns are raised in Ref. [68]. The author suggests that students' difficulty in understanding Moon phases is due to textbook representations, where the Moon orbits around Earth in a counterclockwise direction and four or eight moons are simultaneously shown (Fig. 4). In such a way, it is necessary to reconstruct the conditions for the phase represented in the image and to relate the snapshot of the Moon to the positions of the Moon along its orbit around Earth. Moreover, no indication about the time interval between two consecutive phases is given. Finally, Earth and space viewpoints are confused.
In another study [69], about 100 books for children were analyzed. Analysis focused on three different types of representations of the Moon: (i) the appearance of the phases, (ii) the names of phases, and (iii) the sequence of phases. In particular, reproductions and textual descriptions in the analyzed books were compared with actual photographs of the Moon and with textbooks usually adopted in primary schools. Text and images were coded according to how the Moon was conceptualized and represented, using two categories: (i) the scientific category, which groups the correct representations from the scientific point of view, and (ii) the alternative category, which groups representations with misconceptions and scientific errors. For each illustration of the Moon, the researchers first determined which one of the eight representative phases (for example, full moon, first quarter, etc.) was illustrated. Next, the researchers determined if the representation was scientific or alternative. Results show that more than half of the illustrations featured in the analyzed books represented a full moon.
About a quarter of the illustrations included the image of a crescent moon. Very few representations referred to the new Moon, gibbous Moon, or quarter Moon. In some other cases, the representations included illustrations of Moon phases that were either not observable or not correct. In particular, about one-third of the crescent Moon representations were wrong, as well as about one-quarter of the gibbous Moon representations. All nonscientific representations were consistent with the misconception that the phases of the Moon are caused by the shadow of Earth, a phenomenon that can be observed only during a lunar eclipse. In other cases, the representations "transformed" the lunar phases, showing, for example, that the Moon changes in size in contrast to the observation of everyday life. Other illustrations showed the crescent Moon in wedges, with stars in the area where there should be a part of the Moon not illuminated.
While providing valuable insights, however, the above studies give only a partial account for the students' difficulty in reading astronomy-related images since they concern specific phenomena and do not address how students' reasoning may vary across different astronomical contexts.

C. Student-generated visual representations and learning in science
According to Van Meter and Garner [70], "drawing involves constructive learning processes that engage nonverbal representational modalities and requires integration." Such modalities encompass first the construction of an internal representation of the concept and then the effort to externalize it in conventional form through referential links [71].
From the cognitive point of view, early research studies found that to involve students in the process of generating drawings could promote memory, observation, and imagination [72,73]. More recent findings suggested that drawing might be a beneficial activity for the quality of children's writing concerning explanations and procedures [74]. In this respect, some authors proposed that young students might use drawings as a way to select and capture certain features of the reality around them in order to construct pattern of reasoning about it [75]. It has been also suggested that drawings may promote learning of science contents [76], in particular of models [77]. As such, drawings might be used as formative assessment probes in their own right [78]. Moreover, the role of drawings as an effective learning strategy has been clearly supported by the findings of two studies [79,80]. In the first [79], pupils (fifth and sixth grade) read a two-page text about the central nervous system and were asked to make drawings to represent concepts featured in the text under different support conditions (from providing an illustration to no illustration provided). The results showed that pupils in the most supported condition scored higher marks than the other participants in the drawing task. Such results were also confirmed by the subsequent study [80]. More importantly, the latter study suggested that drawing affects learning more than simply dealing with visual presentations without the request to draw. Finally, using an approach that emphasized the role of visual representations as epistemic practices, some authors have claimed that to construct, explain, justify, and refine open-ended representations of scientific processes may help students learn about conceptual knowledge, the nature of scientific inquiry, and to communicate scientific evidence [81].
From the research viewpoint, many studies have supported the "traditional" claim that student-generated drawings could help in uncovering reasoning strategies, attitudes, and mental representations [82][83][84][85][86][87][88][89]. However, research findings have been often inconsistent concerning the effectiveness of drawings as a way to uncover students' misconceptions and learning outcomes [70]. Such inconsistency has led to a relative paucity of studies about students' drawings in discipline-based educational areas [90][91][92] and to methodological criticisms to drawings as research technique (see, for example, the case of the Draw-A-Scientist test, [93][94][95]). In the effort to mediate between such positions, it has been theorized [96] that children' drawings are products of pictorial conventions in cultural contexts and, therefore, the conception that is represented in a drawing depends on the convention chosen by the student for the representation. Similarly, other authors [97] found that children's drawings about the human body were influenced by different choices related to sociocultural expectations of their community, artistic aims, and practical reasons. In some cases, relevant features, with which the students were familiar, had been simply overlooked. Such results suggest that context and specific situations may affect drawings.
However, the influence of conventional representations on students' drawings seems to decrease as instructional support increases. For instance, in Ref. [98], drawings about tropical rainforests produced by 9 to 11 year old students before and after a site visit were analyzed. The author used a scoring rubric with three dimensions: breadth, extent, and depth. The breadth dimension referred to the quantity of appropriate themes used in the drawings. The extent dimension referred to the quantity of appropriate images used by the pupil in their drawing. The depth dimension referred to how deeply and richly the children represented the addressed themes. Results showed that before the visit, pupils' drawing mainly reproduced familiar countryside contexts, while after the visit, the pupils improved their drawings concerning the extent dimension of the biodiversity, different species of trees and plants, and rainforest features. Similarly, in a recent study [99], third year undergraduate students' drawings of the leukocyte cascade were examined before and after a teaching intervention consisting of a lecture series and laboratory tasks using research publications on the topic as instructional tools. The results show that most of the generated drawings were original in nature, revealing different features of the dynamics of the leukocyte cascade. Finally, it has been found that a multimodal teaching sequence, which included field observations, viewing of photographs and videos, and drawing as a communication tool, was effective in increasing students' knowledge about carnivore plant structure and function [100].

D. Students' representations in astronomy
Many studies in astronomy education have used students' drawings to elicit misconceptions within written tasks or interviews [32,[101][102][103][104]. However, few studies have analyzed in detail the relationships between students' drawings and the underlying mental models of familiar astronomical phenomena, following the trails of the influential works described in Refs. [30,87].
For instance, to validate her theory of contextualization of students' drawing, Ehrlén interviewed 18 students (age 6-9) when they were engaged in the task of drawing Earth as a planet in the Solar System [96]. While the results are in agreement with previous studies, she also found that students might not differentiate between the concepts of Earth, country, and planet.
A more detailed study was carried out in Ref. [105]. The author reports the case study of a pair of fifth-grade students who were asked to produce visual representation that could explain the Moon phases. The main purpose was to propose a theoretical model of the process by which students build and elaborate explanations of the phases phenomenon generating and exploiting visual representations. The hypothesis was that the explanations are manifestations of the students' current state of knowledge, and, therefore, changes in the explanation should provide evidence of changes in the underlying conceptualizations. The proposed model aims to provide a cognitive interpretation of the contingent generation of explanations of scientific phenomena, highlighting the progression of these explanations from initial to most advanced stages, describing at the same time the dynamics behind this process. Results show that the students' mental models of the lunar phases progress towards more sophisticated explanations of the phenomenon after appropriate educational stimuli. The study demonstrates the important role of visual representations produced by the two students, which were able to make a transition between different stages of conceptual understanding.
In Ref. [66], about 78 third-year undergraduate students were asked to draw a scheme of the Sun-Earth-Moon system and to justify the phases of the Moon. Moreover, they were asked also to draw what would have been seen by an astronaut in a spaceship orbiting around Earth. Results confirm previous research findings. For instance, in some cases, drawings suggest that sunrays "obscure" some portions of the Moon. In other drawings, the Moon is present only at night, so that the phases might be due to Earth's rotation around its axis. Similarly, the role of the relative position between Earth, Sun, and Moon in the phases phenomenon seems difficult to grasp. For instance, when asked to draw how an astronaut would see the Moon in its quarter phase, the appearance of the Moon was the same as from Earth. On the contrary, when asked to draw a full Moon phase as seen from different places on Earth, the appearance of the Moon was different due to Earth's shadow that would cover part of Moon surface.

III. AIMS AND OBJECTIVES
This study aims at exploring two main issues about the role of images in astronomy education emerging from the literature. First, to investigate how students' visual representations and explanations of the mechanisms underlying the targeted phenomena are affected when learning with text or images. Second, to investigate the effects of images purposely designed to address issues of traditional textbook visual representations of astronomical phenomena.
The first two research questions that thus guided the study were: The image-support conditions will be set out as follows: traditional textbook images þ text; specially designed images þ text; text only. The main hypothesis is that students who learn in the specially designed images þ text support condition will give better explanations, generate better representations, and learn more about the targeted phenomena. Secondarily, we hypothesize that students who learn in the textbook images support condition will give better explanations, generate better representations, and learn more about the targeted phenomena than students of the text-only support condition.
To further inspect the role of the specially designed images, a third research question was also set out: RQ3: Which features of the used images most affected the students' visual representations and explanations of familiar astronomical phenomena?
Given the curricular requirement of the teaching of astronomy in Italy, the study involved students at the beginning of secondary school cycle (13)(14) year old students, grade 9). More details about the sample are given in Sec IV.

A. Description of specially designed astronomy images
To construct the instructional context for the study, we first developed a set of specially designed, "innovative", astronomy images about the chosen phenomena: change of seasons, Moon phases, solar and lunar eclipses. We call these specially designed images innovative, because their design principles are research-based and theoretically driven. In the development of such images, hence, we started from the available literature in astronomy education about visual representation, and interpreted the results using a sociosemiotic theoretical framework [42].
The adopted framework is based on the assumption that visual representations consist of iconic features related to the content they depict, on a concrete or abstract level, through representational structures. Representational structures are the ways in which symbols and signs are organized with the aim of expressing the concepts to be communicated [43]. According to the framework, the possibility of combining different types of representational structures generates difficulty with the interpretation of the message encoded within the image.
Since the framework categorizes visual representations according to the spatial organization of featured signs and symbols, also the difficulties encountered by students when dealing with visual representations can be categorized according to the same criteria. In previous studies [106][107][108][109][110], such criteria have been organized into a list of iconic features that are significant when reading and interpreting documents containing images. The items of the list that are relevant for this study are reported in Table I.
We remark that the iconic features in Table I do not reflect the accuracy of a visual representation: rather, they represent how depicted signs, symbols, and design choices concur to give meaning to the visual representation. Hence, in contrast to analytical frameworks, the use in the same representation of as many iconic features as possible does not necessarily lead to a "better" visual representation.
The R/S, SEL, SYM, and VER categories represent local iconic features of an image, whose meaning is independent of the specific image in which they are used (for example, the verbal element "m=s" in a Cartesian plot always indicates the unit measure of speed, regardless of the appearance of the curve represented in the graph). The INT and CST categories represent global iconic features of an image, whose meaning depends on the image in which they are used [for example, specific sðtÞ and vðtÞ graphs that refer to the same motion].
The above list allowed us to explain the well-known difficulty of students with the interpretation of graphs [107,112,113] and of images regarding geometrical optics [114]. A very similar coding scheme was used [115] to analyze graphical representations generated by 14-17 year old students in a science news-reporting task. Moreover, the list is in agreement with results of previous studies that investigated students' reasoning when dealing with specific features of visual representations as, for instance, arrows [116] or labels [117]. Finally, in the light of the obtained results, we adopted the above list also for the analysis of students' difficulties when reading images of basic astronomical phenomena [118]. Our analysis confirmed, for instance, the claims that the "in perspective" elliptical shape of Earth's orbit ( Fig. 3) may be problematic since it presents Presence of two or more conceptually related images CST Compositional structures that require the interpretation of spatial distributions and of different representational structures a compositional structure (CST) that mainly highlights the variation of the Earth-Sun distance along Earth's orbit. Similarly, the presence of four or eight moon phases in the same image ( Fig. 4) may be difficult to interpret since they have to be meaningfully integrated with Earth's diagram (INT). In the following, we briefly describe the graphical features of five images purposely designed for the study.

Seasons images
The images designed to explain about seasonal changes are reported in Fig. 5 ("Seasons 1") and Fig. 6 ("Seasons 2"). We began the design of the image "Seasons 1," by choosing a compositional structure (CST) in which Earth's orbit was circular and not elliptical, as suggested by previous studies [38]. Moreover, with respect to usual textbook images [118], we did not include arrows to indicate the rotation and revolution of Earth (SYM) and other information not relevant for the phenomenon (e.g., segments that connect Earth to the Sun). We also chose to avoid the use of text (VER), for instance, avoiding reference to aphelion and perihelion. While we maintained the conventional representation of four Earths along the orbit, we changed the appearance of Earth's axis using a perspective that emphasizes (SEL) the constant direction of the axis during the motion. To prevent misleading ideas about a wrong axis' tilt, we presented a different viewpoint at the bottom of the image: in particular, we chose the viewpoint of an observer on the same plane of Earth's orbit. While such a choice may lead to difficulty in integrating the two parts of the representation (INT), it helps clarify that change of seasons can be due to two combined factors: the motion along the orbit and the tilt of the axis. We began the design of the image "Seasons 2" (Fig. 6) by choosing to focus on two positions (SEL) of Earth along its orbit represented in Fig. 5. In such a way, students can adopt the same iconic codes used to interpret the previous image. Following Ref. [38], the uneven distribution of the Sun's radiation due to axis tilt, as seen from the space, is shown by explicitly reporting (VER) the angle between the direction of sunrays and the plane tangent to Earth's surface at two different times of the year, winter and summer solstice, to highlight the largest difference in illumination at that particular place. To help students connect inclination of sunrays along Earth's surface and their experience with different incidence of radiation, the viewpoint of an observer on Earth is also reported at the bottom of the image. With such a choice, we followed the recommendation [38] to represent separately each season. Despite that the adopted framework predicts that students might find it difficult to relate different images (INT), the vertical arrangement of the two panels (CST) should help relate a specific time of the year (position of the planet along the orbit) and the inclination of Earth's axis to inclination of sunrays (using the same colors). In such a way, the role of the two factors-orbit and axis tilt-on seasonal change may be further reinforced. In addition, as for all designed images, to help students relate the two images in the frame, the same line and color codes were used to show the phenomenon from different viewpoints.

Phases image
For brevity, we describe here only one of the images designed to explain the lunar phases (image "Phases", Fig. 7).
We began the design of the image by unpacking the relevant information about the Sun-Earth-Moon system in three distinct but related panels: first, we highlighted the space perspective by showing two positions of the Moon with respect to Earth and the Sun, and Earth and Moon orbits. In such a way, students can be guided to understand that different positions of the Moon along its orbit lead to different appearance of Moon from Earth. To this latest concern, while realistic and symbolic representations of the Moon are present (R/S) in the top and bottom panels, we chose to insert the unusual perspective of an observer on the plane of Earth's orbit and the line of nodes (SEL) to relate the two representations. Moreover, this central panel could be helpful for students to understand that the planes of the Moon's and Earth's orbits are tilted, and such evidence will be recalled in the eclipses' images. To avoid similarity of symbols (SYM) between dashed lines (Earth and Moon orbits), the line of nodes was represented with a different color and dot spacing. The Earth's orbit perspective, in particular, allows relating the space to Earth' perspective, by showing how the Moon surface is illuminated by the sunlight as the Moon rotates around Earth, and to guide students to understand that the same Moon phase is seen from different places on Earth wherever the Moon is visible. Following Ref. [66], we purposely avoided the presence of four or eight Moons along the orbit around Earth and of any verbal element (VER) or arrows (SYM), so to avoid confusion between the spatial and temporal sequence of the phases.

Solar eclipse image
The image designed to explain solar eclipses is reported in Fig. 8 (image "S-Eclipse"). In the image we highlight the condition under which a total solar eclipse may occur, namely, the alignment along the lines of nodes of Earth, the Moon, and Sun. As for the image in Fig. 7, we show the line of nodes as reference for the reader to identify the alignment condition (SEL). We also chose to show two different positions of Earth on the orbit around the Sun, so to reinforce the idea that a solar eclipse only depends on the condition of three-body alignment along the line of nodes, which could occur at any time of the year. If one recalls the image in Fig 7, it is possible to help students understand that this is not a frequent case. We chose to highlight that the total eclipses can be seen only in a small portion of Earth, due to the different size of the Moon and Earth, by shading only a small gray triangular area (SEL).

Lunar eclipse image
The image designed to explain lunar eclipses is reported in Fig. 9 (image "L-Eclipse"). In the image of Fig. 9, we chose to highlight how Earth's shadow, in which the Moon falls, is formed. In addition, in this case, we chose not to represent gradients of grays or penumbral areas, to focus on the main astronomical mechanism underlying eclipses, namely, alignment from a three-dimensional view. Moreover, to relate the shadow formation with the Moon-Earth-Sun alignment, we explicitly highlighted in both panels the line of nodes (dashed lines, SEL).
As for the lunar phases, we did not include in Fig

Teaching booklets
Once the innovative images were ready, we began the design of three instructional booklets, one for each targeted phenomenon. The aim of the booklets was to cover the same contents of a standard school textbook, with additional reference to the iconic features of the newly developed images. Such reference was included in the images' captions, following the recommendations in Ref. [119] about how to integrate text and pictures.
First, in each image caption, we described the main features of the images and of the underlying semiotic rationale. In particular, we emphasized how to decode the used symbols, (for instance, the dashed lines to indicate orbits) and the different adopted perspectives (for instance Earth vs space perspective). Then, to foster a better interaction between students' mental model and the elements to be selected or conceptually highlighted, we overlapped semantic codes of the text and of the images [120]. This was done, for instance by (i) explicitly reporting the observer's position (lunar phases), (ii) linking 2D representations with 3D perspective (eclipses), and relating different inclination of sunrays on the observer's plane to Earth's revolution and axis' tilt (seasonal changes). Third, we tried to use coherently verbal and visual information that could activate students' previously acquired knowledge (for instance, formation of shadows for the eclipses and propagation of light for seasonal changes). Finally, we slightly shortened the standard text so to have a comparable length (around 700 words), including captions. The captions for the images in Figs. 5-9 are reported in the Supplemental Material [121].

B. Instructional context and sample
To answer our research questions, we designed three instructional contexts under increasing support conditions, involving an experimental and a control group. The experimental group was divided into two subgroups: Group 1 received instruction with images and text of the usual schoolbook; group 2 received instruction with the specially designed teaching booklets. The control group (group 3) used no images at all, and only the usual schoolbook text (same as group 1) was provided. Students from three complete classes of a secondary school in Southern Italy participated to the research and were randomly assigned to the three groups. The students had not been previously taught about astronomical phenomena. Details are reported in Table II.

C. Instruments
Three probes were used in the study. Effects of the different instructional contexts on students' visual representations and explanations (RQ1) were investigated through: (i) a drawing task, during which the students were asked to make a drawing that would explain to a reader the change of seasons, Moon phases, and solar or lunar eclipses; (ii) a written task, in which the students were asked to give a written explanation for each of the three phenomena; (iii) a mixed multiple-choice and true false baseline questionnaire featuring 24 questions, 8 per section devoted to each of the three phenomena, validated in a previous study by our group [122]. Answer choices featured one correct option, one option corresponding to a partial knowledge of the phenomenon, and two incorrect options corresponding to well-known student misconceptions.
Data from the drawing and written tasks were also used to investigate how student-generated visual representations and explanations were related to the iconic features of the images received during instruction (RQ3). Data from the baseline questionnaire were used to investigate the effects of the different instructional contexts on students' conceptions of the targeted phenomena (RQ2).

D. Procedure
Data collection began in mid-December 2016 and finished in late January 2017, and was divided into three consecutive sessions, one per phenomenon. The first FIG. 9. Image "L-Eclipse," designed to explain lunar eclipses. period was devoted to seasonal change, the second to Moon phases, the third to solar and lunar eclipses. Each session included the time given to the students to complete the research probes and to read the teaching material, as follows. Two to three days before presenting the materials, the students were given 20-25 minutes to complete the drawing and written tasks related to the phenomenon they were going to be taught about, and a further 15-20 minutes to complete the corresponding section of the baseline questionnaire. Participants also reported their sex and age. Then, two to three days after this first session, the students received the material (in print) and the following instructions: "Read the following text, including figures and captions. You have 1 hour to complete reading. Then, you will be asked to answer some questions about what you have read." Immediately after finishing reading the instructional material of the booklet, the same day, the participants were again given the drawing and written tasks and completed the section of the baseline questionnaire corresponding to the phenomenon addressed. Considering dead times for distributing and collecting the booklets, the baseline questionnaire and the written task, the sessions lasted on average 4 hours for each phenomenon. The eclipses session was longer since the students were allowed some extra time to produce two different drawings, one for the solar eclipse, another for the lunar eclipse.

E. Data analysis
Drawing task.-The reviewed studies suggest that typical textbook images about astronomical phenomena feature iconic difficulties that may lead students to interpret incorrectly the mechanism underlying the represented phenomenon. For this reason, we did not use preexisting scoring schemes focused on accuracy and fidelity in reproducing the provided images to analyze the students' drawings [79,123]. For the same reason, we did not adopt scoring systems based on the number or types of iconic elements that are present in the drawings [98,124]. While effective for giving some sort of quantitative assessment of drawing, these scoring systems tend to adopt highly prescriptive notions of what should be considered as a "successful drawing," and a better fit to the predetermined "expert" instances [99]. Moreover, such techniques do not allow clustering emerging students' models according to visual features, which are instead essential for correctly interpreting the represented phenomenon.
To analyze students' drawings, we used exploratory factor analysis of iconic features [125], which allows us to identify emerging representative student models. In particular, we used results from a parallel study [126], in which we analyzed about 2000 pupils' drawings about seasons change (494), Moon phases (539), and solar (427) and lunar eclipses (499). We chose factor analysis since it also allows for (i) negative factor loadings, which indicate features highly unlikely to be present in a model and    (ii) calculation of factor scores, which allows assigning each drawing to an emergent model. To perform the exploratory factor analysis, conceptual and semiotic features of students' drawings were first identified and then grouped by emerging factors, which correspond to different students' representations of the phenomena. Details are reported in Ref. [126]. In the Supplemental Material [121], we report the description of the resulting models, which we used to classify drawings of the present study. We note that the emerging models align to a certain extent with previous studies on younger students. For instance, for seasons change, model Distance 1 aligns with notion 3 in Ref. [32], and with drawings reported in Ref. [103], while Tilt aligns with notion 6 in Ref. [32]. Similarly, for Moon phases, models Sun & Moon and Orbit & Sun align with notions 3 and 5 in Ref. [32], respectively.
To categorize the drawings of the present study, authors S. G. and I. T. coded half of the drawings separately. Then, for each phenomenon, Cohen's kappa was calculated.
When the values of Cohen's kappa were not satisfactory (this was the case for solar and lunar eclipses), the two raters discussed and revised the ratings to reach an agreement. In other cases, new models had to be adopted to categorize students' drawings. After further discussions, values higher than 0.75 for each phenomenon were obtained and a final categorization of students' drawings was agreed upon. Results are reported in Tables III-VI. Written task.-To analyze students' written explanations, we adopted a constant comparison method identifying different levels of increasing knowledge about the targeted phenomena. Levels align with findings of previous studies [33,34]. A similar procedure to that used for the drawing was adopted to establish the reliability of the scoring of students' explanations. The value of Cohen's kappa obtained for the three phenomena was in this case   higher than 0.73. To reach such agreement, in some cases, it was necessary to slightly differentiate students' answers. In particular, for seasonal changes, we indicated as correct-a those answers that referred solely to Earth's axis' tilt, while we indicated as correct-b those answers that referred to both the axis' tilt and orbital motion around the Sun. Moreover, we labeled as incorrect-a those answers that referred to the distance misconceptions, while we called incorrect-b those incorrect answers whose underlying reasoning is not related to the distance misconception. For solar and lunar eclipses, we indicated as correct-a those answers that referred solely to the alignment of the Sun-Moon-Earth, while we indicated as correct-b those answers that referred to the relative inclination of Earth and Moon orbits' planes. The determined categories-with typical examples-are reported in Table VII.
To answer RQ1, we investigated (i) whole sample differences between pre-and postinstruction categories and (ii) across-conditions differences in the pre-and postinstruction drawing and written tasks. To this aim, we adopted a χ 2 analysis. When necessary, categories were collapsed to meet statistical requirements to apply this analysis method. To gain further insight and clarify emerging trends, we also investigated the relationships between student-generated drawings and explanations (Table VIII).
Evidence from drawing, written tasks, and their combined analysis was used also to answer RQ3.
-Analysis was carried out as follows. We assigned a full credit (1 point) for each true or false question and multiple-choice question (2 points) answered correctly. Partial credit (1 point) was given if students picked the answer choice corresponding to a partial knowledge of the phenomenon. No penalty (zero points) for blanks or wrong answers. The total score for each of the three sections of the questionnaire was 10.
To answer RQ2, differences across groups in the pre-and postinstruction baseline questionnaire scores were investigated through analysis of variance (ANOVA).

V. FINDINGS
A. Drawing task

Seasonal changes
As reported in Table III, the great majority of drawings (about 80%) in the preinstruction task aligned with a "distance"-based model (e.g., Distance 1 in Fig. 10). The remaining 20% of drawings aligned with models based on some sort of "inclination"-based reasoning, either representing inclined sunrays or a tilted Earth's axis. In the postinstruction task, all groups showed some improvements in their drawings. The overall frequency of the inclinationbased drawings (e.g., Orbit & Rays in Fig. 11) increased from 20% to 64%. Correspondingly, the frequency of distance-based drawings decreased on average down to 36%.
The across-groups analysis of the preinstruction drawings shows no significant differences concerning the distance-based model (frequencies ¼ 79.2, 82.6, and 76.2, respectively, χ 2 ¼ 0.278, df ¼ 2, p ¼ 0.870). In the postinstruction task, 91% students of group 2 generated inclination-based drawings, (for examples of pre-and postinstruction drawings, see Figs. 12 and 13, respectively). On the contrary, students of group 1, after the teaching intervention, still produced in most of the cases (58%) a Distance 2 model, while only 42% of them were able to generate inclination-based drawings. To this concern, group 3 performed better than group 1, since the frequency of inclination-based drawings increases to 58%. This led to a statistical significant difference between the groups in the postinstruction task (χ 2 ¼ 12.932, df ¼ 2, p ¼ 0.002). Trends are summarized in Fig. 14.

Moon phases
From Table IV, we note that the majority of the drawings in the preinstruction task (on average 75%) featured the simple sequence of the phases with or without the indication of the Moon orbit around Earth ("sequence"-based models; see, for instance, the model Sequence 2 in Fig. 15). In the postinstruction tasks, the frequency of these drawings decreased to 40%. All the remaining drawings featured also a representation of the Sun ("Sun"-included drawings see, for instance, the model Orbit & Sun in Fig. 16). We hypothesize that students who produced these drawings acknowledge in some way the relationships between the phases and the relative position between the Sun and the Moon. The percentage of such drawings increased in postinstruction task for all groups with respect to the preinstruction task (see Fig. 17).
By comparing the performances of the three groups, we note that in the preinstruction task the percentages of "sequence"-based drawings are very similar (79%, 70%, and 76%, respectively) and differences are not statistically significant (χ 2 ¼ 0.600, df ¼ 2, p ¼ 0.741).
In the postinstruction task, the distribution of Sunincluded drawings amongst the groups is statistically different (χ 2 ¼ 9.884, df ¼ 2, p ¼ 0.007). In particular, about 60% of group 1 students produced an Orbit & Sun model, 48% of group 2 generated a Phases & Orbits model, but no student of group 3 produced a drawing that included the Sun and the Moon's orbit. The majority of group 3 students (about 67%) still produced a sequence-based drawing. To exemplify students' progression between pre-and postinstruction task, two examples of pre-and postinstruction drawings are reported in Figs. 18 and 19.

Solar and lunar eclipses
Results are reported in Tables V and VI. Concerning solar eclipses, we note that in the preinstruction task, on average about 70% of all students generated a drawing that featured the Sun, Moon, and Earth aligned with also sunrays hitting either the Moon or Earth (e.g., Rays 2, see Fig. 20), and in some cases also the corresponding shaded areas. Since these drawings feature at least the Sun, Moon, and Earth alignment, we called them "alignment" based. In the postinstruction task, the frequency of these models, which include also drawings similar to the images received in the booklet (Orbit 2), increased to about 88%. For these drawings, differences across conditions are not significant in both pre-and postinstruction tasks (preinstruction: χ 2 ¼ 4.856, df ¼ 2, p ¼ 0.088; postinstruction: To further inspect whether the three groups differed for their drawings, we repeated the analysis excluding drawings featuring only the simple alignment between Sun, Moon, and Earth (model SME, Fig. 21, upper frame). The selected drawings are characterized by Sun, Moon, and Earth alignment as well as rays and shaded areas ("rays and shaded" based). Also in this case, the differences are not significant, as can be inferred from the left plot in Fig. 22 (preinstruction: χ 2 ¼ 3.226, df ¼ 2, p ¼ 0.199; postinstruction: For lunar eclipses, in the pretest, on average about 74% of all students produced an alignment drawing in which the Sun, Moon, and Earth are aligned, and, in some cases, also shaded areas are present (e.g., Shading, see Fig. 23), thus suggesting the idea that alignment of the three bodies causes projection of Sun light over the Moon. In the posttest, such percentage increases up to 91%. Differences across the three groups are not significant in both preand postinstruction tasks (preinstruction: χ 2 ¼ 0.922, df ¼ 2, p ¼ 0.631; postinstruction: χ 2 ¼ 4.074, df ¼ 2, p ¼ 0.130). As an example of improvement, we report in Fig. 24 the postinstruction drawings of the student who produced in the preinstruction task the drawings of Fig. 21.
As for the solar eclipses, we investigated whether there were differences across conditions for ray-and shadedbased drawings, hence, excluding SEM model (Fig. 21,  bottom frame). On average about 57% of the students generated ray and shaded drawings in the pretest, while such percentage increases in the post-test to about 70%. In this case, the differences across the three groups are statistically significant in the post-test (preinstruction: Such result is likely due to group 3 performance (see Fig. 22).

Seasonal changes
From Table VII we see that the preinstruction answers were mostly incorrect for the whole sample (overall 70%). About 30% of the sample gave a distance-based answer, while about half of the incorrect answers were related to more sophisticated reasoning, as, for instance: "In Italy and in all parts of the world seasonal changes happen because Earth spins around itself and during the year Italy will be in different positions with respect to the Sun and hence the length of the days and the climate will change and hence seasons" (S8, group 1). "In winter, it is colder because we are nearer the Sun, and Earth will spin faster. Hence, we receive the rays of the Sun with less intensity" (S28, group 2). "The Earth is inclined and some part of it will not face the Sun fully and will be more distant" (S30, group 2).
In the postinstruction task, all groups performed better. In particular, the percentage of partially correct answers increased from 23% to 38% and that of correct answers from 6% to 41%. Performances improved for all groups: the percentage of students of group 2 who gave a correct a=b type explanation increased from 13% to 52%, while for group 1 the frequency increased from 0% to 33%, and for group 3 from 5% to 38%. We found that correct-b type explanations increased mainly in group 3: from 0% to 24% compared to 8% of group 1 and 9% of group 2: "Seasonal changes happen because of Earth's revolution motion and inclination. The angle formed by the rays in summer is greater, if it is minimum it is winter" (S52, group 3).
To investigate differences across conditions we collapsed partial and correct categories and generic and incorrect categories, as two different levels. As suggested by the above data, differences are not statistically significant in both pre-and post-test (preinstruction: The above trends are summarized in Fig. 25.

Moon phases
To give a correct account of the phases phenomenon proved to be difficult for the students of the whole sample. In the preinstruction task, only 13% of the students was able to give at least a partially correct explanation, while about 87% of the students gave either a generic or incorrect explanation: "The different Moon phases happen because the Sun lights up Earth and the shadow of Earth obscures some parts of the Moon. If it (Earth) does not obscure, we have the full Moon" (S36, group 2).
In the post-test, we detected some slight improvement. In particular, the percentage of students who gave at least a partially correct explanation increased to 50%, but only 7% was fully correct: When looking at differences across the conditions, we note that half of the students of both groups 1 and 2 gave a partially correct answer in the postinstruction task, while students of group 3 still had some difficulty correctly explaining the phenomenon (71% of generic or incorrect answers). Similarly, partial or correct explanations increased significantly after the teaching intervention (from 13% to 50%), but most of the students (about 82%) who elaborated on such kind of explanation belonged to group 1 or 2. In particular, only group 1 and group 2 students were able to give correct-type explanations (8% and 13%, respectively). However, by collapsing partial or correct categories and generic or incorrect categories we obtained that differences are not statistically significant in both the pre-and post-test (preinstruction: Trends are summarized in Fig. 26.

Solar and lunar eclipses
The analysis of students' explanations shows that in the preinstruction tasks, only 21% of students gave a correct explanation of the phenomenon in terms of alignment between the Sun, Earth, and Moon. The majority (on average, 35%) gave either an incorrect or partial response, as, for instance: "We have a solar eclipse if the Moon is between the Sun and the Earth…" (S10, group 1). "When Earth, Sun and Moon are aligned and the latter is between the two, we have a solar eclipse" (S53, group 3).
In the postinstruction task, about 70% of the students gave a correct answer. However, we note that the majority of correct-b answers, which give an account of eclipses in terms of line of nodes, increased mainly in group 2 (from 0 to 26%) and group 3 (from 0 to 33%): "The solar eclipse is possible with the new Moon in one of the nodes of the Moon orbit. During the Earth-Moon-Sun alignment, the Earth is in a shadow cone projected by the Moon. Since the shadow is small with respect to the Earth, only in a small portion of the Earth, it will be possible to see the Eclipse" (S29, group 2).

C. Comparison between the drawing and written tasks
To investigate in more detail the trends that emerged in the drawings and written tasks separately, we looked at the distribution of the drawings' models across the categories of answers to the written task. Table VIII summarizes the statistics for the whole sample. Here we focus on differences across groups.

Seasonal change
In the pretest, the percentage of students who produced a distance-based drawing and who gave an incorrect explanation was very similar across groups (67%, 70%, 48%, respectively). In the post-test, this percentage decreased in both group 1 and group 3 to about 20%, while for group 2 such percentage was lower (4.3%). A similar trend can be found looking at the students who produced an inclinationbased drawing and gave a partial or correct explanation. In the pretest such percentage was small for all groups (8%, 17%, 10%, respectively). In the post-test, the percentage increases to about 45% for groups 1 and 3, while it is 83% for group 2. The above trends are summarized in Fig. 28.

Moon phases
For Moon phases, we associated the sequence-based drawing with incorrect accounts, and the Sun-based drawing with partial or correct accounts of the phenomenon. In the pretest, the percentage of students with the first combination of drawings and explanations is high for all groups (76%, 65%, 71%, respectively). In the post-test, such a percentage decreases significantly in group 1 and group 2 (about 15% on average), while it drops at about 43% in group 3.
The differences are even greater looking at the second combination of drawings and explanations. While the percentage of students with such a combination increased to about 50% for group 1 and group 2, it decreased from 10% to 5% for students in group 3. The above trends are summarized in Fig. 29.

Solar and lunar eclipses
For the eclipse's phenomenon, we focus on the combination of rays-and shaded-based drawing and partial or correct explanation. In the pretest, the combination has a low frequency across the three groups and for both the solar and lunar eclipse (on average 13%). The postinstruction frequencies differ across groups and phenomena, as expected from the separated analysis of drawings and explanations.
In particular, for solar eclipses, we observed the most significant increase for group 2 students (from 22% to 74%), while for group 1 the percentage remains the same (12.5). For group 3 we have an increase (from 5% to 43%) similar to that of group 2.
For lunar eclipses, we again found a significant increase (from 17% to 78%) for group 2, while we have for both group 1 and 3 similar increases (from about 10% to about 35%) The above trends are summarized in Fig. 30.

D. Baseline questionnaire
The whole sample preinstruction average score was 18.8 (st. dev  Figure 31 shows the average scores obtained by the whole sample in each section of the questionnaire. In both preinstruction and postinstruction questionnaire, the students, on average, scored better on items about seasons (average pre ¼ 7.1; average post ¼ 8.4) and eclipses (average pre ¼ 6.4; average post ¼ 7.5) while they had more difficulty with items about the Moon phases (average pre ¼ 5.2; average post ¼ 7.2). However, differences between pre-and postinstruction average scores are statistically significant for all the three targeted phenomena (p < 10 −4 ).
Looking at between-group results in the post-test (Fig. 32), we determined that group 2 outperformed group 1 and 3 in the whole questionnaire (average score: Group 1 ¼ 21.0; group 2 ¼ 25.4, group 3 ¼ 22.9). In seasons' items, group 2 scored significantly higher than group 1 [tð65Þ ¼ 3.199, p ¼ 0.002], while the difference with group 3 is not statistically significant [tð65Þ ¼ 0.987, For the phases' items, the difference in the average score between group 1 and groups 2 and 3 is statistically significant [tð65Þ ¼ 2.804, p ¼ 0.007], while such a difference is not statistically different between group 1 and group 3 [tð65Þ ¼ 0.721, p ¼ 0.474]. For eclipses' items, differences between the average scores of group 1 and group 3 are not statistically significant [tð65Þ ¼ 1.825, p ¼ 0.073], while the difference between the average scores of group 2 and those of group 1 and 3 is statistically significant [tð65Þ ¼ 5.106, p < 10 −4 ].
Results from the baseline questionnaire are consistent with results of the analysis of drawings and written explanations (Table VIII).
For instance, the average score on seasons' items of students who produced an inclination-based drawing is 9.0 out of 10. Similarly, average scores of students who gave at least a partially correct explanation are consistently higher than average scores of students who gave an incorrect or generic explanation, although the differences are statistically significant only for the eclipses (Table IX).

VI. DISCUSSION
This study aimed at investigating students' visual representations, explanations, and conceptions about astronomical phenomena under three different teaching conditions: traditional textbook images þ text; innovative images þ text; text only (RQ1 and RQ2). We primarily hypothesized that students who were exposed to the innovative images þ text support condition would have provided better explanations, generated better representations, and learned more about the targeted phenomena. Second, we hypothesized that students who were exposed to the textbook images support condition would have given better explanations, generate better representations, and learn more about the targeted phenomena than students of the text-only support condition. Moreover, we aimed to identify the iconic elements of the innovative images that influenced students' visual representations and explanations (RQ3). In the following, we summarize the obtained results considering our research aims. To help the reader parse the obtained evidence we summarize the main results in Table X.

RQ1 How are students' explanations and visual representations about familiar astronomical phenomena affected by different image-support conditions?
Concerning the students' drawings, our main hypothesis for RQ1 is confirmed only for seasonal changes. Hence, the innovative images about seasons likely succeeded in helping students grasp the relevance of the orbital motion and of the tilt of Earth's axis to explain the phenomenon. In contrast, no effect of the innovative images condition was detected for Moon phases and solar and lunar eclipses over the textbook-images condition. Some beneficial effect was detected with respect to the no image condition for phases and lunar eclipses, confirmed also by the drawingexplanation combined analysis. Hence, we infer that textonly condition in the latter phenomena was not sufficient to help students represent from the conceptual viewpoint the 3D relationship among the three celestial bodies-Sun, Moon and Earth.
Concerning the students' written tasks, results seem contradictory in comparison to the previous ones, since we determined that, in general, also students who produced drawings with iconic features that could suggest a correct account of the addressed phenomenon were not able to give correct written explanations. Such a result is in line with prior literature that suggests caution in using drawings to elicit misconceptions or as an assessment tool [70]. We further discuss this issue in Sec. VII.
For seasonal change explanations, for instance, the innovative images condition was not more significantly effective than the other two conditions. However, we note from the combined analysis of drawings and explanations that students of group 2 were the most consistent in the two probes, with a lower percentage (4%) of them producing a combined distance-based drawing þ incorrect explanation and a higher percentage of them (83%) giving a combination of correct-a=b explanation þ an inclination-based drawing. Such evidence suggests that causal reasoning, which is relevant to understand seasonal changes [53], may be enhanced by specially designed image-based support.
For Moon phases, the specially designed images likely helped the student grasp their underlying mechanism, which is mainly related to spatial reasoning [122], in a better way than the text-only condition. On the contrary, for solar and lunar eclipses, the specially designed images proposed in the booklet did not support students more than the text-only condition for generating a correct explanation of the eclipses, while they were more successful with respect to textbook images.
As for drawings, such a result may be likely due to the mechanism underlying the eclipses phenomenon, which involves reasoning based on geometrical optics rules. Thus, the text-support condition might have been sufficient to help students build a correct explanation of the phenomenon, but not its visual representation, especially in the case of the lunar eclipse, which requires the use of suitable iconic elements to foster a correct spatial reasoning. For solar eclipses, in particular, the mechanism can be likely more easily traced back into students' everyday experience of shadows, making the visual representation used in the teaching process a less powerful support from the cognitive viewpoint. This evidence, moreover, confirms that students, to make sense of visual representations, activate cognitive resources that are mainly related to their prior and domainspecific knowledge [17,62,65].
Such a result may be interpreted also in the light of the conceptual stages framework, reported in Ref. [105]. When constructing explanations of scientific phenomena, students activate isolated knowledge resources and reorganize them in terms of temporary stable conceptual stages. In the eclipse case, the stable stages build on previously acquired concepts such as distance in the Cartesian plane and formation of shadows in geometrical optics. Our secondary hypothesis for RQ1 was essentially confirmed for Moon phases and lunar eclipse drawings, but it was not supported for seasons and solar eclipse drawings. This result is ultimately consistent with the main hypothesis except for the seasons drawings. It is hence worth discussing such cases in detail. In contrast to what emerged from the study in Ref. [79], the visual model of seasons change constructed solely from the text was resonant with the usual textbook visual representation of the phenomenon. Our interpretation is that this result is due to the text submitted to students of group 3, which was essentially the same of that of group 1 except for the absence of the images. In agreement with previous studies [54], this text likely activated the same cognitive strategies in the students of both groups. The different captions in the text submitted to students of group 2 may justify the different outcomes between group 1 and 2.
Overall, from our analysis for RQ1, we infer that the innovative images were mostly effective in helping students produce informed drawings with respect to text-only condition for all phenomena, and also more effective than textbook images when one considers seasonal change drawings. Moreover, innovative and textbook images were both effective for producing a correct explanation of the Moon phases phenomenon. In this case, our evidence is consistent with prior findings regarding students' engagement in high-level cognitive strategies when learning from diagrams [53,54,79]. In other words, the iconic features of all provided images about Moon phases (specially designed and usual ones) likely activated reasoning patterns at the basis of the formation of spatial cognition and reasoning [127].

RQ2 How are students' conceptions about familiar astronomical phenomena affected by different imagesupport conditions?
The main hypothesis is essentially confirmed in the baseline questionnaire for all the phenomena, except a slight nonsignificance of the positive difference between the innovative image condition and the no-image condition in the seasonal change. We hypothesize that this evidence is due to the design of the innovative images, which were constructed considering well-known student misconceptions and the semiotic frame. Hence, their iconic features likely helped students identify the correct alternative in the items in a more effective way. On the contrary, the secondary hypothesis is not supported for any phenomenon. Our interpretation of this result is that usual textbook images do not put enough emphasis on usual misconceptions, but rather reinforce some of them (e.g., the distance misconception using an emphasized elliptical orbit). We discuss in more detail the above evidence in Sec. VII.
RQ3 Which features of the used images most affected the students' visual representations and explanations of familiar astronomical phenomena?
This question aimed at investigating how students exploited iconic elements of the given images in their drawings and explanations.
Let us first discuss about change of seasons images. The prevalent iconic element of the images about change of seasons featured in the booklet of group 1 was the elliptical orbit rather than the tilt of Earth's axis. In terms of our framework, the asymmetrical structure (CST) of the image was likely more attractive from the cognitive viewpoint than the inclination of the axis (SEL), although the latter was emphasized through the verbal indication of the tilt (VER). Our data confirm such theoretically driven analysis since the textbook image did not help group 1 students generate better drawings and explanations with respect to the no-image group, but rather interfered [128] with the identification of the main underlying mechanism of the phenomenon. On the other hand, analysis of the group 2 students' drawings supports the evidence that the choice of including in the images of the booklet a circular Earth's orbit (CST) and the two-panel structure (INT), likely helped students to better select relevant factors underlying the phenomenon, as the relationships between the orbital motion (not the elliptical shape of the orbit), the tilt of the axis and the inclination of sunrays on Earth's surface (SEL). Our result for seasonal change is apparently in contrast with what reported in Ref. [67], where it was found that the elliptical shape of Earth's orbit did not influence significantly students' reasoning about the cause of seasons. However, we note that the author does not report analysis of student-generated visual representations. Moreover, he reported as a tilt-based explanation the following example: "It is warmer in the summer because the Earth is tilted on an axis, so we are getting more direct rays from the sun. It is colder in the winter because the Earth is tilted differently from above, so the rays hit us at a different and less direct angle" [67]. In our study, such explanations have been categorized as partial or correct-a type (see Table VII) since they do refer to Earth's axis, but they lack the reference to Earth's orbital motion. Such explanations were given by about 67% of the students of group 1 in the postinstruction written task, a frequency that is very similar to that found in that study (59%). Hence, we conclude that our results are not in contradiction with those in Ref. [67]. Rather, our findings add to Lee's study suggesting that the distance-based model in visual representations and the inclination-based model in written explanation may coexist.
Our result, on the other hand, confirms the findings reported in a study about Kepler's laws [129], where the authors found that about 70% of the students described planetary orbits as "not circles," producing in many cases highly eccentric depictions even from a top-view perspective. The authors hypothesize that the students likely misinterpreted the side view of the planets' orbits in typical textbook and internet images. We found that in the preinstruction drawing task, the overall percentage of students who produced an elliptical orbit was 56% and it increased to 76% in the post-test, in good agreement with the 70% percentage of elliptical orbits reported in Ref. [129]. Hence, as the authors suggest, our result confirms that textbook images may be the source of such mental representation of Earth, as well as of the other planets' orbit. Interestingly, our findings may shed light also on results of a recent study [130]. In this paper, the authors investigated the effects of instructional materials with refutation text and graphics on students' conceptual knowledge about changes of seasons. A refutation text aims at addressing a given misconception (in the seasons case, the distance-based misconception), first by acknowledging it and then by giving a correct explanation (in the seasons case, Earth's orbital motion and the axis' tilt). For the present study, it is important the refutation graphic. A refutation graphic serves the same aim of a refutation text, but instead it uses images. The authors of this study used as a refutation graphic a visual representation with two-sided images (INT) of Earth orbiting the Sun, both very similar to our Orbit & Tilt model (see also Fig. 3). The used images have both highly elliptical orbits (CST), use verbal elements (VER) to indicate the correspondence between the position of Earth and the corresponding season, and place the Sun at one of the foci of the ellipses (SEL). The image that aims at acknowledging the distance misconception (labeled with a bold face NO at the bottom) places the Sun at the focus near the position of the Earth when it is summer. The image that aims at addressing the distance misconception is the same except that the Sun is placed in the focus near the position of Earth when it is winter and is labeled with a bold face YES at the bottom. They hypothesized that a combination of refutation text and graphics would have resulted in a better result from the conceptual knowledge viewpoint. They found that the refutation graphic did not significantly increase students' performance. They also report as an "intriguing" result the fact that students with standard text and refutation graphic performed worse than the other groups. They also replicated the study asking the new participants to pay closer attention to the graphics, but they obtained again the same result, namely, that the refutation graphic did not improve significantly the students' performances. Using our semiotic framework, the findings in Ref. [130] could reasonably be justified since they confirm that the asymmetrical structure (CST) of the refutation image was more attractive from the cognitive viewpoint than the inclination of the axis (SEL). In our study, about 78% of the students who gave an incorrect explanation of the seasonal change produced an elliptical orbit with the Sun in the "correct" position, namely, slightly closer to Earth during winter (see Table VII). Hence, it is likely that it is not the position of the Sun that may trigger a correct explanation of the seasonal change, since the main iconic obstacle (the ellipsis) remains the same. For instance, in our study, one student claimed: "…during winter the Earth is closer to the Sun, but it is colder since the Earth accelerates due to Kepler's law… during Summer, we are farther from the Sun but the Earth slows down and so we receive more heat… ." Our results warrant more research to find out if refutation graphic that uses our representations could enhance students' conceptual knowledge about seasonal changes.
Concerning Moon phases, students of groups 1 and 2 showed a tendency to reproduce the image they received in the booklet. Such results confirm the findings reported in Ref. [96], thus supporting the hypothesis that students rely on an existing convention when producing a visual representation of a phenomenon. From the collected drawings and explanations of group 1, for instance, we determined that the position of the Sun with respect to the Moon orbit (SEL) was one of the most relevant iconic element of the textbook images for students of group 1. Another relevant iconic element seems the presence of the 8-moon circle, which, however, increased the difficulty in interpreting the overall structure of the image (CST), made up of two related representations (the 8-moon circle and the appearance of the Moon phases). More specifically, the addition of the 8 moons in the boxes was not very helpful in clarifying the mechanism underlying the changing appearance of the Moon during the month, since there was no reference to the time interval between two phases. Similarly, it was difficult to combine correctly the Sun to the different images and the positions of the Moon along its orbit around Earth (INT). Thus, also for Moon phases, we detected an interference between two iconic features of the textbook image, namely its compositional structure and the relationship between its different parts. A similar trend emerges from the analysis of drawings and explanations of students of group 2. The most relevant iconic element was the orbit of the Moon (SEL), while the relationships between the space, Earth and Earth's orbit perspectives (INT) seems to have been difficult for students to decode. However, the presence of only one related realistic and symbolic representation of the Moon (R/S) likely helped students to avoid producing a sequence-based drawing.
For solar eclipses, the inclusion of Earth and Moon orbits and of the line of nodes in the images of the booklet seems to have helped group 2 students gain a correct understanding of the phenomenon. The majority of group 2 students reproduced the condition for the eclipse (alignment) adding either a shaded area or a circle representing about the Moon's orbit (SEL). We note also that all students who produced the latter drawing gave a correct-b type of explanation. Seemingly, also the majority of students of group 1 generated a drawing whose compositional structure (CST) features the Sun, Moon and Earth alignment as the main cause for solar eclipse. However, the analyzed drawings present a selection of iconic elements (SEL) that suggests a not complete understanding of the phenomenon: in particular, students who included in their drawings the indication of sunlight as rays, more rarely included also shaded areas. We note that the poor understanding of the phenomenon by group 1 students is confirmed by the very small percentage of students who gave a correct-b type of explanation. Thus, we can infer that the main information conveyed by the textbook image about solar eclipse was the simple 2D alignment between the Sun, Moon, and Earth, as found in our previous study [118]. We remark that to interpret eclipses only through such 2D alignment model is not sufficient, for instance, to explain the different conditions under which a new Moon and a solar eclipse happen, or to understand why solar eclipses are visible only from small regions of Earth.
We observed this trend also when analyzing drawings of lunar eclipses. The most relevant iconic feature of the textbook image used by group 1 was a geometrical opticsbased construction of Earth and Moon shadows using special "rays" in a similar way to image formation by lenses. We expected that the resulting compositional structure (CST) should have helped students relate the shadow cones to the alignment in space of Earth, Sun, and Moon (SEL). However, this seems not to be the case since drawings of group 1 that featured also shaded areas were less frequent than simple alignment drawings. Shaded areas were more frequent in the drawings of group 3 students, who received no image in the booklet. Such evidence suggests that, also in this case, the compositional structure of the textbook image interferes with the selection of relevant factors underlying the phenomenon. On the contrary, the images of the booklet used by group 2 seem to have fostered a more significant use of shaded areas, since shading appears in most of the postinstruction drawings. Our interpretation is that the separation of the alignment information from the "shading" information in two separate but related panels (INT) may have highlighted how shadows are produced by projection of sunlight on Earth and the Moon, without the need to emphasize geometrical optics rules, which students may be not familiar with. Such evidence confirms that visual information addressing different factors underlying basic astronomy phenomena are more effective for students' understanding if presented separately with suitable links [38]. Similarly, the collected evidence supports findings in Ref. [131], which suggest, to reduce cognitive load, to separate images and text, presenting first the visual information and then the text, as we have done in the booklet of group 2.
Overall, using a contextually situated perspective to explain the nature of students' generated representations [96], the specially designed images likely helped the students to better link their perceptual framework of the addressed phenomena (e.g., the appearance of the Moon, the apparent motion of the Sun during the year) with the astronomical framework of the Sun-Moon-Earth motion. In this process of differentiating and relating such frameworks, a relevant role was likely played by specific iconic features that helped the students use in a more meaningful way conventional iconic features used in textbooks (e.g., the tilt of Earth's axis, the position of the Sun relative to Earth and the Moon, the alignment of the Sun-Earth and Moon and the shaded areas in an eclipse).

VII. CONCLUSIONS AND IMPLICATIONS
Images of astronomical phenomena have been often criticized for being ambiguous, not sufficiently explanatory of the represented phenomena, and in some cases misleading [38,66,68,69,132]. Evidence at the basis of such criticisms has been drawn indirectly from the existence of well-studied students' alternative conceptions in this content area [102,104,133], even after specific instruction [134]. The hypothesized role of images on students' misconceptions about astronomical phenomena seems related to a generalized concern about the accuracy of scientific representations in textbooks [24]. Such concern is of particular interest for this study since, unlike simple illustrations that function mainly as representatives of objects [135], visual representations in astronomy have the aim also to foster understanding of the underlying mechanism of the phenomenon they represent [59].
However, few research studies so far had investigated the possible relationships between students' drawings, explanation, and interpretation of the represented phenomena [67,105]. To the best of our knowledge, this is the first systematic study that addresses this issue triangulating results across three familiar phenomena-change of seasons, Moon phases, and lunar and solar eclipses-using three different probes.
The combined analysis of visual representations, carried out through the adopted sociosemiotic theoretical framework, and of students' explanations about these phenomena support criticism of textbook images. In particular, the textbook images used in the study featured prevalent iconic elements (e.g., ellipsis, rays, and circles) that plausibly interfered with the identification of the relevant factors underlying the phenomena, thus leading to incorrect explanations. In other words, semiotic affordances of textbook images' features likely led the students to interpret the information contained in the image through partial or incorrect mental models [136]. On the other hand, the adopted theoretical framework proved to be useful also in the design of the innovative images. In this regard, the collected data allowed identifying some specific iconic affordances of our images that may play an essential role in scaffolding meaningful understanding of the targeted phenomena in comparison to usual textbook images.
As first implication, hence, our results may help improve the design of image-based instructional materials and teaching-learning sequences in astronomy. In particular, findings suggest the need to clarify the convention used in the images [40], the links between realistic and schematic iconic elements [43], and to specify the correspondence between verbal and iconic elements [137]. Moreover, since students often use similar symbols (e.g., arrows) to indicate processes or different concepts as space and time transitions [77,115], it would be appropriate to provide within the images the suitable decoding keys of the depicted symbols using, for instance, different colors, shapes, shaded areas, or line contours. The effectiveness of design choices adopted for our images encourage the use of compositional structures that may facilitate students' development of a correct modeling, focusing on the relationships between the spatial structure and the mechanism underlying the phenomenon, such as the use of two different perspectives to explain the alignment of Sun, Moon, and Earth in a 3D view. Finally, the old motto "less is more" well resumes the design choice of avoiding coexistence of a temporal and space sequence-so as to maintain separate different situations in time and space-and the presence of iconic features that may resemble different phenomena (for instance, the arrows representing Earth rotation in a diagram about seasonal changes). By showing the effects of different instructional support, this study represents a first step towards a more complete picture of how the aforementioned suggestions may foster correct interpretations and the generation of visual representations in astronomy.
Our study has also a more general implication for teaching through and learning from visual representations. Science textbooks include an increasing use of multiple visual representations, including drawing, graphs, tables, and weblike features that try to compensate for the stillness of a printed image in the attempt to capture the attention of students while meaningfully explaining the presented concepts [23]. Moreover, modern digital technologies implemented on laptops, tablets, or interactive whiteboards have introduced new communication modalities, providing also "nonexperts" with the possibility to create and manipulate images. Such pervasive presence in curricular materials increasingly requires that teachers acquire skills and abilities regarding visual languages. In particular, teachers must be aware of different semantic levels of iconic features to fully exploit images and visual representations [46,90,[138][139][140][141]. Our results suggest paying special attention to the relationships between different but related representations of phenomena; possible dissimilar meaning of verbal elements or symbols in the same panel; emphasized graphical features, which could distract students' attention from the main factors underlying the represented phenomena.
Concerning research about student-generated visual representations, our study adds to previous efforts [142] by showing which mental models of astronomical phenomena students use more often in visual representations and how they correlate to explanation categories of increasing complexity. Understanding how mental models are expressed through different modalities may help gain some insight about the often-unclear relationships between students' drawings and their conceptions of the represented phenomenon [96]. For instance, we have found in the pretest that the most frequent drawings amongst the students who gave an incorrect explanation about seasonal changes were the distance-based ones (Distance 1, 46%; Distance 2, 42%). Correspondingly, the percentage of partial or correct explanations given by students who made an inclinationbased drawing (Orbit & Tilt, Orbit & Rays, Orbit & Tilt & Rays) increased on average from 12% up to about 67%. We found similar patterns also for the other targeted phenomena. For instance, in the pretest, the most frequent drawings among the students who gave an incorrect explanation about the Moon phases were sequence-based ones (Sequence 1, 63%, and Sequence 2, 17%), while among the students who gave a partial or correct explanation of the solar and lunar eclipses, about half produced a Sun-included type drawing (Sun & Moon, Orbit & Sun). On the other hand, we also found, for instance, that the students who gave an incorrect explanation about the Moon phases, about 40% produced a drawing featuring the Sun, and the Moon and Earth orbits (Orbit and Phases & Orbits models). Such models were also the most frequent ones among students who gave a correct account of the Moon phases (38% and 21%, respectively). We found also midway evidence. For instance, the majority of the students who gave an incorrect or partial answer about solar and lunar eclipses in the postinstruction task (35%) produced a drawing featuring the orbit of the Moon and Earth. Amongst the students who provided a correct explanation of the phenomenon, a common drawing outcome was a simple alignment-based model (SEM, 23%).
Such findings thus support suggestions to use drawings in parallel to written explanations and, if possible, to add a further probe to triangulate data. In such a way, aspects that emerge primarily from either drawings or explanations may be contrasted to figure out a more detailed and nuanced knowledge of the students' conceptions. Either way drawings are used, in combination or not with written explanations, our study suggests the reliability of the factor analysis to elicit students' models of a given phenomenon from their drawings. The main advantage of using factor analysis with respect to typical indexing schemes and selfmade rubrics [91,98] is the possibility to identify common patterns, focusing on the conception expressed in the drawing and disregarding the effect of superfluous or difficult to represent symbols [7]. As a further implication, the factor analysis of drawing could be extended to other areas in physics in which misconceptions and visual representations both play an important role, as, for instance, that of electric circuits.
Finally, our findings call for further research on learning support conditions in astronomy education. We used drawings as a way to elicit underlying models and to negotiate evidence-based accounts of such familiar phenomena [88]. To this concern, studies on computersupported drawings [143,144] suggest that to engage students in the generation of realistic drawings may favorably affect modeling skills. Similarly, further research studies are warranted to investigate the effects of simulation environments on students' representational competences in astronomy, in the same way as previous investigations carried out in chemistry [91]. Follow-up studies could investigate support conditions different from those examined in this paper, for instance, drawing vs images vs text, to foster quality learning from drawing [145].
Although it addresses research issues not well explored in the literature, this study has some limitations. First, generalization of the findings is limited by the small number of the involved students. It is likely that, with different samples, the distribution of students' answers could have been significantly different, However, what we were interested in was how specific iconic features of textbook images and of innovative images impacted on student-generated visual representations and explanations. As such, this is not a major limitation neither to our results nor to the implications. A second limitation is related to the kind of images that we have used in the group 2 booklet. It is likely that more effective images could have been designed to inspect if other iconic features of the used framework (e.g., the gestalt of an image) had an impact on students' reasoning and visual representations. Despite that this limitation does not affect the validity of our results, to address such an issue would be a fruitful objective for future research studies in the field of astronomy education.