• Tidak ada hasil yang ditemukan

Figure 8: Student Rating of the Frequency of Instructional Practices

The students rated the teacher-student strategies the most effective on the surveys.

Teacher feedback on assignments had the highest overall average, (M=4.11 out of 5, SD = 0.96). This is consistent with other studies of online learning (King, 2014). The next highest rated strategy was teacher explanations (M=3.86, SD = 0.89). Table 2 shows the full

statistical analysis. It seems that the online platform lends itself quite well to direct teacher instruction (see Figure 9).

The responses to the student-content interaction strategies revealed that

experiments at home were not as effective as observing teachers conduct experiments.

Students found that performing or mimicking labs at home, ‘Vexperiments’, was the least effective strategy (M=2.81, SD = 1.35). Students rated observing teachers conducting experiments in the lab, ‘Texperiments’, as 16% more effective (M=3.61, SD = 0.87). The

Figure 9: Student Perceptions of the Effectiveness of the Instructional Strategies

context of the school laboratory and the expertise of the teacher made a difference to students’ perceived learning.

The open response questions show that ‘pivot labs’ are the most effective virtual labs. Figure 10 shows the coded responses to the open questions on effective strategies.

The ‘pivot labs’ were overall the most effective instructional strategy for teaching science knowledge and skills. Students commented, “I found that pivot labs in class were the most helpful”, and “Going into breakout rooms and working on PIVOT labs is a great way to develop certain scientific skills in my opinion.” Pivot labs come from an education website that has modules of interactive videos that show students phenomena and problems and then take them through a step by step process of analysis and inference, with questions, feedback, and scaffolded explanations (Pivot Interactives, n.d.). The science teachers began using these interactive videos as a substitute to in-person labs. The student responses do not give much information about why these ‘labs’ were so effective, but perhaps two reasons emerge. First, these videos most closely approached in person labs, and require a fair amount of active analysis, inference, and hypothesis testing. The students work through these at their own pace, with the proper scaffolds, which facilitated learning. As one student put it, “Pivot labs, being able to work on lab work somewhat alone.” Second, the interactive platform gives immediate feedback on student answers. As one student explains, “Pivot Labs really help because when you submit an answer it gives you feedback on why it is right or wrong.” This tight feedback loop facilitates learning. For virtual science courses pivot labs could be the best way for students to conduct ‘labs.’

Figure 10: Student Open Question Responses on Instructional Practices

The open response answers also highlight the efficacy of group work and teacher experiments. Many students clearly benefit from the student-to-student interaction. As one states, “The most helpful strategy was the group assignments.” Although in the survey group work had the third highest rating (M = 3.78, SD = 1.04), the open responses show that for 22 students, about 25%, it was the most effective strategy. This is consistent with the research on virtual learning, which has found that students benefit from peer interaction (Nortvig et al., 2018; Martin & Bolliger, 2018; Gray & DiLoreto, 2016). One student explain why this might be the case, “I really enjoyed my partner(s) and I felt very comfortable asking questions that I didn't want to ask the whole class. They made my learning easier.” Students learn from each other. The other strategy students consistently identified was observing teachers conducting experiments. For example, “In class experiments that my teacher

conducted [were the most effective]”, and, “The most effective were definitely the labs and presentations that Mrs.--- did.” These answers give depth to the above analysis. Within the overall mean ratings of the practices there are some students for whom group work is the most effective strategy, and for others it is observing experiments.

Findings for Question 2: There were no significant differences in the student perceptions of the effectiveness of the assignments and activities engaging the five different scientific and

engineering practices (SEPs).

Students perceived only small differences in the frequency of engaging in the SEPs, with negligible differences in the ratings of the different teachers. The most frequent practice, mathematical thinking (M = 3.35, SD = 0.71), was only 0.5 higher on a 4-point scale than the lowest rated practice, using models (M = 2.85, SD = 0.70), which is only a 12.5%

difference overall (see Table 3). Teacher C had the highest frequency score on four of the five SEPs, yet the difference between that and the lowest frequency score was under 0.5. It would seem that the teachers engaged the students in the SEPs more or less the same amount over the course of the semester, and with a frequency in the same range as the instructional practices (see Figure 11).

Figure 11: Student Rating of the Frequency of the SEP Activities

The students rated the five SEPs almost the same in effectiveness. All of the SEPs had mean scores between 3.5 and 3.7, with a median of 4 and a standard deviation of around 1, except for the practice of using models (see Table 4, Figure 12). This practice was the lowest rated, but not by much (M = 3.37, SD = 1.05, Median = 3). In the open-response questions one student spoke about mathematical thinking as the most helpful, “The

mathematic problems were nice because they helped me make a good connection to what I was learning.” In the survey that practice received the highest rating (M = 3.78, SD = 1.02).

Yet overall it does not stand out among the other ratings. On the basis of the survey data, there are no strong conclusions about relative value of the different SEPs in teaching science in a virtual environment.

Figure 12: Student Perception of the Effectiveness of the SEP Activities

Findings for Question 3: Overall, student academic outcomes during virtual learning did not differ significantly from their ninth-grade outcomes, but the students in the lowest quartile did better relative to the other students and the students in the highest quartile did worse, with students in one teacher’s

section showing a small improvement relative to the other three teachers’ students.

The multiple linear regression showed that virtual learning did not have a significant impact on student outcomes. I converted all of the student outcome scores to z-scores and then ran

the regression with the ninth-grade z-score as the independent variable and the tenth-grade z-score as the dependent, with the teachers as predictor variables (see Table 5). The results show that the ninth-grade score predicts the tenth-grade score at +0.66 (p <0.001, df = 98).

Figure 13 shows the strong linear relationship between the two sets of scores.

Figure 13: Student 9th and 10th Grade Science Achievement Scores

The analysis of the changes in the z-scores by quartile shows that the virtual learning environment positively affected the lowest achieving students’ scores relative to the mean and negatively affected the highest achieving students’ scores (see Table 6). A paired t-test showed that the mean z-scores of the lowest quartile increased by 0.43 from their ninth- grade to their tenth-grade science scores (p=0.028), showing that these students improved significantly relative to the other students. The paired t-test also showed that the students in the highest quartile were adversely affected by the virtual environment relative to the other students, with a mean decrease in z-score of -0.42 (p=0.0014). The other two quartiles showed very little variation from ninth to tenth grade.

There are different possible reasons for this variation between the lowest and highest quartiles. One possible reason for this difference for the lowest quartile could be the lower overall grade average on the tenth-grade assessment, only 78.74%, compared to the 9th grade average of 84.89%. Overall, the students did worse, on average, on their tenth-grade exam than in their ninth-grade final score, based on raw percentage. This could explain some of the reason for the 0.43 increase in z-score in the lowest quartile, since the lower mean raises their z-score even if they performed at the same level as in ninth grade. Yet, even if this is the case, the increase still shows that the virtual environment did not affect the lower-achieving students’ scores as much as it did for the other students. The highest- achieving students in ninth-grade had the greatest decrease in test scores relative to the other students, suggesting that the virtual environment negatively affected their learning

more than it did other students. The reasons for this are not clear from the data. In an informal interview, the lead teacher told me that all four teachers had to reduce the content of the curriculum as they made their way as best they could in teaching virtually for the first time. In the complexity of all of the changes the teachers had to make could explain why the variation occurred.

Turning to the impact the different teachers had on student outcomes, the regression analysis shows that teacher D is a predictor of a higher tenth-grade score compared to the other teachers. Holding all other variables constant, students of teacher D increased their scores by 0.54 standard deviations above the mean from ninth to tenth grade (p=0.02).

How does one account for this difference? One explanation could have to do with the demographic breakdown of the students in the evaluations. I was not able to control for sex and ethnicity. However, previous information shows that teacher D had a higher proportion of males, white, and Asian students, all of whom typically fare well in science courses.

Another possible explanation could be that teacher D uses some instructional practices more or less than other teachers. In an attempt to determine whether the differences between teachers in the frequencies of the instructional practices is large enough to merit attention, I ran Bartlett’s homogeneity of variance test on the frequency scores of the instructional practices grouped by teacher (see Table 7). The only differences that were significant were the variance in group work frequency (B-stat = 13.47, p = 0.0037) and using models (B-stat = 8.77, p = 0.032). Teacher D used group work the least of all the teachers, according to the students. But since the students rated group work as one of the most effective strategies, that does not seem to explain the differences. Teacher D engaged students in using models most of all the teachers, yet the students rated this activity the lowest of all the SEPs. Therefore, that does not seem to explain the difference either.

Perhaps there are other instructional strategies that teacher D used that are not in the survey that account for the relative increase in student scores.

Findings for Question 4: Student interest in learning more about science increased during the semester, with Asians showing the largest increase, but not their interest in pursuing science and engineering in college and in a career, except for small

increases for black and male students.

There are a number of noteworthy results of the paired t-tests on the interest survey responses (see Table 8). Perhaps against what one would expect, the overall student interest in science, a science career, and studying science in college increased slightly during the semester. The largest effect size was general interest in science, with a mean increase of +0.19 on a 5-point scale, with a p-value of 0.0277.2 Also of interest is how the interest in a science career and studying science in college both increased by less than half as much as the increase in the general interest in science. This suggests that students are

2 In all of my statistical analysis I do not use the phrase ‘statistical significance’ and the p- value thresholds of 0.05 and so on. This is in response to the American Statistical

Association recommendation to refrain from using the p<0.05 threshold as the key value for drawing conclusions from statistical data and to retire the phrase ‘statistical significance’ as misunderstood and misleading (Wasserstein et al., 2019).

not connecting their interest in science with their longer-term plans to pursue scientific study and a scientific career. Finally, interest in engineering did not increase noticeably, with interest in an engineering career and studying engineering in college decreasing very slightly overall. The effect sizes are very small, so one cannot draw strong conclusions. However, when compared to the increase in interest in science, it may suggest that online learning is not conducive to increasing interest in engineering.

There are a number of important differences in the degrees of interest of the females and males. Females and males showed the same size increase in general interest in

science, +0.19, but males showed much larger increases on all of the other questions compared to the females, sometimes more than three times as large. Furthermore, females grew in disinterest in engineering over the course of the semester, with the largest estimate value of the whole dataset on the question of the female interest in learning more about engineering for college, with t-stat = -0.39 (p-value of 0.0096). The males increased on that question by estimate +0.19 (p=0.37).

Comparing the results of the ethnic groups also shows some noteworthy differences.

All four demographic groups increased their overall interest in science, with the Asians having the largest increase of +0.36 (p=0.083). The Black students showed a healthy increase in interest in science in college, +0.33 (p=0.10), larger than any other group. As with the overall student results, interest in engineering careers and studying engineering in college were slightly negative for all groups, with white students having the largest change at -0.15 for college engineering. Overall, however, the mean differences in interest scores were very small across almost all of the tests, so it is difficult to draw strong conclusions from the results. The incompleteness of the data, with several surveys discarded with incomplete responses, reduced the sample size and thus lowered the statistical power of the data.

Dalam dokumen Teaching Science in the Virtual Environment (Halaman 42-56)

Dokumen terkait