Concurrent with many of the studies cited, I have recently completed a five-and-a-half year longitudinal study concerning the effectiveness of course flipping in a moderately-sized honors general chemistry class. My fundamental research question was whether or not course flipping would provide significant improvements in learning outcomes in a general chemistry classroom setting.
The following will discuss the manner of the study as well as the outcomes. This study was approved by the Texas Tech University Institutional Review Board.
The Course of Instruction
The author began course flipping in the spring semester of 2009 in an on-line graduate conceptual chemistry class taught at Texas Tech University as part of a multidisciplinary master’s science degree. The methodology was borne from much the same motivation that Bergmann and Sams had for flipping at the high
23
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002
school level, i.e., to be able to maximize teaching effectiveness during the limited time available with the students. By 2010 it was clear to the author that “course flipping” (as the term had been coined) could potentially provide a more effective means of improving SLOs in the general chemistry classroom than through the normal lecture-homework-exam paradigm. A study was begun to evaluate the efficacy of the pedagogy. The first classes involved were the Honors General Chemistry courses for F 2011 - S 2012 at Texas Tech University (CHEM 1307, Principles of Chemistry I (Fall) and CHEM 1308, Principles of Chemistry II (Spring)). The courses contained 75 and 70 students, respectively, and met T Th 9:30 AM – 10:50 AM. In 2015 the class size was increased to 96 students to cover an increasing population of honors students.
All of the lectures for each course were pre-recorded in the summer of 2011 (for CHEM 1307) and the fall of 2011 (for CHEM 1308) using a Mediasite®
recorder, a document camera, and a video camera. The Mediasite® recorder had picture-in-picture capabilities. All of the lectures were composed of class notes with strategic blanks for examples to be worked. The notes were provided to the students, who could then fill in the blanks while watching the lectures.
This allowed for a tactile component to the learning process as well as the visual and audio representations in the recordings. The lectures were subsequently re-recorded in the summer of 2015 in high definition. Table 1 shows the characteristics of the lectures. Although the average lecture time was over 30 minutes, in a free-response survey given at the end of the first year of flipped instruction in CHEM 1308, 67% of the students thought that the videos were not too long. The author has polled the students in the CHEM 1308 classes every year from 2012-2015 concerning the length of the videos and has found a similar response. The main comment was that the students could pause the videos if they wanted to parse the time spent in viewing. There have been limited studies in the STEM disciplines concerning video length as it relates to improved SLOs (42).
The author is currently performing a study of video length as it relates to course flipping in general chemistry.
Table 1. Lecture Characteristics Lecture Times
No. of Videos Shortest Longest Average
CHEM 1307 26 3:40 64:35 33:20
CHEM 1308 27 19:27 47:17 31:25
The syllabus carefully listed the lecture number and the topic for each class.
Each lecture was correlated to a folder on the Learning Management System (LMS; here, Blackboard) that contained the videotaped lecture, a set of notes to be filled in, and a link to the Cengage online learning platform OWL for post-video quizzing. After watching each lecture (homework), the students then worked 6-10 homework questions using the OWL format (Mastery question bank). The
24
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002
scores were then recorded, and counted for 150 points out of a total of 800 points allotted for the course. This allowed for a determination of who had watched the videos each week.
Class time was divided into two parts:
- First half: Review of the lecture material. During this time, the instructor checked (in a discussion format) for main ideas and was able to clear up any misconceptions. In this way the instructor could determine what the class had learned by watching the videos, and could provide additional information and insight, as well as prevent any misconceptions or muddiness from propagating through the curriculum. This review often involved a variety of techniques, including having the students “act out”
molecular-level processes.
- Second half: This involved problem solving, using problems from the textbook (Oxtoby, Gillis, and Campion/Butler, Principles of Modern Chemistry. 7thand 8thEds., Cengage Learning, 2012 and 2015) (43,44) which had been previously indicated in the syllabus, so that the students could try the problems before coming to class, if so desired. Problems were worked in a variety of formats, depending upon the material and the class including group work, going to the board, modeling the answers, think-pair-share, etc.)
In addition to class time, the class was roughly divided in half and attended one of two zero-credit hour 1.5-hour discussion sections. The discussion section had additional interaction with the course material as well as preparation for a quiz given during each of the sections (one quiz per week per student). Three exams were given during the semester, as well as a final exam (cumulative). An ACS End-of-Term exam was administered as a pre- and post- test. As a way on incentivizing the exam, students were told that if they scored at or above the 90th percentile in the post-test, they did not have to take the class-based final exam.
An additional item that is often discussed is the number of contact hours in the flipped model compared to course credit hours, so that the students are not engaged for longer than the number of credit hours mandate. This is not really an issue, as in a traditional lecture-homework format, the out of class homework can take a variable number of hours, depending on the number and level of difficulty of questions asked. Care is often taken in the development of flipped classes so that if there is out-of-class assessment, the number of questions is relatively small (here 6-10 low to moderate-level questions, to assess initial understanding of the lecture material only). Given the in-class time constraints, the number of advanced problems worked is usually small (in this study, typically 3-5, after an initial discussion).
Evaluation of the Model: Methods of Assessment
Four methods of assessment were involved to determine whether or not course flipping in the method described above would be able to improve learning outcomes. These include:
25
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002
• Method I: Class-based Exam Score Comparisons
• Method II: ACS End-of-Term Exam Score Comparisons
• Method III: 40 –Question Likert Scale Questionnaire
• Method IV: Free-Response Questionnaire (Spring, 2012)
Evaluation of the Model: Results and Outcomes
The author has been the only teacher of the Honors sections of CHEM 1307 and 1308 since 1998, and, as such, has access to data for relatively homogeneous student populations over time (the average SAT scores (verbal + math; pre-2012 scale) over the period of study was 1350±50). Consequently, a historical approach has been used for comparison. The demographics of the study group are shown in Table 2.
Table 2. Demographics of Study Group
Fall 2006
Fall 2011
Fall 2012
Fall 2013
Fall 2014
Fall 2015
Spring 2007
Spring 2012
Spring 2014
Spring 2015 Male
% 44 58 45 58 49 43 35 53 43 38
Female
% 56 42 55 42 51 57 65 47 57 62
White
% 93 84 88 80 78 79 88 77 69 72
Hispanic
% 4 5 7 7 7 12 7 7 19 7
Asian/
Other
%
2 11 5 12 14 9 2 14 11 20
Black
% 0 0 0 1 1 0 2 1 1 1
Table 3 provides the average scores for the three exams and final exam that were given in the fall of 2006 in CHEM 1307 (pre-flipped) as well as the fall exam periods from 2011-2013, the years in which the study was conducted. For each comparative data set, a 1-tailed heteroscedasticttest was performed to determine the statistical significance.
26
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002
Table 3. Exam Score Comparisons, CHEM 1307, Fall, 2011 - 2015 Exam I
(Std Dev) Exam II Exam III Final Average Signifi- cance F 2006
(n= 45) 77.8(14.8) 77.9(12.8) 79.9(11.2) 78.8(12.0) 78.6(12.7) - F 2011
(n = 73) 87.5(9.2) 85.4(10.3) 85.8(10.0) 92.4(14.9) 87.8(11.1) (p = 0.0011) F 2012
(n = 75) 92.6(6.3) 85.8(7.7) 84.5(11.9) 90.6(10.3) 87.6(9.1) (p = 0.0026) F 2013
(n = 76) 90.2(7.1) 87.4(12.2) 83.2(12.3) 89.8(13.4) 87.7(11.3) (p = 0.0022) F 2014
(n = 74) 90.6(5.8) 90.4(7.2) 87.2(8.8) 88.1(10.1) 89.1(8.0) (p = 0.0023) F 2015
(n = 89) 90.9(10.1) 86.7(11.8) 88.8(7.8) 87.8(12.9) 88.6(10.6) (p = 0.0024) Average
Score 90.4(8.9) 87.1(10.3) 85.9(10.3) 89.7(12.3) 88.3(10.5) (p = 0.00008) Average
Δ 12.6 9.2 6.0 10.9 9.7
It is important to note that the exams given in the flipped classes were the exact exams given in 2006. Consequently, in this part of the study, the same instructor, same content, and same exams were used. As Table 3 demonstrates, the average increase in the exam scores as a result of using the course flipping pedagogy is more than nine percent. Each of the increases for each set of exams in the flipped class years relative to the pre-flipped year for CHEM 1307 are statistically significant at thep< 0.01 level. This is especially notable, given that the number of students taking the course increased by nearly 70% from 2006 to 2011. The more effective use of classroom time seems to be one reason for the increase in test scores. A similar effect was seen during the second semester of general chemistry, as noted in Table 4. The author did not teach this course in 2013.
As in the case with CHEM 1307, the 2012 and 2007 exams were the same.
Again, each of the increases for each set of exams in the flipped class years relative to the pre-flipped year for CHEM 1308 are statistically significant at the p < 0.01 level. The largest effect during the three years of the study was observed in the final exam statistics for CHEM 1308. There are several possible reasons why this might be the case. The ability to review the material due to the recorded nature of the lectures has been cited by the students (vide infra). Also, the continual preparation afforded by additional active learning during class time provides for a stronger ability for synthesis (the final exams were, in all cases, cumulative).
27
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002
Table 4. Exam Score Comparisons, CHEM 1308, Spring, 2012-2015 Exam I
(Std Dev) Exam II Exam III Final Average Signifi- cance S 2007
(n = 43) 84.6(12.2) 81.4(9.5) 83.6(8.0) 72.8(13.8) 78.6(10.9) S 2012
(n = 70) 91.2(6.5) 84.7(7.9) 86.2(10.5) 87.4(14.1) 87.3(9.8) (p = 0.0015) S 2014
(n = 75) 89.6(8.3) 84.5(8.3) 86.4(12.5) 87.9(17.5) 87.1(11.7) (p = 0.0020) S 2015
(n = 82) 90(5.6) 86.7(6.7) 88.8(7.2) 87.9(10.5) 88.4(7.5) (p = 0.0016) Average
Score 90.3(8.2) 85.3(8.1) 87.1(9.6) 87.7(14.0) 87.6(10.0) (p = 0.0017)
Average Δ 5.7 3.9 2.5 14.9 9.0
In an attempt to mitigate any instructor bias in the preparation of the assessments, the 2005-2006 American Chemical Society First and Second Term General (EOT I and II) algorithmic exams were administered to the students (except in 2007, when a shorter, conceptual test was given; these results are not reported, due the differing nature of the test). The results of the exam scores for students scoring over the 95thand 80thpercentiles on the 2005 ACS First Term General Chemistry exam over a ten-year period are shown in Table 5. It is striking that in the pre-flipped 2009 class only two students scored at or above the 80th percentile. Once flipping began in earnest in 2011, the number of students scoring above this benchmark began to significantly increase (an average of 16.0% ± 6.2 scored above the 80thpercentile pre-flipping, while 23.2% ± 4.2 scored above the 80thpercentile post-flipping (p= 0.09)). Given that the exam was administered to a relatively homogeneous student population with the same instructor teaching the class each year, it is strongly suggestive that the enhanced student attention and time on task provided in the flipped environment is a likely cause of the improvement.
Similar results were observed for the two most recent years in which CHEM 1308 was taught by the author, using the 2006 ACS Second Term General Chemistry Exam as a comparison with the spring semester of 2010 (Table 6).
Spring of 2010 was the last semester that CHEM 1308 was taught in a non-flipped format.
28
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002
Table 5. 2005 ACS First Term General Chemistry Exam Comparisons for CHEM 1307
F 2005 F 2006 F 2008a F 2009 F 2010
Above 95th%ile 3 3 0 0 2
80-94th%ile 5 9 6 2 6
Total >80 %ile (%) 8 (18.6) 12 (26.1) 6 (13.6) 2 (4.2) 8 (17.4)
F 2011 F 2012 F 2013 F 2014 F 2015
Above 95th%ile 5 4 1 4 9
80-94th%ile Total >80 %ile (%)
10 15 (20.0)
9 13 (18.3)
7 8 (23.5)
11 15 (20.3)
22 31 (33.7)
aIn 2007 a short form of the ACS End-of-Term Exam was used
Table 6. 2006 ACS End-of-Term II Exam Comparisons for CHEM 1308
Sp 2010 Sp 2014 Sp 2015
Above 95th%ile 3 3 10
80-94th%ile 2 9 13
Total >80%ile(%) 5 (10.4) 12 (16.4) 23 (27.7)
Several other studies have used the ACS-end-of term exam average percentile rankings as indicators of overall improvements in SLOs using the flipped course format compared to a traditional lecture format (12, 14). One criticism that could be raised about using percentile averages is that it does not provide an indication of overall learning gains throughout the course of the semester, merely an indication of the relative content knowledge of the students at the end of the course of instruction only. If the students entered the course with significant prior knowledge, then a marginal increase in percentile score might be expected using either a traditional or a flipped course approach. This would still be recorded as a significantly high score.
In an attempt to determine whether the flipped course of instruction resulted in significant increases in content knowledge (at least in an algorithmic sense, as the conceptual ACS exam was not routinely administered), we performed a pre- post differential analysis for CHEM 1307 comparing F 2008 with F 2015. These were the two years with the largest differentials between pre- and post-flipping percentiles within the data set. The results are shown in Figure 1. The histograms are presented as the percent of students within each class that achieved the pre-post differential within the given bin. Percentages are used to normalize to class sizes.
The normal distribution is superimposed for each year.
29
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002
Figure 1. Percentile Differentials for 2008 and 2015 ACS EOT I Exam
The average normal differentials are 35 and 42 for 2008 and 2015 respectively.
The results are significant at the p < 0.01 level. A similar and more compelling result is seen for the spring 2010 and spring 2015 pre-post percentile differentials (Figure 2). The average normal differentials are 27 and 34 for 2008 and 2015 respectively. The results are significant at the p < 0.001 level. These data tend to indicate that the use of the flipped class pedagogy significantly increases student learning outcomes compared to the traditional lecture format throughout the course of an entire semester of study for honors students. Since the students in this study were honors students who were pre-selected by the Honors College to be placed in the class, we can add little to the question of whether or not a flipped class environment will be of benefit to lower-level students.
Figure 2. Percentile Differentials for the 2010 and 2015 ACS EOT II Exam 30
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002
It should be noted that only small variations in the average pre-test percentile scores were noted between the years shown in Figures 1 and 2 (EOT I Exam, 2008, 21.4 ± 8.9; 2015, 23.2 ± 11.5; EOT II Exam, 2010, 16.4 ± 9.2; 2015, 16.8 ± 8.75), further indicating the significance of the pre-post differential.
Likert Questionnaire
A questionnaire was administered to the students in each class over the course of the 2011-2012 academic year. The number of students responding in the fall of 2011 (CHEM 1307) was 63, while 43 provided responses in the spring of 2012 in CHEM 1308. The following are some of the significant responses (score >
3.0) using a 1 (disagree; negative) to 5 (agree; positive) Likert scale for the fall, 2011 CHEM 1307 class. The numbers in parentheses refer to the positive response percentage in the spring 2012 CHEM 1308 class, and the number in brackets refers to the average score for the spring class.
• 52 (70) % [3.50 ± 0.55] of students thought that they spent more time in the flipped course
• 78 (86) % [4.05 ± 0.93] of the class felt that time shifting put more of the responsibility for learning the material on the student
• 57 (77) % [3.65 ± 0.61] of the class agreed that there was increased interaction between the professor and class in the time-shifted format compared to other classes
• 75 (91) % [4.03 ± 0.88] of the class felt that the instructor worked an adequate number of examples in class.
• 78 (90) % [4.34 ± 1.16] of students believed that the instructor was a partner in their learning of chemistry.
• 69 (67) % [4.03 ± 0.91] of students liked the use of OWL to test their understanding after watching the lecture
• 77 (93) % [3.26 ± 0.20] of students, knowing what they know now, would NOT have taken a different section.
• 37 (72) % [3.90 ± 0.92] of students would take another time-shifted course again while 23% felt that this was n/a.
• 55 (84) % [3.57 ± 0.66] of the students felt that the time-shifted lecture/
discussion section format was useful, while 22 (7) % of the students did not.
It should be noted that the increase in positive responses during the spring semester is most likely due to the fact that the majority of the students who took CHEM 1308 took CHEM 1307 the previous semester in a flipped format.
Free-Response Questionnaire
In addition to the Likert-scale questionnaire, a free response section on the questionnaire given after the CHEM 1308 course was complete provides some useful information with regard to some of the mechanical aspects of the course.
Some of the relevant data include:
31
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002
• How many lectures per week did you actually watch? 79% of the students watched all of the lectures.
Commentary:This tends to indicate that the students did not pick and choose which lectures to watch.
• How many classes per week did you actually attend? 86% of the students attended every class.
Commentary: This allayed some of the fear that students would believe that watching the out-of-class videos was sufficient to provide all of the relevant content for the course.
• When watching the lectures did you actually watch it as if you were actually attending class?74% of the students watched the lectures as if actually attending class.
Commentary:The students tended to treat the videos in an analogous fashion to actually attending the lecture. The use of a notebook with blanks that could be filled in while watching the video probably enhanced this effect.
• Did you watch the video lectures as review for quizzes and exams? 60% of the students did NOT watch the video lectures as review for quizzes and exams.
Commentary: Although the students did not watch the entire video lecture set as a review for exams or quizzes, they reported that they spot-watched the video as a review for topics that were somewhat unclear for them.
• Did you think this method of teaching left more time for procrastination between tests compared to a traditional lecture style? 53% of the students said that this method did not leave more time for procrastination.
Commentary: The pace of the course was designed to relatively closely match the number of contact hours relative to the traditional lecture course.
How many hours per week on average did you study for this class? 76% of the students studied between 2-4 hours per week. 17% of the students studied more than 4 hours. 7% of the students studied less than 2 hours.
Commentary: The reduction in the number of hours on average that the students “studied” for the course is most likely due to the increased repetition (albeit in different formats) with which they worked with the various topics in the course. They studied less because their study time was used more efficiently.
• How many hours do you prepare for each test? 36% of the students prepared 1-2 hours; 20% of the students prepared 3-4 hours; 27% of the students prepared 5-6 hours; 17% prepare 7 or more hours.
Commentary: The reduction in the number of hours on average that the students “studied” for exams is again most likely due to more efficient time on task.
• When you attended class, on a scale of 1-5 did you actually work the problems with the professor (5) or just simply write the answers down (1)?Those answering 3-5: 81%. Those answering 0-2: 19%.
Commentary:This is almost of necessity, as the problems were worked using active and engaged learning strategies.
• On a scale of 1 to 5 (1 being no, 5 being very important) did you think you actually needed to watch the lectures to earn a good grade in this class? 79% of the students answered 4-5.
32
Publication Date (Web): December 1, 2016 | doi: 10.1021/bk-2016-1228.ch002