How to insert visual information into a whiteboard animation with a human hand? Effects of different insertion styles on learning
Smart Learning Environments volume 10, Article number: 39 (2023)
Whiteboard animations have become very popular in recent years. They are mainly used in distance education, where learners can acquire knowledge individually and without the help of a teacher. However, there is little empirical evidence on how whiteboard animations should be designed to achieve learning-enhancing effects. Since the presentation of whiteboard animations is reminiscent of a teacher drawing or showing content on a whiteboard, the hand has been identified as an essential feature of this learning medium. Therefore, the aim of this experimental study was to investigate whether and how the human hand should be implemented in whiteboard animations for the presentation of visual content. University students (N = 84) watched a whiteboard animation in which the type of information insertion was manipulated (hand drawing content vs. hand pushing content in vs. no hand visible). Results revealed that the drawing hand on a whiteboard led to significantly higher intrinsic motivation than the hand pushing visual content onto the whiteboard. Contrary to assumptions derived from cognitive load theory, the implementation of a human hand did not cause extraneous cognitive load. However, no other effects on the perception of the instructor, cognitive load, and learning performance were found. The results are discussed in terms of both cognitive and social processes in multimedia learning.
Imagine the following situation: You are sitting in a classroom and listening to the teacher as he or she writes down learning content on the whiteboard. What probably reminds most people of (analog) school days gone by, can now also be achieved with whiteboard animations. With the growing use of technology-based learning environments, whiteboard animations have become a popular tool used in distance learning and informal instructional contexts, such as YouTube. They are the digital equivalent of analog drawings on a whiteboard. Whiteboard animations show the digital process of drawing or animating pictures on a whiteboard accompanied by a human voice (Türkay, 2016). However, there is a discrepancy between the increasing use of whiteboard animations in educational contexts and the gap in knowledge about how to design them. Perhaps the most striking feature of whiteboard animations is the human hand writing or pushing content on the whiteboard (e.g., Fiorella & Mayer, 2016). Considering both cognitive and social processes during learning, this study aims to examine whether the presence of a human hand in whiteboard animations is associated with learning-enhancing effects. In this context, the present study experimentally examines two types of information insertion, that are commonly used in whiteboard animations, often in a combined form: a human hand drawing visual content on a whiteboard and a human hand pushing the visual content onto the whiteboard (content already drawn). Furthermore, a control group is included in which no hand is shown and the content appears automatically on the whiteboard. By this, theoretical and practical implications will be derived.
The learning tool whiteboard animation
Animations (or dynamic visualizations) are defined as visual representations in which changes in space and time are explicitly displayed (e.g., Plötzner & Lowe, 2012). Dynamic visualizations consist of a series of static pictures displayed in rapid sequence. This creates the optical illusion of continuous change (Plötzner et al., 2021). For learning, animations are thought to have an advantage over static pictures. A recent meta-analysis by Castro-Alonso et al. (2019) found a small effect (g + = 0.23) in favor of dynamic visualizations. However, as pointed out by Mayer and Moreno (2002), animations are not per se conducive to learning. From a learner-centered perspective, animations need to be designed and presented according to current learning theories and their recommendations. A rationale for the use of animations in educational settings can be provided by the cognitive theory of multimedia learning (CTML; Mayer, 2021). Based on the assumptions that people process visual and auditory information in separate channels (Clark & Paivio, 1991), have a limited working memory capacity (Baddeley, 1992), and actively engage in information processing (Wittrock, 1989), animations should be designed to support learners in building up a coherent mental model of learning-relevant information.
One sub-type of animations are whiteboard animations. This learning tool includes “videos that depict the process of drawing a finished picture, usually on a whiteboard or something resembling a whiteboard” (Türkay, 2016, p. 103). Hereby, pictures are usually drawn by a human hand or slid onto the whiteboard (e.g., Krieglstein et al., 2023). In doing so, Schneider et al. (2023) could show that the progressive drawing of visual content (i.e., dynamic visualization) within a whiteboard animation is associated with higher retention and transfer performance than the presentation of the finished drawn picture (i.e., static picture). The illustrations are usually accompanied by a narrator’s voice, giving learners the feeling that they are following an instructor as he or she explains the learning content on the whiteboard.
The current state of research on whiteboard animations is mostly limited to media comparisons, for example with brochures and PowerPoint slideshows (e.g., Occa & Morgen, 2022; van der Meij & Draijer, 2021). In a study by Türkay (2016), whiteboard animations were shown to be associated with significantly better retention performance than audio and text presentations. Moreover, whiteboard animations increase perceptions of engagement and enjoyment. However, the novelty effect cannot be completely ruled out as a basis for explanation (e.g., Clark, 1983). Comparing different media in terms of their ability to promote learning is fraught with the danger of confounded comparisons (e.g., Clark, 1985). For example, because whiteboard animations and texts are different media representations, a comparison is not useful as no concrete design recommendations can be derived. Overall, there is still little insight into how whiteboard animations should be designed. This requires controlled experiments in which selected components of the whiteboard animation are manipulated. In this context, the human hand, which animates the content on the white background, e.g. by drawing it or dragging it onto the whiteboard, can be considered as a key component of whiteboard animations.
Different roles of the human hand in whiteboard animations
Although the human hand has been used in a variety of whiteboard animations presented in experimental research (e.g., Türkay, 2016; van der Meij & Draijer, 2021), there is still no empirical evidence as to whether this is at all conducive to learning at all. However, there is already a considerable body of theory and frameworks that argue for or against the inclusion of the hand in whiteboard animations.
One way to increase the social affordances of computer-based learning environments such as whiteboard animations is the implementation of human instructors (e.g., Pi et al., 2020; Wang & Antonenko, 2017; Wilson et al., 2018). In recent years, several studies have investigated the effects of a human instructor on multimedia learning (e.g., Lawson et al., 2021; Ramlatchan et al., 2020; Wang et al., 2020). Human instructors must be distinguished from pedagogical agents, which are defined as virtual (nonhuman) on-screen characters (Martha & Santoso, 2019). However, Henderson and Schroeder (2021) argue that the use of human instructors and pedagogical agents follow the same logic, as both serve to enhance social interaction in learning and ultimately improve learning performance. Similarly, studies comparing human and virtual instructors have found no significant differences in terms of their effects on learning performance (e.g., Horovitz & Mayer, 2021). Lawson et al. (2021) were also able to show that learners can similarly recognize emotional tones displayed by human or virtual instructors.
It is generally assumed that such social partners serve educational purposes by guiding the learner through a digital learning environment (Heidig & Clarebout, 2011). The instructor or agent can be seen as a knowledgeable mentor who motivates the learner (Baylor & PALS, 2003). In principle, the human instructor does not have to be completely visible, i.e., from head to toe. For example, it is sufficient, if a human hand is visible (e.g., Fiorella & Mayer, 2016; Schroeder & Traxler, 2017) or a human voice (e.g., Atkinson et al., 2005; Mayer et al., 2003) is audible to recite the learning content. Studies have shown that an instructor in a multimedia learning environment increases intrinsic motivation (Beege et al., 2022) and mental effort (Lin et al., 2020).
The approach of implementing pedagogical agents or instructors in multimedia learning is closely related to the embodiment principle (e.g., Fiorella, 2021). Based on the assumption that the human motor system is involved in a variety of cognitive tasks (e.g., mathematics; Wakefield et al., 2019), the embodiment principle recommends implementing task-relevant sensorimotor experiences in the learning environment. In this context, studies have shown that a pedagogical agent performing physical movements such as gestures within a learning environment leads to improved learning performance (e.g., Mayer & DaPra, 2012; for a meta-analysis see Davis, 2018). It is not always necessary for the learner to perform such movements– often it is sufficient to observe them (e.g., Cook et al., 2013). This kind of “thinking with the body” extends working memory capacity and cognitively relieves the learner (Sepp et al., 2019).
The justification for including human instructors in multimedia learning environments can be further explained by several other theories. From a human–computer interaction perspective, the beneficial effect of pedagogical agents can be explained by the computers as social actors (CASA) paradigm (Nass et al., 1994). In this context, people tend to interact with a pedagogical agent presented on a computer in a similar way as they would with a real person. When people attribute human-like characteristics to a digitally presented instructor, the persona effect comes into play (e.g., Craig et al., 2002). In this context, instructors need to have a persona—an authentic agent that facilitates learning and appears engaging, human-like, and credible (Baylor & Ryu, 2003). Similarly, the social agency theory, which is anchored in multimedia learning research (Mayer et al., 2003), posits that the presence of a pedagogical agent or human instructor in a multimedia learning environment causes learners to feel that they are engaged in social interaction. When learners perceive such social cues, they become more engaged in learning, which in turn is associated with better learning outcomes. In this context, the cognitive-affective-social theory of learning in digital environments (CASTLE), proposed by Schneider et al. (2022a), argues that social cues resulting from the interaction with pedagogical agents or human instructors activate social schemata that lead to improved learning-relevant, motivational, and metacognitive processes.
It is often argued that a human hand in an instructional video or whiteboard animation is unnecessary or even detrimental to learning, leading to the classification of the human hand as a seductive detail (for a meta-analysis, see Sundararajan & Adesope, 2020). This position is supported by cognitive load theory (CLT; Sweller, 2020). In line with this cognitive-oriented framework, the hand in a whiteboard animation can be defined as interesting but irrelevant information that is not essential for achieving the learning goal (e.g., Harp & Mayer, 1998). According to CLT, the argument against such seductive details is that they increase extraneous cognitive load (ECL). In general, ECL depends on the presentation and design of the learning material (Sweller et al., 2019).
The aim, therefore, is to reduce extraneous processing by providing appropriate learning materials so that unnecessary cognitive resources are not wasted on processes irrelevant to learning. This frees up enough resources to deal with the complexity of the information to be learned. The complexity of the learning material is referred to as intrinsic cognitive load (ICL). It is assumed that the task complexity depends on the element interactivity, which describes the amount of information that must be learned at the same time. Besides, the complexity of a task can be reduced if learners can draw on prior knowledge (Sweller et al., 2019). The third type of cognitive load, germane cognitive load (GCL), refers to learning-relevant activities in which learners actively invest cognitive resources (Kalyuga, 2011). In contrast to ICL and ECL, which are perceived passively, GCL plays an active role in learning. Ideally, the investment of cognitive resources results in knowledge being stored in long-term memory in the form of schemata (Kirschner, 2002).
Returning to the human hand in whiteboard animations as a seductive detail: It is suggested that when ECL is increased, fewer cognitive resources are available to devote to intrinsic load. In this context, CLT recommends a “less is more” approach to the design of learning environments (Mayer, 2014). This means that learning materials should be designed in such a way that available working memory resources are used for germane processing, i.e. the construction and automation of schemata (Kirschner, 2002; Paas & van Merriënboer, 2020). Similarly, the coherence principle, derived from CTML, suggests that non-essential visual information, such as a human hand, should be avoided because processing it unnecessarily consumes cognitive resources (Mayer et al., 2008). Similarly, Schroeder and Traxler (2017) have found that the inclusion of a human hand in an instructional video is associated with lower learning performance compared to a condition in which the human hand is absent. The authors suggest that the hand is an extraneous feature that consumes working memory resources that would be needed for learning. This seems to be especially the case when teaching complex topics (causing high ICL) with an instructional video. However, the study by Schroeder and Traxler (2017) was based purely on a comparison of whether the presence of a human hand is conducive to learning or not.
Dynamic drawing principle
A salient feature of whiteboard animations is the drawing of visual content by a human hand, similar to a teacher writing on a whiteboard. In this context, empirical findings support the idea of implementing human-generated drawings in instructional videos. In multimedia learning research, the dynamic drawing principle (Fiorella & Mayer, 2016; Fiorella et al., 2019, 2020; Mayer et al., 2020) describes that people learn better when a video lecture shows the instructor drawing content than when the instructor refers to already drawn content. Accordingly, in a study by Fiorella and Mayer (2016), students watched a video lecture either in an already-drawn format or watched the instructor draw the content by hand. Across four experiments, results revealed that watching an instructor drawing content is beneficial for learning. It also appeared that watching the instructor drawing contents was only beneficial for learners with low prior knowledge. Furthermore, observing a drawing instructor promotes learning when both the instructor’s body and the instructor’s hand are visible. The instructor drawing hypothesis was also confirmed in another study by Fiorella et al. (2019).
As outlined by Fiorella et al. (2020), there are cognitive and motivational benefits to observing dynamic drawings, as is common in whiteboard animations. In this context, basic principles of multimedia learning are considered when content is drawn by an instructor (e.g., Fiorella & Mayer, 2016). Thus, dynamic drawings act as signals that direct learners’ attention to learning-relevant content (signaling principle or cueing principle; Alpizar et al., 2020; Chun, 2000). In addition, the simultaneous presentation of visual drawings and corresponding oral explanations supports temporal contiguity within the learning material (e.g., Ginns, 2006). In this way, ECL can be reduced so that sufficient cognitive resources are available for processing learning-relevant information. Considering the above remarks on social cues, it is assumed that dynamic drawing motivates learners to engage deeply in generative processing (Fiorella et al., 2020). As a result, learners are motivated to invest mental effort in learning so that engaged learning leads to successful learning.
The present study
In summary, the theories discussed above take rather opposing positions on the use of a human hand in whiteboard animations, depending on whether one views the learning process from a cognitive or a social perspective. Experiments conducted by Fiorella and Mayer (2016), as well as Fiorella et al., (2019, 2020), seem to suggest that a human hand drawing content on a whiteboard is more beneficial for learning than a human referring to already drawn content. However, it is still unclear whether a human hand pushing the visual content on the whiteboard without drawing it is also conducive to learning. Following assumptions derived from CLT and CTML assuming that the human hand is an interesting but learning-irrelevant extraneous detail (e.g., Sundararajan & Adesope, 2020), a control group is added in which the visual content appears automatically on the whiteboard without a human hand being visible. Thus, the objective of the present study is to determine whether and, if so, how the human hand should be implemented in whiteboard animations for the presentation of visual content. In this context, the effects of the intentional manipulation on learning-relevant variables will be examined. Due to the rather conflicting theoretical backgrounds, research questions (RQ) are formulated:
What is the impact of different information insertion styles (hand drawing content, hand pushing content in, no hand visible) on …
The perception of the instructor?
Participants and design
This experiment is based on a single-factor design with three levels (independent variable: information insertion). Because previous research on dynamic drawing has mostly shown medium to large effects (e.g., Fiorella et al., 2019), an a-priori power analysis (G*Power version 3.1; Faul et al., 2009) was conducted assuming an effect size of f = 0.35. This analysis recommended a minimum sample size of 84 participants (1 − ß = 0.80; α = 0.05). Data were collected from 94 students. Due to technical problems (e.g., screen-sharing did not work, whiteboard animation on the website did not fully load), ten participants had to be excluded. The remaining 84 students (76.2% female, 2.4% did not specify their gender; Mage = 23.2; SDage = 3.1) from Chemnitz University of Technology (Germany) were considered for statistical analyses. Students were enrolled in media communication (56.0%), media and instructional psychology (31.0%), computer science & communication studies (2.4%), and other study programs (10.7%). At the time of the study, the participants were studying between the 1st and the 12th semester (M = 3.2; SD = 2.1). As compensation for participating in the study, students received either 6€ or 0.75 h course credit. Each student was randomly assigned to one condition. Accordingly, participants learned the whiteboard animation in one of three conditions (hand drawing content vs. hand pushing content in vs. no hand visible). All three groups consisted of 28 participants (see Table 1 for detailed demographic characteristics). The prior knowledge of the participants on the learning content can be classified as low (M = 1.3; SD = 1.0; with a maximum of eight points).
The material consisted of a whiteboard animation about black holes and Hawking radiation. It explained what exactly black holes are and the theory behind their definition. The physical function of black holes (gravity) was also explained. In this context, it was explained why a black hole generates extremely strong gravity in its immediate vicinity. Regarding Hawking radiation, it was explained that it is said to emanate from the event horizon of a black hole. There, so-called particle-antiparticle pairs are formed. Near the event horizon, however, it is possible that the antiparticle crashes into the black hole, while the matter particle escapes and thus the black hole actually emits radiation. The whiteboard animation lasted 8:44 min and was created using the software Doodly (Voomly LLC., 2021). In detail, the whiteboard animation consisted of 18 slides with a varying number of images per slide. A total of 42 mostly comic-like pictures were used. On some slides, short written words were added, such as labels or important numbers. In all conditions, the whiteboard animation and the content were identical, the only difference being how the visual learning contents was animated (see Fig. 1). Hence, the visual content was animated either by drawing the content with a human hand (condition hand drawing content), by pushing the visual content onto the whiteboard (condition hand pushing content in), or the visual content was animated without a human hand, appearing virtually out of nowhere (condition no hand visible). In detail, in the drawing condition, a human hand (right-handed) drew all the visual content with a pen (i.e., it wrote the texts and drew the pictures). In the pushing condition, the visual content was pushed onto the whiteboard by a human hand. The content was pushed onto the whiteboard by the shortest route. This means, for example, that content that is visible on the left was pushed onto the whiteboard from the left. In the condition without a human hand, the visual content appeared as if it had been written by a human hand without the hand being visible. The visible hands were rather neutral, i.e. they could not be clearly assigned to the male or female gender. Following the voice principle (Mayer et al., 2003), the visual content was accompanied by a human voice. Consistent with meta-analytical findings of Castro-Alonso et al. (2021), the human voice was female, as it has been shown that a pedagogical agent with a female voice has a larger effect on learning outcomes. The whiteboard animation was presented system-paced, indicating that the learner had no control over the progress of the whiteboard animation (Biard et al., 2018). Learners were able to view the whiteboard animation once.
For all measures, the coefficient McDonald’s ω (McDonald, 1999) was chosen to calculate internal consistency.
Perception of the instructor
The agent persona instrument (API; Ryu & Baylor, 2005) was used to measure the perception of the instructor. In detail, the subscales facilitating learning (ten items; ω = 0.90; e.g., “The agent kept my attention”), credible (five items; ω = 0.77; e.g., “The agent was helpful”), and engaging (five items; ω = 0.85; e.g., “The agent was motivating”) were used. Participants rated the items on a 5-point Likert scale ranging from (1) “strongly disagree” to (5) “strongly agree”.
Learner’s perceived cognitive load during learning was measured with the questionnaire by Klepsch et al. (2017). A meta-analysis by Krieglstein et al. (2022) has shown that self-rating scales can reliably measure cognitive load. In detail, the German sub-scales of ICL (two items; ω = 0.81; e.g., “This task was very complex”), ECL (three items; ω = 0.82; e.g., “During this task, it was exhausting to find the important information”), and GCL (two items; ω = 0.71; e.g., “My point while dealing with the task was to understand everything correct”) were used. Each item has to be rated on a 7-point Likert scale ranging from (1) “strongly disagree” to (7) “strongly agree”.
The learner’s motivation was measured with the situational motivation scale (SIMS; Guay et al., 2000). For the aim of this study, the sub-scale intrinsic motivation consisting of four items was used (ω = 0.88). Specifically, participants were asked to indicate why they are currently engaged in this activity. For example, the item “Because I think that this activity is pleasant” had to be rated on a 7-point Likert scale ranging from (1) “strongly disagree” to (7) “strongly agree”.
First, prior knowledge was gathered because it affects learning performance as well as cognitive load perception (e.g., Chen et al., 2017). Accordingly, it was measured with three open-answer questions (e.g., “What are black holes?”). Two raters scored the responses with the help of a prepared list with correct answers. The rater’s agreement (i.e., inter-rater reliability; McHugh, 2012) was moderate to strong across all three questions (question 1: κ = 0.73; question 2: κ = 1.00; question 3: κ = 0.86). Overall, participants had low prior knowledge (M = 1.31; SD = 1.01; maximum of eight points).
Second, learning performance was measured with two types of learning tests – a single-choice and a multiple-choice test. For all eleven single-choice questions, one answer per question was correct. All questions had four possible answers. One example test is the question “What is the physical force of a black hole?”, which was presented with the answer options (a) “gravitational force”, (b) “magnetism”, (c) “electromagnetism”, and (d) “centrifugal force”. In this example, the correct answer was (a). A total of eleven points could be earned on the single-choice test. Moreover, four multiple-choice were presented. Three out of four questions had five answer possibilities. Due to a technical error, one question was presented with four answer choices (however, the correct answer was included in this question, so the question was included in the analysis). The number of correct answers varied between questions, but at least one answer was correct. Participants received one point for recognizing an item as correct. Furthermore, one point was given if an incorrect item was identified as incorrect. An example from the multiple-choice test is the question “Which statements about black holes are correct?”. This question was presented with the answer options (a) “The surface of a black hole resembles that of a star”, (b) “Black holes are objects whose mass maximizes to an extremely large volume as a result of a supernova”, (c) “In the singularity, space–time no longer exists”, (d) “The escape velocity is higher than the speed of light”, and (e) “Black holes are a special space–time with an extremely compact volume”. For this question, the answer options (d) and (e) were correct. A total of 19 points could be achieved in the multiple-choice test resulting in a total score of 30 points for the learning test.
The experiment was conducted online via the web conferencing platform BigBlueButton (https://bigbluebutton.org/). Participants were recruited with the help of mailing lists. The experiment took place with a maximum of four participants at a time. In the beginning, participants were informed that they were about to watch a whiteboard animation dealing with an astrophysical topic. Participants were also instructed that they would have to answer questions about the learning content after the learning phase. Each participant was assigned to a breakout room where they were asked to share their screen until the end of the perception of the learning material. This was done to ensure that the participants were engaged with the whiteboard animation. After reception, screen sharing could be stopped so that the learners could work on the learning questions and questionnaires without pressure. After receiving the link to the learning environment, participants could start the experiment by themselves. First, participants worked through the questions on prior knowledge. Then they were redirected to a learning website where the whiteboard animation was presented. It was not allowed to pause the video or skip, or re-watch certain parts. After viewing the whiteboard animation, the dependent variables were measured in the following order: perception of the pedagogical agent, intrinsic motivation, cognitive load, and learning performance. Finally, participants were asked to provide some demographic information. In total, the experiment lasted about 45 min.
For data analysis, IBM SPSS Statistics 29 (IBM Corp., 2022) was used. One-way analyses of variance (ANOVA) or multivariate analyses of variance (MANOVA) were calculated depending on the research question and the resulting number of dependent variables. For all variance analyses, the group variable information insertion was set as the independent variable. In the case of a significant ANOVA (α = 0.05), post-hoc tests were calculated to identify which group means differ significantly from each other (Kim, 2014). Tukey’s honestly significant difference (HSD) was calculated as multiple comparison procedure (Jaccard et al., 1984). In the case of a significant MANOVA, follow-up ANOVAs were calculated separately for each dependent variable. Partial eta-squared (ηp2) was calculated as effect size. For interpretation, the conventions proposed by Cohen (1988) were followed (0.01 = small; 0.06 = moderate; 0.14 = large). Because analysis of variance as a parametric procedure must meet the assumptions of variance homogeneity and normal distribution, appropriate tests were conducted (Lix et al., 1996). Only violations of the assumptions were reported. The descriptive results of all dependent variables are displayed in Table 2.
Do the three groups differ on control variables?
A preliminary issue is to ensure that the three groups were equivalent on control variables resulting from randomization (e.g., Suresh, 2011). Therefore, one-way ANOVAs and chi-square tests were conducted. No significant differences with regard to age, F(2, 81) = 0.29; p = 0.752; prior knowledge, F(2, 81) = 0.72; p = 0.492; gender, χ2(2, N = 84) = 0.65; p = 0.721; study program, χ2(6, N = 84) = 6.25; p = 0.396; and current semester, F(2, 81) = 2.42; p = 0.095, were found. It can be concluded that the groups are equivalent on control variables and therefore comparable.
RQ1: impact on the perception of the instructor
Following the agent persona instrument, the three facets facilitating learning, credible, and engaging were calculated simultaneously using a MANOVA. No significant effect could be found, Wilk’s Λ = 0.91, F(6, 158) = 1.30, p = 0.261, ηp2 = 0.05. Therefore, no follow-up ANOVAs were calculated.
RQ2: impact on cognitive load
Since the variables ICL, ECL, and GCL were not normally distributed and the assumption of homogeneous covariance matrices was violated, non-parametric tests were conducted as these have a lower type I error rate. Thus, Kruskal–Wallis tests were calculated. No significant effects on ICL, H(2) = 2.29, p = 0.319; ECL, H(2) = 2.76, p = 0.252; and GCL, H(2) = 3.87, p = 0.145, were found. Consequently, post-hoc tests were omitted.
RQ3: impact on intrinsic motivation
An ANOVA revealed a significant main effect of the independent variable on intrinsic motivation, F(2, 81) = 3.51, p = 0.034, ηp2 = 0.08. Pairwise-comparisons found that the condition hand drawing content reported a significantly higher intrinsic motivation than the condition hand pushing contents in (p = 0.033). Comparisons between hand drawing content and no hand visible (p = 0.147) and hand pushing content in and no hand visible (p = 0.788) failed to reach significance.
RQ4: impact on learning performance
Because both single-choice and multiple-choice questions were used to measure learning performance, a MANOVA was calculated. This analysis revealed no significant effect; Wilk's Λ = 0.96, F(4, 160) = 1.30, p = 0.516, ηp2 = 0.02. Therefore, no follow-up ANOVAs were calculated.
The current study aimed to take a closer look at the human hand in whiteboard animations. Based on influential theories in multimedia learning and instructional psychology research, it should be figured out which type of information insertion within whiteboard animations is most conducive to learning. Due to the rather contradictory theoretical assumptions, the research questions were formulated to determine whether a human hand in whiteboard animations aids learning as a social cue that activates social schemata, or whether the hand can be omitted to direct attention to learning-relevant information. Furthermore, this study did not focus purely on the comparison between a visible hand and no visible hand (as in the study by Schroeder & Traxler, 2017), but manipulated two different information insertion styles commonly used in whiteboard animations. The results show that it is rather irrelevant for learning whether visual content in a whiteboard animation is drawn by hand, is pushed in by a hand, or appears automatically on the whiteboard. All information insertion options used in this study lead to comparable learning outcomes. Moreover, the inclusion of a human hand as a social cue within a whiteboard animation did not make the instructor appear more learning-facilitating, credible, or engaging. But what are the reasons for the lack of effects on learning performance, the perception of the instructor, and cognitive load?
In general, these findings do not fully support the assumptions derived from the embodiment principle (Fiorella, 2021) as well as social agency theory (Mayer et al., 2003). A key assumption of this study was that the drawing hand is most reminiscent of a teacher drawing visual content on a whiteboard while explaining the content. According to the theories, this should increase social interaction and perception of the instructor as a social cue, but this was not reflected in the results. Similarly, the implementation of a human hand in an instructional video did not enhance the instructor’s ability to facilitate learning (Schroeder & Traxler, 2017). It may be that the human voice in a whiteboard animation seems is sufficient for the instructor to be perceived as engaged, credible, and learning-facilitating (e.g., Mayer & DaPra, 2012; Mayer et al., 2003). The human hand does not seem to play the expected role in learning with whiteboard animations, as learners concentrate on the progress of the animation and pay little attention to the hand. On the other hand, it is encouraging to note that the drawing hand in the whiteboard animation resulted in a significantly higher intrinsic motivation than a hand that pushes content onto the whiteboard. This seems to be further evidence for the dynamic drawing principle, noting that Fiorella et al. (2020) argue that observing dynamic drawing provides motivational benefits, but do not substantiate this with empirical evidence. It should be noted, however, that higher intrinsic motivation does not lead to better learning performance.
In terms of cognitive processes, results seem to refute the assumption derived from CLT that the human hand in a whiteboard animation is a seductive detail with a negative impact on learning (as pointed out by Schroeder & Traxler, 2017). The human hand (whether drawing or pushing in content) did not increase ECL. From an evolutionary educational psychology perspective, it seems reasonable that human movements are processed automatically because they can be framed as biologically primary knowledge (Geary, 2002). This does not cause additional ECL. Similarly, Schneider et al. (2022b) were able to show that gestures and facial expressions performed by a human instructor did not result in a higher ECL. Across all conditions in this study, ICL (i.e., the task complexity) was relatively high. Consistent with CLT, no differences were found here because only the type of presentation was manipulated and not the complexity of the information to be learned (Sweller et al., 2019). The high complexity of the learning environment could also explain the non-significant effects. Accordingly, learners focused primarily on the information to be learned and paid less attention to the human hand. Learners are likely to block out learning-irrelevant elements such as the hand and focus on the information to be learned. Similarly, Rop et al. (2018) could show that as learning time increases, learners can adapt their learning strategy and ignore learning-irrelevant information.
The results of this study suggest that the presence of a human hand in a whiteboard animation does not have a negative effect on learning. However, no general design recommendations for the human hand in whiteboard animations can be derived from a single study. In this context, the results cannot be generalized to other age groups (e.g., children), learning topics (e.g., biology), or educational settings (e.g., elementary school). Thus, further studies are needed that attempt to replicate the results of this study in other contexts to increase the generalizability of the findings (e.g., Plucker & Makel, 2021). However, in light of the results of this study, it can be concluded that when instructional designers create whiteboard animations for educational purposes, they are quite free to decide whether a human hand should be visible and how it should be animated (drawing or pushing in content). Based on the dynamic drawing principle, which can be considered as empirically well proven (e.g., Fiorella & Mayer, 2016; Fiorella et al., 2019, 2020; Mayer et al., 2020), the possibility of having the content drawn by a human hand should be considered. This is also supported by the fact that the human hand (whether drawing the content or pushing it onto the whiteboard) does not cause any additional ECL. In light of social agency theory (Mayer et al., 2003) and the CASTLE framework (Schneider et al., 2022a), a human hand could be implemented as a social cue to prime a social response from the learner.
Limitations and future directions
In addition to the interesting findings of this study, there are also several limitations that should be taken into account when interpreting the results and that should guide future studies. The first limitation relates to the production of the learning material. In this context, the individual conditions were created using software to ensure that only the insertion of information is manipulated. However, the human hand and its movements look somewhat artificial in some parts of the animation. Furthermore, the human hands in the drawing and pushing conditions differed slightly in appearance. For example, the hand in the hand pushing content condition had a slightly darker skin tone. Second, the experiment was conducted with a student sample. Given that whiteboard animations are presented on online platforms that reach a diverse audience, more research is needed on whether the findings can be generalized to other target groups (e.g., children or seniors). In this context, it would be interesting to see if the results can be replicated with other learning topics. Third, the learning material was presented with system-pacing meaning that students had no control over the progress of the whiteboard animation. Tshould ensure internal validity. In line with the interactivity principle, further studies should investigate whether the findings also occur when learners can adapt the progress of the learning material to their own pace (e.g., Evans & Gibbons, 2007). Fourth, one of the explanations supporting the implementation of a human hand in whiteboard animations is that learners pay more attention to the learning content. To measure learners’ attention to the hand, future studies could consider the possibility of collecting eye-tracking data (e.g., van Gog & Scheiter, 2010). This could provide more insight into whether learners direct their visual attention to the hand within the whiteboard animation. Fifthly, future studies should measure learners’ familiarity with whiteboard animations. Familiarity with media technologies and instructional methods can have a crucial impact on how learners use them for learning, which is particularly important in distance learning settings (e.g., Fütterer et al., 2023). Furthermore, the novelty effect (e.g., Clark, 1983) may be reduced if learners have previous experience with whiteboard animations. Sixth, it has been argued that the high complexity of the learning content is likely to cause learners to focus their full attention on the learning content and more or less ignore the hand. To test this assumption with empirical data, future studies should manipulate the complexity within a whiteboard animation. In this context, the human hand may have a negative effect on learning when learners have to process complex information that consumes a lot of cognitive resources. If learners also have to process the human hand shown in the whiteboard animation, they may quickly become cognitively overloaded, as ICL and ECL are additive (e.g., Sweller et al., 2019).
Availability of data and materials
The datasets used and analysed during the current study are available from the corresponding author on reasonable request.
Cognitive theory of multimedia learning
Computers as social actors
Cognitive-affective-social theory of learning in digital environments
Cognitive load theory
Extraneous cognitive load
Intrinsic cognitive load
Germane cognitive load
Agent persona instrument
Situational motivation scale
Analysis of variance
Multivariate analysis of variance
Honestly significant difference
Alpizar, D., Adesope, O. O., & Wong, R. M. (2020). A meta-analysis of signaling principle in multimedia learning environments. Educational Technology Research and Development, 68, 2095–2119.
Atkinson, R. K., Mayer, R. E., & Merrill, M. M. (2005). Fostering social agency in multimedia learning: Examining the impact of an animated agent’s voice. Contemporary Educational Psychology, 30, 117–139.
Baddeley, A. (1992). Working memory. Science, 255, 556–559.
Baylor, A. L., & PALS. (2003). The impact of three pedagogical agent roles. In T. Sandholm & M. Yokoo (Eds.), Proceedings of the second international joint conference on autonomous agents and multiagent systems (pp. 928–929). ACM Press.
Baylor, A. L., & Ryu, J. (2003). The effects of image and animation in enhancing pedagogical agent persona. Journal of Educational Computing Research, 28, 373–394.
Beege, M., Krieglstein, F., & Arnold, C. (2022). How instructors influence learning with instructional videos-the importance of professional appearance and communication. Computers & Education, 185, 104531.
Biard, N., Cojean, S., & Jamet, E. (2018). Effects of segmentation and pacing on procedural learning by video. Computers in Human Behavior, 89, 411–417.
Castro-Alonso, J. C., Wong, M., Adesope, O. O., Ayres, P., & Paas, F. (2019). Gender imbalance in instructional dynamic versus static visualizations: A meta-analysis. Educational Psychology Review, 31, 361–387.
Castro-Alonso, J. C., Wong, R. M., Adesope, O. O., & Paas, F. (2021). Effectiveness of multimedia pedagogical agents predicted by diverse theories: A meta-analysis. Educational Psychology Review, 33, 989–1015.
Chen, O., Kalyuga, S., & Sweller, J. (2017). The expertise reversal effect is a variant of the more general element interactivity effect. Educational Psychology Review, 29, 393–405.
Chun, M. M. (2000). Contextual cueing of visual attention. Trends in Cognitive Sciences, 4(5), 170–178.
Clark, R. E. (1983). Reconsidering research on learning from media. Review of Educational Research, 53, 445–459.
Clark, R. E. (1985). Confounding in educational computing research. Journal of Educational Computing Research, 1, 137–148.
Clark, J. M., & Paivio, A. (1991). Dual coding theory and education. Educational Psychology Review, 3, 149–210.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Taylor and Francis.
Craig, S. D., Gholson, B., & Driscoll, D. M. (2002). Animated pedagogical agents in multimedia educational environments: Effects of agent properties, picture features and redundancy. Journal of Educational Psychology, 94, 428–434.
Cook, S. W., Duffy, R. G., & Fenn, K. M. (2013). Consolidation and transfer of learning after observing hand gesture. Child Development, 84, 1863–1871.
Davis, R. O. (2018). The impact of pedagogical agent gesturing in multimedia learning environments: A meta-analysis. Educational Research Review, 24, 193–209.
Evans, C., & Gibbons, N. J. (2007). The interactivity effect in multimedia learning. Computers & Education, 49, 1147–1160.
Faul, F., Erdfelder, E., Buchner, A., & Lang, A. G. (2009). Statistical power analyses using G* Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41, 1149–1160.
Fiorella, L. (2021). The embodiment principle in multimedia learning. In R. E. Mayer & L. Fiorella (Eds.), The Cambridge handbook of multimedia learning (pp. 286–295). Cambridge University Press.
Fiorella, L., & Mayer, R. E. (2016). Effects of observing the instructor draw diagrams on learning from multimedia messages. Journal of Educational Psychology, 108, 528–546.
Fiorella, L., Stull, A. T., Kuhlmann, S., & Mayer, R. E. (2019). Instructor presence in video lectures: The role of dynamic drawings, eye contact, and instructor visibility. Journal of Educational Psychology, 111, 1162–1171.
Fiorella, L., Stull, A. T., Kuhlmann, S., & Mayer, R. E. (2020). Fostering generative learning from video lessons: Benefits of instructor-generated drawings and learner-generated explanations. Journal of Educational Psychology, 112, 895–906.
Fütterer, T., Hoch, E., Lachner, A., Scheiter, K., & Stürmer, K. (2023). High-quality digital distance teaching during COVID-19 school closures: Does familiarity with technology matter? Computers & Education, 199, 104788.
Geary, D. (2002). Principles of evolutionary educational psychology. Learning and Individual Differences, 12, 317–345.
Ginns, P. (2006). Integrating information: A meta-analysis of the spatial contiguity and temporal contiguity effects. Learning and Instruction, 16, 511–525.
Guay, F., Vallerand, R. J., & Blanchard, C. (2000). On the assessment of situational intrinsic and extrinsic motivation: The Situational Motivation Scale (SIMS). Motivation and Emotion, 24, 175–213.
Harp, S. F., & Mayer, R. E. (1998). How seductive details do their damage: A theory of cognitive interest in science learning. Journal of Educational Psychology, 90, 414–434.
Heidig, S., & Clarebout, G. (2011). Do pedagogical agents make a difference to student motivation and learning? Educational Research Review, 6, 27–54.
Henderson, M. L., & Schroeder, N. L. (2021). A systematic review of instructor presence in instructional videos: Effects on learning and affect. Computers and Education Open, 2, 100059.
Horovitz, T., & Mayer, R. E. (2021). Learning with human and virtual instructors who display happy or bored emotions in video lectures. Computers in Human Behavior, 119, 106724.
IBM Corp. (2022). IBM SPSS statistics for windows (Version 29.0) [Computer software]. IBM Corp.
Jaccard, J., Becker, M. A., & Wood, G. (1984). Pairwise multiple comparison procedures: A review. Psychological Bulletin, 96, 589–596.
Kalyuga, S. (2011). Cognitive load theory: How many types of load does it really need? Educational Psychology Review, 23, 1–19.
Kim, H. Y. (2014). Analysis of variance (ANOVA) comparing means of more than two groups. Restorative Dentistry & Endodontics, 39, 74–77.
Kirschner, P. A. (2002). Cognitive load theory: Implications of cognitive load theory on the design of learning. Learning and Instruction, 12, 1–10.
Klepsch, M., Schmitz, F., & Seufert, T. (2017). Development and validation of two instruments measuring intrinsic, extraneous, and germane cognitive load. Frontiers in Psychology, 8, 1997.
Krieglstein, F., Beege, M., Rey, G. D., Ginns, P., Krell, M., & Schneider, S. (2022). A systematic meta-analysis of the reliability and validity of subjective cognitive load questionnaires in experimental multimedia learning research. Educational Psychology Review, 34, 2485–2541.
Krieglstein, F., Schneider, S., Gröninger, J., Beege, M., Nebel, S., Wesenberg, L., Suren, M., & Rey, G. D. (2023). Exploring the effects of content-related segmentations and metacognitive prompts on learning with whiteboard animations. Computers & Education, 194, 104702.
Lawson, A. P., Mayer, R. E., Adamo-Villani, N., Benes, B., Lei, X., & Cheng, J. (2021). Recognizing the emotional state of human and virtual instructors. Computers in Human Behavior, 114, 106554.
Lin, L., Ginns, P., Wang, T., & Zhang, P. (2020). Using a pedagogical agent to deliver conversational style instruction: What benefits can you obtain? Computers & Education, 143, 103658.
Lix, L. M., Keselman, J. C., & Keselman, H. J. (1996). Consequences of assumption violations revisited: A quantitative review of alternatives to the one-way analysis of variance F test. Review of Educational Research, 66, 579–619.
Martha, A. S. D., & Santoso, H. B. (2019). The design and impact of the pedagogical agent: A systematic literature review. Journal of Educators Online, 16, n1.
Mayer, R. E. (2014). Incorporating motivation into multimedia learning. Learning and Instruction, 29, 171–173.
Mayer, R. E. (2021). Cognitive theory of multimedia learning. In R. E. Mayer & L. Fiorella (Eds.), The Cambridge handbook of multimedia learning (pp. 57–72). Cambridge University Press.
Mayer, R. E., & DaPra, C. S. (2012). An embodiment effect in computer-based learning with animated pedagogical agents. Journal of Experimental Psychology: Applied, 18, 239–252.
Mayer, R. E., Fiorella, L., & Stull, A. (2020). Five ways to increase the effectiveness of instructional video. Educational Technology Research and Development, 68, 837–852.
Mayer, R. E., Griffith, E., Jurkowitz, I. T. N., & Rothman, D. (2008). Increased interestingness of extraneous details in a multimedia science presentation leads to decreased learning. Journal of Experimental Psychology: Applied, 14, 329–339.
Mayer, R. E., & Moreno, R. (2002). Animation as an aid to multimedia learning. Educational Psychology Review, 14, 87–99.
Mayer, R. E., Sobko, K., & Mautone, P. D. (2003). Social cues in multimedia learning: Role of speaker’s voice. Journal of Educational Psychology, 95, 419–425.
McDonald, R. P. (1999). Test theory: A unified treatment. Lawrence Erlbaum.
McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22, 276–282.
Nass, C., Steuer, J., & Tauber, E. R. (1994). Computers are social actors. In B. Adelson, S. Dumais, & J. Olson (Eds.), Proceedings of the SIGCHI conference on human factors in computing systems (pp. 72–78). ACM.
Occa, A., & Morgan, S. E. (2022). The role of cognitive absorption in the persuasiveness of multimedia messages. Computers & Education, 176, 104363.
Paas, F., & van Merriënboer, J. J. G. (2020). Cognitive-load theory: Methods to manage working memory load in the learning of complex tasks. Current Directions in Psychological Science, 29, 394–398.
Pi, Z., Xu, K., Liu, C., & Yang, J. (2020). Instructor presence in video lectures: Eye gaze matters, but not body orientation. Computers & Education, 144, 103713.
Plötzner, R., Berney, S., & Bétrancourt, M. (2021). When learning from animations is more successful than learning from static pictures: Learning the specifics of change. Instructional Science, 49, 497–514.
Plötzner, R., & Lowe, R. (2012). A systematic characterisation of expository animations. Computers in Human Behavior, 28, 781–794.
Plucker, J. A., & Makel, M. C. (2021). Replication is important for educational psychology: Recent developments and key issues. Educational Psychologist, 56(2), 90–100.
Ramlatchan, M., & Watson, G. S. (2020). Enhancing instructor credibility and immediacy in online multimedia designs. Educational Technology Research and Development, 68, 511–528.
Rop, G., van Wermeskerken, M., de Nooijer, J. A., Verkoeijen, P. P., & van Gog, T. (2018). Task experience as a boundary condition for the negative effects of irrelevant information on learning. Educational Psychology Review, 30, 229–253.
Ryu, J., & Baylor, A. L. (2005). The psychometric structure of pedagogical agent persona. Technology, Instruction, Cognition, and Learning, 2, 291–314.
Schneider, S., Beege, M., Nebel, S., Schnaubert, L., & Rey, G. D. (2022a). The cognitive-affective-social theory of learning in digital environments (CASTLE). Educational Psychology Review, 34, 1–38.
Schneider, S., Krieglstein, F., Beege, M., & Rey, G. D. (2022b). The impact of video lecturers’ nonverbal communication on learning–An experiment on gestures and facial expressions of pedagogical agents. Computers & Education, 176, 104350.
Schneider, S., Krieglstein, F., Beege, M., & Rey, G. D. (2023). Successful learning with whiteboard animations—A question of their procedural character or narrative embedding? Heliyon, 9, e13229.
Schroeder, N. L., & Traxler, A. L. (2017). Humanizing instructional videos in physics: When less is more. Journal of Science Education and Technology, 26, 269–278.
Sepp, S., Howard, S. J., Tindall-Ford, S., Agostinho, S., & Paas, F. (2019). Cognitive load theory and human movement: Towards an integrated model of working memory. Educational Psychology Review, 31, 293–317.
Sundararajan, N., & Adesope, O. (2020). Keep it coherent: A meta-analysis of the seductive details effect. Educational Psychology Review, 32, 707–734.
Suresh, K. P. (2011). An overview of randomization techniques: An unbiased assessment of outcome in clinical research. Journal of Human Reproductive Sciences, 4, 8–11.
Sweller, J. (2020). Cognitive load theory and educational technology. Educational Technology Research and Development, 68, 1–16.
Sweller, J., van Merriënboer, J. J., & Paas, F. (2019). Cognitive architecture and instructional design: 20 years later. Educational Psychology Review, 31, 261–292.
Türkay, S. (2016). The effects of whiteboard animations on retention and subjective experiences when learning advanced physics topics. Computers & Education, 98, 102–114.
van der Meij, H., & Draijer, E. (2021). Design principles for multimedia presentations: A comparison between a whiteboard animation and a PowerPoint slideshow presentation. Journal of Educational Multimedia and Hypermedia, 30, 393–418.
van Gog, T., & Scheiter, K. (2010). Eye tracking as a tool to study and enhance multimedia learning. Learning and Instruction, 20, 95–99.
Voomly LLC. (2021). Doodly (Version v2.7.4) [Computer software]. https://www.doodly.com/
Wakefield, E. M., Congdon, E. L., Novack, M. A., Goldin-Meadow, S., & James, K. H. (2019). Learning math by hand: The neural effects of gesture-based instruction in 8-year-old children. Attention, Perception, & Psychophysics, 81, 2343–2353.
Wang, J., & Antonenko, P. D. (2017). Instructor presence in instructional video: Effects on visual attention, recall, and perceived learning. Computers in Human Behavior, 71, 79–89.
Wang, J., Antonenko, P. D., & Dawson, K. (2020). Does visual attention to the instructor in online video affect learning and learner perceptions? An eye-tracking analysis. Computers & Education, 146, 103779.
Wilson, K. E., Martinez, M., Mills, C., D’Mello, S., Smilek, D., & Risko, E. F. (2018). Instructor presence effect: Liking does not always lead to learning. Computers & Education, 122, 205–220.
Wittrock, M. C. (1989). Generative processes of comprehension. Educational Psychologist, 24, 345–376.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Krieglstein, F., Meusel, F., Rothenstein, E. et al. How to insert visual information into a whiteboard animation with a human hand? Effects of different insertion styles on learning. Smart Learn. Environ. 10, 39 (2023). https://doi.org/10.1186/s40561-023-00258-6
- Whiteboard animations
- Social cues
- Human instructor
- Dynamic drawing
- Multimedia learning
- Cognitive load