The gamification of education can enhance levels of students’ engagement similar to what games can do, to improve their particular skills and optimize their learning. On the other hand, scientific studies have shown adverse outcomes based on the user’s preferences. The link among the user’s characteristics, executed actions, and the game elements is still an open question. Aiming to find some insights for this issue, we have investigated the effects of gamification on students’ learning, behavior, and engagement based on their personality traits in a web-based programming learning environment. We have conducted an experiment for four months with 40 undergraduate students of first-year courses on programming. Students were randomly assigned to one of the two versions of the programming learning environment: a gamified version composed of ranking, points, and badges and the original non-gamified version. We have found evidence that gamification affected users in distinct ways based on their personality traits. Our results indicate that the effect of gamification depends on the specific characteristics of users.
First part title: Studying the impact of gamification on learning and engagement based on the personality traits of students
Stimulated by the effects that game elements can produce, many researchers have looked into the influence of gamification in an educational context, getting favorable results, such as the increase of engagement, user retention, knowledge, and cooperation (Hakulinen and Auvinen 2014; Tvarozek and Brza 2014). Despite that, some studies have shown uncertain or prejudicial results from gamification (Christy and Fox 2014). They found that ranking affects women in various ways and may guide to unexpected opposite impact. Hanus and Fox (2015) informed that, in addition to not increase the results, gamification decreases pleasure and motivation. Haaranen et al. (2014) noticed that some users had adverse emotions about the badges.
The mix of controversial results related to the effects of gamification in learning environments yield doubts concerning the advantages of its utilization in an educational setting. Moreover, research about the effects of gamification elements on students’ learning, participation, and other effects, is a broad goal. The objective should be delimited to what elements of games are efficient for a particular type of student, involved in a given activity (Dichev and Dicheva 2017). Different layouts of elements of games, used to add gamification to diverse activities, produces different effects, hampering the process of determining which elements or collection of these elements are efficient to promote the engagement and learning for a group or type of user, doing a specific action (Dichev and Dicheva 2017). The motivation (Pedro 2016; Hakulinen and Auvinen 2014; Mekler et al. 2017), player profile (Barata et al. 2014; O’Donovan et al. 2013) and personality (Codish and Ravid 2014; Jia et al. 2016) are the characteristics and preferences that have been most investigated in gamified learning environments.
The user’s personality is the set of characteristics and psychological factors that are used to understand how individuals think and interact (Goldberg 1992). Personality traits refer to an individual’s reactions to different situations, and little is known about how different elements of gamification affect engagement based on the user’s traits (Codish and Ravid 2014). Empirical studies are needed to verify whether the effect of gamification may differ depending on users’ personality traits.
Codish and Ravid (2014) researched, through preference surveys, how extroverts and introverts received the gamification and discovered an adverse effect of the ranking on extroverted students and favorable but not substantial on introverted students; extroverts chose the badges. On the other hand, Jia et al. (2016) found different results, in which, also through preference surveys, identified that extroverted people are driven by points, levels, and ranking. Jang, Park, and Yi (2015) found that users with low agreeableness who used a non-gamified version of a system had lower learning rates than those who used the gamified one.
The previously found results were crucial for the conceptual comprehension of the effect of gamification on personality, but they were only based on users’ opinions, obtained through questionnaires and in a short time. It is essential to conduct experiments to verify the real effects of gamification in learning environments and over a long duration.
In this study, we aimed to study whether the gamification affects students differently depending on their personality traits. More specifically, we aimed to investigate whether distinct components of gamification affect students’ learning, their programming attitudes (trial and error behavior in the programming tasks submission for correction), and engagement depending on their personality traits (extroversion, openness, agreeableness, neuroticism, and conscientiousness) in the context of programming learning. Personality traits were selected because there is a lack of empirical studies in real environments that target this topic, and the results of the formal works are an open question. In our work, the effects of gamification for different personality traits were investigated with an empirical experiment in a real learning environment over a longer interval of time (four months). We examined participants’ engagement and behavior by their activities logs on the educational system and their learning by knowledge tests.
This work is an extended version of our paper published on ICALT 2019 (Smiderle et al. 2019). Differently from ICALT paper, in which we have focused only on the extroversion trait, in the present article, we also explore how the other personality traits (openness, agreeableness, neuroticism, and conscientiousness) interfere in the impact of gamification elements on students’ learning, engagement, and programming behavior.
The participants were universities student from two first-semester classes in a Computing course at a private university in the state of Rio Grande do Sul, Brazil. In total, 48 students aged between 17 and 34 years old (M =21, DP =3.74), 38 boys, and ten girls, were invited at the beginning of the semester for participation. Consent forms were delivered for all students, who agreed to join in the experiment, to sign. Only the information of the students who fulfilled the personality trait questionnaire (43 students) and agreed to participate by returning the content form (41 students) was considered at the end of the experiment, totalizing 40 students (7 girls and 33 boys). At the beginning of the experiment, the teachers explained to the students that their participation in the experiment was voluntary; they could quit at any moment, and this would not change their final grade in the class.
We verified the change in engagement by the number of logins, badges, points, and also the number of visualizations of the gamification elements. The grades in the course exams served to evaluate learning. The programming behavior was measured by the accuracy of the solutions submitted by students for programming exercises. Accuracy is the result of the total number of correct solutions divided by the total number of solutions sent. It represents the student care before submitting a solution, being the opposite of trial and error behavior, in which the student sends different solutions repeatedly until success, without seriously reflecting on them, only to get the system feedback.
Personality questionnaire - iGFP-5
To determine students’ personalities, we have used the IGFP-5. IGFP-5 is a self-reported measure composed of 44 items and designed to evaluate the personality dimensions based on the Big Five Personality Factors model (Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism) (de Andrade 2008). It was validated for Brazil through a sample of 5,089 respondents from the five Brazilian regions, 66.9% female, and 79.0% higher education. According to Andrade (2008), individuals with high scores in Openness are generally outspoken, imaginative, witty, original, and artistic. Conscientious individuals are generally cautious, trustworthy, organized, and responsible. Extroverted individuals tend to be active, enthusiastic, sociable, and eloquent or talkative. People with high scores in agreeableness are pleasant, lovely, cooperative, and affectionate. Neurotic individuals are usually nervous, highly sensitive, tense, and concerned.
BlueJ and feeper
FeeperFootnote 1 is a web-based system designed to assist students and teachers in programming classes. In the environment, the teacher can provide programming exercises, which can be solved by students, and automatically corrected by an Online Judge integrated on the platform. It matches the output of the learners’ program with the output of an ideal solution provided by the teacher for a given input. It uses rules previously registered by the teacher also to give some feedback for the students based on the output of their code. A significant advantage of this type of environment is that it reduces the teacher’s burden because it corrects the exercises automatically, allowing the teacher to concentrate their efforts on students who are struggling with the tasks.
BlueJ is a free Java Development Environment designed for beginners to learn the basics of programming (Bluej 2019).
In our work, the teacher recommended students to write their code at BlueJ, which has a more straightforward interface for beginners. After solving the task on BlueJ, students should submit the final solution to Feeper to get the correction and error-feedback. Only Feeper was gamified in this study, and it was used by students to verify their progress.
System logs and grades
The information extracted from Feeper through the use of the environment consisted of the number of: logins, correct and wrong exercises, badges and points obtained, and challenges completed. We also analysed the number of users’ views of the elements ranking, badges, and points. The number of badges view is different from the number of badges obtained. When we counted the number of views, we were analyzing how interested the student was in this element. A student can get many badges because she accomplished the activities successfully due to her interest in the topic, even if she is not interested in getting badges. The same is true for the Points and Ranking.
During the semester, students accomplished three exams as part of their formal evaluation process of the class. Grade A was delivered in the middle of the semester; it was comprised of problems related to topics seen until it. Grade B was the last exam, delivered at the end of the semester. When students were not able to achieve the minimum score, they could improve their grade with Grade C, which was delivered two weeks after Grade B. In this work, grade A contributed to check students’ performance in programming before gamification switching on in the experimental group. The participants completed the IGFP-5 personality questionnaire and were randomly distributed into two groups, the control and the gamified. At the end of the semester, they took the final exam.
Gamification in feeper
The gamification elements implemented on Feeper for this study are points, badges, and ranking, described below. The only difference between the gamified and non-gamified versions of Feeper is that participants in the non-gamified version cannot see the gamification elements, but internally the system still scores points and badges. This score allows us to compare whether students be able to see the gamification elements engage them.
Points appear to participants in two different parts of the system. When students are completing a programming task, they can see how many points they could earn if they solve it successfully. When the solution is incorrect, the score is decreased by five points for each submission (the students can lose a maximum of 70 points for each task). Students can also view their score histories for the solved exercises and the points previously earned. Students were warned that the scores obtained in the exercises would not affect their final grade on the course.
Nine distinct badges were granted to users by obtaining specific objectives, with three degrees (gold, silver, bronze), totalizing 27 badges. Badges were granted for students who have achieved a specific sum of logins, correct assignment, submitted assignments, submitted assignments with no errors, daily activity, and for whom have concluded challenges and were top of the class and the platform.
The ranking is the sum of all points earned by students for all assignments solved. There are two distinct rankings available. The ranking of the class shows the participants with the best scores in the class; its goal is to promote local objectives for students. The second one is the general ranking, which contrasts the scores of all students of the platform who have used Feeper.
This experiment followed an experimental design consisting of two groups, control (21 participants) and experimental groups (19 participants), for which the students were randomly assigned with the only restriction of having the same number of participants initially in both groups. Table 1 shows the number of participants for each personality trait in each group (gamified and non-gamified).
Students in the control group used the original non-gamified version of Feeper, while learners in the experimental group used a gamified version of Feeper with points, badges, and ranking. All students started using the non-gamified variant of Feeper, and only in the second half of the semester (after the first exam, grade GA), students in the experimental group began to use the gamified version.
This type of design allows us to examine the effects of gamification on personality traits using both controlling conditions: the participant with himself (a within-subject design, by comparing the performance and engagement of students of the experimental group before and after Grade GA) and by comparing control and experimental groups after Grade GA. At the end of the semester, students completed the final exam (Grade B), involving all the content of the course. Figure 1 illustrates the phases of the experiment.
Some students have reported that they noticed their version of Feeper was different from the one used by a nearby colleague (they were able to notice the presence of points and ranking). When this occurred, teachers only reported that some new features were being tested in Feeper and were only available to some participants.
The experiment was realized in the second part of 2018 and had a period of four months. The participants had class once a week, and each class had two hours and 38 minutes of duration. Students used Feeper in all classes, except for the first class, the three classes in which the teacher delivered the exams (Grade A, Grade B, and Grade C), and one class of topic review, totalizing 15 classes solving program tasks using BlueJ and Feeper.
In the first week of class, the teacher presented to the students the organization and some introductory notions of computer organization. In the second week, the teacher presented to students the Feeper and BlueJ environments (“Materials” section). Introductory tasks were given for the students to get used to both learning environments.
From the third week onward, students realized four exercises in each class: a worked example, two activities that were part of the final grade, and an optional exercise. The teacher began each class by solving a worked example step-by-step to teach the students how to solve a programming task involving the same concepts to be worked in the class.
Students then solved two other programming tasks with the same difficulty level as the example worked and using the same programming concepts. The students’ grade was composed of these two tasks accomplishments (20%) and the score on the exams (80%). In addition to being part of the grade, these tasks served to identify students’ difficulties. The optional task was an extra activity with greater difficulty and with the possibility of additional grade. The goal was to challenge the students and also verify their engagement as it was optional.
Results and analysis
This section presents the results found in this study. We conducted a Shapiro-Wilk test to verify the normality of the data. For the data that followed a normal distribution, we conducted a t-test to compare means. We also calculated the effect size for all tests, which is a simple way to quantify the difference between two groups, through the Cohen d effect. For data with non-normal distributions, we applied a Wilcoxon rank-sum test to compare means. The effect size of non-normal data was calculated by dividing the z value by the square root of the number of participants, as described by Mann-Whitney (Pallant 2010).
Table 2 shows the results of the evaluation comparing the experimental and control groups. The level of significance was set at a <0.05 with 95% confidence interval. Only statistically significant results are shown.
Participants who used the gamified environment had a higher average of points, badges, and the number of logins than participants of the non-gamified group. However, no statistically significant results were found to show that the gamified group was more engaged than the non-gamified group. Regarding the grades, both groups had a reduction from grade GA to grade GB, which is usual in this class as grade GB has more content and is more difficult than GA.
The gamified system can change student behavior. Gamified group participants had a significant improvement in the quality of the submitted solutions, having obtained more accuracy (number of correct solutions sent, divided by the total number of solutions sent), when comparing Grade A and Grade B. In the literature, it is possible to find papers that reported improved performance of the gamified group by increasing the score, (Krause et al. 2015) and decreasing unwanted behaviors (Pedro 2016). Another work that studied the effects of gamification on the performance of users on tagging pictures (Mekler et al. 2017) has found an increase in the number of tag annotations, but without resulting in improved tag quality.
Each personality trait was verified individually. Introverted participants (in both experimental and control groups) had a higher number of points, medals, and logins than extroverted. A statistically significant difference was found in the number of points and ranking visualization between the introvert and extrovert gamified groups, thus indicating a divergence on how users with different personality traits receive the gamification effect. In addition, a statistically significant difference was found in the accuracy gain of the introvert participants in the gamified group.
Regarding the personality trait neuroticism, no differences were observed between the gamified and non-gamified groups. However, it can be noted that in both groups, people with high neuroticism had a higher number of logins. About the personality trait conscientiousness, students of the non-gamified group with low conscientiousness obtained the lowest number of points when compared to the other groups.
The correlation of all personality traits and the variables was calculated and showed in Fig. 2. They were used to identify the relationship between the dependent variables (engagement, learning, programming behavior) and personality traits. A correlation is considered strong when the Pearson coefficient is higher than 0.7, moderate when it is between 0.5 and 0.7, and weak when it is lower than 0.5.
Regarding the number of points obtained, a moderate negative correlation (r=−0.52,p=0.01) was identified with the extroversion personality trait for the gamified group. For the nongamified group, a weak positive correlation between the points and the conscientiousness personality trait (r=+0.41,p=0.01) was found.
Regarding the number of badges obtained, a moderate negative correlation was found with the extroversion personality trait (r=−0.52,p=0.01).
Concerning the accuracy after the grade GA, for the gamified group, a moderate negat ive correlation was found with the extroversion trait (r=−0.57,p<0.01) and a weak negative correlations with the openness trait (r=−0.42,p=0.05). A positive weak correlation was found with the conscientiousness trait (r=0.3,p=0.18). For the non-gamified group, a moderate positive correlation was identified between accuracy after GB and the conscientiousness personality trait (r=0.5,p=0.02).
We also analyzed the correlation between the engagement of students in the gamified group with the different gamification elements (points, medals, and ranking). About the number of ranking views, we observed a strong negative correlation with the extroversion trait (r=−0.79,p<0.01), and also a weak negative correlation with agreeableness trait (r=−0.48,p=0.02).
Each trait was verified individually. Introverted participants in both control and experimental groups had a higher number of points, badges, and logins. A statistically significant difference was found in the number of points and ranking views between the introvert and extrovert students who used the gamified version, thus indicating that there is a difference between how different users with different personality traits receive the effect of gamification. In addition, a statistically significant difference was found in the accuracy gain of the introverted participants who used the gamified version.
This result partially matches with the results found in (Codish and Ravid 2014), that detected a negative effect of the ranking on extroverted participants and positive and not significant in introverted participants; extroverts preferred badges. However, unlike our work (Jia et al. 2016) found that extroverts tend to be more motivated by points, levels, and ranking.
Regarding neuroticism, no differences were observed between the gamified and non-gamified groups. Overall, in both groups, neurotic participants had more logins. People with high neuroticism tend to be more worried and insecure, which, possibly, makes them check the platform more often for new exercises.
Regarding the grades, the gamified group with high agreeableness had the lowest grade reduction (the difference between GB and GA), while the non-gamified group with low agreeableness had the lowest grade reduction. Related to this personality trait, (Jang et al. 2015) found that users with low agreeableness in the non-gamified group had lower learning rates than those in the gamified group, thus being more useful to these users. Differently from the mentioned work, we have not found any evidence that gamification improves learning for the students with low agreeableness.
About the conscientiousness personality trait, in both gamified and non-gamified groups, the high conscious participants were more accurate in solving the exercises, and only the low conscious students who used the non-gamified system had accuracy decreased. This decrease was significant, and thus it is possible to observe that the loss of accuracy of the non-gamified group came from low conscientious participants. A possible explanation is that people with this personality trait are more careless and negligent.
Limitations of study
When the user fills out a personality trait questionnaire, it has a score (a value ranging from 5 to 25) across five different personality traits. For example, a score on the extroversion trait means that the person tends to be more extroverted or introverted. An important issue is which score to use to classify someone as extroverted or introverted. As our sample was relatively small (only 40 participants), we have classified the participants using the median. This procedure was a provisional measure, and we intend to use a more reliable method for classifying participants in the next works. A trustworthy possibility in a larger sample would be to check only the extremes individuals of each personality trait.
Another limitation to the validity of the results is the representativeness of the sample since all participants who joined the study are from the same university, most of them young males. Therefore, it is not reasonable to generalize the results for the whole student population. Statistically, it can be solved with the replication of this experiment in distinct samples of undergraduate students.
The results were obtained through an empirical experiment involving programming tasks. It is also necessary to replicate the study for other domains such as mathematics, physics, etc.
This article aimed to contribute to the understanding of how gamification affects participants differently depending on their characteristics. More specifically, we investigated the effects of points, ranking, and badges in the engagement, learning, and behavior programming of students based on their personality traits. To achieve this goal, we have conducted an empirical experiment with 40 undergraduate students enrolled in a programming class who used two different versions of a programming learning environment: a gamified one and a non-gamified.
The results showed a change in the behavior of the gamified group showing a significant improvement in the accuracy of students with personality traits with low agreeableness, low openness, and introverts who used the gamified version in the second half of the course. A reduction during the semester (GA/GB) for accuracy was also verified for students with low conscientious personality who used the non-gamified system, while, in the gamified group, this reduction has not occurred, indicating that gamification may help these groups. Introverted students who used the gamified version were more engaged than extroverted students for the same version. We have also found a strong negative correlation between the extroversion trait and the number of ranking views, indicating that gamification in general and, especially the ranking element, is more beneficial to introverts.
This work contributes to the understanding of how gamified environment systems affect users based on their characteristics. Specifically, it contributed to the comprehension of how gamification affects the engagement and learning behavior of university students based on their personality traits. Future research could study the effect of gamification in various disciplines over a more extended period. It could help to verify whether, over time, gamification loses its effectiveness, to identify possible saturation points and limitations in its application.
Availability of data and materials
The first and last authors store the datasets. However, the Informed Consent Form that was used does not allow the distribution of the data fully.
Barata, G., Gama, S., Jorge, J.A., Gonçalves, D.J. (2014). Relating gaming habits with student performance in a gamified learning experience. In Proceedings of the first ACM SIGCHI annual symposium on Computer-human interaction in play - CHI PLAY ’14. https://doi.org/10.1145/2658537.2658692. ACM, (pp. 17–25).
Borges, S.d.S., Reis, H.M., Durelli, V.H., Bittencourt, I.I., Jaques, P.A., Isotani, S. (2013). Gamificação aplicada à educação: um mapeamento sistemático. In Brazilian Symp. on Computers in Education. https://doi.org/10.5753/cbie.sbie.2013.234, (Vol. 24. Sociedade Brasileira de Computação, p. 234).
Christy, K.R., & Fox, J. (2014). Leaderboards in a virtual classroom: A test of stereotype threat and social comparison explanations for women’s math performance. Computers & Education, 78, 66–77.
Dichev, C., & Dicheva, D. (2017). Gamifying education: what is known, what is believed and what remains uncertain: a critical review. International Journal of Educational Technology in Higher Education, 14(1), 9.
Fardo, M.L. (2014). A gamificação como estratégia pedagógica: estudo de elementos dos games aplicados em processos de ensino e aprendizagem. Master’s thesis, Universidade de Caxias do Sul. https://repositorio.ucs.br/handle/11338/457.
Goldberg, L.R. (1992). The development of markers for the big-five factor structure. Psychol. assessm., 4(1), 26.
Haaranen, L., Ihantola, P., Hakulinen, L., Korhonen, A. (2014). How (not) to introduce badges to online exercises. In ACM Tech. Symp. on Computer Science Education. https://doi.org/10.1145/2538862.2538921. ACM, (pp. 33–38).
Hakulinen, L., & Auvinen, T. (2014). The effect of gamification on students with different achievement goal orientations. In 2014 International Conference on Teaching and Learning in Computing and Engineering (LaTiCE). https://doi.org/10.1109/latice.2014.10. IEEE, (pp. 9–16).
Hanus, M.D., & Fox, J. (2015). Assessing the effects of gamif. in the classroom. Computers & Education, 80, 152–161.
Jang, J., Park, J.J.Y., Yi, M.Y. (2015). Gamification of online learning. In: Conati, C., Heffernan, N., Mitrovic, A., Verdejo, M.F. (Eds.) In Artificial Intelligence in Education. Springer, Cham, (pp. 646–649).
Jia, Y., Xu, B., Karanam, Y., Voida, S. (2016). Personality-targeted gamification: a survey study on personality traits and motivational affordances. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems - CHI ’16. https://doi.org/10.1145/2858036.2858515. ACM.
Knutas, A., Ikonen, J., Nikula, U., Porras, J. (2014). Increasing collaborative communications in a programming course with gamification: a case study. In Proceedings of the 15th International Conference on Computer Systems and Technologies. https://dl.acm.org/citation.cfm?id=2659620. ACM, (pp. 370–377).
Krause, M., Mogalle, M., Pohl, H., Williams, J.J. (2015). A playful game changer: Fostering student retention in online education with social gamification. In ACM Conf. on Learning@ Scale. https://dl.acm.org/citation.cfm?id=2724665. ACM, (pp. 95–102).
Mekler, E.D., Brühlmann, F., Tuch, A.N., Opwis, K. (2017). Towards understanding the effects of individual gamific. elements on intrinsic motivation and performance. Comput. in Human Behav., 71, 525–534.
O’Donovan, S., Gain, J., Marais, P. (2013). A case study in the gamification of a university-level games development course. In Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference on - SAICSIT ’13. https://doi.org/10.1145/2513456.2513469. ACM, (pp. 242–251).
Pallant, J (2010). Spss survival manual: a step by step guide to data analysis using spss. Open University Press.
Smiderle, R., Marques, L., M., C.J.A.P.D., Rigo, S., Jaques, P. (2019). Studying the impact of gamification on learning and engagement of introverted and extroverted students. In 2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT). https://doi.org/10.1109/icalt.2019.00023. IEEE.
All authors have had a scientific contribution in this manuscript, with the first and last authors contributing the most to the design, implementation, analysis, and write up of the manuscript. All authors read and approved the final version of the manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Smiderle, R., Rigo, S.J., Marques, L.B. et al. The impact of gamification on students’ learning, engagement and behavior based on their personality traits.
Smart Learn. Environ.7, 3 (2020). https://doi.org/10.1186/s40561-019-0098-x