Formative Assessment and Learning Analytics Dirk T. Tempelaar
Hans Cuypers
Maastricht University School of Eindhoven University of Technology, Business & Economics, Department of Mathematics and Tongersestraat 53 - Room A2.20 Computing Science 6211 LM Maastricht 5612 AZ Eindhoven The Netherlands The Netherlands
[email protected] [email protected] André Heck
Henk van der Kooij
University of Amsterdam, Faculty of Science Science Park 904 1098 XH Amsterdam The Netherlands
Utrecht University, Freudenthal Institute P.O. Box 80125 3508 TC Utrecht The Netherlands
[email protected] [email protected] ABSTRACT Learning analytics seeks to enhance the learning process through systematic measurements of learning related data, and informing learners and teachers of the results of these measurements, so as to support the control of the learning process. Learning analytics has various sources of information, two main types being intentional and learner activity related metadata [1]. This contribution aims to provide a practical application of Shum and Crick’s theoretical framework [1] of a learning analytics infrastructure that combines learning dispositions data with data extracted from computerbased, formative assessments. The latter data component is derived from one of the educational projects of ONBETWIST, part of the SURF program ‘Testing and Test Driven Learning'.
Categories and Subject Descriptors K.3.1 [Computers and Education]: Computer Uses in Education – Computer-assisted instruction (CAI).
General Terms Measurement, Design.
Keywords Blended Learning; Formative Assessment; Learning Analytics; Learning Dispositions; Student Profiles; Test Directed Learning.
Evert van de Vrie Open University Netherlands Faculteit Informatica P.O. Box 2960 6401 DL Heerlen The Netherlands
[email protected] learning management systems and other concern systems, as for example accounts of prior education. A combination with intentionally collected data, such as self-report data stemming from student responses to surveys, is however the exception rather than the rule. In their theoretical contribution to LAK2012 [1], Shum and Crick propose a learning analytics infrastructure that combines learning activity generated data with learning dispositions, values and attitudes measured through self-report surveys and fed back to students and teachers through visual analytics. Their proposal considers for example spider diagrams to provide learners inside in their learning dispositions, values and attitudes. In our empirical contribution, we aim to provide a practical application of such an infrastructure based on combining learning and learner data. In collecting learner data, we opted to use a wide range of well validated self-report surveys firmly rooted in current educational research, including learning styles, learning motivation and engagement, and learning emotions. Learner data were reported to both students and teachers using visual analytics similar to those described in [1], so instead of focusing on technology to feedback learner data, we will focus here on the crucial role of the richness of the profile of learner dispositions, values and attitudes. Our second data source is rooted in the instructional method of test-directed learning, and brings about the second focus of this empirical study: to demonstrate the crucial role of data derived from computer-based formative assessments in designing effective learning analytic infrastructures.
1. INTRODUCTION The prime data source for most learning analytic applications is data generated by learner activities, such as learner participation in continuous, formative assessments. That information is frequently supplemented by background data retrieved from Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. LAK '13, April 08 - 12 2013, Leuven, Belgium Copyright 2013 ACM 978-1-4503-1785-6/13/04…$15.00.
2. TEST-DIRECTED LEARNING The classic function of testing is that of taking an aptitude test. After completion of the learning process, we expect students to demonstrate mastery of the subject. According to test tradition, feedback resulting from such classic tests is no more than a grade, and that feedback becomes available only after finishing all learning. The alternative form of assessment, formative assessment, has an entirely different function: that of informing student and teacher. The information should help better shape the teaching and learning and is especially useful when it becomes available during or prior to the learning. Diagnostic testing is an example of this, just as is practice testing. Because here the
feedback that tests yield for learning constitutes the main function, it is crucial that this information is readily available, preferably even directly. At this point digital testing comes on the scene: it is unthinkable to get feedback from formative assessments in time without using computers. In the Netherlands, the development of digital testing has accelerated over the past decade, partly by the National Action Plan on e-Learning by SURF. This plan included projects as the National Knowledge-Base Mathematics Skills (NKBW I & II) and Intelligent feedback, which ran from 2006 to 2010 and delivered a multitude of test materials for digital, formative assessments (www.nkbw.nl) to the mathematics discipline. The projects under the current SURF program Testing and Test Driven Learning [2] aim to complete this development by integrating series of digital tests into a test-directed curriculum. In the context of mathematics and statistics the project ONBETWIST (www.onbetwist.org) achieves this goal [3].
3. LEARNING ANALYTICS The broad goal of learning analytics is to apply the outcomes of analyzing data gathered by monitoring and measuring the learning process, as feedback to assist directing that same learning process. Several alternative operationalizations are possible to support this. In [4], six objectives are distinguished: predicting learner performance and modeling learners, suggesting relevant learning resources, increasing reflection and awareness, enhancing social learning environments, detecting undesirable learner behaviors, and detecting affects of learners. In the following sections describing our approach, we will demonstrate that the combination of self-report learner data with learning data from test-directed instruction allows to contribute to at least five of these objectives of applying learning analytics. Only social interaction is restricted to learners being able to assess their individual learning profiles in terms of a comparison of their own strong and weak characteristics relative to the position of other students. These profiles are based on both learner behavior, including all undesirable aspects of it, and learner characteristics: the dispositions, attitudes and values. Learner profiles are used to model different types of learners, and to predict learner performance for each individual student. Since our instructional format is of student-centered type, with the student, and not the teacher, steering the learning process, it is crucial to feedback all this information to learners themselves as to make them fully aware of how to optimize their individual learning trajectories.
4. CASE STUDY: MATH AND STATS Our empirical contribution focuses on freshmen education in quantitative methods (mathematics and statistics) of the business & economics school at Maastricht University, one of the educational projects within ONBETWIST. All these projects experiment with forms of test-directed learning, using a common database of test materials (ONBETWIST database). We focus on this specific project since it is unique for the integration of testdirected learning and the application of learning analytics. In addition, it is directed at a large and diverse group of students, which benefits the research design. The population of students studied here consists of two cohorts of freshmen: 2010/2011 and 2011/2012, containing 1,832 students who in some way participated in school activities (have been active in the digital learning environment Blackboard). Besides BlackBoard, two different digital learning environments for test-directed learning were utilized: MyStatLab and ONBETWIST, by a large majority of students: 1,743 and 1,682, respectively.
The diversity of the student population derives mainly from its very international composition: only 34.8% took Dutch high school, whereas all others were educated in international high school systems. The largest group, 41.9% of the freshmen, were educated according to the German Abitur system. High school systems in Europe differ strongly, most particularly in the teaching of mathematics and statistics. In that European palette the Netherlands occupies a rather unique position, both in choice of subjects (one of the few European systems with substantial focus on statistics) and the chosen pedagogical approach. But even beyond the Dutch position, there exist large differences, such as between the Anglo-Saxon and German-oriented high school systems. Therefore it is crucial that the first course offered to these students is flexible and allows for individual learning paths. To some extent, this is realized in offering optional, developmental summer courses, but for the main part, this diversity issue needs to be solved in the program itself. The digital environments for test-directed learning play an important role in this.
5. TEST-DIRECTED LEARNING For both sub-topics of the course, mathematics and statistics, digital environments for test-directed learning are utilized. In statistics, with the largest diversity in prior proficiency, we used the commercial MyStatLab (MSL) environment. MSL is a generic digital learning environment, developed by the publisher Pearson, for learning statistics. It adapts to the specific choice of a textbook from Pearson. Although MSL can be used as a learning environment in the broad sense of the word (it contains, among others, a digital version of the textbook), it is primarily an environment for test-directed learning. Each step in the learning process is initiated by submitting a question. Students are encouraged to (try to) answer the question. If they do not master (completely), the student can either ask for help to step by step solve the problem (Help Me Solve This), or ask for a fully worked example to show (View an Example). Next, a new version of the problem loads (parameter based) to allow the student to demonstrate their newly acquired mastery. In the investigated courses, students work an average 19.2 hours in MSL, about a quarter of the available time of 80 hours for learning statistics. In this study, we use two different indicators for the intensity of use of MSL: Stats#hours, the number of hours a student spent practicing in the MSL environment, and StatsTestScore, the average score for the practice questions, all chapters aggregated. For the mathematics education, the ONBETWIST database of test items generated by the SURF project ONBETWIST was applied. The ONBETWIST project has its own player that would have made it possible to use these materials via server-based computing. However, we opted to convert part of the content of the ONBETWIST test database into BlackBoard item-pools and subsequently organize access through the local UM Blackboard environment, in order to accommodate the storage of all user access data. We also opted to use items of multiple choice type only, to prevent the use of formula editors by our students, with parallel items stored in item pools from which a random version was drawn in every practice or test attempt. The functionality of the BlackBoard system realized this way is narrower than the previously described MSL system: it is primarily a practicing and testing functionality, in which the student can repeatedly test the mastery of a specific math topic, rather than a true e-tutorial supporting the learning itself. The variables signaling intensity of practicing are Math#Tests, the number of practice tests a student has tried out, and MathTestScore, the average score of those tests.
Because BlackBoard does not track time, it is not possible to determine what part of the total learning time students spent in practicing the BlackBoard math-tests, but because of the more limited functionality, it seems plausible that this proportion is lower than a quarter of the time spent on the course.
6. EDUCATIONAL PRACTICE The educational system in which students learn mathematics and statistics is best described as a ‘blended system’. The main component is 'face-to-face’: problem-based learning (PBL), in small groups (14 students), coached by a content expert tutor. Participation in these tutor groups is required, as for all courses based on the Maastricht PBL system. Optional is the online component of the blend: the use of the two test-directed learning environments. The reason for having this component optional is at one hand that this best fits the Maastricht educational model, which is student-directed and places the responsibility for making educational choices primarily with the student, and at the other hand, the circumstance that not all students will benefit equally from using these environments: due to the diversity in prior knowledge, it is supposed to have less added value for students at the high end. However, the use of test-directed environments is stimulated by making bonus points available for good performance in the quizzes. Quizzes are taken every two weeks and consist of items that are drawn from item pools very similar to the item pools applied in the two digital practice platforms. We chose for this particular constellation, since it stimulates students with little prior knowledge to make intensive use of the test platforms. They realize that they fall behind other students in writing the exam, and need to achieve a good bonus score both to compensate, and to support their learning. The most direct way to do so is to frequently practice in the MSL and BB environments. The student-directed characteristic of the instructional model requires first and foremost adequate information for students so that they are able to monitor their study progress and their topic mastery in absolute and relative sense. That provision of relevant information starts the first day of the course when students take two entry tests for mathematics and statistics, so as to make their positions clear. Feedback from entry tests provide the first signals of the importance of using the test platforms. Next, the digital MSL- and BB-environments take over the monitor function: students can at any time see their progress in preparing the next quiz, get feedback on the performance in the already taken quizzes and on the conduct of the practice sessions. The same information is also available for the teachers. Although the primary responsibility for directing the learning process is with the student, the tutor acts complementary to that self-steering, especially in situations where the tutor considers that a more intense use of digital learning environments is desirable, given the position of the student concerned. In this way, the application of learning analytics shapes the instructional situation.
7. IMPACT TEST-DIRECTED LEARNING To explore the role of test-directed learning, we investigated the relationship between the intensity of use of the two test-directed platforms and academic performance. Two indicators measure academic performance: the exam containing a mathematics and statistics part (MathExam and StatExam) and three quizzes for both sub-topics, summed into a MathQuiz and StatQuiz score. Before examining the relationship between practice and performance, we corrected for differences in prior knowledge, in two ways: by the level of prior mathematics education, and by the
student score in the math entry test. What prior education is concerned: high school systems distinguish a basic level preparing for the social sciences and an advanced level preparing for sciences. An indicator variable is used for math at advanced level (MathAdv) (which is true for one third of the students), with basic level of math prior schooling being the reference group. Moreover, the level of prior math knowledge is determined by the day-one entry or diagnostic test, of which the score is labeled as EntryTest, focusing on the mastery of basic algebraic skills. One of the most straightforward ways to investigate the role of test-directed learning on achievement is to use regression analyses in which performance variables are explained by prior knowledge and data on intensity of using the practice tests. These regressions indicate that prior knowledge, both as type of prior schooling and as score in the entry test, explains part of performance differences. But the most important predictor of course performance is the level that students gain in the test platforms. The number of different tests students need to acquire that level, or the time they need to practice to acquire that level, has a corrective effect, what is intuitive: knowledge achieved through testing helps, but if a student needs a lot of time or effort to reach that level, this signals more problematic learning. An alternative demonstration of the impact of using the test environments is obtained by dividing the population of students into students with high and low mastery in the entry test and high and low level of intensity of using the test platforms, and comparing exam scores and pass/fail outcomes. The fit resulting from these prediction models is very high. For example, in a median split on performance in the math platform, 92% of the students with the better practice performance do pass, against 59% in the students with lower practice performance.
7.1 LA: demographic characteristics Having demonstrated that on average students benefit from the opportunity of test-directed learning, the question arises whether this is equally true for all students. This question asks for learning analytics applications using data from other sources than the learning environments to identify specific student groups most in need for these practice environments. In this section of our empirical study, we follow [1], [5] to investigate individual differences in the intensity of using digital learning tools. As a first step, we make use of data from the regular student administration such as whether or not Dutch high school, whether or not advanced prior math schooling, gender, nationality and entry test score. Students with advanced prior schooling are better at math, without incurring more need to practice. They are not better at statistics, which corresponds to the fact that in programs at advanced level, the focus is not on statistics but abstract math. Dutch students make considerably less use of both test environments and hence achieve a slightly lower score, benefiting from a smoother transition than international students, but relying just somewhat too much on that. Students with a high entry test score do better in mathematics and a little better in statistics in the test environments, without the need to exercise more. Finally, there are modest gender effects, the strongest in the intensity of exercising: female students are more active than male students.
7.2 LA: cultural differences The remaining data from the student records of administrative systems regard the nationality of students. Because cultural differences in education has been given an increasingly important role, and because the Maastricht student population makes it very suitable through its strong international composition, the
nationality data are converted into so-called national culture dimensions, based on the framework of Hofstede [6]. In that framework, there are a number of cultural dimensions that refer to values that are strongly nationally determined. In this study we use six of these dimensions: Power Distance, Individualism versus Collectivism, Masculinity versus Femininity, Uncertainty Avoidance, Long-Term vs. Short-Term Orientation and Indulgence vs. Restraint. Scores for each of these national dimensions are assigned to the individual students. Correlating these scores with the four indicators of practice tests intensity result in several significant effects, all in line with Hofstede's framework. The most significant effects are for students from a masculine culture, where mutual competition is an important driver in education, for students from a culture that value longterm over short-term and, somewhat in relation thereto, cultures that value sobriety rather than enjoyment. In this, masculinity and hedonism have a stronger impact on the intensity of exercising, than on the proceeds of exercising, in contrast to long-term orientation, that has about equal impact on both aspects. Uncertainty avoidance contributes, as expected, to practicing, albeit to a lesser extent and again primarily toward intensity of exercising rather than its outcome. The roles of power distance and individualism play a less salient role in learning, as expected.
7.3 LA: learning styles Although the effects are smaller in size, learning data based on the learning style model of Vermunt [7] exhibit a characteristic role. Vermunt’s model distinguishes learning strategies (deep, stepwise, and concrete ways of processing learning topics), and regulation strategies (self, external, and lack of regulation of learning). Deep-learning students demonstrate no strong relationship with test directed learning: they exercise slightly less, but achieve a slightly better score. That is certainly not true for the stepwise learning students. Especially for these students the availability of practice tests seems to be meaningful: they practice more often and longer than other students and achieve, especially for statistics, a better score than the other students. These patterns repeat themselves in the learning regulation variables that characterize the two ways of learning: self-regulation being characteristic for deep learning, external regulation as a feature for stepwise learning. Indeed, the students whose learning behavior has to be externally regulated, are those who benefit most from the test environments: both in intensity and performance they surpass the other students. A notable (but weak) pattern is finally visible in learning behavior lacking regulation: these students tend to practice more often and longer than the other students but achieve in both subtopics lower performance levels. Apparently, even the structure of the two test environments is incapable to compensate the of lack of regulation for these student.
7.4 LA: (mal)adaptive thoughts & behaviors Recent Anglo-Saxon literature on academic achievement and dropout assigns an increasingly dominant role to the theoretical model of Andrew Martin: the 'Motivation and Engagement Wheel’ [8]. That model includes both behaviors and thoughts or cognitions that play a role in learning. Both are then divided into adaptive and mal-adaptive or obstructive forms. As a result, the four quadrants are: adaptive behavior and adaptive thoughts (the ‘boosters’), mal-adaptive behavior (the ‘guzzlers’) and obstructive thoughts (the ‘mufflers’). In Figure 1, two panels depict the relationships of adaptive and mal-adaptive thoughts and behaviors with the usage data. The first panel documents adaptive thoughts
Figure 1. Role of (mal)adaptive thoughts and behaviors Self-belief, Value of school and Learning focus, and adaptive behaviors Planning, Study management and Perseverance. All adaptive thoughts and all adaptive behaviors have a positive impact on the willingness of students to use the test environments, where the effect of the adaptive behavior dominates that of cognitions. The mal-adaptive variables show a less uniform picture. Because gender effects play a prominent role here, the dummy variable female/male is added to the four data of use intensity in the panel. From these additional correlations we conclude that mal-adaptivity manifests itself differently in female and male students: for female students primarily in the form of limiting thoughts, especially fear and uncertainty, in male students primarily as mal-adaptive behaviors: self-handicapping and disengagement. That difference has a significant impact on learning. Mal-adaptive behaviors negatively impact the use of the test environments: all the correlations, both for use intensity and performance, are negative. The effect of inhibiting mind, however, is different: uncertainty and anxiety have a stimulating effect on the use of the test environments rather than an inhibitory effect. Combination of both effects provides a partial explanation for the observed gender effects in the use of the test environments.
7.5 LA: learning emotions Also of relatively recent date is research on the role of emotions in learning. Leading in this research is Pekrun’s control-value theory of learning emotions [9]. That theory indicates that emotions that arise when learning are influenced by the feeling to be 'in control' and something worthwhile to do. Pekrun’s model distinguishes several emotions, and for this study we selected emotions that contribute most strongly to student success or failure: the negative emotions of Anxiety, Boredom and Hopelessness, the positive emotion Enjoyment. Emotions are context-specific measured, for
example, Anxiety is defined in the context of learning mathematics. Learning emotions are typically measured in the middle of the course, unlike all other instruments that are taken in the beginning of the course. Correlations can thus not be interpreted within a cause-effect framework, as we can do for most other variables. The most obvious association is that of mutual influence: emotions will impact the use of the test environments, but conversely experience gained in practicing, and ideally the performance in practicing, will also determine learning emotions. Associations we find all have predicted directions: negative emotions demonstrate negative relationships to the use of the test environments, positive emotion and feeling in control, demonstrate positive relationships. It is striking that performance in the test environment, especially for mathematics, is much stronger associated with learning emotions than the intensity of practicing in the test environments.
8. CONCLUSIONS The intensive use of practice test environments makes a major difference for academic performance. But in a student-centered curriculum it is not sufficient when teachers are convinced of the benefits that test-based learning in digital learning environments entails. Students regulate their own learning process, making themselves choices on how intensively they will exercise and therefore, are the ones who need to become convinced of the usefulness of these digital tools. In this, learning analytics can play an important role: it provides a multitude of information that the student can use to adapt the personal learning environment as much as possible to the own strengths and weaknesses. For example, in our experiment the students were informed about their personal learning dispositions, attitudes and values, together with information on how learning in general interferes with choices they can make in composing their learning blend. At the same time: the multitude of information available from learning analytics is also the problem: that information requires individual processing. Some information is more important for one student than the other, requiring a personal selection of information to take place. Learning analytics deployed within a system of student-centered education thus has its own challenges. The aim of this contribution extends beyond demonstrating the practical importance of Shum and Crick’s learning analytics infrastructure. Additionally, this research provides many clues as to what individualized information feedback could look alike. In the learning blend described in this case study, the face-to-face component PBL constitutes the main instructional method. The digital component is intended as a supplementary learning tool, primarily for students for whom the transition from secondary to university education entails above average hurdles. Part of these problems are of cognitive type: e.g. international students who never received statistics education as part of their high school mathematics program, or other freshmen who might have been educated in certain topics, without achieving required proficiency levels. For these kind of cognitive deficiencies, the digital testdirected environments proved to be an effective tool to supplement PBL. But this applies not only to adjustment problems resulting from knowledge backlogs. Students encounter several types of adjustment problems where the digital tools appear to be functional. The above addressed learning dispositions are a good example: student-centered education presupposes in fact deep, self-regulated learning, where many
students have little experience in this, and feel on more familiar ground with step-wise, externally regulated learning. As the analyses demonstrate: the digital test environments help in this transformation. It also makes clear that the test environments are instrumental for students with non-adaptive cognitions about learning mathematics and statistics, such as anxiety. An outcome that is intuitive: the individual practice sessions with computerized feedback will for some students be a safer learning environment than the PBL tutorial group sessions. Finally, the learning analytics outcomes make also clear where the limits of the potentials of digital practice are: for students with nonadaptive behaviors and negative learning emotions. If learning involves boredom and provokes self-handicapping, even the challenges of test-based learning will fall short.
9. ACKNOWLEDGEMENTS The ONBETWIST project has been financed by SURFfoundation as part of the Testing and Test Driven Learning program [2].
10. REFERENCES [1] Buckingham, S. S. & Deakin, C. R. (2012). Learning Dispositions and Transferable Competencies: Pedagogy, Modelling and Learning Analytics. Proceedings LAK2012: 2nd International Conference on Learning Analytics & Knowledge, pp. 92-101. ACM Press: New York [2] SURF (2010). Programma Toetsing en Toetsgestuurd Leren. http://www.surf.nl/nl/themas/innovatieinonderwijs/toetsen/D ocuments/Projectplan%20PROGRAMMA%20TOETSING% 20EN%20TOETSGESTUURD%20LEREN.pdf [3] Tempelaar, D. T., Kuperus, B., Cuypers, H., Van der Kooij, H., Van de Vrie, E. M., & Heck, A. (2012). “The Role of Digital, Formative Testing in e-Learning for Mathematics: A Case Study in the Netherlands”. In: “Mathematical elearning” [online dossier]. Universities and Knowledge Society Journal (RUSC), 9(1). UoC. [4] Verbert, K., Manouselis, N., Drachsler, H., & Duval, E. (2012). Dataset-Driven Research to Support Learning and Knowledge Analytics. Educational Technology & Society, 15 (3), 133–148. [5] Whitmer, J., Fernandes, K., & Allen, W. R.. (2012). Analytics in Progress: Technology Use, Student Characteristics, and Student Achievement. EDUCAUSE Review Online, July. [6] Hofstede, G., Hofstede, G. J., & Minkov, M. (2010). Cultures and organizations: Software of the mind. Revised and expanded third edition. Maidenhead: McGraw-Hill. [7] Vermunt, J. D. (1996). Leerstijlen en sturen van leerprocessen in het Hoger Onderwijs. Amsterdam/Lisse: Swets & Zeitlinger. [8] Martin, A. J. (2007). Examining a multidimensional model of student motivation and engagement using a construct validation approach. British Journal of Educational Psychology, 77, 413-440. [9] Pekrun, R. (2006). The control-value theory of achievement emotions: Assumptions, corollaries, and implications for educational research and practice. Educational Psychology Review, 18, 315-34.