Using Stroop Task to Assess Cognitive Load - Semantic Scholar

Report 3 Downloads 85 Views
Gwizdka, J. (2010b). Using Stroop task to assess cognitive load. Proceedings of the 2010 European Conference on Cognitive Ergonomics (ECCE'2010).

Using Stroop Task to Assess Cognitive Load Jacek Gwizdka School of Communication and Information Rutgers, The State University of New Jersey 4 Huntington Street, New Brunswick, NJ 08901, USA [email protected] ABSTRACT

BACKGROUND

Motivation – Assessment of cognitive load on user tasks is useful for characterizing user interfaces and tasks with respect to their demands on the user’s mental effort.

In this paper, we define cognitive load as the mental effort required of an individual to complete their task using a given interactive system. Hence, at any point in task performance cognitive load is relative to the user, the task being completed, and the system employed to accomplish the task. Cognitive load is of our interest for two reasons. First, cognitive load can be used to characterize interfaces with respect to cognitive cost. Second, it can be used to characterize tasks with respect the required mental effort. Using the terminology from Cognitive Load Theory (Chandler & Sweller, 1991), the first case deals mainly with extraneous load that is with the demands imposed by user interface. The second case, deals mainly with intrinsic load that is with task demands on the user’s cognitive resources.

Research approach – We conducted a controlled experiment with 48 subjects. The primary task involved information search. Stroop-like task was used as a secondary task. Reaction time to the secondary task events was used to assess cognitive load. Findings/Design – Reaction time on the secondary task differentiated between the primary task stages and the user interfaces. Higher cognitive load component of the secondary task performance discriminated primary task stages, while lower cognitive load component discriminated user interfaces. Research limitations/Implications – Results presented in this short paper were an unexpected finding. They are thus preliminary and need to be confirmed in further experiments. Originality/Value – This finding promises a method that separates extraneous cognitive load from intrinsic load. Take away message – Secondary task can be designed to yield separate assessment of intrinsic and extraneous load. Keywords

Cognitive load; Cognitive load theory; Dual-task paradigm, Experimentation, Measurement; User models. INTRODUCTION

User task, system, and individual characteristics influence the level of difficulty experienced by a user. One kind of difficulty is related to mental, or cognitive, requirements that are imposed by the interactive system or the task itself. Understanding what contributes to a user’s cognitive load on tasks is crucial to understanding interaction process and to identifying tasks types and system features that impose increased levels of load on users (Back & Oppenheim, 2004). As new user interfaces and interactive features are introduced we need to understand how the new functionality influences user performance and the system usability, usefulness, and acceptance.

Measurement of Cognitive Load

A common classification of cognitive load assessment techniques divides them into performance, subjective and physiological measures (Schnotz & Kürschner, 2007; Cegarra & Chevalier, 2008). Performance-based techniques include assessment of performance on a secondary task (Brünken et al., 2002). Cognitive load measures derived from performance on secondary task are grounded in the notion of limited cognitive resources . Dual-task techniques are low cost and enable real-time, on-task data collection. The dual-task technique measures instantaneous load (Xie & Salvendy, 2000) that is typically used to calculate average values during a unit of analysis (i.e., during a whole task or a subtask). Experiment presented here used a subtask as a unit of analysis. We refer to subtasks as task stages. METHODOLOGY

The methodology section presents only these aspects of our experiment that are directly relevant to this paper. For further details the reader is referred to (Gwizdka & Lopatovska, 2009). Forty-eight subjects participated in a controlled webbased search. User tasks were designed to differ in terms of difficulty. Easier task involved finding one or more facts, more difficult tasks involved information gathering. The tasks were also divided into three categories based on the structure of the underlying information need. Based on these characteristics, the tasks were categorized into three levels of objective difficulty. During the course of each study session,

participant performed a set of six tasks of differing type and structure. The searches were performed on Wikipedia by using one of two search engines Google (UI1) or Alvis (UI2 - Figure 1) (Buntine, 2005). UI1 displayed search results in a list, while UI2 presented search results along with their categories and allowed for category-based browsing of results. The order of tasks was partially balanced with respect to the objective task difficulty to obtain all possible combinations of monotonically increasing and decreasing objective difficulty within the groups of three tasks. This yielded four task rotations that were repeated for two orders of user interfaces.

task can be considered separately. Our secondary task involved motor action, as well as verbal/semantic processing (phonological loop in working memory). The modalities of the primary task and the secondary task overlapped, and one could have reasonably assumed that different levels of cognitive effort on the primary task should be reflected in the differences of performance on the secondary task, and in particular, in the reaction time to the secondary task. To our knowledge, Stroop task has not been previously used as the secondary task in the studies of interactive systems.

Figure 2. Secondary task pop-up window. Data Collection and Measures

Figure 1. Alvis search interface (Buntine, 2005). A secondary task (DT) was introduced to obtain indirect objective measures of user’s cognitive load on the primary task. The secondary task was based on Stroop effect (Stroop, 1935). In our implementation of the secondary task, a small pop-up window (Figure 2) was displayed at a fixed location on the computer screen at random time intervals (15-29 seconds) and for a random period of time (5-9 seconds). The length of a cycle was thus between 20 and 38 seconds. The times were selected to be short enough to affect performance on the search tasks but to allow for close-to-normal performance. The pop-up contained a word with a name of a colour. The colour of the word’s font either matched or did not match the name of the colour. Participants were asked to click on the pop-up as soon as they noticed it. The click was performed either on the right (match between the colour name and the font colour) or on the left mouse button (no-match). The participants were asked about their dominant handedness and allowed to position mouse and the keyboard according to their preferences. The pop-up window disappeared after a random period of time or as soon as it was clicked. In case of a colourmatch, the cognitive load imposed on a person is less related to semantic processing. In case of a no-match, the cognitive load is imposed on a person is higher since more semantic processing is required. In both cases, the motor effort required to respond to the secondary task is similar. Performance on the two modes of the Stroop

User interaction was recorder by Morae screen cam software (http://www.techsmith. com/morae.asp) and by the secondary task software. The interaction logs that were used in the analysis presented in this paper included time-stamped sequences of visited web pages, keyboard use, and mouse clicks. The latter were recorded for the primary and the secondary task. We used a semi-automatic process to segment user interaction data into task stages. The following four task stages were distinguished: query entry (Q), search engine results examination (L), visiting a content web page (C), and saving a result (B). The latter involved bookmarking and tagging a web page. Task stages differed in their demands on cognitive processing, thus we expected to find differences in the performance on the secondary task. We assessed average cognitive load during each task stage by calculating average reaction time to the secondary task events (RT). We performed this calculation for all secondary task events (RTall), and separately for colour-match (RTmatch) and nocolour-match (RTnmatch) cases. The three obtained variables were used as three separate dependent variables. RESULTS

As expected, we found that mental effort varied across task stages. This main, expected result will be reported elsewhere. In this paper, we focus on an unexpected finding that leads to the possibility of separate measurement of intrinsic and extraneous load. We performed series of univariate analyses of variance with the task stage, user interface and objective task difficulty as controlled variables and the mean reaction times to the secondary task as the dependent variable(s). We found several significant effects on reaction time (Table 1 and Table 2).

Table 1. Effects of User Interface and task stage on average reaction times per task stage. Task Stage Obj. Task RT Task Stage UI * UI difficulty all

F(3,839)=5.97 p