You are on page 1of 14

Cognitive Demands of Collision Avoidance in Simulated Ship Control

G. Robert J. Hockey, University of Leeds, Leeds, U.K., Alex Healey and Martin Crawshaw, University of Hull, Hull, U.K., David G. Wastell, University of Manchester Institute of Science and Technology, Manchester, U.K., and Jrgen Sauer, Darmstadt University of Technology, Darmstadt, Germany
The study examines the cognitive demands of collision avoidance under a range of maritime scenarios. Operators used a PC-based radar simulator to navigate set courses over 100 6-min trials varying in collision threat and trafc density. Corrective maneuvers were made through the application of standard navigation rules and by using two decision aids (target acquisition and test maneuver). Results showed widespread effects of collision threat in terms of decision aid use, subjective workload, and secondary task performance. Most notably, demand increased markedly over the course of emergency trials, in which collision threat resulted from rule violation by target vessels. The findings are discussed in terms of the comparison between predictable demands (requiring standard course changes) and those involving uncertainty about the others intentions (involving more intensive monitoring and forced delays in corrective action). The study has relevance for the design of collision avoidance systems, specically for the use of ecological displays. INTRODUCTION There is now a substantial body of research on patterns of workload and decrement in complex real-life tasks, such as driving, aviation, process control, and other advanced industrial systems (see Wickens & Hollands, 1999). However, with a few exceptions (e.g., Schuffel, Boer, & van Breda, 1989), no systematic studies have been carried out within the maritime domain. This is particularly relevant at the present time, in view of both the considerable increase in the size and density of maritime trafc and developments toward the integration of ship control functions into a highly automated ships operation center. With the growth of trafc congestion, navigational demands have necessarily increased, as has the need for effective navigational decision aids (Grabowski, 1990). Successful implementation of such systems will depend on knowledge of how navigational demands affect workload and performance. Mariners are exposed to an increasing number and diversity of supervisory and decision tasks (Dyer-Smith, 1992; Grocott, 1992; Lee & Sanquist, 1996), needing to divide attention between primary navigation displays and secondary tasks such as engine and cargo functions. Efcient scheduling of these tasks may be critical for situation appraisal and reducing the potential for accidents (Sanderson, 1989). Accident statistics and the analysis of critical incidents and near misses in the maritime eld provide strong evidence for the role of high workload. Although several simulation studies of navigation workload have been carried out, few have considered performance under stressful, uncertain, or emergency conditions. One study using a full mission simulator (Sablowski, 1989) found no effects on navigation of various emergencies (either a collision threat or rudder failure during the nal phase of a 2-hr scenario), although ratings of workload were higher for emergency scenarios.

Address correspondence to G. R. J. Hockey, School of Psychology, University of Leeds, LS2 9JT, UK; g.r.j.hockey@ leeds.ac.uk. HUMAN FACTORS, Vol. 45, No. 2, Summer 2003, pp. 252265. Copyright 2003, Human Factors and Ergonomics Society. All rights reserved.

COLLISION AVOIDANCE It has been reported that approximately 90% of all marine accidents occur in conned waters such as channels, fairways, and inshore trafc zones (Cockroft, 1984). These are complex navigational environments in which threats and uncertainty are high and options for action severely constrained. One issue contributing to such problems is likely to be poor technological design. Vicente and Rasmussen (1992) noted that process operators sometimes act as if displays completely describe underlying processes, rather than providing only a partial description, and that this sometimes leads to unanticipated variations in the workload and behavior of human operators. Although periods of low workload may be less demanding, high-workload activities may be more difcult to manage (Raby & Lee, 2001). Lee and Sanquist (1996) observed that although a collision avoidance system was able to monitor increased numbers of vessels and reduce computational load on the operator, it also increased the need for interpretative skills and knowledge of various predictor functions. To prepare for the study, we conducted a series of interviews with experienced mariners, using the critical incident technique (Kirwan & Ainsworth, 1992) to inform our understanding of the sources of cognitive demand in collision avoidance (CA) situations. This showed that problems in interpreting intentions, or predicting the actions, of other vessels was one of the most common sources of near-miss incidents. This is well established as a problem for mariners in potential collision encounters (Young & Bell, 1993) and is recognized as major source of demand in driving (Zeitlin, 1995). In both cases, effects of uncertainty are exacerbated by the occurrence of rule violations by others using the shared trafc space. Perrow (1984) reported data showing that 56% of major maritime collisions include violations of rules of the road as a contributory cause. Access to information about the intentions of other vessels may be the most important requirement for an operators effective appraisal of situations (Hammer & Hara, 1990) and remains a major obstacle to the introduction of expert systems (Hobday, Rhoden, & Jones, 1993). In nautical environments, uncertainty is affected by both the physical environment (wind,

253 currents, sandbanks) and the navigation behavior of other vessels. Collision encounters are a particularly interesting source of information about maritime demands. Although decisions about what action to take are based on standard, internationally agreed rules of the road (International Maritime Organisation, 1972), the dynamic and interpretative nature of the task means that uncertainty and unpredictability are common features in the erroneous or risky judgments made in many shipping accidents (James, 1994; Perrow, 1984; Wagenaar & Groenweg, 1987). Our use of uncertainty within CA contexts is closely related to Woodss (1988) analysis. Woods dened uncertainty as an intrinsic feature of complex systems, associated with unavailability or ambiguity of data and reduced predictability of future states. In the application of Woodss analysis to rule violation by a target vessel, we argue that cognitive demands are likely to be increased because of the loss of reliable information for implementing evasive actions as the scenario develops. In Hutchinss (1995) account of navigational computation, uncertainty would therefore be expected to prolong and disturb the familiar fix cycle (p. 133). This will have several effects on navigators, given that they (a) need to consider a greater range of potential interpretations of the situation; (b) cannot complete necessary informationprocessing activities at appropriate times; and (c) are unable to implement routine procedures and familiar responses when actions do need to be carried out. There is also a clear link with Rasmussens (1983) analysis, because uncertainty can be considered to push the operator toward the use of knowledge-based processing, as opposed to the rule-based behavior associated with standard CA encounters. Tattersall and Hockey (1995) have shown that this mode of problem solving made greater demands on in-air engineers, as indexed by suppression of the 0.10-Hz component of heart rate variability. In addition, the higher level of mental load must be sustained for prolonged periods because actions are delayed by the inability to resolve the uncertainty. The present study was designed to assess the cognitive demands of CA at sea using standard workload assessment methodology. A specific

254 aim was to compare the two major sources of CA demand, together with those of normal (noncollision-course) encounters. Routine CA encounters have predictable course-change requirements, following rules-of-the-road procedures. By contrast, emergency encounters, which are brought about by rule violation by the target vessel, increase uncertainty and make course changes unpredictable. We compared demands under uncertainty with those imposed by predictable course change requirements by examining the time course of demand over trials. Action implications of emergency scenarios were not resolved until late in the trial, so secondary task decrement should increase over the course of the trial, as compared with routine trials. The primary task requirement to make corrective maneuvers under threat conditions should be reected in the increased use of navigation aiding devices, particularly test maneuvers. Because of the possibility of making early, controlled course corrections, routine scenarios should show sustained use throughout the trial, whereas emergency scenarios, in which reliable information was not available until later, were expected to show increasing use as the trial progressed. METHOD Participants and Design The 12 participants (10 men, 2 women) were selected from a panel recruited for a continuing program on performance in complex systems. Their ages ranged from 20 to 31 (median = 24) years. All were either graduates or students in science with extensive experience with computing and Windows-based applications. None had previous maritime experience of controlTABLE 1: The Six Generic Scenarios Encounter Normal/xed Normal/altering Routine/xed Routine/altering Emergency/xed Emergency/altering
were not included in the analysis.

Summer 2003 Human Factors ling large vessels. They were paid 100 ($150 U.S.) for their participation in the experiment. Participants were tested under all conditions of a 3 2 3 3 repeated-measures design, in which CA demand was manipulated by three separate factors. Collision threat (three levels) was dened as increasing risk of collision with an approaching (target) vessel if no action were taken. Target behavior (two levels) referred to whether target vessels remained on a fixed course or altered during the trial. The third factor, trafc (three levels), was dened by the number of distracters (vessels other than own ship or target) present on the display (zero, one, or three). A fourth factor, phase (three levels), was defined for analytic purposes by dividing the trial period into three successive phases. Navigational Scenarios A set of six generic scenarios was dened by crossing collision threat and target behavior (Table 1). Three levels of collision threat were included: normal encounters, which did not require any action by own ship (OS), and encounters requiring either a routine or an emergency response. Target behavior involved targets either remaining on a xed course throughout a trial or altering at some point, changing the characteristics of the developing situation. In normal encounters the target was shown to be passing at a safe distance of at least 1 nautical mile (nm), equivalent to 1852 m. Routine encounters displayed the target approaching OS from the right (starboard) on a direct collision course. Because such targets have right of way under maritime regulations, it was the responsibility of OS to take evasive action by implementing appropriate course changes.

Target Approach from Port or starboard Port (gives way) Starboard (stands on) Starboard Port (stands on) Port or starboard

Collision Threat Normal Normal Routine Routine Emergency Emergency

Target Behavior Fixed Alters Fixed Alters Fixed Alters

No. of Trialsa 36 18 18 6 6 12

a Trials add up to 96 rather than 100. The remaining 4 trials involved ultrasafe scenarios, in which the target passed well ahead. These

COLLISION AVOIDANCE In routine/altering, targets made an (unnecessary) alteration, as if to give way, although OS was still strictly required to make a routine CA maneuver. Emergency scenarios were expected to increase monitoring demands because of uncertainty about intentions of target vessels. In these, the target violated rules of the road, either failing to give way as required (emergency/xed) or altering onto a collision course (emergency/altering), in both cases unexpectedly generating a potential collision situation. The change from a seemingly safe situation (equivalent to normal scenarios) to one in which an emergency response was required was initiated during the 3rd min of a 6-min trial and took 30 to 60 s to become evident. The resulting emergency collision threat meant that participants were put under time pressure to take evasive action, but only after determining that the target would not revert to normal rule use. Each of these six generic scenarios was presented at three levels of traffic (zero, one, or three distracters) to give a set of 18 distinctive scenario types. In order to explore the time course of effects of navigational demands, the 6-min period was analyzed in three successive phases. Transient effects associated with the beginning and end of trials were avoided by omitting the initial and final 30 s (when few relevant actions occurred), giving three 100-s phases: 30 to 130 s, 130 to 230 s, and 230 to 330 s. Navigational and Collision Avoidance Goals Participants were required to avoid collisions and, if possible, to stay on a planned course that was always indicated as due north on the radar. The CA goal required operators to pass targets and distracters at a safe distance, which was based on approximations to standards maintained by actual mariners and assessed by measuring the closest point of approach (CPA) during a trial between OS and target or distracter (whichever was closer). A CPA of 1 nm was defined as a minimum for good practice. A collision was dened operationally as CPA < 0.5 nm, and a near miss was defined as CPA < 0.8 nm. The track-keeping (TK) objective was to remain for as long as possible within 1 nm of the planned track. Avoiding a collision was emphasized as having

255 priority over track keeping in those cases where the two goals conicted. However, the TK goal meant that operators should avoid making unnecessarily large course alterations to avoid other vessels. Design of Trials The duration of each scenario was short (6 min) so as to allow for multiple replications of trial types and to increase measurement reliability for the different encounters. It also allowed us to provide participants with a more representative sampling of the relative incidence of rule-compliance and violation actions. Shorter trials meant that the starting position and movement of both targets and other vessels had to be tightly controlled to provide time for collision situations to develop. To facilitate this, vessel speeds were set at a level about 30% higher than those typically found at sea. (This was established in pilot tests as not changing behavior signicantly.) The number of trials representing each of the 18 scenarios was designed to reect actual maritime experience, support expectations of event probabilities, and generate sufcient reliable data. For example, to be consistent with the exceptional nature of transgressions of navigation regulations, the probability of a violation by target vessels was kept fairly low (p = .18). This represents a compromise between the even lower level of violations in real-life encounters and the minimum number of observations needed to obtain reliable measurements on the least frequent scenarios (routine/altering and emergency/xed, both 6 trials out of 100). The probabilities and frequencies of trials representing the six generic encounter types are given in Table 1, with trials in all cases equally divided among the three levels of trafc. In addition to these 18 scenarios, an ultrasafe scenario was included (target passing well ahead, p = .04) in which there was no possibility of collision; this allowed us to check baseline performance, on cases in which collision threat was absent. Data from these trials showed no errors and very low levels of workload and are not included in the analysis. Simulation Display The simulation of a simplified navigation

256 control station, judged to be realistic by qualied navigation ofcers, was based on the Advanced Radar Plotting Aid (ARPA), which is commonly used on merchant and other vessels. It was programmed in Visual Basic 3.0 and designed to run on a PC with a minimum specification of Windows 3.1, Intel 80486 processor, and 17inch (43.2-cm) VGA color monitor. Scenarios were presented on a highly simplied relativemotion radarscope, with OS represented at the center of the screen. All encounters took place under open sea conditions, with no navigational constraints apart from other vessels, which were displayed as vectors showing relative direction and speed. On a relative-motion display, any other vessel that displays a vector pointing directly at OS is on a collision course (Figure 1). The planned track was superimposed on the radarscope as a dotted line and was oriented directly north. Range rings were set at 1-nm intervals, providing a rough guide to distance. In the scenario represented in Figure 1, the target is shown approaching from port at a range of 3 nm and on a collision course (as in normal/ altering or emergency/fixed). A distracter is also represented, approaching from starboard at

Summer 2003 Human Factors 3.6 nm and predicted to pass ahead at a range of approximately 1 nm. Both targets and distracters were programmed to appear on the radarscope and behave as vessels at sea. Their initial position was always ahead (north) and to the side of OS, and they would move across the screen at a preprogrammed speed and heading. Starting parameters for position, speed, and heading depended on encounter type but were otherwise varied randomly across trials. They were calculated to put the target either on a direct collision course (<0.25 nm) or on a passing course (1.01.4 nm). Decision aids. Operators had access to two decision aids for navigation and collision avoidance. A rough assessment of distances could be made using the range rings, but an electronic bearing line (EBL) facility could also be used to display the range and bearing of any point selected. The control panel at the right of the radarscope included facilities for changing the heading and speed of OS (although the speed option was rarely used in practice by our participants). Digital readouts of the heading, speed, range, and bearing of any vessel, as well as predicted closest point of approach (CPA) and time

Figure 1. Radar display showing events near the beginning of a trial, with OS at the center of the relative motion display and a target vessel at 3 nm and on a collision course.

COLLISION AVOIDANCE remaining to this position (TCPA), could be displayed on a panel above the radarscope by acquiring (clicking on) a chosen vessel. In Figure 1, the vessel approaching from port has been acquired. The predicted CPA enables an operator immediately to assess the risk of collision, whereas predicted TCPA indicates how much time is available to make any necessary course change. In addition, a predictor facility mode (test maneuver) allowed the operator to model the consequences of a planned maneuver before making it. When the test maneuver mode was toggled on, the OS vector could be set to the heading to be tested, and the predicted relative heading vectors of all other vessels would be displayed. Secondary Task In addition to carrying out the primary navigation/collision avoidance task, operators were required to monitor a separate display in order to maintain engine oil temperature within tolerance limits. The requirement to switch between displays allowed us to infer shifts of attention between primary and secondary tasks. The temperature variable would uctuate slightly around a set value (50 5 units) when in its normal state. At random intervals generated by the computer (mean 46 s) it would enter a drift state, gradually increasing or decreasing in value toward the displayed limits. Operators were instructed to reset the system whenever they diagnosed a drift state. To ensure that participants actively monitored the display for drifts and did not reset it automatically, a cost was incorporated in which false positives always triggered the onset of a new drift state. If the variable was allowed to exceed tolerance limits, an omission error was recorded. Oil temperature was reset to normal for each new trial. Subjective Mental Workload At the end of each 6-min trial, operators completed a subjective rating of mental workload (MWL). Because of the length of the sessions (2 hr) and the need to reduce further demands on participants, this took the form of a single-item visual analog scale. Participants moved a slider along a 100-mm line on the screen (endpoints labeled very low and very high mental workload) and pressed an accept but-

257 ton when they were happy with their rating. This was scored in 2-mm divisions to give an MWL rating in the range of 1 to 50. Simple global (univariate) scales have been shown to provide sensitive and reliable indices of subjective workload in similar work situations (Hendy, Hamilton, & Landry, 1993), and the procedure used here has been found to be minimally intrusive in other long-duration, complex tasks (Hockey, Wastell, & Sauer, 1998; Sauer et al., 2002). Procedure Preliminary training. Because participants had no nautical experience prior to taking part in the experiment, they were given extensive formal training on the task (a total of around 12 hr). Before being confronted with the radar simulation, they were given instruction in elementary navigational skills (navigational exercises and relevant collision regulations for the simulated encounters). Before continuing with training on the simulation task, participants were required to practice these component skills in their own time and to obtain perfect or nearperfect scores on formal tests. (Only one had to be eliminated for failing to acquire the necessary skills.) Navigation exercises were selected from a standard radar-plotting handbook used for training nautical cadets. These helped participants to develop an understanding of relative motion displays, vessel movements, the geometry of collision encounters, and calculation of course changes from basic radar-plotting techniques. Instructions were also provided on the essential navigational goals of keeping a safe distance from other vessels and maintaining track. Participants were instructed to make collision avoidance the top priority and to attempt to maintain the good-practice criterion of 1 nm CPA but to return to the default track line once the collision threat had passed. An illustrated lecture was given, explaining the collision regulations relevant for the task. These were selected from the standard set of maritime regulations for the rules of the road at sea (International Maritime Organisation, 1972) and provided clear guidelines for what action should be taken whenever a collision threat was identied. Essentially, these state that vessels should normally

258 (a) give way to other vessels approaching on their starboard (right) side; (b) make any necessary course alterations in a starboard (clockwise) direction; and (c) stand on (hold present course) when other vessels approach from the port (left) side (because these should give way). Simulator training. Participants were asked to familiarize themselves with the user manual provided, which explained how to use each of the simulator controls and interpret the display. They were able to refer to this documentation throughout a simulator training session of 20 practice trials. This allowed participants to familiarize themselves with the simulator and to practice making CA maneuvers. Practice trials were a random subset of the full set of 100 experimental trials, although they did include an emergency encounter. During early trials, participants were encouraged to clarify their understanding of the interface by asking for clarication if necessary. Performance on practice trials was assessed in terms of success in meeting primary task goals (CA and TK) and was augmented by a further set of 10 trials if necessary. Task sessions. Participants carried out the task in groups of two or three individuals, separated by screens. Five blocks of 20 trials were run over a period of 3 to 5 days. Experimental blocks took a little over 2 hr to complete and were run in either late morning (10:00 a.m. 12:00 p.m.) or early afternoon (2:004:00 p.m.). The sequence of trials in each session was selfpaced; participants controlled the initiation of each trial by clicking a start button, which allowed them to pause between trials if necessary. These breaks were encouraged to prevent fatigue or other effects of continuous testing, but they typically lasted only 5 to 15 s. Analysis of Data Data capture. Automatic data capture ensured that a detailed record was kept of all significant events and their time of occurrence (acquisition of other vessels, use of the test maneuver toggle, heading or speed changes, CPA, and position of OS at 10-s intervals). For the secondary task, the system recorded reset actions and switches between radar and oil temperature displays. The level of the oil temperature variable was recorded at 10-s intervals,

Summer 2003 Human Factors allowing the occurrence of drift and error states to be computed. Data from 1179 of the total 1200 trials (12 participants 100) were successfully collected for analysis, resulting in a small number of missing observations, which were not replaced in the analysis. Statistical treatment. For most analyses, a 3 2 3 3 repeated-measures analysis of variance was carried out, with collision threat, target behavior, traffic, and phase as factors. For some analyses only whole-trial measures were available, so phase was omitted. Data transformations were carried out if necessary in order to satisfy assumptions of normality, homogeneity, and sphericity. (This was necessary because means were based on different numbers of events.) A log transform was used for percentage of time on the secondary task. Square root transformations were carried out on secondary task sampling time, duration, frequency, and reset latency. Because no suitable transformation could be carried out on the data for track keeping and collision avoidance, we used logistic regression based on dichotomized (above or below 1 nm) minimum and maximum distance off track. RESULTS Primary Task The various overall measures of navigation and collision avoidance behavior for the six different scenario types are summarized in Table 2. They include performance measures (rulefollowing behavior, collisions, track keeping), course changes, and use of the two navigation aids (target acquisitions and test maneuvers). Collision avoidance and track keeping. In the absence of collision threats, overall TK performance (remaining within the good-practice limits of 1 nm of the planned course) was generally good. Performance for normal encounters was near perfect (<1% off track); failure to maintain this criterion was greatest for routine conditions (routine/xed 15%, routine/altering 13% time off track). The recognition of the routine collision threat is indicated by the early rst course change (mean time of rst maneuver = 80 s). In emergency scenarios, in which the threat becomes apparent only late in the trial, the first maneuver occurred much later

COLLISION AVOIDANCE
TABLE 2: Summary of Primary Task (Ship Control) Measures

259

Encounter

Collision Deviation Course Rule Target Test Bearings Headings [+ near miss] from Track Changes Following Acquisitions Maneuver Taken Entered (% trials) (% time) (% trials) (% trials) (no./trial) (% time) (no./trial) (no./trial) 0.0 0.6 14.7 12.6 3.7 2.4 1.5 44.1 98.7 98.5 99.7 97.0 100.0 99.0 96.7 93.3 90.7 81.3 1.05 0.88 1.20 1.07 0.62 0.88 12.8 41.1 65.1 65.8 59.6 46.7 0.17 2.01 5.12 5.01 2.51 1.26 2.51 2.15 7.71 3.94 1.93 0.92

Normal/xed 0.0 [0.0]0 Normal/altering 1.4 [6.6]0 Routine/xed 0.9 [6.0]0 Routine/altering 1.3 [4.0]0 Emergency/xed 6.7 [21.3]0 Emergency/altering 10.6 [33.3]0

(mean 213 s). Because of this, overall TK error was, by default, lower (emergency/fixed 4%, emergency/altering 2% off track). CA was measured in terms of CPA, using two criteria: collisions only (CPA < 0.5 nm) and collisions plus near misses (CPA < 0.8 nm). Performance was again good; overall collision rate was low for both criteria (2% and 6%) and occurred almost exclusively under emergency conditions. Course changes and rule following. Participants generally took appropriate action to achieve the primary task goals. As expected, almost no course changes were made in normal/ fixed encounters, although they occurred on 44% of normal/altering trials (in which the target started on a collision course but altered to safe), with participants following approved procedure (to starboard) on 99% of occasions. As required, maneuvers were made in 99% of routine encounters, with 95% of the first alterations again following recommended procedure (to starboard). Finally, required maneuvers were made in almost all emergency encounters. Despite the uncertainty associated with these, rule following was high (91% and 81% for emergency/fixed and emergency/altering, respectively). The Headings Entered column in Table 2 shows that the number of course changes entered was much higher for routine encounters (7.71 and 3.94) than for either normal or emergency encounters (around 12 per trial). Use of decision aids. Table 2 suggests that both types of aid were used most often during routine encounters, in which CA threat was present throughout the trial. For target acquisitions, however (Figure 2), the only marked effect was that the aid was used more under high traffic, F(2, 22) = 5.99, p < .01. There was a general reduction over the course of the trial,

F(2, 22) = 4.83, p < .05, but no Trafc Phase interaction, F(4, 44) = 1.00, p > .05, or any other effect. Use of the test maneuver facility showed more widespread differences among encounter types. Figure 3 shows the percentage of time the aid was selected. As predicted, this remained high throughout routine threat (around 50%) and increased dramatically over the trial for emergency threat. There were strong effects of both threat, F(2, 22) = 42.80, p < .001, and target behavior, F(1, 11) = 12.31, p < .001, and their interaction, F(2, 22) = 32.75, p < .001.

Figure 2. Mean number of target acquisitions as a function of phase and trafc density (distracters).

260

Summer 2003 Human Factors F(2, 22) = 46.97, p < .001, and traffic, F(2, 22) = 24.91, p < .001, as well as interactions of threat with both target behavior, F(2, 22) = 32.51, p < .001, and trafc, F(2, 22) = 2.66, p < .05. Effects of target behavior occur only with normal encounters, and those of traffic occur only with routine encounters. Secondary Task Performance measures. Table 3 summarizes the secondary task performance data for the six encounter types (with MWL means for comparison purposes). The only sensitive measure was reset latency (time from the onset of a fault to a reset response). Omission errors were rare, averaging much less than one per trial, and reset false positives were even rarer, occurring on only 4% of trials. For reset latency, there were no main effects of encounter variables, but there was an interaction between threat and target behavior, F(2, 22) = 3.36, p < .05. Reset latency increased under emergency encounters only when targets altered course.

Figure 3. Use of test maneuvers (% time selected) as a function of phase and encounter type (NF = normal/xed, NA = normal/altering, RF = routine/ xed, RA = routine/altering, EF = emergency/xed, EA = emergency/altering).

Unlike target acquisition, this aid had no effects of trafc. The main effect of phase, F(2, 22) = 49.68, p < .001, was associated primarily with the increase over time under emergency threat. There was an interaction of phase with both threat, F(4, 44) = 27.34, p < .001, and target behavior, F(2, 22) = 8.47, p < .01, as well as a three-way interaction, F(4, 44) = 9.97, p < .001. For emergency encounters, increase in the use of test maneuvers was greater when targets altered course. Figure 3 shows that the use of the test maneuver during emergency/altering changes from the level of the least-demanding scenario (normal/xed) during Phase 1 to that of routine by the end of the trial. Subjective Mental Workload Figure 4 summarizes effects on subjective mental workload ratings (MWLs). These vary over most of the range (442 on the 150 scale), reecting widespread differences among conditions. There were main effects of threat, F(2, 22) = 98.60, p < .001, target behavior,

Figure 4. Mental workload rating as a function of trafc (0, 1, 3 distracters) and encounter type (NF = normal/xed, NA = normal/altering, RF = routine/ xed, RA = routine/altering, EF = emergency/xed, EA = emergency/altering).

COLLISION AVOIDANCE
TABLE 3: Summary of Secondary Task Measures Encounter Normal/xed Normal/altering Routine/xed Routine/altering Emergency/xed Emergency/altering MWL Rating 6.7 17.0 27.3 28.0 36.3 40.0 Omissions (no./min) 0.18 0.26 0.29 0.31 0.31 0.56 Reset Latency (s) 35.2 36.6 37.0 37.6 36.8 40.5 Sampling Time (%) 40.8 29.9 19.9 18.5 20.9 27.8

261

Sampling Sampling Rate (no./min) Duration (s) 2.17 2.13 2.13 2.03 2.12 1.11 15.07 9.33 4.81 3.83 3.58 8.98

As predicted, there was a main effect of phase, F(2, 22) = 5.24, p < .05, resets taking longer later in the trial. In addition, phase interacted with threat, F(4, 44) = 6.45, p < .001, and target behavior, F(2, 22) = 5.45, p < .05. Increases in reset latency over the trial occurred more strongly under emergency and target alteration conditions. Sampling time. We predicted that time spent sampling the secondary task display would be sensitive to variations in CA demand and to changes over time. Figure 5 shows strong effects of all independent factors. Overall, operators spent around 40% of the trial monitoring the secondary task display during normal en-

counters and 20% under routine encounters, with a fall over the trial (from 40% to 20%) under emergency conditions. There was a main effect of threat, F(2, 22) = 13.97, p < .001, but not target behavior, F(1, 11) = 1.22, p > .05, although there was a complex interaction, F(1, 22) = 6.56, p < .01. When targets altered course, sampling time was reduced for normal, unchanged for routine, and increased for emergency scenarios. There was also a small main effect of trafc density, F(2, 22) = 5.21, p < .05. Of greater interest are changes in secondary task sampling over the duration of the trial. Figure 5 shows that the effect of phase, F(2, 22) = 10.80, p < .001, is attributable mainly to

Figure 5. Secondary task monitoring (% time selected) as a function of collision threat (separate panels), phase, target behavior (F = xed, A = altering), and trafc density (0, 1, 3 distracters).

262 changes under emergency scenarios, especially emergency/altering. This is supported by the Phase Threat interaction, F(4, 44) = 11.63, p < .001, and the three-way interaction among phase, threat, and target behavior, F(4, 44) = 4.22, p < .01. Under emergency/altering, the apparent threat of collision increases from minimal to maximal over the course of the trial. The pattern of findings here can be seen as a mirror image of the increased use of test maneuvers in Figure 3. To examine the nature of these effects, we decomposed sampling time into its two components, frequency (how often the oil temperature display was sampled) and duration (length of sampling periods). Sampling frequency was found to be near constant across all conditions (0.30.4/min). The dramatic reductions in sampling time under increasing threat are almost entirely the result of operators making briefer (rather than less-frequent) inspections. However, sampling frequency was higher in trials where no errors occurred (mean = 0.37/min) than in those incurring either one (mean = 0.27/min) or two (mean = 0.22/min) errors. Effects on sampling duration were similar to those for overall sampling time and are not reported separately here. DISCUSSION Overall, the results confirm that secondary task techniques can be used effectively to assess cognitive demands in a simulated maritime task environment. Higher levels of collision threat were associated with markedly increased ratings of MWL and with impaired performance on the secondary oil pressure monitoring task, in the form of increased reset latency, mediated primarily by a reduction in the average sampling duration. The reduction in secondary task monitoring under increased collision threat was mirrored by an increase in the use of the test maneuver tool, which strongly reected CA demands. Use of the target acquisition tool proved sensitive only to trafc density and fell off over the course of the trial. Of more specic interest for the purposes of the study is that these effects are found not only with objective workload factors (trafc density and marked course change requirements) but

Summer 2003 Human Factors also with uncertainty. All these effects were observed most strongly under the two emergency encounters (emergency/fixed and emergency/ altering), in which uncertainty was highest. Demands associated with traffic density or standard maneuvering requirements can be assessed quickly and acted upon effectively (using the target acquisition and test maneuver facilities). By contrast, demands arising from rule violations build up over the trial and are not fully resolved even when the emergency maneuver is carried out (because of the increased risk of collision with the late maneuvers). It is no coincidence that the problem of uncertainty about others intended actions has been identied as one of the most serious impediments to the development of effective collision avoidance systems (Hobday et. al., 1993). Mariners in our critical incident survey reported a number of incidents in which near collisions involved rule violations of the kind simulated in the study. They reported high levels of concern and frustration in (a) not knowing what would happen and (b) having to delay course changes until the last minute, when judgments of safety margins were difcult. To understand what is happening in the two emergency scenarios, recall that they both start out as formally equivalent to normal encounters (emergency/altering with normal/fixed, and emergency/xed with normal/altering), in which no collision threat is indicated. Emergencies are generated by the target vessel violating the rules of the road (in the rst case by altering onto a collision course and in the second case by failing to alter course). Unlike the evasive response in routine encounters, in these cases the navigator must continue to monitor the progress of the target vessel and is discouraged from making early evasive action by both navigation/CA rules and track-keeping requirements. Although maneuvers still need to be planned in advance, they have to be constantly updated as the threat is sustained and unresolved. In addition (as in real maritime contexts, supported by our interviews), maneuvers are typically knowledge based (Rasmussen, 1983) and can rarely be drawn from the standard repertoire of evasive actions. This interpretation is supported by the data, which illustrate the growth of uncertainty for

COLLISION AVOIDANCE emergency scenarios over the course of the trial as a transition from the levels of workload associated with normal scenarios to those for routine scenarios. Figure 3 shows that time spent making test maneuvers early in the trial was small under emergency scenarios, as for the equivalent normal encounters, conrming that participants did not predict the later misbehavior of the target vessel. As uncertainty develops, there is a marked increase in test maneuvers (by a factor of 10 in the case of emergency/altering), until it matches the level for routine encounters, and a corresponding reduction in secondary task sampling (Figure 5). This all happens during a brief (6-min) trial, in which task events occur relatively slowly and infrequently. An alternative explanation for the pattern of results should, however, be considered. It is possible that the dramatic increase in demands over time during emergency scenarios is attributable not to uncertainty but to the demands on evasive action. This is unlikely, given that both headings and bearings were much less frequent under emergency scenarios than under routine ones (Table 2) and the time of the rst course change was much later (mean 213 vs. 80 s, respectively). More specifically, the evasiveaction-demands explanation would predict that the increasing secondary task neglect (e.g., sampling time) under emergency conditions coincides with increased course change activity. By contrast, an uncertainty explanation predicts a growing strain on mental resources under emergency conditions, disrupting performance even before maneuvers are made. This cannot be tested easily, because both maneuvers and secondary task decrements occur mainly during the middle phase of the trial; however, a detailed analysis of sampling time (using 60-s phases) showed that decrement did not occur at all for routine scenarios. A more specic analysis was carried out for all trials in which a course change was made. The period before the rst maneuver was divided into early and late phases, allowing decrements in sampling time before making maneuvers to be observed as a reduction in sampling time. This revealed signicant decrements for both emergency/fixed and emergency/altering, in both cases for 10 of the 12 participants (sign test; p < .05), whereas there was no sign of decrement for

263 routine encounters (routine/xed 5/12 and routine/altering 7/12 participants). We can, therefore, be fairly sure that the emergency conditions make demands on primary task monitoring, even before evasive actions are taken a predicted effect of uncertainty on cognitive demands. In generalizing the results to actual marine environments, we need to consider possible limitations of using a simple simulator and specially trained participants. Although it would be desirable to test the findings in a full ship simulator, it can be argued that the simplified simulation allowed us to separate the effects of task factors that are often confounded in more realistic systems. We considered using mariners, but these were not available in sufcient numbers. Our alternative strategy was to use participants who were technically competent (e.g., in computing and engineering) and give them extensive training. Experienced mariners would bring to the situation a range of strategic options not available to our participants, and it would be valuable to know how they would manage uncertainty arising from rule violation. However, on the basis of our interview sample, we have no reason to assume that there would be any major differences, given that they identied problems of predicting actions of other vessels as the major source of demand in collision avoidance. Performance on the primary tasks (CA and TK) may be used as an indicator of how well our participants adhered to instructions. Both tasks were generally carried out well, given the constraints of the various scenarios. However, it is difcult to assess the adequacy of primary task performance in encounters when evasive action is required. CA and TK were both impaired in emergency encounters, in which large course changes are necessary. However, this cannot be taken as direct evidence of performance decrement, as these tasks are effectively data limited under these conditions (Norman & Bobrow, 1975). Given the restricted space for maneuvering, no amount of effort or resource application can prevent some deviation from the planned course or the occurrence of near collisions. In routine scenarios, the formal requirement to make course changes again means that TK was technically impaired, although CA performance was maintained within acceptable

264 limits. In all such cases there was a natural tradeoff between CA and TK. In real-life tasks, as here, operators recognize the need to make large changes of course (30+) in order to minimize uncertainty for others and to anticipate worst-case scenarios, although there is often a reluctance to do so (Cahill, 1983). Our results show that uncertainty about the actions of others, by reducing the time available for information gathering and decision making, has the effect of shifting this trade-off in the direction of increased collisions (or near collisions). In routine encounters, participants made planned track changes, according to agreed rules, so that the risk of collision was minimized. Under the emergency scenarios experienced in the study, sustained uncertainty about what action is required obliges the navigator to hold the original course longer. This means that track keeping is relatively good, but the risk of collision (the top goal) is greater. Ideally, we would have liked to have optimal solutions for each scenario, so that we could separate genuine TK errors from required deviations. However, there is evidence of widespread variability in choice of evasive maneuver, even among mariners (Curtis, 1978). How can findings such as these be used in the design of better collision avoidance systems? Although there has been little systematic work addressing this problem, we would agree with Lee (1996) and Lee and Sanquist (1996, 2000) in advocating the use of ecological interface displays (Vicente & Rasmussen, 1992). These have the intrinsic advantage of allowing the mariner to attend to the problem at the appropriate level of abstraction in this case, the constraints of either navigational geometry/dynamics (direction, distance, speed) or social/organizational functions (rules of the road, behavior of other vessels). Lee (1996) suggested that the increasing use of Advanced Radar Plotting Aid (ARPA) facilities makes the physical constraints of CA maneuvers so salient (e.g., through the generation of safety zones) that the mariner behaves as if they completely represent the situation. This results in overattention to the geometry of encounters and a neglect of the social/organizational factors. Lee and Sanquist (1996) found that mariners on a training simulator some-

Summer 2003 Human Factors times ignored the rules of the road when making avoidance maneuvers with ARPA but not with normal radar. They predicted that such problems are likely to become greater with the increasing use of integrated displays, in which an electronic chart display is overlaid onto the radar screen, increasing the salience of physical constraints further and reducing the redundancy normally available to the operator in correcting position errors. Broadly in line with this prediction, Sauer et al. (2002) showed that integrated displays resulted in better track keeping but imposed higher levels of demand, implying overattention to primary navigational data. Clearly, rule violations pose a special problem for CA because of the additional demands of uncertainty when target vessels behave in unexpected ways. These not only increase the risk of collision but also result in neglect of auxiliary platform functions, such as engine monitoring. To incorporate information about the intentions of other vessels and to provide direct navigational support remains a major technical goal for CA systems (Hammer & Hara, 1990; Hobday et al., 1993). One possibility (Lee, 1996) would be to use rules-of-the-road constraints to predict course changes for target vessels (plotted on radar/chart displays to show changing areas of safe passage). These can be used in conjunction with ARPA information to make decisions based on a more representative set of possible evasive maneuvers. Violations will still occur, but they would be detectable as mismatches between anticipated and actual tracks, allowing information about others intentions to play an integrated part in CA decision making. ACKNOWLEDGMENTS We are grateful to John Haberley and colleagues at Southampton Institute, U.K., and to Herke Schuffel and colleagues at TNO Soesterberg, Netherlands, for technical advice and to Magda Healey for assistance with data analysis. We also thank the editor and two anonymous referees for a number of helpful suggestions on an earlier draft of the manuscript. The research was supported by Grant GR/J06214 from the U.K. Engineering and Physical Sciences Research Council (Martine Technology Directorate).

COLLISION AVOIDANCE REFERENCES


Cahill, R. A. (1983). Collisions and their causes. London: Fairplay. Cockroft, A. N. (1984, June). Collisions at sea. Safety at Sea, pp. 1719. Curtis, R. G. (1978). Determination of mariners reaction times. Journal of Navigation, 31, 408417. Dyer-Smith, M. B. A. (1992). Shipboard organisation The choices for international shipping. Journal of Navigation, 45, 414424. Grabowski, M. (1990). Decision support to masters, mates on watch, and pilots: The piloting expert system. Journal of Navigation, 43, 364384. Grocott, D. F. H. (1992). The 21st century navigation station. Journal of Navigation, 45, 315328. Hammer, A., & Hara, K. (1990). Knowledge acquisition for collision avoidance maneuver by ship handling simulator. In Proceedings of MARSIM 90, International Conference on Marine Simulation and Ship Maneuverability (pp. 245252). Tokyo: Society of Naval Architects of Japan. Hendy, K. C., Hamilton, K. M., & Landry, L. N. (1993). Measuring subjective workload: When is one scale better than many? Human Factors, 35, 579-601. Hobday, J. S., Rhoden, D., & Jones, P. (1993). The role of expert systems and neural networks in the marine industry (Tech. Investigation, Propulsion and Environmental Engineering Department). London: Lloyds Register. Hockey, G. R. J., Wastell, D. G., & Sauer, J. (1998). Effects of sleep deprivation and user interface of complex performance: A multilevel analysis of compensatory control. Human Factors, 40, 233253. Hutchins, E. (1995). Cognition in the wild. Cambridge, MA: MIT Press. International Maritime Organisation. (1972). International regulations for preventing collisions at sea (Amended 1981, 1987, 1989). London: Author. James, M. K. (1994). The timing of collision avoidance maneuvers: Descriptive mathematical models. Journal of Navigation, 47, 259272. Kirwan, B. & Ainsworth, L. K. (1992). A guide to task analysis. London: Taylor & Francis. Lee, J. D. (1996, November). Design of advanced ship systems: Emerging problems and human factors solutions. Paper presented at the Centro Tecnico Navale (CETENA) Seminar on Human Factors Impact on Ship Design, Genoa, Italy. Lee, J. D., & Sanquist, T. F. (1996). Maritime automation. In R. Parasuraman & M. Mouloua (Eds.), Automation and human performance: Theory and applications (pp. 365384). Hillsdale, NJ: Erlbaum. Lee, J. D., & Sanquist, T. F. (2000). Augmenting the operator function model with cognitive operations: Assessing the cognitive demands of technological innovation in ship navigation. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, 30, 273285. Norman, D. A., & Bobrow, D. G. (1975). On data-limited and resource-limited processes. Cognitive Psychology, 7, 4464. Perrow, C. (1984). Normal accidents. Princeton, NJ: Princeton University Press. Raby, M., & Lee, J. D. (2001). Fatigue and workload in the maritime industry. In P. A. Hancock & P. A. Desmond (Eds.), Stress, workload and fatigue (pp. 566580). Mahwah, NJ: Erlbaum. Rasmussen, J. (1983). Skills, rules, and knowledge: Signals, signs and symbols, and other distinctions in human performance models. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13, 257266. Sablowski, N. (1989). Effects of bridge automation on mariners performance. In A. Coblentz (Ed.), Vigilance and performance in automated systems (pp. 101110). Dordrecht, Netherlands: Kluwer Academic. Sanderson, P. M. (1989). The human planning and scheduling role in advanced manufacturing systems: An emerging human factors domain. Human Factors, 31, 635666.

265
Sauer, J., Wastell, D. G., Hockey, G. R. J., Crawshaw, C. M., Ishak, M., & Downing, J. (2002). Effects of display design on performance in a simulated ship navigation environment. Ergonomics, 45, 329347. Schuffel, H., Boer, J. P. A., & van Breda, L. (1989). The ships wheelhouse of the nineties: The navigation performance and mental workload of the ofcer of the watch. Journal of Navigation, 42, 6072. Tattersall, A. J., & Hockey, G. R. J. (1995). Level of operator control and changes in heart-rate variability during simulated ight maintenance. Human Factors, 37, 682698. Vicente, K. J., & Rasmussen, J. (1992). Ecological interface design: Theoretical foundations. IEEE Transactions on Systems, Man, and Cybernetics, 22, 589606. Wagenaar, W. A., & Groenweg, J. (1987). Accidents at sea: Multiple causes and impossible consequences. In E. Hollnagel, G. Mancini, & D. D. Woods (Eds.), Cognitive engineering in complex dynamic worlds (pp. 133144). London: Academic. Wickens, C. D., & Hollands, J. G. (1999). Engineering psychology and human performance (3rd ed.). Upper Saddle River, NJ: Prentice Hall. Woods, D. D. (1988). Coping with complexity: The psychology of human behavior in complex systems. In L. P. Goodstein, H. B. Andersen, & S. E. Olsen (Eds.), Tasks, errors and mental models (pp. 128148). London: Taylor & Francis. Young, H., & Bell, P. (1993). The effect of commercial factors on the introduction of new navigational technology. In Proceedings of the International Maritime Conference: The Impact of New Technologies on the Maritime Industries (1/11/12). Warsash, UK: Southampton Institute of Higher Education. Zeitlin, L. R. (1995). Estimates of driver mental workload: A longterm field trial of two secondary tasks. Human Factors, 37, 610620.

G. Robert J. Hockey is professor of human performance and human factors and director of research at the School of Psychology, University of Leeds, UK. He obtained his Ph.D. in experimental psychology from the University of Cambridge, UK, in 1969. Alex Healey is a freelance researcher attached to the University of Hull, UK. He received his masters degree in human-computer interaction at Heriot-Watt University, UK, in 1992. Martin Crawshaw is a senior lecturer in psychology at the University of Hull, UK. He obtained his Ph.D. in psychology at Southampton University, UK, in 1975. David G. Wastell is a professor in the Department of Computation, University of Manchester Institute of Science and Technology, UK. He obtained his Ph.D. in psychophysiology from the University of Durham, UK, in 1978. Jrgen Sauer is a lecturer in the Institute of Psychology at Darmstadt University of Technology, Germany. He obtained his Ph.D. in human factors from the University of Hull in 1997. Date received: June 21, 2001 Date accepted: February 26, 2002

You might also like