Understanding of the Concept of Numerically Less by Bottlenose
Dolphins (Tursiops truncatus)
Kelly Jaakkola, Wendi Fellner, Linda Erb, Mandy Rodriguez, and Emily Guarino Dolphin Research Center In 2 experiments, bottlenose dolphins (Tursiops truncatus) judged the ordinal relationship between novel numerosities. The dolphins were first trained to choose the exemplar with the fewer number of items when presented with just a few specific comparisons (e.g., 2 vs. 6, 1 vs. 3, and 3 vs. 7). Generalization of this rule was then tested by presenting the dolphins with all possible pairwise comparisons between 1 and 8. The dolphins chose the exemplar with the fewer number of items at levels far above chance, showing that they could recognize and represent numerosities on an ordinal scale. Their pattern of errors was consistent with the idea of an underlying analog magnitude representation. The idea that nonhuman animals evidence numerical abilities is no longer controversial. Data from numerous laboratories have shown that many nonhuman animals have at least some numerical capacities (for reviews, see Davis & Perusse, 1988; Dehaene, 1997). The debate in comparative cognition has moved toward two issues: (a) to what extent different animal species understand particular numerical properties, and (b) what representational sys- tems underlie these capacities. Ordering Numerosities The current article is specifically concerned with the property of relative numerositythat is, the understanding that numerosities (e.g., oneness, twoness, etc.) can be judged according to their order in an inherent series. Even if one can distinguish three objects from two objects, this distinction could theoretically be recognized as a simple property of a set, like distinguishing redness from green- ness. However, the concept of number necessarily entails more than that. Inherent in the number concept is the idea that numbers form an ordered seriesnot only is three different from two and four, but it is more than two and less than four. Relative numerosity has been studied in the animal kingdom using a variety of methods, including presenting an animal with sets of objects or series of events (e.g., light flashes) and requiring it to pick the one that has more or less objects/events (e.g., Alsop & Honig, 1991; Beran, 2001; Boysen & Berntson, 1995; Call, 2000; Dooley & Gill, 1977; Hauser, Carey, & Hauser, 2000; Killian, Yaman, von Fersen, & Gunturkun, 2003; Machado & Keen, 2002; Shumaker, Palkovich, Beck, Guagnano, & Morowitz, 2001; Thomas & Chase, 1980); presenting an animal with symbols or objects associated with particular numbers of food pieces and assessing whether it chooses the symbol/object associated with the greater number of food pieces (e.g., Mitchell, Yao, Sherman, & ORegan, 1985; Washburn & Rumbaugh, 1991); training an ani- mal to respond to numerosities in a particular numerical order (e.g., ascending 1 32 33 34) and then testing its generalization of the rule with new numerosities (e.g., 5, 6, 7, 8; Brannon & Terrace, 1998, 2000); and training an animal to make a differential response to two numerosities and then testing its response to intermediate numerosities (e.g., Breukelaar & Dalrymple-Alford, 1998; Emmerton, Lohmann, & Niemann, 1997). The results of these studies have led to claims of understanding relative numerosity for many animal species, including chimpan- zees (Beran, 2001; Boysen & Berntson, 1995; Dooley & Gill, 1977), orangutans (Call, 2000; Shumaker et al., 2001), rhesus monkeys (Brannon & Terrace, 1998, 2000; Hauser et al., 2000; Washburn & Rumbaugh, 1991), squirrel monkeys (Thomas & Chase, 1980), dolphins (Killian et al., 2003; Mitchell et al., 1985), rats (Breukelaar & Dalrymple-Alford, 1998), and pigeons (Alsop & Honig, 1991; Emmerton et al., 1997; Machado & Keen, 2002). Note, however, that to qualify as explicit knowledge of relative numerosity, four criteria must be met: (a) Nonnumerical cues such as stimulus area, density, or duration must be controlled for otherwise, the animals may be judging the relative magnitude of these continuous dimensions rather than relative numerosity per se; 1 (b) particular numerosities must be judged as greater than some numerosities and less than others (rather than simply desig- nating some numerosities as large and others as small); (c) the ability must be demonstrated with numerosities on which the animals were not originally trained; and (d) this generalization to 1 This is true even in cases in which stimulus items are presented sequentially rather than simultaneously. Even if an animal is shown one M&M at a time, for example, it is not possible to determine whether its responses are based on the total number of M&Ms or the total amount (volume) of candy (Beran, 2001; Call, 2000). Kelly Jaakkola, Wendi Fellner, Linda Erb, Mandy Rodriguez, and Emily Guarino, Dolphin Research Center (DRC), Grassy Key, Florida. We thank Adrian Dahood, Shelly Samm, Peter Sugarman, Tammie Anderson, Margaret Thomas, Cheryl Sullivan, and numerous research interns at DRC for help with data collection, as well as Tammie Anderson, Susan Carey, Pat Clough, Kathy Roberts, Elizabeth Spelke, and Marie Trone for helpful comments on earlier versions of this article. We are also grateful to Peter Sugarman for help with design and construction of the testing apparatus and Tommi Jaakkola for technical assistance. Finally, a special thanks to the staff and dolphins of DRC, under the leadership of President Jayne Shannon-Rodriguez, for their cooperation and patience during this project. Correspondence concerning this article should be addressed to Kelly Jaakkola, Dolphin Research Center, 58901 Overseas Highway, Grassy Key, FL 33050. E-mail: kelly@dolphins.org Journal of Comparative Psychology Copyright 2005 by the American Psychological Association 2005, Vol. 119, No. 3, 296303 0735-7036/05/$12.00 DOI: 10.1037/0735-7036.119.3.296 296 novel numerosities must occur in such a manner that inadvertent cuing from trainers or experimenters is impossible. Using these criteria, researchers have thus far convincingly demonstrated knowledge of relative numerosity only in primates (Brannon & Terrace, 1998, 2000; Hauser et al., 2000). The current study explored the understanding of relative numer- osity in the bottlenose dolphin. To our knowledge, there have only been two prior studies of numerical understanding in dolphins. In the first (Mitchell et al., 1985), a dolphin was asked to choose between different objects, each of which was associated with a particular number of fish (e.g., an ice-cube tray was worth one fish; a mixing bowl was worth four fish). Whatever choice the dolphin made, the associated number of fish were thrown to her one at a time. The dolphin learned to choose the object with the greatest fish value. Unfortunately, it is not possible to determine whether the dolphin based her response on number, per se, because both the amount of fish (mass, volume), and the duration of the rewarding process also varied systematically with number. In the second study (Killian et al., 2003), a dolphin was trained to choose which of two stimulus arrays contained the fewer number of elements. However, during the crucial generalization phase in which the dolphin was tested on novel discriminations, the stimuli were placed in the water by an assistant who kept observing the animal (Killian et al., 2003, p. 138). It is unclear from the description whether the dolphin could see this person. Without this piece of information, it is unfortunately not possible to rule out inadvertent cuing. Thus, to a large extent, the level of dolphins numerical understanding remains an open question. Numerical Representation To fully understand numerical competence in humans and non- human animals, a second issue that must be addressed concerns the type of representational system that underlies any demonstrated numerical capacities. The two types of representational systems most often proposed in the current literature are depicted in Fig- ure 1. The first is an analog magnitude system (see Figure 1B), in which the numerosity of a set is mentally encoded as a single continuous magnitude, like a number line, such that a greater magnitude is indicative of a larger numerosity (e.g., Gallistel & Gelman, 1992, 2000). Comparisons between sets are performed by comparing the extent of the magnitudes representing each set. Because of the nature of this representation, judgments using this system are subject to Webers law, in which the difficulty of discriminating between two magnitudes is a function of their ratio. For example, discriminating seven from eight should be more difficult than discriminating two from three. The second type of representational system proposed to account for nonlinguistic numerical competence is the object-file model (e.g., Hauser et al., 2000; Simon, 1997; Uller, Carey, Huntley- Fenner, & Klatt, 1999; see Figure 1C). In this model, individuals in a set are represented as separate symbols (object files), with no single mental entity representing the numerosity of a set. Rather, comparisons between sets are performed by one-to-one matching between the object files for each set. Because there is a limit to the number of objects that can be held in mind simultaneously, nu- merical judgments are subject to a magnitude limit. That is, accu- rate judgments should not be possible for numerosities that exceed that attentional limit, typically taken to be four or fewer objects (Feigenson, Carey, & Hauser, 2002). The Current Study The purpose of the current study was to look for evidence of understanding relative numerosity in the bottlenose dolphin, and if successful, to explore the underlying representational system. Dol- phins were presented with two arrays of dots that varied with respect to dot sizes and positions (see Figure 1A). The dolphins were trained to choose the exemplar with the fewer number of items, using just a few specific training pairs (e.g., 2 vs. 6, 1 vs. 3, and 3 vs. 7). Generalization of this rule was then tested by presenting the dolphins with all possible pairwise comparisons between 1 and 8. If the dolphins performance is consistent with Webers law, this would suggest an underlying analog magnitude representation. If their performance instead evidences magnitude limits, this would suggest an underlying object-file representation. Experiment 1 Method Subject and testing environment. The subject was a male Atlantic bottlenose dolphin (Tursiops truncatus) named Talon, who was born at the Dolphin Research Center in Grassy Key, Florida, and was 10 years old at the time training was initiated. Talon was selected because he was male (to Figure 1. Examples of stimuli (A) and corresponding analog magnitude (B) and object-file (C) representations. 297 UNDERSTANDING LESS IN DOLPHINS avoid a potential baby-induced hiatus), relatively young, and had time available in his training and public program schedule. He resided primarily with another 12-year-old male dolphin in one of three natural seawater lagoons situated on the Gulf of Mexico. Lagoon A measured 36.6 27.4 up to 6.1 m deep, depending on tide state; Lagoon B measured 36.6 13.7 up to 3.7 m deep; and Lagoon C measured 41.1 15.2 up to 4.9 m deep. Training took place in any of the three lagoons, and testing took place exclusively in Lagoon A. All sessions were videotaped using a camera located across the lagoon from the testing dock. At various times throughout training and testing, the group was joined by any of three other male dolphins. Talon was fed between 5.5 and 7.7 kg (M 6.7 kg) of capelin, smelt, and herring daily, approximately 33% of which he received during experimental sessions. During nonexperimental sessions, he continued to participate in other training sessions, including public programs and in-water interactions with trainers and guests. Stimuli and apparatus. On each trial, the dolphin was presented with two stimulus arrays of dots. Each array was presented on a 61 61-cm black plywood board divided into 16 evenly spaced locations (arrayed in a 4 4 matrix). Four 1.3-cm-wide strips of black hook-and-loop fastener were attached horizontally to the front of the plywood, running through the center of each row of the matrix. White plywood dots of three different diameters (2.5, 5, and 10 cm) served as the stimuli and were attached to the boards with hook-and-loop fastener. Across trials, the stimuli varied with respect to both dot sizes and positions, to ensure that the dolphin could not respond on the basis of nonnumerical cues such as stimulus area or pattern. Dot sizes were assigned randomly, subject to surface area constraints. Dot locations varied randomly, subject to the constraint that each dot must be adjacent to at least one other dot in a single continuous configuration (see Figure 1A). This prevented a situation in which it could appear that there were two separate groups of dots on the same board. The two stimulus arrays were attached to two arms of a presentation apparatus constructed of PVC pipe, at a distance of 1.3 m apart (see Figure 2). The apparatus was designed so that the stimulus boards could be simultaneously rotated from the loading position at 1.0 m back from the edge of the dock, facing up, down to the presentation position of 0.2 m above the water, facing the dolphin. The area between the two arms of the apparatus was blocked by a research screen constructed of PVC pipe and opaque canvas. An experimenter stood behind the screen out of the dolphins field of view and used a periscope-like device for one-way viewing of the dolphin. A trainer sat in front of the research screen facing the dolphin. The trainer could not see the front of the stimulus arrays and was blind as to which array was correct. Between trials, canvas side flaps were raised to prevent the dolphin from viewing preparations for the next trial. Procedure. Trials ran as follows: Once both boards were loaded onto the apparatus, the side flaps were lowered and the experimenter called out, Ready. The trainer asked the dolphin to station with his head out of the water, then called back, Ready. The apparatus was rotated into position and then the trainer gave a hand signal while saying Less. The experi- menter observed the dolphin through the periscope and blew a whistle if he touched the correct board or simply removed the stimuli if he was incor- rect. On correct trials, the trainer provided positive reinforcements of fish and social interaction. On incorrect trials, the trainers response remained neutral. During the training phase, training decisions such as whether to repeat a trial, how much to reward, and whether to end a session early were left to the discretion of the trainer. During the testing phase, all such decisions were fixed. Normally, one session was run per day, 5 days per week. Sessions typically lasted 20 to 30 min. During sessions, any other dolphins present in the lagoon were kept busy at a separate dock by another trainer. Training. Three pairs of numerosities were designated as training pairs: 2 versus 6, 1 versus 3, and 3 versus 7. All other combinations were presented for the first time during the testing phase. Initial training made use of only a single numerosity pair: 2 versus 6. 2 For the earliest stages of training, the discrimination was simplified as much as possible by present- ing only medium-sized dots at fixed locations. In addition, the dots on the correct stimulus array were affixed to the board with springs and the board was jostled, providing a wiggle cue. Errorless trials in which only the two-dot array was presented were used to train the dolphin to approach the 2 On a single occasion, Talon was accidentally presented with a 2-versus-5 comparison when a dot fell off of the six-dot board. Figure 2. Experimental set-up. A: Front view. Note that the trainer cannot see the stimulus arrays, and is blind as to which side is correct. The experimenter watched the dolphin through the periscope located on top of the research screen. B: Side view, showing apparatus at loading position, then rotated into presentation position. 298 JAAKKOLA, FELLNER, ERB, RODRIGUEZ, AND GUARINO wiggling board and dots. 3 Training progressed in several steps: First, introduce choice (i.e., nonerrorless) trials; second, eliminate the wiggle cue; third, vary dot locations; fourth, introduce small- and large-sized dots; and fifth, introduce reversed trials (i.e., trials in which the array with the fewer number of dots covered more surface area than the array with the greater number of dotse.g., the two-dot board may have had two large dots that covered 162 cm 2 of surface area, whereas the six-dot board may have had two small and four medium dots that covered only 91 cm 2 of surface area). Training sessions typically consisted of 20 trials and included three types of trials: (a) standard trials at a difficulty level that the dolphin had demonstrated he was capable of solving successfully, (b) errorless trials that previewed the next training step, and (c) probe trials consisting of choice trials at the next training step. For example, consider the point at which the wiggle cue had just been eliminated and varying dot locations were being introduced. At this point, standard trials consisted of medium dots in fixed locations (which the dolphin had already mastered), errorless trials consisted of one board with two medium dots in any location (showing the dolphin the correct answer for the next step), and probe trials consisted of a board of two versus a board of six medium dots in any location (i.e., the next step). In a typical session, 25% of the trials were errorless. The number of standard and probe trials changed dynamically in such a way that the number of probes increased each time the dolphin demonstrated competency, until the previous probes became the new standard at the next training step. This procedure worked well until the last step, in which we introduced reversed trials. It became apparent that Talon had been discriminating based on the amount of white space (i.e., surface area of the stimuli) on each board rather than on the number of dots. We tried two strategies to get him past this conceptual difficulty. First, we asked Talon to carry the dots to the boards, and also to try the discrimination using common objects in place of the dots (i.e., we attached objects such as soda cans and paint brushes to the boards). The objective was to try to get him to view the dots as objects rather than as amounts of stuff. However, Talon reacted to this by losing his motivation to participate in the research. We thus selected a new combination of dots that resulted in nearly equal amounts of white space on both boards: two large dots (162 cm 2 of surface area) versus one small, four medium, and one large dot (167 cm 2 of surface area). We then repeated Training Steps 2 (eliminating the wiggle) and 3 (varying dot locations). After Talon had reached criterion on the comparison of 2 versus 6 with matched surface area, a second pair of numerosities1 versus 3was introduced in probe trials, also with nearly equal amounts of white space on both boards. The number of probe trials was gradually increased until the session was divided evenly between both training pairs (2 vs. 6 and 1 vs. 3). The final step was to gradually increase the difference in the amount of white space between the boards, with half of the trials being reversed, until the difference in white space became noticeable. This time, performance on reversed trials was equal to performance on consistent trials (i.e., trials in which the array with the fewer number of dots also covered less surface area than the array with the greater number of dots), indicating that Talon had come to disregard the amount of white space and was now discrimi- nating based on the number of dots. Finally, a third numerosity pair3 versus 7was introduced in probe trials. Talon was immediately correct on 100% of these trials over two sessions, including both consistent and reversed trials. An additional ses- sion was run consisting of an even mix of all three known numerosity pairs. This marked the end of the training phase. In total, by the end of training, Talon had seen 572 errorless trials (i.e., a single array of two dots), 2,764 trials of 2 versus 6, 332 trials of 1 versus 3, and 15 trials of 3 versus 7. Testing design. Once training was completed, generalization of the less rule was tested by presenting the dolphin with all possible pairwise com- parisons between 1 and 8. As in training, stimuli were varied with respect to dot sizes and positions. In addition, testing exemplars for each compar- ison were counterbalanced such that half the trials were reversed with respect to surface area, and half were consistent. Each new pair of numer- osities was tested eight times, distributed over 19 sessions. Each session consisted of 20 trials, with familiar trials (i.e., 2 vs. 6, 1 vs. 3, and 3 vs. 7) and test trials intermixed. For the first 7 sessions, there were 12 familiar trials and 8 test trials; for the remaining sessions, there were 8 familiar trials and 12 test trials. The specific numerosity pairs tested in each session were chosen randomly, with the constraints that no more than two of the same numerosity pair could occur in any given session, and no consecutive numerosity pairs (e.g., 3 vs. 4) occurred during the first 3 sessions. During each session, half of the trials were reversed with respect to surface area and half were consistent; order of trials was randomized, with the con- straint that there were no more than 3 test trials in a row (in the final session there were 5 consecutive test trials); and correct side was assigned ran- domly, with the constraint that there were never more than three consec- utive correct answers on either side. Coding. The dolphin was coded as making a choice when his rostrum contacted one of the stimulus boards. Accuracy of his choices was coded live during the sessions and later checked by a second experimenter from the videotapes. Initial orientation was defined as the first direction the dolphin faced after the start of the signal and was coded from the video- tapes. Because 12 test trials were not videotaped, analyses of initial orientation were conducted using the remaining 188 test trials. A second experimenter independently coded initial orientation for 20% of the gen- eralization trials. Reliability between the two coders was 99% for final accuracy and 100% for initial orientation. Results All analyses were conducted on the data from novel numerosity pairs only. The data from trained pairs are included in tables for completeness. Overall accuracy. Overall, the dolphin chose the exemplar with the fewer number of dots on 83% of the generalization trials (binomial test, p .001), showing that he could recognize and represent numerosities on an ordinal scale. The accuracy of these judgments was not significantly related to whether surface area was consistent or reversed, 2 corrected for continuity (1, N 200) 1.25, ns. Because every trial presented a new combination of dot sizes and positions, the only way to succeed was to recognize the numerosity of the arrays. It is theoretically possible, however, that the dolphin could have recognized these numerosities as nominal categories and then learned their ordinal relations extremely rap- idly after the first reinforcement. However, first-trial data showed that the dolphin correctly judged the relative numerosities even the first time he was presented with each comparison (76% correct, binomial test, p .01). Error patterns. To assess underlying numerical representa- tion, we next analyzed the dolphins error patterns. The proportion of correct responses for each numerosity pair is presented in the top panel of Table 1. A regression analysis showed a significant linear effect of the ratio of large to small numerosity on the proportion of correct responses for each numerosity pair ( 3 Errorless trials were those in which only a single, correct board was presented. Technically speaking, there is no correct answer regarding which board has fewer dots when only one board is presented. However, the only comparison presented in the early trials was 2 versus 6, in which the two-dot board was always the correct answer. The errorless trials were included to help the dolphin learn that rule. Errorless trials were faded out entirely when reversed trials were introduced (Training Step 5). 299 UNDERSTANDING LESS IN DOLPHINS .488, p .02). In other words, the dolphin tended to make more errors on trials in which the ratio between the two presented numerosities was small, as predicted by Webers law. This is consistent with the predictions of the analog magnitude model. We next looked at the effect of set size. The object-file model would predict perfect or near-perfect performance for set sizes under a particular threshold, coupled with chance performance above that threshold. We tested each small numerosity for such a magnitude limit. Binomial tests showed greater than chance per- formance ( ps .05) both above and below each small numerosity from 1 through 6. It was only above a possible threshold of 7 where performance was at chance. This is much higher than the magnitude limit of four or fewer items typically reported for humans and nonhuman primates (e.g., Feigenson et al., 2002; Hauser et al., 2000). Moreover, the failure at 7 versus 8 (the only numerosity pair above this threshold) is also predicted by the analog magnitude model. Thus, we found no clear evidence to support the predictions of the object-file model. First look. As a further check on the dolphins possible nu- merical representations, we next examined his initial orientation to the stimulus boards. On all trials, the dolphin oriented toward one of the boards immediately as the signal was being given. This initial orientation was correct on 78% of the generalization trials (binomial test, p .001), showing that he had made an initial choice prior to the signal. On some trials, he then made an orientation reversal, in which he turned back toward the other stimulus array (sometimes choosing that second array, and other times coming back to his original choice). One possibility, there- fore, is that he might have initially used an object-file representa- tion but then reverted to an analog magnitude representation when the set sizes were not sufficiently small for object files to handle. If so, one would expect that the error pattern for initial orientation would show the magnitude limit signature of the object-file model rather than the Webers law signature of an analog magnitude system. The proportion of correct initial orientations for each numeros- ity pair is presented in the bottom panel of Table 1. A regression analysis showed a significant linear effect of the ratio of large to small numerosity on the proportion of correct initial orientations for each numerosity pair ( .643, p .01), consistent with Webers law. Further, binomial tests showed above chance per- formance ( ps .05) for each small numerosity from 1 through 6, with chance performance only appearing above a threshold of 7 (i.e., for the comparison of 7 vs. 8). As was the case with final accuracy, the results of initial orientation are more consistent with the predictions of the analog magnitude model than with those of the object-file model. Discussion When presented with displays of dots that controlled for non- numerical cues, a bottlenose dolphin was able to consistently choose the array with the fewer number of dots. This was true even for first-trial data on novel numerosity pairs. His performance, measured by both initial orientation and final choice, was predicted by the ratio of the numerosities presented, consistent with Webers law. The only evidence of a magnitude limit came at a possible threshold of 7. However, as the only comparison in that set was 7 versus 8, this result is also predicted by Webers law. Thus, the dolphins performance seems better explained by an underlying analog magnitude model than by the object-file model. One puzzling result from Experiment 1 was that the dolphin performed poorly on the comparison of 1 versus 2 dots. This should be an easy problem on any theory of numerical comparison. It is important to note, however, that this difficulty could have arisen as an artifact of the training procedure. Because the vast majority of his training was performed on the 2-versus-6 discrim- ination (in which 2 was the correct answer), it could be that the dolphin simply developed a strong bias to pick 2 whenever it was present. In Experiment 2, we replicate Experiment 1 with another dol- phin, with two differences in the training procedure. First, the majority of training used the numerosity pair 1 versus 8, rather than 2 versus 6. If Talons difficulty with 1 versus 2 was a training artifact, then a second dolphin with a different training history should not show this same difficulty. Second, we equalized the white space between stimulus boards earlier in the training process to avoid the difficulties that Talon encountered when he initially based his choices on stimulus area rather than on numerosity. Experiment 2 Method Subject and testing environment. The subject was a male Atlantic bottlenose dolphin named Rainbow, who was collected from the Gulf of Mexico at approximately 4 years of age and was approximately 24 years old at the time training was initiated. He resided in one of two natural seawater lagoons situated on the Gulf of Mexico. Lagoon D measured 28.6 18.3 up to 4.9 m deep, depending on tide state; and Lagoon E measured 27.4 13.1 up to 2.4 m deep. Training took place in either of the two lagoons, and testing took place exclusively in Lagoon D. All sessions were videotaped using a camera located across the lagoon from Table 1 Experiment 1: Proportion of Talons Correct Responses for Each Numerosity Pair Small numerosity Large numerosity 2 3 4 5 6 7 8 Final choice 1 .38 (.85*) 1.00* 1.00* 1.00* 1.00* 1.00* 2 .88* 1.00* .88* (1.00*) 1.00* 1.00* 3 .63 .63 .88* (.88*) 1.00* 4 .38 1.00* .75 .88* 5 .50 1.00* 1.00* 6 .75 .88* 7 .25 Initial orientation 1 .50 (.79) 1.00* 1.00* 1.00* 1.00* 1.00* 2 .88* .88* .83 (.96*) 1.00* 1.00* 3 .63 .29 .88* (.78) .71 4 .50 .67 .75 .67 5 .75 .75 .75 6 .86 .50 7 .75 Note. Talon is the bottlenose dolphin who was the subject in Experiment 1. Parentheses indicate trained discriminations. * p .05. 300 JAAKKOLA, FELLNER, ERB, RODRIGUEZ, AND GUARINO the testing dock. At various times throughout training and testing, Rainbow was joined by any of five other dolphins. Rainbow was fed between 7.3 and 10.4 kg (M 8.5 kg) of capelin, smelt, sardines, and herring daily, approximately 33% of which he received during experimental sessions. During nonexperimental sessions, he continued to participate in other training sessions, including public programs and in-water interactions with trainers and guests. Stimuli and apparatus. The stimuli and apparatus were identical to those used in Experiment 1. Procedure. The procedure for individual trials was identical to that of Experiment 1. Training. Four pairs of numerosities were used as training pairs: 1 versus 8, 3 versus 7, 2 versus 4, and 4 versus 7. All other combinations were presented for the first time during the testing phase. Initial training made use of only a single numerosity pair: 1 versus 8. For the earliest stages of training, the discrimination was simplified as much as possible by presenting only medium-sized dots at fixed locations. In addition, the dot on the correct stimulus array was affixed to the board with a spring and the board was jostled, providing a wiggle cue. Errorless trials in which only the one-dot array was presented were used to train the dolphin to approach the wiggling board and dot. Training progressed in several steps: First, intro- duce choice (i.e., nonerrorless) trials; second, eliminate the wiggle cue; third, vary dot locations; and fourth, equalize white space on the boards. We selected a new combination of dots that resulted in nearly equal amounts of white space on both boards: one large dot (81 cm 2 of surface area) versus five small and three medium dots (86 cm 2 of surface area). As in Experiment 1, training sessions typically consisted of 20 trials and included three types of trials: (a) standard trials at a difficulty level that the dolphin had demonstrated he was capable of solving successfully, (b) errorless trials that previewed the next training step, and (c) probe trials consisting of choice trials at the next training step. The number of standard and probe trials changed dynamically in such a way that the number of probes increased each time the dolphin demonstrated competency, until the previous probes became the new standard at the next training step. Error- less trials were faded out entirely at the point where we introduced equalized white space between boards (Training Step 4). After Rainbow had reached criterion on the comparison of 1 versus 8 with similar surface areas, a second pair of numerosities3 versus 7was introduced in probe trials, also with nearly equal amounts of white space on both boards. The number of probe trials was gradually increased until the session was divided evenly between both training pairs (1 vs. 8 and 3 vs. 7). The final step was to gradually increase the difference in the amount of white space between the boards, until the difference in white space became noticeable. Then a third numerosity pair2 versus 4was introduced in probe trials. Rainbow was correct on seven out of eight (88%) of these trials over two sessions, including both consistent and reversed trials. At this point, Rainbow unfortunately began to decline to participate in research sessions, because of the distraction of a female dolphin who had been moved into his lagoon. Over the next 3 months, he chose to partic- ipate on only 4 days. When he regained interest in research in December, we decided to delay testing until January, to accommodate staff vacations. In the meantime, we continued to run research sessions a few times a week, using all three training pairs to date. In January, we introduced a fourth numerosity pair4 versus 7in probe trials. Rainbow was correct on 12 out of 14 (86%) of these trials over three sessions, including both consistent and reversed trials. This marked the end of the training phase. In total, by the end of training, Rainbow had seen 301 errorless trials (i.e., a single array of one dot), 1,861 trials of 1 versus 8, 524 trials of 3 versus 7, 118 trials of 2 versus 4, and 14 trials of 4 versus 7. Testing design. The testing design was the same as in Experiment 1, with the following differences: Each new pair of numerosities was tested eight times, distributed over 18 sessions. Each session consisted of 20 trials, with familiar trials (i.e., 1 vs. 8, 3 vs. 7, 2 vs. 4, and 4 vs. 7) and test trials intermixed. For the first 6 sessions, there were 12 familiar trials and 8 test trials; for the remaining sessions, there were 8 familiar trials and 12 test trials. Coding. Coding was identical to that of Experiment 1. Because 8 test trials were not videotaped, initial orientation analyses were conducted using the remaining 184 test trials. Reliability between the two coders was 100% for both final accuracy and initial orientation. Results All analyses were conducted on the data from novel numerosity pairs only. The data from trained pairs are included in tables and figures for completeness. Overall accuracy. Overall, the dolphin chose the exemplar with the fewer number of dots on 82% of the generalization trials (binomial test, p .001), showing that he could recognize and represent numerosities on an ordinal scale. The accuracy of these judgments was not significantly related to whether surface area was consistent or reversed, 2 corrected for continuity (1, N 184) 2.42, ns. As in Experiment 1, first-trial data showed that the dolphin correctly judged the relative numerosities even the first time he was presented with each comparison (88% correct, bino- mial test, p .01). Error patterns. To assess underlying numerical representa- tion, we next analyzed the dolphins error patterns. The proportion of correct responses for each numerosity pair is presented in the top panel of Table 2. As in Experiment 1, a regression analysis showed a significant linear effect of the ratio of large to small numerosity on the proportion of correct responses for each numer- osity pair ( .507, p .02). That is, the dolphin tended to make more errors on trials in which the ratio between the two presented numerosities was small, as predicted by Webers law. We next looked at the effect of set size. The object-file model would predict perfect or near-perfect performance for set sizes Table 2 Experiment 2: Proportion of Rainbows Correct Responses for Each Numerosity Pair Small numerosity Large numerosity 2 3 4 5 6 7 8 Final choice 1 .88* .75 1.00* 1.00* 1.00* 1.00* (1.00*) 2 .63 (.81*) .63 .88* 1.00* .88* 3 1.00* .25 .88* (.93*) .88* 4 .75 .88* (.88*) .88* 5 .75 .75 .88* 6 .50 .88* 7 .75 Initial orientation 1 .63 .63 .88* 1.00* .88* 1.00* (.98*) 2 .67 (.71*) .50 .86 1.00* .75 3 .88* .38 1.00* (.82*) .88* 4 .63 .75 (.80*) .63 5 .88* .71 .88* 6 .38 1.00* 7 .63 Note. Rainbow is the bottlenose dolphin who was the subject in Exper- iment 2. Parentheses indicate trained discriminations. p .07. * p .05. 301 UNDERSTANDING LESS IN DOLPHINS under a particular threshold, coupled with chance performance above that threshold. We tested each small numerosity for such a magnitude limit. As in Experiment 1, binomial tests showed above chance performance ( ps .05) both above and below each small numerosity from 1 through 6, with chance performance only appearing above a threshold of 7 (i.e., for the comparison of 7 vs. 8). Thus, the dolphins performance was more consistent with the predictions of the analog magnitude model than with the object- file model. Note that in contrast to Talons performance in Experiment 1, Rainbow performed above chance with the comparison of 1 versus 2. It thus seems that Talons difficulty with this comparison was likely an artifact of training history. First look. As a further check on the dolphins possible nu- merical representations, we also examined his initial orientation to the stimulus boards. On all trials, the dolphin oriented toward one of the arrays immediately as the signal was being given. This initial orientation was correct on 78% of the generalization trials (binomial test, p .001), showing that he had made an initial choice prior to the signal. The proportion of correct initial orien- tations for each numerosity pair is presented in the bottom panel of Table 2. As in Experiment 1, a regression analysis showed a significant linear effect of the ratio of large to small numerosity on the proportion of correct initial orientations for each numerosity pair ( .445, p .03), consistent with Webers law. Further, binomial tests showed above chance performance ( ps .05) for each small numerosity from 1 through 6, with chance performance only appearing above a threshold of 7 (i.e., for the comparison of 7 vs. 8). Thus, as was the case with final accuracy, the results of initial orientation are more consistent with the predictions of the analog magnitude model than with those of the object-file model. Discussion The dolphin in Experiment 2 showed essentially the same pat- tern of results as the dolphin in Experiment 1. He consistently chose the array with the fewer number of dots, even the first time he was presented with novel numerosity pairs. His pattern of errors was predicted by the ratio of large to small numerosity, consistent with Webers law. The only evidence of a possible magnitude limit came at a threshold of 7 (i.e., for the comparison of 7 vs. 8), which is also predicted by Webers law. Thus, the dolphins performance seems better explained by an underlying analog magnitude model than by the object-file model. General Discussion This study adds two new findings to the literature on animal concepts of number. First, when presented with displays of dots that controlled for nonnumerical cues, dolphins were able to con- sistently choose the array with the fewer number of dots. This was true even for first-trial data on novel numerosity pairs. Thus, bottlenose dolphins are able to discriminate numerosities and to reason about them with respect to an ordinal scale. Second, the pattern of errors in the dolphins initial orientation and final choice conformed to Webers law. This suggests an underlying analog magnitude representational system as has been proposed to account for other human and nonhuman animal results (e.g., Dehaene, 1997; Gallistel & Gelman, 1992; Whalen, Gallistel, & Gelman, 1999). In contrast, we found no evidence of the type of magnitude limit that would suggest that the dolphins were using an object-file model in this task. In their review of the literature, Davis and Perusse (1988) proposed that subitizingwhich they characterized as a perceptual process that rapidly assesses the numerosity of a small quantity of itemscan account for much of the data on animals numerical abilities. Although the experiments presented here were not de- signed to speak to this issue, there are several aspects of our results that suggest that subitizing was not behind the performance of the dolphins in this task. First, subitizing is generally agreed to operate only on numer- osities up to 3 or 4 in humans (e.g., Dehaene, 1992; Mandler & Shebo, 1982). The dolphins, however, showed success at compar- isons as large as 6 versus 8. Indeed, if subitizing were at work, the pattern we would expect to see would be the same sort of magni- tude limit signature predicted by the object-file model, most likely with respect to the dolphins first response (i.e., initial orientation). Recall, however, that we found no such evidence of a magnitude limit for either of our dolphins. Second, the model of subitizing advocated by Davis and Perusse (1988) holds that each numerosity is recognized as a nominal category, similar to triangle and square, and as such is not a numerical process at all. On that model, there is no such mecha- nism that can account for the fact that the dolphins error patterns conformed to Webers law. Indeed, that model of subitizing cannot account for ordinal results at all (Brannon & Terrace, 2000). If dolphins were to recognize 1, 2, and 3 in the same way that they might recognize cat, horse, and dog, then they would have to learn individually that 3 is greater than 2, which is in turn greater than 1. This study demonstrated that they can make exactly these kinds of judgments without needing to learn them individually. References Alsop, B., & Honig, W. K. (1991). Sequential stimuli and relative numer- osity discriminations in pigeons. Journal of Experimental Psychology: Animal Behavior Processes, 17, 386395. Beran, M. J. (2001). Summation and numerousness judgments of sequen- tially presented sets of items by chimpanzees (Pan troglodytes). Journal of Comparative Psychology, 115, 181191. Boysen, S. T., & Berntson, G. G. (1995). Responses to quantity: Perceptual versus cognitive mechanisms in chimpanzees (Pan troglodytes). Journal of Experimental Psychology: Animal Behavior Processes, 21, 8286. Brannon, E. M., & Terrace, H. S. (1998, October 23). Ordering of the numerosities 1 to 9 by monkeys. Science, 282, 746749. Brannon, E. M., & Terrace, H. S. (2000). Representation of the numerosi- ties 19 by rhesus macaques (Macaca mulatta). Journal of Experimental Psychology: Animal Behavior Processes, 26, 3149. Breukelaar, J. W. C., & Dalrymple-Alford, J. C. (1998). Timing ability and numerical competence in rats. Journal of Experimental Psychology: Animal Behavior Processes, 24, 8497. Call, J. (2000). Estimating and operating on discrete quantities in orangu- tans (Pongo pygmaeus). Journal of Comparative Psychology, 114, 136 147. Davis, H., & Perusse, R. (1988). Numerical competence in animals: Def- initional issues, current evidence, and a new research agenda. Behavioral and Brain Sciences, 11, 561615. Dehaene, S. (1992). Varieties of numerical abilities. Cognition, 44, 142. Dehaene, S. (1997). The number sense: How the mind creates mathemat- ics. New York: Oxford University Press. Dooley, G. B., & Gill, T. V. (1977). Acquisition and use of mathematical skills by a linguistic chimpanzee. In D. M. Rumbaugh (Ed.), Language 302 JAAKKOLA, FELLNER, ERB, RODRIGUEZ, AND GUARINO learning by a chimpanzee: The LANA project (pp. 247260). New York: Academic Press. Emmerton, J., Lohmann, A., & Niemann, J. (1997). Pigeons serial order- ing of numerosity with visual arrays. Animal Learning and Behavior, 25, 234244. Feigenson, L., Carey, S., & Hauser, M. (2002). The representations under- lying infants choice of more: Object files versus analog magnitudes. Psychological Science, 13, 150156. Gallistel, C. R., & Gelman, R. (1992). Preverbal and verbal counting and computation. Cognition, 44, 4374. Gallistel, C. R., & Gelman, R. (2000). Non-verbal numerical cognition: From reals to integers. Trends in Cognitive Sciences, 4, 5965. Hauser, M. D., Carey, S., & Hauser, L. B. (2000). Spontaneous number representation in semi-free-ranging rhesus monkeys. Proceedings of the Royal Society of London: Biological Sciences, 267, 829833. Killian, A., Yaman, S., von Fersen, L., & Gunturkun, O. (2003). A bottlenose dolphin discriminates visual stimuli differing in numerosity. Learning & Behavior, 31, 133142. Machado, A., & Keen, R. (2002). Relative numerosity discrimination in the pigeon: Further tests of the linear-exponential-ratio model. Behavioural Processes, 57, 131148. Mandler, G., & Shebo, B. J. (1982). Subitizing: An analysis of its com- ponent processes. Journal of Experimental Psychology: General, 111, 122. Mitchell, R. W., Yao, P., Sherman, P. T., & ORegan, M. (1985). Dis- criminative responding of a dolphin (Tursiops truncatus) to differen- tially rewarded stimuli. Journal of Comparative Psychology, 99, 218 225. Shumaker, R. W., Palkovich, A. M., Beck, B. B., Guagnano, G. A., & Morowitz, H. (2001). Spontaneous use of magnitude discrimination and ordination by the orangutan (Pongo pygmaeus). Journal of Comparative Psychology, 115, 385391. Simon, T. J. (1997). Reconceptualizing the origins of number knowledge: A non-numerical account. Cognitive Development, 12, 349372. Thomas, R. K., & Chase, L. (1980). Relative numerousness judgments by squirrel monkeys. Bulletin of the Psychonomic Society, 16, 7982. Uller, C., Carey, S., Huntley-Fenner, G., & Klatt, L. (1999). What repre- sentations might underlie infant numerical knowledge. Cognitive Devel- opment, 14, 136. Washburn, D. A., & Rumbaugh, D. M. (1991). Ordinal judgments of numerical symbols by macaques (Macaca mulatta). Psychological Sci- ence, 2, 190193. Whalen, J., Gallistel, C. R., & Gelman, R. (1999). Nonverbal counting in humans: The psychophysics of number representation. Psychological Science, 10, 130137. Received January 24, 2004 Revision received December 8, 2004 Accepted December 11, 2004 303 UNDERSTANDING LESS IN DOLPHINS