Risk Context Effects in Inductive Reasoning: An Experimental and Computational Modeling Study Kayo Sakamoto and Masanori Nakagawa Tokyo Institute of Technology, 2-21-1 O-okayama, Meguro-ku, Tokyo, 152-8552, Japan {sakamoto.k.ad, nakagawa.m.ad}@m.titech.ac.jp
Abstract. Mechanisms that underlie the inductive reasoning process in risk contexts are investigated. Experimental results indicate that people rate the same inductive reasoning argument differently according to the direction of risk aversion. In seeking to provide the most valid explanation of this, two kinds of models based on a Support Vector Machine (SVM) that process different knowledge spaces are proposed and compared. These knowledge spaces—a feature-based space and a category-based space—are both constructed from the soft clustering of the same corpus data. The simulation for the category-based model resulted in a slightly more successful replication of experimental findings for two kinds of risk conditions using two different estimated model parameters than the other simulation. Finally, the cognitive explanation by the category-based model based on a SVM for contextual inductive reasoning is discussed. Keywords: inductive reasoning, categorization, risk, natural language processing, corpus-based conceptual clustering, Support Vector Machines.
1 Introduction This study deals with one kind of inductive reasoning argument (e.g., Rips, 1975; Osherson, Smith, Wilkie, Lopez, and Shafir, 1990), such as: The person likes wine. The person doesn’t like beer. The person likes champagne. In this type of argument, its strength (the likelihood of the conclusion below the line given the premises above the line) depends mainly on the entities in each sentence (e.g., “wine”, “beer”, “champagne”) since these sentences share the same basic predicate (e.g., “The person likes ~.” and “The person doesn’t like ~.”). However in real-world situations, even reasoning-based behavior that involves such a simple argument evaluation can entail some element of risk context. For example, the relatively straightforward situation of giving somebody a present involves some risk. Even if you knew that the person in question likes wine but not beer, could you reasonably infer their reactions toward receiving a bottle of champagne from you? If the person were a close friend, they would be unlikely to B. Kokinov et al. (Eds.): CONTEXT 2007, LNAI 4635, pp. 425–438, 2007. © Springer-Verlag Berlin Heidelberg 2007
426
K. Sakamoto and M. Nakagawa
take offense if they do not like champagne. Therefore, in this situation, you would be fairly safe in inferring that the person “probably” likes champagnes. On the other hand, when faced with the risk of upsetting your boss, which could have more serious consequences, you might make a different inference, telling yourself how “unlikely” it is that the boss likes champagne. In these different risk contexts, the argument strength should be evaluated differently, which means that human ratings of the argument strength are by nature context-dependent. This study examines the impact of risk contexts on inductive reasoning. While several studies have naturally discussed the context-dependency of inductive reasoning argument (e.g., [3]), they have only addressed the issue with identical entity sets and by changing the predicate. Accordingly, they claim that the information required for similarity computation differs for different predicates, that is, different semantic contexts. The present study, however, reports that identical arguments (consisting of the same premise and conclusion propositions) are rated differently in different situational contexts. Thus, the findings from this study indicate that people modify the same similarity information necessary to rate argument strengths according to the given situational context, which results in different ratings. Especially, the situational contexts treated in the present study entail ‘concocted’ social evaluations in which argument ratings are scored, and each score is set up to imply the argument rater’s social ability. In such a situational context, people tend to avoid the risk of low evaluation about his/her ‘concocted’ social ability. The present study clarifies the effect of such risk contexts on inductive reasoning argument ratings. The outline of this study is as follows: First, an experiment is described which indicates that people rate the same inductive reasoning arguments differently according to the direction of risk aversion. Then, two kinds of models based on Support Vector Machine (SVM) which differ in how they explain the mechanisms underlying people’s performance in the experiment are proposed and simulation results are compared in terms their fit to the data. Finally, we argue that inductive reasoning in risk contexts is best explained by a category-based model based on SVM which adjusts the similarities for positive premise entities, negative premise entities, and conclusion entities.
2 Experiment An experiment was conducted to examine whether people’s ratings for inductive reasoning arguments are influenced by risk aversion strategies concerning social evaluations. Specifically, we compare participant’s ratings under two distinct risk conditions: in the over-estimation risk condition, over-estimated ratings of the argument’s likelihood entail a score-decreasing risk, while in the under-estimation risk condition under-estimated ratings entail a score- decreasing risk. 2.1 Method 2.1.1 Participants 77 Japanese undergraduate students, of which 34 were assigned to the over-estimation risk condition, with the remaining 43 being assigned to the under-estimation risk condition.
Risk Context Effects in Inductive Reasoning
427
2.1.2 Task and Condition The experimental task was rating the likelihood of inductive reasoning arguments on a 7-point scale (e. g., [13], [11]). Unlike the usual inductive reasoning task, each rating was scored according to the variation from a ‘concocted’ right answer. If a rating corresponded to this right answer, it would score perfectly. In the overestimation risk condition, as the likelihood rating increased relative to the right answer, the reduction to the score also increased. Conversely, in the under-estimation risk condition, as the likelihood rating decreased relative to the right answer, the more the reduction to the score increased. Score allocations for each condition are shown in Table 1. Participants were told a cover story that their scores imply the person’s social ability. In the end of the experiment, the participant’s total score rank in the all participants. Table 1. Allocation of scores in each risk condition
over 3 points overestimating
2 points overestimating
1 point overestimating
corresponds to the right answer
UNDER add 0 OVER minus 100
add 35 minus 65
add 65 minus 35
add 100 add 100
1 point underestimating
2 points underestimating
over 3 points underestimating
minus 35 add 65
minus 65 add 35
minus 100 add 0
UNDER OVER
2.1.3 Materials In the experiment, 4 sets of inductive reasoning arguments were used. Each set contains 8 arguments, and each argument consists of 2 positive premises, one negative premise, and a conclusion. In the 8 arguments of one set, all premises were fixed and combined with each of the 8 conclusions. The premise and conclusion statements all consisted of a combination of a predicate (Mr. A likes ‘~’) and an entity (e. g., steak), such as “Mr. A likes steak.” In the case of negative premises, the predicate involved a negative verbal form, such as “Mr. A doesn’t like Japanese noodles.” The positive and negative premise entities and the conclusion entities in each argument set were selected from those in [15] and [16]. The first of these earlier studies verified that participants can discriminate corpus-derived latent categories in their rating of inductive reasoning arguments, while the second examined whether participants can distinguish between entities of the same latent category based on similarities with positive premise entities or negative premise entities. Those previous studies, however, did not consider the contextual conditions. Accordingly, the concocted right answer for each argument was assigned by referring to stable rating data from the previous studies.
428
K. Sakamoto and M. Nakagawa
2.1.4 Procedure All the experimental procedure was controlled in a web application executed in Internet Explorer 6.0 (see Figure 1). Participants joined in the experiment as an exercise in a web application class. The procedure of the web application included an instruction section, 4 argument rating sections, and a final ranking announcement. In the instruction section, participants are given an overview of the experiment: a cover story concerning its purpose (to measure social abilities to guess the preferences of another person), the flow of experiment, and the scoring/ranking system according to each risk condition. During the first argument rating section, unlike the later sections, rating feedback was given. This included the right answer, the participant’s rating, the difference between the rating and the right answer, the participant’s current score, and the current maximum score. At the end of the first rating section, the participant’s were told their interim score and a false ranking (this was the same for all participants). During the subsequent rating sections no rating feedback was provided apart from the interim score which was presented at the end of each section. After all the rating sections were completed, the experiment ended when the participants had been given their final scores and actually computed rankings, followed by debriefing concerning the experiment.
Fig. 1. Example of experiment in the under-estimation risk condition (translated into English)
2.2 Results Only the data from the three subsequent rating sections (without feedback) were analyzed in terms of differences between the two conditions. Ratings on the 7-point scale were translated into numerical scales (-3 ~ 3). The average ratings over the 3 sets of arguments (24 arguments) were 0.014 (under-estimation risk condition) and 0.138 (over-estimation risk condition) respectively, differing significantly between the two conditions (p 0) while the estimated values for parameter b are negative (b < 0) in both conditions for both versions. This is consistent with the model’s assumption that the argument ratings are based on the similarities of conclusion entities with positive premise entities and the dissimilarities with negative premise entities. The absolute ratio of the estimated parameters (|b/a|) in the overestimation risk condition is higher than in the under-estimation risk condition. This indicates that, in the proposed model, those parameters fulfill an adjustment role that can account for shifts in argument ratings under different risk conditions. Thus, the estimation results support the validity of the model’s response selection assumption— that response decisions based on similarity estimations are biased by risk aversion strategies towards social evaluation contexts in which the participant’s social ability is evaluated. In other words, participants in the over-estimation risk condition placed greater emphasis on the conclusion entities’ dissimilarities with the negative premise than participants in the under-estimation risk condition. In the over-estimate risk condition, the “optimal” response is one that does not overly stress the conclusion entity’s similarity with positive premises. On the other hand, under the underestimation risk condition, the “optimal” response is the opposite of that in the other condition: namely, one that does not overly stress the conclusion entity’s dissimilarity with negative premises.
Risk Context Effects in Inductive Reasoning
435
4 Discussion The present study adopts two approaches to examining inductive reasoning in risk contexts—an experimental study and the proposal and simulation of computational models. The results of the experiment suggest that peoples’ ratings of an argument vary according to the risk context. Thus, when making an over-estimation would incur a risk of the score being reduced, people tend to rate arguments as being less “likely”, while the same argument tend to be rated as being more “likely” when there is a risk of an under-estimation incurring the score decreasing. On the other hand, the results from simulations of the proposed models indicate that the model based on SVM that processes corpus-oriented categorical knowledge was more successful in replicating the empirical data than the model which processes corpus-oriented feature knowledge. The replication of empirical data for two kinds of risk conditions was achieved using two different estimated model parameters. Thus, the mechanisms that underlie inductive reasoning would seem to be based on category-based prototypical representations of conceptual knowledge, complex computations of similarity. Especially, the mechanism underlying the risk context effects in inductive reasoning is explained by similarity adjustment based on risk aversion strategies toward social evaluation contexts. “Person A likes”
Wine
“Person A likes” “Person A doesn’t like”
Champagne
Wine Champagne
Beer
Beer “Person A doesn’t like”
(1)The case of over-estimation risk condition.
(2) The case of under-estimation risk condition.
Fig. 2. Similarity adjustment for each risk condition
Figure 2 is a graphic representation of the similarity adjustment mechanism for both risk conditions based on ratings for the following argument: Person A likes wines. Person A doesn’t like beers. Person A likes champagnes. Note that the similarities between entities (distances in the graphic representation) do not change, but rather the response boundaries (the orange and green dashed lines) do change. These changes can be attributed to the difference in the absolute ratio for the estimated parameters |b/a|. The relative adjustment sizes (boundary sizes) for the temporal category of “Person A likes” and for the temporal contrast category of “Person A doesn’t like” reflect the sizes of the ratio |b/a|. Since |b/a| for the
436
K. Sakamoto and M. Nakagawa
over-estimation risk condition is higher than that in the under-estimation risk condition, the relative adjustment size for the temporal category “Person A likes” in the over-estimation risk condition is smaller than that in the under-estimation risk condition. With such a mechanism, the conclusion “Person A likes champagnes” will be less “likely” in the over-estimation risk condition while being more “likely” in the under-estimation risk condition. This effect of different ratios |b/a| is brought by people’s risk aversion strategies toward social contexts in which his/her response based on inductive reasoning is evaluated. This study addressing Rips’s [13] types of inductive reasoning (called categorybased induction or property induction) in risk contexts is particularly noteworthy in drawing together three particularly interesting interdisciplinary perspectives. First, the present study shows that an identical argument strength in inductive reasoning depends on the risk context that is incorporated as part of the experimental design. Previous studies regarding contextual inductive reasoning have only dealt with arguments that have identical entity sets by changing the predicates (e.g., [3], [20]). Second, the cognitive mechanisms that underlie our empirical findings can be clearly explained by a model based on SVM utilizing a kernel method that was originally inspired by mathematics (e.g., [21]). While it might also be possible to develop a multilayer neural network in order to adequately model inductive reasoning in risk contexts, if evaluated only in terms of data fitting, such a model would, however, be unable to account for the cognitive mechanisms that underlie the adjustment of two kinds of similarities—one between a conclusion entity and the positive premise entities and one between the conclusion entity and the negative premise entities. Finally, the knowledge spaces processed in the proposed model are constructed from corpus clustering results. Previous models of inductive reasoning have utilized knowledge spaces constructed from psychological ratings (e.g., [19], [11]). However, as the methodology involves vast costs in conducting psychological evaluations for such an extra-ordinal number of features and similarities, they all focused on entities within rather restricted domains, such as categories of animal. In the proposed model, instead of relying on features or similarities rating data, knowledge spaces are constructed from corpus-clustering results. This means that rating costs are avoided, because information for large numbers of words can be computed from a corpus. This makes it possible to conduct predictive simulations for many more conclusion entities than those used in the experiment in this paper. Moreover, these predictive simulations can be applied to constructing an induction-based search engine that searches concepts similar to positive examples and dissimilar from negative examples (cf., [15]). Furthermore, since the corpus-based methodology and the computational model treating contextual effects are combined, the present study can be applied to the ‘contextual’ induction-based search engine that provides appropriate search results to a given search context. For example, considering the previously mentioned situation when you knew that a person likes wine but not beer, and the risk of unsuitable search results is high (e.g., when you are thinking about a present for upsetting boss) this kind of search engine will provide a result eliminating champagne. On the other hand, for another searching situation when you are thinking about a present for your close friend, the search engine will provide a result including champagne. Therefore, the present study that draws together three interdisciplinary perspectives opens up an exiting application possibility.
Risk Context Effects in Inductive Reasoning
437
Acknowledgments. This study was supported by the Tokyo Institute of Technology 21COE Program, “Framework for Systematization and Application of Large-scale Knowledge Resources”. Furthermore, the authors would like to thank Dr. T. Joyce, a postdoctoral researcher with the 21COE-LKR, for his critical reading of our manuscripts and valuable comments on an earlier draft.
References 1. Ashby, F.G., Maddox, W.T.: Relations between prototype, examplar, and decision bound models of categorization. Journal of Mathematical Psychology 37, 372–400 (1993) 2. Hampton, J.A.: Polymorphous concepts in semantic memory. Journal of Verbal Learning and Verbal Behavior 18, 441–461 (1979) 3. Heit, E., Rubinstein, J.: Similarity and property effects in inductive reasoning. Journal of Experimental Psychology: Learning, Memory, and Cognition 20, 411–422 (1994) 4. Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd International Conference on Research and Development in Information Retrieval:SIGIR ’99. pp. 50–57 (1999) 5. Kameya, Y., Sato, T.: Computation of probabilistic relationship between concepts and their attributes using a statistical analysis of Japanese corpora. In: Proceedings of Symposium on Large-scale Knowledge Resources: LKR2005 (2005) 6. Kruschke, J.K.: ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review 99, 22–44 (1992) 7. Kudoh, T., Matsumoto, Y.: Japanese Dependency Analysis using Cascaded Chunking. In: Proceedings of the 6th Conference on Natural Language Learning: CoNLL 2002, pp. 63– 39 (2002) 8. Matsuka, T., Nickerson, J.V., Jian, J.: A prototype model that learns and generalize Medin, Alton, Edelson, & Frecko (1982) XOR category structure as humans do. In: Proceedings of the Twenty-Eighth Annual Conference of the Cognitive Science Society (2006) 9. Medin, D.L., AltonM., W., Edelson, S.M., Freko, D.: Correlated symptoms and simulated medical classification. Journal of Experimental Psychology: Learning, Memory, and Cognition 8, 37–50 (1982) 10. Nosofsky, R.M.: Attention, similarity, and the identification categorization relationship. Journal of Experimental Psychology: General 115, 39–57 (1986) 11. Osherson, D.N., Smith, E.E., Wilkie, O., Lopez, A., Shafir, E.: Category-Based Induction. Psychological Review 97(2), 185–200 (1990) 12. Pereira, F., Tishby, N., Lee, L.: Distributional clustering of English words. In: Proceedings of the 31st Meeting of the Association for Computational Linguistics, pp. 183–190 (1993) 13. Rips, L.J.: Inductive judgment about netural categories. Journal of Verbal Learning and Verbal Behavior 14, 665–681 (1975) 14. Rosch, E.: On the internal structure of perceptual and semantic categories. In: Moore, T.E. (ed.) Cognitive Development and the Acquisition of Language, pp. 111–144. Academic Press, New York (1973) 15. Sakamoto, K., Terai, A., Nakagawa, M.: Computational Models of Inductive Reasoning and Their Psychological Examination: Towards an Induction-Based Search-Engine. In: Proceedings of the Twenty-Seventh Annual Conference of the Cognitive Science Society, pp. 1907–1912 (2005)
438
K. Sakamoto and M. Nakagawa
16. Sakamoto, K., Nakagawa, M.: The Effects of negative premise on Inductive reasoning: A psychological experiment and computational modeling study. In: Proceedings of the Twenty-Eighth Annual Conference of the Cognitive Science Society, pp. 2081–2086 (2006) 17. Sakamoto, K., Terai, A., Nakagawa, M.: Computational Models of Inductive Reasoning Using the Statistical Analysis of a Japanese Corpus. The Journal of Cognitive Systems Research (in press) (2007) 18. Sanjana, N.E., Tenenbaum, J.B.: Bayesian models of inductive generalization. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Processing Systems 15, MIT press, Cambridge, MA (2003) 19. Sloman, A.T.: Feature-Based Induction. Cognitive Psychology 25, 231–280 (1993) 20. Smith, E.E., Shafir, E., Osherson, D.: Similarity, plausibility, and judgment of probability. Cognition 49, 67–96 (1993) 21. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)