1
PROCEEDINGS OF THE
XXVIII SCIENTIFIC
CONFERENCE
EMPIRICAL
STUDIES IN
PSYCHOLOGY
31st MARCH 3rd APRIL, 2022
FACULTY OF PHILOSOPHY, UNIVERSITY OF BELGRADE
INSTITUTE OF PSYCHOLOGY
LABORATORY FOR EXPERIMENTAL PSYCHOLOGY
FACULTY OF PHILOSOPHY, UNIVERSITY OF BELGRADE
2
EMPIRICAL
STUDIES IN
PSYCHOLOGY
31st MARCH 3rd APRIL, 2022
FACULTY OF PHILOSOPHY, UNIVERSITY OF
BELGRADE
Institute of Psychology, Faculty of Philosophy, University of Belgrade
Laboratory for Experimental Psychology, Faculty of Philosophy, University of Belgrade
Belgrade, 2022
3
PROGRAMME COMMITTEE
prof. dr Orlando M. Lourenço
prof. dr Claus-Christian Carbon
prof. dr Agostini Tiziano
prof. dr Lucia Tramonte
prof. dr Maria do Céu Taveira
prof. dr José M. Peiró
prof. dr Gonida Sofia-Eleftheria
prof. dr Laurie Beth Feldman
prof. dr Joana Maria Mas
doc. dr Milica Vukelić
doc. dr Ivana Stepanović Ilić
dr Zora Krnjaić
prof. dr Dejan Todorović
prof. dr Sunčica Zdravković
prof. dr Iris Žeželj
doc. dr Danka Purić
prof. dr Zvonimir Galić
prof. dr Dušica Filipović Đurđević
prof. dr Slobodan Marković
prof. dr Ksenija Krstić
prof. dr Dražen Domijan
doc. dr Oliver Toškov
doc. dr Olja Jovanov
doc. dr Dobrinka Kuzmanov
doc. dr Bojana Bodroža
doc. dr Ivana Jakovljev
doc. Dragan Janković
prof. dr Pavle Valerjev
prof. dr Denis Bratko
prof. dr Petar Čolović
doc. dr Jelena Matanov
dr Janko Međedović
doc. dr Marija Branković
dr Anja Wertag
dr Jelena Radišić
doc. dr Dragana Stanojev
doc. dr Maja Savić
dr Nataša Simić
dr Maša Popović
dr Darinka Anđelković
prof. dr Tamara Džamonja Ignjatović
doc. dr Kaja Damnjanov
dr Marko Živanović
dr Maša Vukčević Marković
prof. dr Goran Opačić
prof. dr Aleksandar Kostić
dr Zorana Zupan
dr Marina Videnović (chairwoman)
4
ORGANIZING COMMITTEE
dr Marina Videnović
prof. dr Slobodan Marković
prof. dr Dušica Filipović Đurđević
Olga Marković Rosić
doc. dr Ivana Stepanović Ilić
Ksenija Mišić
Milana Rajić
dr Marko Živanović
doc. dr Kaja Damnjanov
dr Nataša Simić
Teodora Vuletić
Anđela Milošević
Ana Avramović
Natalija Ignjatović
Milica Ninković
Jovan Ivanović
EDITORS
dr Marina Videnović, naučni saradnik
dr Nataša Simić, viši naučni saradnik
doc. dr Ivana Stepanović Ilić
doc. dr Kaja Damnjanović, naučni saradnik
Milana Rajić, istraživač saradnik
Cover photo:
Spring-loaded switch, (E. Zimmermann, Leipzig-Berlin)
from Collection of old scientific instruments, Laboratory for Experimental Psychology, Faculty of Philosophy,
University of Belgrade
Proofreading and layout: Teodora Vuletić
Towards an Accessible Assessment of Reasoning: The Relation of Statistical
Reasoning and Classic Reasoning Task Performance
Pavle Valerjev (valerjev@unizd.hr)
Department of Psychology, University of Zadar
Marin Dujmović (marin.dujmovic@bristol.ac.uk)
School of Psychological Science, University of Bristol
Abstract
Classic reasoning tasks regularly require computerised
administration, tight experimental control and are overall not
accessible to researchers outside the field. While there have
been attempts to develop reasoning assessments, these have
resulted in comprehensive yet difficult to implement
instruments. This study is part of a project with the aim of
determining which key factors need to be covered in such an
instrument, while being easily administered and accessible.
Modified versions of three standard reasoning tasks (the Base
Rate neglect task, the Linda problem and the Covariation
detection task) were conducted alongside the test of statistical
reasoning in order to assess how performance on the tasks and
the test relate. The study was conducted in two countries
(Croatia and the UK) and languages. Quite a robust relation
between performance in the reasoning tasks and on the test
were observed in both samples. We can conclude that one
factor which needs to be a part of the full reasoning assessment
has to cover statistical reasoning as it is robustly related to
overall reasoning performance.
Keywords: dual-process theory; reasoning; statistical
reasoning; probabilistic reasoning; rationality
Introduction
Modern dual-process models of reasoning posit that there are
multiple different Type 1 processes (DeNeys 2012; Glockner
& Witteman, 2010; Pennycook et al., 2015). These can be
heuristics such as availability or representativeness, which
are traditional Type 1 processes, but also include logical
intuitions, probabilistic heuristics and others which have
traditionally been defined as Type 2. Tasks may cue multiple
Type 1 processes to produce responses. If there is conflict,
and it is detected, between Type 1 responses, then Type 2
processing is required to resolve it (Dujmović & Valerjev,
2018; Pennycook et al., 2015). The resolution of conflict may
result in accepting the dominant response, cognitive
decoupling in favour of an alternative, or abandoning all
Type 1 responses in favour of more analytical processing.
The tasks developed to probe these processes usually
require careful measurement in a computerized laboratory
and quite a labour-intensive process of developing batteries
of items. Stanovich, West, and Toplak (2016) have done
extensive work in an attempt to develop measures of
rationality. Their work resulted in the Comprehensive
Assessment of Rational Thinking (CART). The key word
being comprehensive, the assessment many factors which
have been identified as components of rational thought but is
not as focused is too large to administer for most researchers.
The overall goal of this process is to develop a compact
reasoning assessment instrument from the perspective of
modern dual-process models, while being accessible to
researchers outside the field. The aim of this particular study
is to determine whether a measure of statistical reasoning,
which mostly measures probabilistic reasoning, shows robust
and significant relationships with performance on a number
of tasks usually used in reasoning research. These tasks
routinely include judgments of probability, or simply require
probabilistic reasoning in order to provide a normatively
correct response. It was expected that all of the tasks used in
the current study would correlate with the aforementioned
measure.
Methods
Participants
Participants were recruited from the UK (N=298) and from
Croatia (N=292). Samples were equalised by gender ratio
(71.19% female), urban to rural residence ratio (12.38%
rural, 8.81% small town, 78.81% urban), age (M=31.42) and
highest achieved education level (46.78% high school, 29.66
undergraduate, 23.56% graduate or higher).
Materials
Reasoning tasks
Three tasks adhere to a similar structure where one response
to the problem can be made based on what would be
normatively correct, and another can be made based on a
heuristic.
The first of these is the modified base rate neglect task (BR
task) which we further modified (DeNeys & Glumicic, 2008;
Dujmović & Valerjev, 2018). The task can be seen in Figure
1. In this task the response can be made either based on the
mathematical probability or based on the stereotype.
Participants gave estimates of both populations being
extroverted and sociable, then they gave probability estimates
for each of the two responses.
29
Figure 1: Example of a modified base rate neglect task
Based on these four estimates, a bias score towards heuristic
reasoning was calculated (equations (1) and (2)).
𝑝(𝐴)=𝑝(𝑐ℎ𝑎𝑟. 𝐴) 𝑁(𝐴)
𝑝(𝑐ℎ𝑎𝑟. 𝐴) 𝑁(𝐴)+ 𝑝(𝑐ℎ𝑎𝑟. 𝐵) 𝑁(𝐵)
(1)
𝐵𝑖𝑎𝑠 = 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑(𝐴) 𝑝(𝐴)
(2)
For example, a participant gave the estimate that only 5% of
all mathematicians are sociable and extroverted (p(char.A) in
(1)), but 90% of hospitality workers have those traits
(p(char.B) in (1)). Given that there are 993 mathematicians
(N(A) in (1)) and 7 hospitality workers (N(B) in (1)), p(A) in
(1) represents the correct probability estimate for a randomly
chosen person to be a mathematician by the particular
participant. In this case that would have been 88.74%. If the
participant estimated the probability of a person being a
mathematician to be 70% (estimated(A) in (2)) then the bias
is 18.74%. This means that the participant was swayed by the
stereotype and the difference in how the traits are distributed
for mathematicians and hospitality workers. Even though this
participant is ascribing a higher probability to the person
being a mathematician, the estimate is sub-optimal. This has
the advantage of measuring bias even when participants give
a categorically correct response (higher probability for the
person being from the appropriate group). The resulting
measure is a continuous variable even when based on one or
a small number of tasks.
The second reasoning task was a modified version of the
Linda problem (Dujmović et al., 2021). In the classic task
people overestimate the probability of the conjunction of two
occurrences. The classic task can be seen in Figure 2.
Participants were asked to give independent probability
estimates for both individual occurrences, and their
conjunction.
Figure 2: Classic Linda problem
Based on the estimates, a bias towards the conjunction fallacy
was calculated (equations (3) and (4)).
𝑝(𝐴&𝐵)= 𝑝(𝐴) 𝑝(𝐵)
(3)
𝐵𝑖𝑎𝑠 = 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒𝑑(𝐴&𝐵) 𝑝(𝐴&𝐵)
(4)
For example, if a participant estimates that the probability of
Linda being a bank teller is 25% (p(A) in (3)), and the
probability of Linda being active in the feminist movement is
70% (p(B) in (3)), then the correct probability of the
conjunction is 17.5% (p(A&B) in (3)). If the participant
estimated that the probability of the conjunction was 60%
(estimated(A&B) in (4)), then the bias is equal to 42.5%. This
means that the participant is estimating the conjunction to be
more probable than it actually is and the level of inaccuracy
is measured as the bias.
The final reasoning task is the covariation detection task
(Valerjev & Dujmović, 2019). The task can be seen in Figure
3. The bias towards high frequencies, rather than processing
ratios, would lead participants to conclude that positive
outcomes are more correlated with administering of the
vaccine, while the correct response would be that the positive
outcomes are negatively correlated with administering the
vaccine (responses were given on a -3 to 3 Likert scale).
The modified CRT
The cognitive reflection test (Fredrick, 2005) is well known
and widely used for research into human rationality. The
prototypical example of a CRT item is the bat and ball
problem shown below.
A bat and a ball cost 1.10 $ in total.
The bat costs 1.00 $ more than the ball.
How much does the ball cost?
30
In this study we used a modified version consisting of four
items. Each item had four possible responses one of which
was the correct response (analytical), one of which was the
quick but incorrect response (heuristic), and two fillers
(Valerjev, 2020).
Figure 3: The Covariation detection task
The TSR
The Test of statistical reasoning (Rapan & Valerjev, 2020;
2021) is the first stage at developing a measure which could
correlate well with reasoning tasks. An example from the
TSR can be seen below.
A box contains 4 white, 6 blue, and 8 black balls. A single
ball is drawn. What is the probability that the ball is blue?
The test consisted of eleven time-limited (45 seconds per
item) 4-alternative forced-choice tasks.
Procedure
Participants completed the study via PsyToolkit (Stoet, 2010;
2017). Participants completed the CRT followed by the BR
task, the Linda problem, Covariation detection and the TSR.
The order was the same for each participant, but the order of
items/estimates within each task was randomized. Each
estimate/item was presented independently.
Scores on the CRT and TSR for each participant were
calculated as the sum of correct responses. In the BR task and
the Linda problem, the scores were calculated as a bias
towards heuristic thinking. Finally, the detection of
covariation task scores mapped to a 1-7 scale where higher
results indicate higher bias towards heuristic thinking.
Results
Descriptive statistics can be seen in Table 1.
Table 1: Descriptive statistics for the measures
Measure
M
SD
Skewness
CRT
1.94
1.25
0.04
BR bias
19.66
28.91
0.76
Linda bias
11.86
20.78
1.24
Covariation
detection
3.80
1.59
-0.18
TSR
7.07
2.15
-0.29
To determine how well the CRT and reasoning tasks predict
scores on the TSR two regression analyses were conducted
on the overall and both national samples (Table 2).
Table 2: Regression analyses of TSR scores as the
criterion and reasoning measures as predictors
UK sample
Predictor
r
β
t
CRT
.46
.33
6.44**
BR bias
-.31
-.14
2.68**
Linda bias
-.39
-.27
5.35**
Covariation detection
-.29
-.14
2.81**
R = .58; R2 = .34; Radj2 = .33; F(4, 290) = 37.27**
Croatian sample
CRT
.37
.28
4.59**
BR bias
-.26
-.18
3.30**
Linda bias
-.17
-.12
2.17*
Covariation detection
-.28
-.24
4.59**
R = .49; R2 = .24; Radj2 = .23; F(4, 284,) = 22.15**
Combined sample
CRT
.42
.31
8.21**
BR bias
-.27
-.16
4.46**
Linda bias
-.30
-.19
5.36**
Covariation detection
-.29
-.20
5.47**
R = 54.; R2 = 29.; Radj2 = .28; F(4, 579) = 58.44**
*p < .05; **p < .001
The regression analyses show that all of the tasks are
significant predictors for TSR scores in both samples. The
patterns of results were similar for samples from both
countries apart from bias on the Linda problem being better
correlated with TSR scores in the Croatian sample. This
seems to be due to the Croatian sample having more
participants who underwent at least some statistics training
which considerably decreases Linda bias and the correlation
with TSR.
31
Discussion
The study aimed to determine what is the relation between a
measure of statistical reasoning and performance on classic
reasoning tasks. Results showed a robust relationship both
when analysing overall data and national samples
independently. The relationships reported in the results are
expected given that reasoning tasks routinely include aspects
of probabilistic reasoning which is what the TSR primarily
measures.
The variance reasoning measures explain in the TSR scores
is promising for future work, though some reservations
should be taken into consideration. First, this version of the
TSR was time-limited which routinely results in settling for
the dominant Type 1 response. Since incentivising Type 1
reasoning is common to reasoning tasks, the time-limit may
be a key factor, resulting in stronger relationships. Second,
the shared computerised method of administering both the
reasoning tasks and the TSR contributes to the strength of the
relations. It is important to investigate whether the results
generalize to other settings. Finally, reasoning tasks were
limited to one problem per task. Sets of items have been
developed and will be administered as batteries in the future.
The end goal of this research is to develop a measure which
will be available and easy to administer to researchers across
different fields rather than the current tasks which are mainly
limited to a fairly small research community. Such a measure
would potentially make reasoning and the dual-process
approach more accessible and more wide-spread since
heuristics and analytical processing are part of real-world
reasoning, decision making and problem solving. Future
steps include detecting other relevant factors that need to be
a part of such an assessment, establishing that they indeed are
related to performance in established reasoning tasks,
creating a manageable, accessible, curtailed version of the
assessment which will cover all the determined factors and
validating the final version.
References
De Neys, W. (2012). Bias and conflict: A case for logical
intuitions. Perspectives on Psychological Science, 7(1),
28-38.
De Neys, W., & Glumicic, T. (2008). Conflict monitoring in
dual process theories of thinking. Cognition, 106, 1248-
1299.
Dujmović, M., & Valerjev, P. (2018). The influence of
conflict monitoring on meta-reasoning and response times
in a base rate task. Quarterly Journal of Experimental
Psychology, 71(12), 25482561.
Dujmović, M., Valerjev, P., & Bajšanski, I. (2021). The role
of representativeness in reasoning and metacognitive
processes: an in-depth analysis of the Linda problem.
Thinking and Reasoning, 27(2), 161-186.
Frederick, S. (2005). Cognitive reflection and decision
making. Journal of Economic Perspectives, 19, 2542.
Glockner, A., & Witteman, C. (2010). Beyond dual-process
models: A categorization of processes underlying intuitive
judgment and decision making. Thinking & Reasoning,
16(1), 1-25.
Pennycook, G., Fugelsang, J.A. & Koehler, D.J. (2015).
What makes us think? A three-stage dual-process model of
analytic engagement. Cognitive Psychology, 80, 34-72.
Rapan, K., & Valerjev, P. (2020). Test statističkog
rasuđivanja [Test of statistical reasoning]. In V.Ć. Adorić,
I. Burić, I. Macuka, M. Nikolić Ivanišević, & A. Slišković
(Eds.), Zbirka psihologijskih skala i upitnika - Svezak 10
[Collection of psychological scales and questionnaires
Volume 10], (pp. 103-112), Zadar: University of Zadar.
Rapan, K., & Valerjev, P. (2021). Is automation of statistical
reasoning a suitable mindware in a base-rate neglect task?
Psychological Topics, 30(3), 447-466.
Stanovich, K., West, R., & Toplak, M. (2016). The rationality
quotient: Toward a test of rational thinking. Cambridge:
MIT press.
Stoet, G. (2017). PsyToolkit: A novel web-based method for
running online questionnaires and reaction-time
experiments. Teaching of Psychology, 44(1), 24-31
Stoet, G. (2010). PsyToolkit - A software package for
programming psychological experiments using Linux.
Behavior Research Methods, 42(4), 1096-1104.
Valerjev, P., & Dujmović, M. (2019). Performance and
metacognition in scientific reasoning; The covariation
detection task. Psychological Topics, 28(1), 93-113.
Valerjev, P. (2020). Chronometry and meta-reasoning in a
modified Cognitive Reflection Test. In K. Damnjanović,
O. Tošković, & S. Marković (Eds.), Proceedings of the
XXV Scientific Conference Empirical Studies in
Psychology (pp. 31-34). Belgrade: Institute of Psychology,
Laboratory for Experimental Psychology, Faculty of
Philosophy, University of Belgrade.
32
160
CIP Katalogizacija u publikaciji
Narodna biblioteka Srbije, Beograd
PROCEEDINGS OF THE XXIII SCIENTIFIC CONFERENCE EMPIRICAL STUDIES IN
PSYCHOLOGY (28; 2022, Beograd)
[Zbornik radova] / XXVIII naučni skup Empirijska istraživanja u psihologiji
31. mart-3.april 2022; Filozofski fakultet, Univerzitet u Beogradu; [organizatori]
Institut za psihologiju i Laboratorija za eksperimentalnu psihologiju 1. Izd
Beograd: Filozofski fakultet, 2022 160 str.
Kor. Nasl. Zbornik radova na srp. i engl. jeziku elektronsko izdanje
ISBN-978-86-6427-245-2
1. Institut za psihologiju (Beograd)
2. Laboratorija za eksperimentalnu psihologiju (Beograd)
a) Psihologija Empirijska istraživanja – Zbornik radova